If we are resetting just one engine, we know it has stalled. So we can
pass the stalled parameter directly to i915_gem_reset_engine(), which
alleviates the necessity to poke at the generic engine->hangcheck.stalled
magic variable, leaving that under control of hangcheck as its name
implies. Other than simplifying by removing the indirect parameter along
this path, this allows us to introduce new reset mechanisms that run
independently of hangcheck.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180406220354.18911-1-chris@chris-wilson.co.uk
We don't handle resetting the kernel context very well, or presumably any
context executing its breadcrumb commands in the ring as opposed to the
batchbuffer and flush. If we trigger a device reset twice in quick
succession while the kernel context is executing, we may end up skipping
the breadcrumb. This is really only a problem for the selftest as
normally there is a large interlude between resets (hangcheck), or we
focus on resetting just one engine and so avoid repeatedly resetting
innocents.
Something to try would be a preempt-to-idle to quiesce the engine before
reset, so that innocent contexts would be spared the reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180330131801.18327-1-chris@chris-wilson.co.uk
After
commit dd9f31c7a3
Author: Imre Deak <imre.deak@intel.com>
Date: Wed Aug 16 17:46:07 2017 +0300
drm/i915/gen9+: Set same power state before hibernation image
save/restore
during hibernation/suspend the power domain functionality got disabled,
after which resume could leave it incorrectly disabled if the ACPI
target state was S0 during suspend and i915 was not loaded by the loader
kernel.
This was caused by not considering if we resumed from hibernation as the
condition for power domains reiniting.
Fix this by simply tracking if we suspended power domains during system
suspend and reinit power domains accordingly during resume. This will
result in reiniting power domains always when resuming from hibernation,
regardless of the platform and whether or not i915 is loaded by the
loader kernel.
The reason we didn't catch this earlier is that the enabled/disabled
state of power domains during PMSG_FREEZE/PMSG_QUIESCE is platform
and kernel config dependent: on my SKL the target state is S4
during PMSG_FREEZE and (with the driver loaded in the loader kernel)
S0 during PMSG_QUIESCE. On the reporter's machine it's S0 during
PMSG_FREEZE but (contrary to this) power domains are not initialized
during PMSG_QUIESCE since i915 is not loaded in the loader kernel, or
it's loaded but without the DMC firmware being available.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105196
Reported-and-tested-by: amn-bas@hotmail.com
Fixes: dd9f31c7a3 ("drm/i915/gen9+: Set same power state before hibernation image save/restore")
Cc: amn-bas@hotmail.com
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180322143642.26883-1-imre.deak@intel.com
We were fetching uC firmwares in separate uc_init_fw step, while
there is no reason why we can't fetch them during init_early.
This will also simplify upcoming patches, as size of the firmware
may be used for register initialization.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180323123451.59244-2-michal.wajdeczko@intel.com
In upcoming patch, we want to perform more actions in early
initialization of the uC. This reordering will help resolve
new dependencies that will be introduced by future patch.
v2: s/i915_gem_load_init/i915_gem_init_early (Chris)
v3: s/i915_gem_load_cleanup/i915_gem_cleanup_early (Michal)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180323123451.59244-1-michal.wajdeczko@intel.com
We try to keep all HuC related code in dedicated file.
There is no need to peek HuC register directly during
handling getparam ioctl.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180314200429.40132-1-michal.wajdeczko@intel.com
Not all callers want the GPU error to handled in the same way, so expose
a control parameter. In the first instance, some callers do not want the
heavyweight error capture so add a bit to request the state to be
captured and saved.
v2: Pass msg down to i915_reset/i915_reset_engine so that we include the
reason for the reset in the dev_notice(), superseding the earlier option
to not print that notice.
v3: Stash the reason inside the i915->gpu_error to handover to the direct
reset from the blocking waiter.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180320100449.1360-2-chris@chris-wilson.co.uk
In Gen11, the Video Decode engines (aka VDBOX, aka VCS, aka BSD) and the
Video Enhancement engines (aka VEBOX, aka VECS) could be fused off. Also,
each VDBOX and VEBOX has its own power well, which only exist if the
related engine exists in the HW.
Unfortunately, we have a Catch-22 situation going on: we need the blitter
forcewake to read the register with the fuse info, but we cannot initialize
the forcewake domains without knowin about the engines present in the HW.
We workaround this problem by allowing the initialization of all forcewake
domains and then pruning the fused off ones, as per the fuse information.
Bspec: 20680
v2: We were shifting incorrectly for vebox disable (Vinay)
v3: Assert mmio is ready and warn if we have attempted to initialize
forcewake for fused-off engines (Paulo)
v4:
- Use INTEL_GEN in new code (Tvrtko)
- Shorter local variable (Tvrtko, Michal)
- Keep "if (!...) continue" style (Tvrtko)
- No unnecessary BUG_ON (Tvrtko)
- WARN_ON and cleanup if wrong mask (Tvrtko, Michal)
- Use I915_READ_FW (Michal)
- Use I915_MAX_VCS/VECS macros (Michal)
v5: Rebased by Rodrigo fixing conflicts on top of:
"drm/i915: Simplify intel_engines_init"
v6: Fix v5. Remove info->num_rings. (by Oscar)
v7: Rebase (Rodrigo).
v8:
- s/intel_device_info_fused_off_engines/
intel_device_info_init_mmio (Chris)
- Make vdbox_disable & vebox_disable local variables (Chris)
v9:
- Move function declaration to intel_device_info.h (Michal)
- Missing indent in bit fields definitions (Michal)
- When RC6 is enabled by BIOS, the fuse register cannot be read until
the blitter powerwell is awake. Shuffle where the fuse is read, prune
the forcewake domains after the fact and change the commit message
accordingly (Vinay, Sagar, Chris).
v10:
- Improved commit message (Sagar)
- New line in header file (Sagar)
- Specify the message in fw_domain_reset applies to ICL+ (Sagar)
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180316121456.11577-1-mika.kuoppala@linux.intel.com
[Mika: soothe checkpatch on commit msg]
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Those two concepts are really separate. Since GuC is writing data into
its own buffer and we even provide a way for userspace to read directly
from it using i915_guc_log_dump debugfs, there's no real reason to tie
log level with relay creation.
Let's create a separate debugfs, giving userspace a way to create a
relay on demand, when it wants to read a continuous log rather than a
snapshot.
v2: Don't touch guc_log_level on relay creation error, adjust locking
after rebase, s/dev_priv/i915, pass guc to file->private_data (Sagar)
Use struct_mutex rather than runtime.lock for set_log_level
v3: Tidy ordering of definitions (Sagar)
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180319095348.9716-5-michal.winiarski@intel.com
If we fail to reset the GPU, we declare the machine wedged. However, the
GPU may well still be running in the background with an in-flight
request. So despite our efforts in cleaning up the request queue and
faking the breadcrumb in the HWSP, the GPU may eventually write the
in-flght seqno there breaking all of our assumptions and throwing the
driver into a deep turmoil, wedging beyond wedged.
To avoid this we ideally want to reset the GPU. Since that has already
failed, make sure the rings have the stop bit set instead. This is part
of the normal GPU reset sequence, but that is actually disabled by
igt/gem_eio to force the wedged state. If we assume the worst, we must
poke at the bit again before we give up.
v2: Move the intel_gpu_reset() from set-wedged in the reset error path
into i915_gem_set_wedged() itself. Even if the reset fails (e.g. if it is
disabled by gem_eio), it still tries to make sure the engines are
stopped. For i915_gem_set_wedged() callers from outside of i915_reset(),
this should make sure the GPU is disabled while the driver is marked as
being wedged.
Testcase: igt/gem_eio
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180315151015.22741-1-chris@chris-wilson.co.uk
Hardware may have specific restrictions on GuC WOPCM offset and size. On
Gen9, the value of the GuC WOPCM size register needs to be larger than the
value of GuC WOPCM offset register + a Gen9 specific offset (144KB) for
reserved GuC WOPCM. Fail to enforce such a restriction on GuC WOPCM size
will lead to GuC firmware execution failures. On the other hand, with
current static GuC WOPCM offset and size values (512KB for both offset and
size), the GuC WOPCM size verification will fail on Gen9 even if it can be
fixed by lowering the GuC WOPCM offset by calculating its value based on
HuC firmware size (which is likely less than 200KB on Gen9), so that we can
have a GuC WOPCM size value which is large enough to pass the GuC WOPCM
size check.
This patch updates the reserved GuC WOPCM size for RC6 context on Gen9 to
24KB to strictly align with the Gen9 GuC WOPCM layout. It also adds support
to verify the GuC WOPCM size aganist the Gen9 hardware restrictions. To
meet all above requirements, let's provide dynamic partitioning of the
WOPCM that will be based on platform specific HuC/GuC firmware sizes.
v2:
- Removed intel_wopcm_init (Ville/Sagar/Joonas)
- Renamed and Moved the intel_wopcm_partition into intel_guc (Sagar)
- Removed unnecessary function calls (Joonas)
- Init GuC WOPCM partition as soon as firmware fetching is completed
v3:
- Fixed indentation issues (Chris)
- Removed layering violation code (Chris/Michal)
- Created separat files for GuC wopcm code (Michal)
- Used inline function to avoid code duplication (Michal)
v4:
- Preset the GuC WOPCM top during early GuC init (Chris)
- Fail intel_uc_init_hw() as soon as GuC WOPCM partitioning failed
v5:
- Moved GuC DMA WOPCM register updating code into intel_wopcm.c
- Took care of the locking status before writing to GuC DMA
Write-Once registers. (Joonas)
v6:
- Made sure the GuC WOPCM size to be multiple of 4K (4K aligned)
v8:
- Updated comments and fixed naming issues (Sagar/Joonas)
- Updated commit message to include more description about the hardware
restriction on GuC WOPCM size (Sagar)
v9:
- Minor changes variable names and code comments (Sagar)
- Added detailed GuC WOPCM layout drawing (Sagar/Michal)
- Refined macro definitions to be reader friendly (Michal)
- Removed redundent check to valid flag (Michal)
- Unified first parameter for exported GuC WOPCM functions (Michal)
- Refined the name and parameter list of hardware restriction checking
functions (Michal)
v10:
- Used shorter function name for internal functions (Joonas)
- Moved init-ealry function into c file (Joonas)
- Consolidated and removed redundant size checks (Joonas/Michal)
- Removed unnecessary unlikely() from code which is only called once
during boot (Joonas)
- More fixes to kernel-doc format and content (Michal)
- Avoided the use of PAGE_MASK for 4K pages (Michal)
- Added error log messages to error paths (Michal)
v11:
- Replaced intel_guc_wopcm with more generic intel_wopcm and attached
intel_wopcm to drm_i915_private instead intel_guc (Michal)
- dynamic calculation of GuC non-wopcm memory start (a.k.a WOPCM Top
offset from GuC WOPCM base) (Michal)
- Moved WOPCM marco definitions into .c source file (Michal)
- Exported WOPCM layout diagram as kernel-doc (Michal)
v12:
- Updated naming, function kernel-doc to align with new changes (Michal)
v13:
- Updated the ordering of s-o-b/cc/r-b tags (Sagar)
- Corrected one tense error in comment (Sagar)
- Corrected typos and removed spurious comments (Joonas)
Bspec: 12690
Signed-off-by: Jackie Li <yaodong.li@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> (v8)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v9)
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> (v11)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v12)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1520987574-19351-2-git-send-email-yaodong.li@intel.com
We are sanitizing uC related modparams together with other driver
modparams in intel_sanitize_options called from i915_driver_init_hw,
but this is too late for us as we will want to use USES_GUC/USES_HUC
macros at earlier stage. Since our sanitizing does not require any
MMIO access, we can do it in intel_uc_init_early right after we resolve
firmware names.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180312130308.22952-2-michal.wajdeczko@intel.com
We have many functions responsible for allocating different parts of
GuC log runtime called from multiple places. Let's stick with keeping
everything in guc_log_register instead.
v2: Use more generic intel_uc_register name, keep using "misc" suffix (Michał)
s/dev_priv/i915 (Sagar)
Make guc_log_relay_* static (sparse)
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308154707.21716-2-michal.winiarski@intel.com
There are a number of information that are readable from hardware
registers and that we would like to make accessible to userspace. One
particular example is the topology of the execution units (how are
execution units grouped in subslices and slices and also which ones
have been fused off for die recovery).
At the moment the GET_PARAM ioctl covers some basic needs, but
generally is only able to return a single value for each defined
parameter. This is a bit problematic with topology descriptions which
are array/maps of available units.
This change introduces a new ioctl that can deal with requests to fill
structures of potentially variable lengths. The user is expected fill
a query with length fields set at 0 on the first call, the kernel then
sets the length fields to the their expected values. A second call to
the kernel with length fields at their expected values will trigger a
copy of the data to the pointed memory locations.
The scope of this uAPI is only to provide information to userspace,
not to allow configuration of the device.
v2: Simplify dispatcher code iteration (Tvrtko)
Tweak uapi drm_i915_query_item structure (Tvrtko)
v3: Rename pad fields into flags (Chris)
Return error on flags field != 0 (Chris)
Only copy length back to userspace in drm_i915_query_item (Chris)
v4: Use array of functions instead of switch (Chris)
v5: More comments in uapi (Tvrtko)
Return query item errors in length field (All)
v6: Tweak uapi comments style to match the coding style (Lionel)
v7: Add i915_query.h (Joonas)
v8: (Lionel) Change the behavior of the item iterator to report
invalid queries into the query item rather than stopping the
iteration. This enables userspace applications to query newer
items on older kernels and only have failure on the items that are
not supported.
v9: Edit copyright headers (Joonas)
v10: Typos & comments in uapi (Joonas)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-6-lionel.g.landwerlin@intel.com
Up to now, subslice mask was assumed to be uniform across slices. But
starting with Cannonlake, slices can be asymmetric (for example slice0
has different number of subslices as slice1+). This change stores all
subslices masks for all slices rather than having a single mask that
applies to all slices.
v2: Rework how we store total numbers in sseu_dev_info (Tvrtko)
Fix CHV eu masks, was reading disabled as enabled (Tvrtko)
Readability changes (Tvrtko)
Add EU index helper (Tvrtko)
v3: Turn ALIGN(v, 8) / 8 into DIV_ROUND_UP(v, BITS_PER_BYTE) (Tvrtko)
Reuse sseu_eu_idx() for setting eu_mask on CHV (Tvrtko)
Reformat debug prints for subslices (Tvrtko)
v4: Change eu_mask helper into sseu_set_eus() (Tvrtko)
v5: With Haswell reporting masks & counts, bump sseu_*_eus() functions
to use u16 (Lionel)
v6: Fix sseu_get_eus() for > 8 EUs per subslice (Lionel)
v7: Change debugfs enabels for number of subslices per slice, will
need a small igt/pm_sseu change (Lionel)
Drop subslice_total field from sseu_dev_info, rely on
sseu_subslice_total() to recompute the value instead (Lionel)
v8: Remove unused function compute_subslice_total() (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-2-lionel.g.landwerlin@intel.com
We want to use higher level 'uc' functions as the main entry points to
the GuC/HuC code to hide some details and keep code layered.
While here, move call to disable_guc_interrupts after sending suspend
action to the GuC to allow it work also with CTB as comm mechanism.
v2: update commit msg (Sagar)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302111550.21328-1-michal.wajdeczko@intel.com
We want to de-emphasize the link between the request (dependency,
execution and fence tracking) from GEM and so rename the struct from
drm_i915_gem_request to i915_request. That is we may implement the GEM
user interface on top of requests, but they are an abstraction for
tracking execution rather than an implementation detail of GEM. (Since
they are not tied to HW, we keep the i915 prefix as opposed to intel.)
In short, the spatch:
@@
@@
- struct drm_i915_gem_request
+ struct i915_request
A corollary to contracting the type name, we also harmonise on using
'rq' shorthand for local variables where space if of the essence and
repetition makes 'request' unwieldy. For globals and struct members,
'request' is still much preferred for its clarity.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Rather than deriving the platform_mask from the
intel_device_static_info->platform at runtime, pre-fill it in the static
data.
v2: Undefine macros at end of their scope
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180215081930.11477-3-chris@chris-wilson.co.uk
Add an intel_bios_cleanup() function to act as counterpart of
intel_bios_init() and move the cleanup of vbt related resources there,
putting it in the same file as the allocation.
Changed in v2:
-While touching the code anyways, remove the unnecessary:
if (dev_priv->vbt.child_dev) done before kfree(dev_priv->vbt.child_dev)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180214082151.25015-1-hdegoede@redhat.com
Simplify intel_virt_detect_pch() by making it return a PCH id rather
than returning the PCH type and setting PCH id for some PCHs. Map the
PCH id to PCH type using the shared routine. This gives us sanity check
on the supported combinations also in the virtualized setting.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/197cf635261a1c628371ffaaee90e8647493af4d.1517851783.git.jani.nikula@intel.com
If we fail to reset the GPU (i915_reset()), we do one final
intel_gpu_reset() attempt as we mark the device wedged. The idea here is
even though the GPU has proven unreliable (and so we want to stop using
it for the time being), we don't want it spinning away in the background
whilst the driver idles so we try to reset it one more time. However, we
want to dump the i915_gem_set_wedged() debugging info before we do, so
that we can see the accurate state of the GPU when it failed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180209114056.9957-1-chris@chris-wilson.co.uk
Most of our ioctl functions have an _ioctl suffix in the name. I like
that idea since it makes it easy to figure out how the function is
going to get called. Rename the handful of exceptions to follow the
same pattern.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207164841.19431-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Rather than having the high level ioctl interface guess the underlying
implementation details, having the implementation declare what
capabilities it exports. We define an intel_driver_caps, similar to the
intel_device_info, which instead of trying to describe the HW gives
details on what the driver itself supports. This is then populated by
the engine backend for the new scheduler capability field for use
elsewhere.
v2: Use caps.scheduler for validating CONTEXT_PARAM_SET_PRIORITY (Mika)
One less assumption of engine[RCS] \o/
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tomasz Lis <tomasz.lis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-2-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
On blb and pnv, we are seeing sporadic
i915 0000:00:02.0: Resetting chip after gpu hang
[drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING
[drm:i915_reset [i915]] *ERROR* Failed hw init on reset -5
which notably lack the actual root cause of the error. Ostensibly it
should be the init_ring_common() that failed, but it's error paths are
covered by DRM_ERROR.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207111545.17078-1-chris@chris-wilson.co.uk
We're using i915_inject_load_failure() to inject dummy
faults during driver load, but since this is debug utility
we shouldn't expose it in default config as it consumes
both code and data.
add/remove: 0/1 grow/shrink: 0/2 up/down: 0/-302 (-302)
Function old new delta
__i915_inject_load_failure 61 - -61
i915_gem_init 1331 1268 -63
i915_driver_load 5923 5745 -178
Total: Before=1177454, After=1177152, chg -0.03%
add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-4 (-4)
Data old new delta
i915_load_fail_count 4 - -4
Total: Before=56762, After=56758, chg -0.01%
add/remove: 4/8 grow/shrink: 0/1 up/down: 245/-591 (-346)
RO Data old new delta
__param_str_inject_load_failure 20 - -20
__UNIQUE_ID_inject_load_failuretype200 34 - -34
__param_inject_load_failure 40 - -40
__func__ 4998 4896 -102
__UNIQUE_ID_inject_load_failure201 150 - -150
Total: Before=119095, After=118749, chg -0.29%
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180201173248.3912-1-michal.wajdeczko@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
GuC log streaming needs interrupts enabled prior to GuC resume but
runtime pm interrupt setup was happening post GuC resume. Fix it.
While at it, fix the unwinding of steps in the runtime suspend path.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104695
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1516808821-3638-2-git-send-email-sagar.a.kamble@intel.com
intel_power_domains_init_hw() calls set_init_power, but when using
runtime power management this call is skipped. This prevents hw readout
from taking place.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104172
Link: https://patchwork.freedesktop.org/patch/msgid/20180116155324.75120-1-maarten.lankhorst@linux.intel.com
Fixes: bc87229f32 ("drm/i915/skl: enable PC9/10 power states during suspend-to-idle")
Cc: Nivedita Swaminathan <nivedita.swaminathan@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: <stable@vger.kernel.org> # v4.5+
Reviewed-by: Imre Deak <imre.deak@intel.com>
During initialization of the runtime part of the intel_device_info
we are dumping that part using DRM_DEBUG_DRIVER mechanism.
As we already have pretty printer for const part of the info,
make similar function for the runtime part and use it separately.
v2: add runtime dump to debugfs (Chris)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171221185334.17396-7-michal.wajdeczko@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171221215735.30314-6-chris@chris-wilson.co.uk
Convert intel_device_info_dump into pretty printer to be
consistent with the rest of the driver code.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171219114346.26308-2-michal.wajdeczko@intel.com
Inside i915_gem_reset(), we start touching the HW and so require the
low-level HW to be re-enabled, in particular the PCI BARs.
Fixes: 7b6da818d8 ("drm/i915: Restore the kernel context after a GPU reset on an idle engine")
References: 0db8c96120 ("drm/i915: Re-enable GTT following a device reset")
Testcase: igt/drv_hangman #i915g/i915gm
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171217132852.30642-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
At the beginning of a reset, we disable the submission method and find
the stuck request. We expect to find a stuck request for we have
declared the engine stalled. However, if we find no active request, the
engine must have recovered from its stall before we could issue a reset,
so let the engine continue on without a reset. If the engine is truly
stuck, we will back soon enough with the next reset attempt.
v2: Remove the stale debug message.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171216002206.31737-1-chris@chris-wilson.co.uk
After GPU reset, GuC HW needs to be reinitialized (with FW reload).
Unfortunately, we're doing some extra work there (mostly allocating stuff),
work that can be moved to guc_init and called once at driver load time.
As a side effect we're no longer hitting an assert in
i915_ggtt_enable_guc on suspend/resume.
v2: Do not duplicate disable_communication / reset_guc_interrupts
v3: Add proper teardown after rebase
References: 04f7b24ecc ("drm/i915/guc: Assert that we switch between known ggtt->invalidate functions")
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171213221352.7173-3-michal.winiarski@intel.com
Since on gen2, we do not universally have a GPU reset implementation, we
fail i915_reset() at intel_has_gpu_reset(). However, this is also
intentionally disabled for CI testing and so it only has a debug
message. Promote that debug message to a user-facing error message that
should explain why their machine became unusable following the GPU hang.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211204040.22858-1-chris@chris-wilson.co.uk
Now that we are using struct resource to track the stolen region, it is
more convenient if we track the mappable region in a resource as well.
v2: prefer iomap and gmadr naming scheme
prefer DEFINE_RES_MEM
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-8-matthew.auld@intel.com
Chris requested this backmerge for a reconciliation on
drm_print.h between drm-misc-next and drm-intel-next-queued
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
- Init clock gate fix (Ville)
- Execlists event handling corrections (Chris, Michel)
- Improvements on GPU Cache invalidation and context switch (Chris)
- More perf OA changes (Lionel)
- More selftests improvements and fixes (Chris, Matthew)
- Clean-up on modules parameters (Chris)
- Clean-up around old ringbuffer submission and hw semaphore on old platforms (Chris)
- More Cannonlake stabilization effort (David, James)
- Display planes clean-up and improvements (Ville)
- New PMU interface for perf queries... (Tvrtko)
- ... and other subsequent PMU changes and fixes (Tvrtko, Chris)
- Remove success dmesg noise from rotation (Chris)
- New DMC for Kabylake (Anusha)
- Fixes around atomic commits (Daniel)
- GuC updates and fixes (Sagar, Michal, Chris)
- Couple gmbus/i2c fixes (Ville)
- Use exponential backoff for all our wait_for() (Chris)
- Fixes for i915/fbdev (Chris)
- Backlight fixes (Arnd)
- Updates on shrinker (Chris)
- Make Hotplug enable more robuts (Chris)
- Disable huge pages (TPH) on lack of a needed workaround (Joonas)
- New GuC images for SKL, KBL, BXT (Sagar)
- Add HW Workaround for Geminilake performance (Valtteri)
- Fixes for PPS timings (Imre)
- More IPS fixes (Maarten)
- Many fixes for Display Port on gen2-gen4 (Ville)
- Retry GPU reset making the recover from hang more robust (Chris)
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJaIf0wAAoJEPpiX2QO6xPK/wAH/0JrYX+oml+ZJ7xV4wxm0S5S
K8jcSWQPlF78xkuiVaA0200qOPJP4Yp32Gmv7dVBFKq7IaLn3c8B2DjqE3D0QPue
ZPSMMFz7NTMVjfKpmkCp9V3ln87Cc/Cj0L66CKRuFnvefDCGLN4hULj3NAxQfkwm
9CsUH+p5d7usXgIB7tIovNEKw4hLiTQiTUd5bwok6iJ3EB79DHxxlY8nG6n5ukEr
9XryS1ADGmTUy7YifIdJKPyDLjEeP+xRMApUva67y7zaTk7iW6mETkBJ7WpQbCCL
fQaLlmU/SrBD8ViJBx19U6ofKIVJ+vVMoU9JNg4Ak5WZOfjw7aIPiZaaHU3Aj+w=
=IutN
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-2017-12-01' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
[airlied: fix conflict in intel_dsi.c]
drm-intel-next-2017-12-01:
- Init clock gate fix (Ville)
- Execlists event handling corrections (Chris, Michel)
- Improvements on GPU Cache invalidation and context switch (Chris)
- More perf OA changes (Lionel)
- More selftests improvements and fixes (Chris, Matthew)
- Clean-up on modules parameters (Chris)
- Clean-up around old ringbuffer submission and hw semaphore on old platforms (Chris)
- More Cannonlake stabilization effort (David, James)
- Display planes clean-up and improvements (Ville)
- New PMU interface for perf queries... (Tvrtko)
- ... and other subsequent PMU changes and fixes (Tvrtko, Chris)
- Remove success dmesg noise from rotation (Chris)
- New DMC for Kabylake (Anusha)
- Fixes around atomic commits (Daniel)
- GuC updates and fixes (Sagar, Michal, Chris)
- Couple gmbus/i2c fixes (Ville)
- Use exponential backoff for all our wait_for() (Chris)
- Fixes for i915/fbdev (Chris)
- Backlight fixes (Arnd)
- Updates on shrinker (Chris)
- Make Hotplug enable more robuts (Chris)
- Disable huge pages (TPH) on lack of a needed workaround (Joonas)
- New GuC images for SKL, KBL, BXT (Sagar)
- Add HW Workaround for Geminilake performance (Valtteri)
- Fixes for PPS timings (Imre)
- More IPS fixes (Maarten)
- Many fixes for Display Port on gen2-gen4 (Ville)
- Retry GPU reset making the recover from hang more robust (Chris)
* tag 'drm-intel-next-2017-12-01' of git://anongit.freedesktop.org/drm/drm-intel: (101 commits)
drm/i915: Update DRIVER_DATE to 20171201
drm/i915/cnl: Mask previous DDI - PLL mapping
drm/i915: Remove unsafe i915.enable_rc6
drm/i915: Sleep and retry a GPU reset if at first we don't succeed
drm/i915: Interlaced DP output doesn't work on VLV/CHV
drm/i915: Pass crtc state to intel_pipe_{enable,disable}()
drm/i915: Wait for pipe to start on i830 as well
drm/i915: Fix vblank timestamp/frame counter jumps on gen2
drm/i915: Fix deadlock in i830_disable_pipe()
drm/i915: Fix has_audio readout for DDI A
drm/i915: Don't add the "force audio" property to DP connectors that don't support audio
drm/i915: Disable DP audio for g4x
drm/i915/selftests: Wake the device before executing requests on the GPU
drm/i915: Set fake_vma.size as well as fake_vma.node.size for capture
drm/i915: Tidy up signed/unsigned comparison
drm/i915: Enable IPS with only sprite plane visible too, v4.
drm/i915: Make ips_enabled a property depending on whether IPS is enabled, v3.
drm/i915: Avoid PPS HW/SW state mismatch due to rounding
drm/i915: Skip switch-to-kernel-context on suspend when wedged
drm/i915/glk: Apply WaProgramL3SqcReg1DefaultForPerf for GLK too
...
History tells us that if we cannot reset the GPU now, we never will. This
then impacts everything that is run subsequently. On failing the reset,
we mark the driver as wedged, trying to prevent further execution on the
GPU, forcing userspace to fallback to using the CPU to update its
framebuffers and let the user know what happened.
We also want to go one step further and add a taint to the kernel so that
any subsequent faults can be traced back to this failure. This is
useful for CI, where if the GPU/driver fails we want to reboot and
restart testing rather than continue on into oblivion. For everyone
else, the warning taint is a testament to the system unreliability.
TAINT_WARN is used anytime a WARN() is emitted, which is suitable for
our purposes here as well; the driver/system may behave unexpectedly
after the failure.
v2: Also taint if the recovery fails (again history shows us that is
typically fatal).
v3: Use TAINT_WARN
References: https://bugs.freedesktop.org/show_bug.cgi?id=103514
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171205172757.32609-1-chris@chris-wilson.co.uk
- Many improvements for selftests and other igt tests (Chris)
- Forcewake with PUNIT->PMIC bus fixes and robustness (Hans)
- Define an engine class for uABI (Tvrtko)
- Context switch fixes and improvements (Chris)
- GT powersavings and power gating simplification and fixes (Chris)
- Other general driver clean-ups (Chris, Lucas, Ville)
- Removing old, useless and/or bad workarounds (Chris, Oscar, Radhakrishna)
- IPS, pipe config, etc in preparation for another Fast Boot attempt (Maarten)
- OA perf fixes and support to Coffee Lake and Cannonlake (Lionel)
- Fixes around GPU fault registers (Michel)
- GEM Proxy (Tina)
- Refactor of Geminilake and Cannonlake plane color handling (James)
- Generalize transcoder loop (Mika Kahola)
- New HW Workaround for Cannonlake and Geminilake (Rodrigo)
- Resume GuC before using GEM (Chris)
- Stolen Memory handling improvements (Ville)
- Initialize entry in PPAT for older compilers (Chris)
- Other fixes and robustness improvements on execbuf (Chris)
- Improve logs of GEM_BUG_ON (Mika Kuoppala)
- Rework with massive rename of GuC functions and files (Sagar)
- Don't sanitize frame start delay if pipe is off (Ville)
- Cannonlake clock fixes (Rodrigo)
- Cannonlake HDMI 2.0 support (Rodrigo)
- Add a GuC doorbells selftest (Michel)
- Add might_sleep() check to our wait_for() (Chris)
Many GVT changes for 4.16:
- CSB HWSP update support (Weinan)
- GVT debug helpers, dyndbg and debugfs (Chuanxiao, Shuo)
- full virtualized opregion (Xiaolin)
- VM health check for sane fallback (Fred)
- workload submission code refactor for future enabling (Zhi)
- Updated repo URL in MAINTAINERS (Zhenyu)
- other many misc fixes
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJaD2cyAAoJEPpiX2QO6xPKuiEH/2/J7Ebf5IRZtaTU+ke2uOI4
2YCdrn9F1guz6d+cZtsLPkJ9JwQlz9EftfB7KT+9dT8viEG0FFna9bV+Xz3wyGQ6
DRlP9tCFnCDaOyZBI5QshubuzldabPpfscPJI7/EMr91jtveGhKIhsRzHBxKCEZF
LKlAHtXAWSkTozmh6bU+wf5TEOFzYv2oquTVn5ZJrpYlqup/wEKh+KnL9eBQ3+Qp
FLnmKjInaadOV/uXQfeWstJuohG/pfcNm68OmDOxYNmwpeNnwbtfKT9eZeDtDZDy
dXj9mokeTwg4fBrXX/tyxuKogywxQSNFTqCU2yY9up+35ykmjVN8p/1BYi+GGe0=
=ePes
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-2017-11-17-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
More change sets for 4.16:
- Many improvements for selftests and other igt tests (Chris)
- Forcewake with PUNIT->PMIC bus fixes and robustness (Hans)
- Define an engine class for uABI (Tvrtko)
- Context switch fixes and improvements (Chris)
- GT powersavings and power gating simplification and fixes (Chris)
- Other general driver clean-ups (Chris, Lucas, Ville)
- Removing old, useless and/or bad workarounds (Chris, Oscar, Radhakrishna)
- IPS, pipe config, etc in preparation for another Fast Boot attempt (Maarten)
- OA perf fixes and support to Coffee Lake and Cannonlake (Lionel)
- Fixes around GPU fault registers (Michel)
- GEM Proxy (Tina)
- Refactor of Geminilake and Cannonlake plane color handling (James)
- Generalize transcoder loop (Mika Kahola)
- New HW Workaround for Cannonlake and Geminilake (Rodrigo)
- Resume GuC before using GEM (Chris)
- Stolen Memory handling improvements (Ville)
- Initialize entry in PPAT for older compilers (Chris)
- Other fixes and robustness improvements on execbuf (Chris)
- Improve logs of GEM_BUG_ON (Mika Kuoppala)
- Rework with massive rename of GuC functions and files (Sagar)
- Don't sanitize frame start delay if pipe is off (Ville)
- Cannonlake clock fixes (Rodrigo)
- Cannonlake HDMI 2.0 support (Rodrigo)
- Add a GuC doorbells selftest (Michel)
- Add might_sleep() check to our wait_for() (Chris)
Many GVT changes for 4.16:
- CSB HWSP update support (Weinan)
- GVT debug helpers, dyndbg and debugfs (Chuanxiao, Shuo)
- full virtualized opregion (Xiaolin)
- VM health check for sane fallback (Fred)
- workload submission code refactor for future enabling (Zhi)
- Updated repo URL in MAINTAINERS (Zhenyu)
- other many misc fixes
* tag 'drm-intel-next-2017-11-17-1' of git://anongit.freedesktop.org/drm/drm-intel: (260 commits)
drm/i915: Update DRIVER_DATE to 20171117
drm/i915: Add a policy note for removing workarounds
drm/i915/selftests: Report ENOMEM clearly for an allocation failure
Revert "drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk"
drm/i915: Calculate g4x intermediate watermarks correctly
drm/i915: Calculate vlv/chv intermediate watermarks correctly, v3.
drm/i915: Pass crtc_state to ips toggle functions, v2
drm/i915: Pass idle crtc_state to intel_dp_sink_crc
drm/i915: Enable FIFO underrun reporting after initial fastset, v4.
drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
drm/i915: Add might_sleep() check to wait_for()
drm/i915/selftests: Add a GuC doorbells selftest
drm/i915/cnl: Extend HDMI 2.0 support to CNL.
drm/i915/cnl: Simplify dco_fraction calculation.
drm/i915/cnl: Don't blindly replace qdiv.
drm/i915/cnl: Fix wrpll math for higher freqs.
drm/i915/cnl: Fix, simplify and unify wrpll variable sizes.
drm/i915/cnl: Remove useless conversion.
drm/i915/cnl: Remove spurious central_freq.
drm/i915/selftests: exercise_ggtt may have nothing to do
...
It has been many years since the last confirmed sighting (and fix) of an
RC6 related bug (usually a system hang). Remove the parameter to stop
users from setting dangerous values, as they often set it during triage
and end up disabling the entire runtime pm instead (the option is not a
fine scalpel!).
Furthermore, it allows users to set known dangerous values which were
intended for testing and not for production use. For testing, we can
always patch in the required setting without having to expose ourselves
to random abuse.
v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
lack of ilk support better.
v3: Clear intel_info->rc6p if we don't support rc6 itself.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171201113030.18360-1-chris@chris-wilson.co.uk
As we declare the GPU wedged if the reset fails, such a failure is quite
terminal. Before taking that drastic action, let's sleep first and try
active, in the hope that the hardware has quietened down and is then
able to reset. After a few such attempts, it is fair to say that the HW
is truly wedged.
v2: Always print the failure message now, we precheck whether resets are
disabled.
References: https://bugs.freedesktop.org/show_bug.cgi?id=104007
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171201122011.16841-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>