OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Chris Wilson	c41166f9a1	drm/i915: Beware temporary wedging when determining -EIO At a few points in our uABI, we check to see if the driver is wedged and report -EIO back to the user in that case. However, as we perform the check and reset asynchronously (where once before they were both serialised by the struct_mutex), we may instead see the temporary wedging used to cancel inflight rendering to avoid a deadlock during reset (caused by either us timing out in our reset handler, i915_wedge_on_timeout or with malice aforethought in intel_reset_prepare for a stuck modeset). If we suspect this is the case, that is we see a wedged driver and reset in progress, then wait until the reset is resolved before reporting upon the wedged status. v2: might_sleep() (Mika) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109580 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190220145637.23503-1-chris@chris-wilson.co.uk	2019-02-20 16:31:08 +00:00
Chris Wilson	e4106dae0f	drm/i915/selftests: Make unbannable contexts for reset handling igt_ctx_sseu was caught using bannable contexts, and in the course of resetting rapidly to run its test, was banned. Don't let ourselves ban the test! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190218145051.18981-1-chris@chris-wilson.co.uk	2019-02-18 16:00:34 +00:00
Chris Wilson	c836eb79c0	drm/i915/selftests: Always use an active engine while resetting Currently, we only try to reset a live engine for checking the whitelist retention across a per-engine reset. For safety, it appears we need to prime the system with a hanging spinner before performing a full-device reset. (Figuring out the root cause behind the instability with handling a reset during a no-op request is a challenge for another test, the whitelist test has its own purpose.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109626 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190213224805.32021-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2019-02-15 09:37:31 +00:00
Chris Wilson	eb8d0f5af4	drm/i915: Remove GPU reset dependence on struct_mutex Now that the submission backends are controlled via their own spinlocks, with a wave of a magic wand we can lift the struct_mutex requirement around GPU reset. That is we allow the submission frontend (userspace) to keep on submitting while we process the GPU reset as we can suspend the backend independently. The major change is around the backoff/handoff strategy for performing the reset. With no mutex deadlock, we no longer have to coordinate with any waiter, and just perform the reset immediately. Testcase: igt/gem_mmap_gtt/hang # regresses Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-3-chris@chris-wilson.co.uk	2019-01-25 14:27:22 +00:00
Chris Wilson	9f58892ea9	drm/i915: Pull all the reset functionality together into i915_reset.c Currently the code to reset the GPU and our state is spread widely across a few files. Pull the logic together into a common file. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190116153304.787-1-chris@chris-wilson.co.uk	2019-01-16 22:45:31 +00:00
Chris Wilson	d4225a535b	drm/i915: Syntatic sugar for using intel_runtime_pm Frequently, we use intel_runtime_pm_get/_put around a small block. Formalise that usage by providing a macro to define such a block with an automatic closure to scope the intel_runtime_pm wakeref to that block, i.e. macro abuse smelling of python. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-15-chris@chris-wilson.co.uk	2019-01-14 16:18:25 +00:00
Chris Wilson	c9d08cc3e3	drm/i915/selftests: Mark up rpm wakerefs Track the temporary wakerefs used within the selftests so that leaks are clear. v2: Add a couple of coarse annotations for mock selftests as we now loudly warn about the errors. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-14-chris@chris-wilson.co.uk	2019-01-14 16:18:20 +00:00
Chris Wilson	16e4dd0342	drm/i915: Markup paired operations on wakerefs The majority of runtime-pm operations are bounded and scoped within a function; these are easy to verify that the wakeref are handled correctly. We can employ the compiler to help us, and reduce the number of wakerefs tracked when debugging, by passing around cookies provided by the various rpm_get functions to their rpm_put counterpart. This makes the pairing explicit, and given the required wakeref cookie the compiler can verify that we pass an initialised value to the rpm_put (quite handy for double checking error paths). For regular builds, the compiler should be able to eliminate the unused local variables and the program growth should be minimal. Fwiw, it came out as a net improvement as gcc was able to refactor rpm_get and rpm_get_if_in_use together, v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual mark up for smaller more targeted patches. v3: Mention the cookie in Returns Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk	2019-01-14 16:17:53 +00:00
Daniele Ceraolo Spurio	f663b0ca9b	drm/i915/selftests: recreate WA lists inside the selftest By using the wa lists inside the live driver structures, we won't catch issues where those are incorrectly setup or corrupted. To cover this gap, update the workaround framework to allow saving the wa lists to independent structures and use them in the selftests. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190110013232.8972-1-daniele.ceraolospurio@intel.com [tursulin: Fixup checkpatch whitespace complaint in memset.]	2019-01-10 09:15:18 +00:00
Chris Wilson	3abd6143f9	drm/i915/selftests: verify_gt_engine_wa() needs rpm wakeref The mmio readback for verify_gt_engine_wa() also needs a runtime-pm wakeref, so effectively do the entirety of both engine workarounds tests. As such simplify the rpm behaviour here by acquiring the wakeref for the whole of each subtest. It would be still useful to later verify the registers retain their magic values across rpm suspend/resume. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181206180713.6827-1-chris@chris-wilson.co.uk	2018-12-06 20:45:09 +00:00
Tvrtko Ursulin	69bcdecf1a	drm/i915: Move register white-listing to the common workaround framework Instead of having a separate list of white-listed registers we can trivially move this to the common workarounds framework. This brings us one step closer to the goal of driving all workaround classes using the same code. v2: * Use GEM_DEBUG_WARN_ON for the sanity check. (Chris Wilson) v3: * API rename. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20181203125014.3219-6-tvrtko.ursulin@linux.intel.com	2018-12-04 12:23:21 +00:00
Tvrtko Ursulin	28d6ccce73	drm/i915/selftests: Add tests for GT and engine workaround verification Two simple selftests which test that both GT and engine workarounds are not lost after either a full GPU reset, or after the per-engine ones. (Including checks that one engine reset is not affecting workarounds not belonging to itself.) v2: * Rebase for series refactoring. * Add spinner for actual engine reset! * Add idle reset test as well. (Chris Wilson) * Share existing global_reset_lock. (Chris Wilson) v3: * intel_engine_verify_workarounds can be static. * API rename. (Chris Wilson) * Move global reset lock out of the loop. (Chris Wilson) v4: * Add missing rpm puts. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20181203125014.3219-5-tvrtko.ursulin@linux.intel.com	2018-12-04 12:23:19 +00:00
Tvrtko Ursulin	b9f78d6752	drm/i915/selftests: Fix live_workarounds to actually do resets The test was missing some magic ingredients to actually trigger the resets. In case of the full reset we need the I915_RESET_HANDOFF flag set, and in case of engine reset we need a busy request. Thanks to Chris for helping with reset magic. v2: * Grab RPM ref over reset. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20181130095211.23849-1-tvrtko.ursulin@linux.intel.com	2018-11-30 15:09:16 +00:00
Chris Wilson	b8bdd9cc60	drm/i915/selftests: Live tests emit requests and so require rpm As we emit requests or touch HW directly for some of the live tests, the requirement is that we hold the rpm wakeref before doing so. We want a mix of granularity since we will want to test runtime suspend, so try to mark up only the critical sections where we need rpm for the live test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108002 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180920144934.16611-1-chris@chris-wilson.co.uk	2018-09-20 17:01:26 +01:00
Chris Wilson	cb4dc8daf4	drm/i915/selftests: Add a safety net to live_workarounds Since live_workarounds poke around the w/a registers and checks to see if they survive across a reset, we are prone to fouling the machine and leaving it in a non-recoverable state. Wrap the probe inside a timeout to abort the test if the reset fails. v2: Include GEM_TRACE on declaring wedged. v3: Add a few includes to make the header look standalone. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107188 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180711122952.18448-1-chris@chris-wilson.co.uk	2018-07-11 14:13:56 +01:00
Chris Wilson	a523697857	drm/i915: Start returning an error from i915_vma_move_to_active() Handling such a late error in request construction is tricky, but to accommodate future patches which may allocate here, we potentially could err. To handle the error after already adjusting global state to track the new request, we must finish and submit the request. But we don't want to use the request as not everything is being tracked by it, so we opt to cancel the commands inside the request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-3-chris@chris-wilson.co.uk	2018-07-06 18:22:37 +01:00
Chris Wilson	da99fe5f85	drm/i915: Refactor export_fence() after i915_vma_move_to_active() Currently all callers are responsible for adding the vma to the active timeline and then exporting its fence. Combine the two operations into i915_vma_move_to_active() to move all the extra handling from the callers to the single site. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-1-chris@chris-wilson.co.uk	2018-07-06 18:22:34 +01:00
Chris Wilson	47e61a7980	drm/i915/selftests: Skip workaround tests when wedged If the GPU is irrecoverably wedged, we cannot submit any request and therefore cannot query the register state of the context (which is done using the GPU command stream). So skip over the test as it expectedly fails. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706065332.15214-6-chris@chris-wilson.co.uk	2018-07-06 11:24:54 +01:00
Chris Wilson	697b9a8714	drm/i915: Make closing request flush mandatory For symmetry, simplicity and ensuring the request is always truly idle upon its completion, always emit the closing flush prior to emitting the request breadcrumb. Previously, we would only emit the flush if we had started a user batch, but this just leaves all the other paths open to speculation (do they affect the GPU caches or not?) With mm switching, a key requirement is that the GPU is flushed and invalidated before hand, so for absolute safety, we want that closing flush be mandatory. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180612105135.4459-1-chris@chris-wilson.co.uk	2018-06-14 08:16:12 +01:00
Chris Wilson	82ad6443a5	drm/i915/gtt: Rename i915_hw_ppgtt base member In the near future, I want to subclass gen6_hw_ppgtt as it contains a few specialised members and I wish to add more. To avoid the ugliness of using ppgtt->base.base, rename the i915_hw_ppgtt base member (i915_address_space) as vm, which is our common shorthand for an i915_address_space local. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk	2018-06-05 21:11:20 +01:00
Gustavo A. R. Silva	a399715913	drm/i915/selftests: Fix uninitialized variable There is a potential execution path in which variable err is returned without being properly initialized previously. Fix this by initializing variable err to 0. Addresses-Coverity-ID: 1468362 ("Uninitialized scalar variable") Fixes: `f4ecfbfc32` ("drm/i915: Check whitelist registers across resets") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180424131545.GA4053@embeddedor.com	2018-04-24 16:44:25 +01:00
Oscar Mateo	94f8dfc6cd	drm/i915/selftests: Handle a potential failure of intel_ring_begin Silence smatch over: drivers/gpu/drm/i915/selftests/intel_workarounds.c:58 read_nonprivs() error: 'cs' dereferencing possible ERR_PTR() by handling a potential (but unlikely) failure of intel_ring_begin. Fixes: `f4ecfbfc32` ("drm/i915: Check whitelist registers across resets") Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/1523915821-30624-1-git-send-email-oscar.mateo@intel.com	2018-04-17 11:40:27 +01:00
Chris Wilson	f4ecfbfc32	drm/i915: Check whitelist registers across resets Add a selftest to ensure that we restore the whitelisted registers after rewrite the registers everytime they might be scrubbed, e.g. module load, reset and resume. For the other volatile workaround registers, we export their presence via debugfs and check in igt/gem_workarounds. However, we don't export the whitelist and rather than do so, let's test them directly in the kernel. The test we use is to read the registers back from the CS (this helps us be sure that the registers will be valid for MI_LRI etc). In order to generate the expected list, we split intel_whitelist_workarounds_emit into two phases, the first to build the list and the second to apply. Inside the test, we only build the list and then check that list against the hw. v2: Filter out pre-gen8 as they do not have RING_NONPRIV. v3: Drop unused engine parameter, no plans to use it now or future. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180414122754.569-1-chris@chris-wilson.co.uk	2018-04-14 18:36:45 +01:00

23 Commits