OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Chris Wilson	c4e8ba7390	drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore If we find ourselves waiting on a MI_SEMAPHORE_WAIT, either within the user batch or in our own preamble, the engine raises a GT_WAIT_ON_SEMAPHORE interrupt. We can unmask that interrupt and so respond to a semaphore wait by yielding the timeslice, if we have another context to yield to! The only real complication is that the interrupt is only generated for the start of the semaphore wait, and is asynchronous to our process_csb() -- that is, we may not have registered the timeslice before we see the interrupt. To ensure we don't miss a potential semaphore blocking forward progress (e.g. selftests/live_timeslice_preempt) we mark the interrupt and apply it to the next timeslice regardless of whether it was active at the time. v2: We use semaphores in preempt-to-busy, within the timeslicing implementation itself! Ergo, when we do insert a preemption due to an expired timeslice, the new context may start with the missed semaphore flagged by the retired context and be yielded, ad infinitum. To avoid this, read the context id at the time of the semaphore interrupt and only yield if that context is still active. Fixes: `8ee36e048c` ("drm/i915/execlists: Minimalistic timeslicing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200407130811.17321-1-chris@chris-wilson.co.uk	2020-04-07 14:43:58 +01:00
Chris Wilson	53f5da74c7	drm/i915/selftests: Wait until we start timeslicing after a submit If we submit, we do not start timeslicing until we process the CS event that marks the start of the context running on HW. So in the selftest, be sure to wait until we have processed the pending events before asserting that timeslicing has begun. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200403190209.21818-1-chris@chris-wilson.co.uk	2020-04-03 21:38:27 +01:00
Chris Wilson	98d513167f	drm/i915/selftests: Check for has-reset before testing hostile contexts In order to kill off a hostile context, we need to be able to reset the GPU. So check that is supported prior to beginning the test. Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200402205839.25065-1-chris@chris-wilson.co.uk	2020-04-02 22:00:54 +01:00
Mika Kuoppala	708c82d59b	drm/i915: Report all failed registers for ctx isolation For CI it is enough to point out a single failure in isolation. However it is beneficial to gather info in logs for transients further down the line. Do not stop into first comparison failure but continue probing forward. v2: for all engines and poisons (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200331135403.16906-1-mika.kuoppala@linux.intel.com	2020-03-31 21:42:12 +01:00
Chris Wilson	71a6688e81	drm/i915/selftests: Tidy up an error message for live_error_interrupt Since we don't wait for the error interrupt to reset, restart and then complete the guilty request, clean up the error messages. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200331091459.29179-1-chris@chris-wilson.co.uk	2020-03-31 21:42:12 +01:00
Chris Wilson	4b379a48de	drm/i915/selftests: Check timeout before flush and cond checks Allow a bit of leniency for the CPU scheduler to be distracted while we flush the tasklet and so ensure that we always check the status of the request once more before timing out. v2: Wait until the HW acked the submit, and we do any secondary actions for the submit (e.g. timeslices) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200330121644.25277-1-chris@chris-wilson.co.uk	2020-03-30 17:56:00 +01:00
Chris Wilson	bb4328f6b9	drm/i915/selftest: Add more poison patterns Throw in the inverse patterns to create more examples of poison to use against the LRC state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200313102812.30173-1-chris@chris-wilson.co.uk	2020-03-13 11:36:34 +00:00
Aditya Swarup	9b234d2643	drm/i915/selftests: Fix uninitialized variable Static code analysis tool identified struct lrc_timestamp data as being uninitialized and then data.ce[] is being checked for NULL/negative value in the error path. Initializing data variable fixes the issue. Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Aditya Swarup <aditya.swarup@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200303142347.15696-1-aditya.swarup@intel.com	2020-03-03 17:30:20 +00:00
Chris Wilson	280e285dc7	drm/i915/selftests: Be a little more lenient for reset workers Give the reset worker a kick before losing help when waiting for hang recovery, as the CPU scheduler is a little unreliable. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-15-chris@chris-wilson.co.uk	2020-02-28 15:45:42 +00:00
Chris Wilson	b0158b9132	drm/i915/selftests: Wait for the context switch As we require a context switch to ensure that the current context is switched out and saved to memory, perform an explicit switch to the kernel context and wait for it. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1336 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200228082330.2411941-18-chris@chris-wilson.co.uk	2020-02-28 15:18:55 +00:00
Chris Wilson	24eba7a998	drm/i915/selftests: Check recovery from corrupted LRC Check that we can recover if the LRC is totally corrupted. Based on a very simple theory that anything that can be adjusted via the context (i.e. on behalf of the user), should be under the purview of the per-engine-reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-13-chris@chris-wilson.co.uk	2020-02-28 13:04:14 +00:00
Chris Wilson	efb69b9832	drm/i915/selftests: Verify LRC isolation Record the LRC registers before/after a preemption event to ensure that the first context sees nothing from the second client; at least in the normal per-context register state. References: https://gitlab.freedesktop.org/drm/intel/issues/1233 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-12-chris@chris-wilson.co.uk	2020-02-28 13:01:14 +00:00
Daniele Ceraolo Spurio	065273f76d	drm/i915/guc: Kill USES_GUC_SUBMISSION macro use intel_uc_uses_guc_submission() directly instead, to be consistent in the way we check what we want to do with the GuC. v2: do not go through ctx->vm->gt, use i915->gt instead Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> #v1 Reviewed-by: Andi Shyti <andi.shyti@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200218223327.11058-3-daniele.ceraolospurio@intel.com	2020-02-20 17:48:03 +00:00
Chris Wilson	bd3d1f8673	drm/i915/selftests: Mark GPR checking more hostile Currently, we check that a new context has a clear set of general purpose registers. Add a little bit of hostility by preempting our new context and re-poisoning the GPR to ensure that there is no context leakage from preemption. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ramalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200219123418.1447428-1-chris@chris-wilson.co.uk	2020-02-19 14:09:19 +00:00
Chris Wilson	e7aa531e84	drm/i915/selftest: Analyse timestamp behaviour across context switches Check that the CTX_TIMESTAMP is monotonic across context save/restore and upon preemption. References: https://gitlab.freedesktop.org/drm/intel/issues/1233 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ramalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200219112004.1412791-1-chris@chris-wilson.co.uk	2020-02-19 14:09:18 +00:00
Chris Wilson	d30d3d5f58	drm/i915/selftests: Flush tasklet on wait_for_submit() Always flush the tasklet if we have pending submissions in wait_for_submit(), so that even if we see the HW has started before we process its ack, when we return the execlists state is well defined. Fixes: `06289949b8` ("drm/i915/selftests: Check for any sign of request starting in wait_for_submit()") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200218211215.1336341-1-chris@chris-wilson.co.uk	2020-02-18 21:21:53 +00:00
Chris Wilson	06289949b8	drm/i915/selftests: Check for any sign of request starting in wait_for_submit() We only want to wait until the request has been submitted at least once; that is it is either in flight, or has been. References: `fcf7df7aae` ("drm/i915/selftests: Check for the error interrupt before we wait!") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200218141305.1258394-1-chris@chris-wilson.co.uk	2020-02-18 19:45:08 +00:00
Tvrtko Ursulin	1883a0a465	drm/i915: Track hw reported context runtime GPU saves accumulated context runtime (in CS timestamp units) in PPHWSP which will be useful for us in cases when we are not able to track context busyness ourselves (like with GuC). Keep a copy of this in struct intel_context from where it can be easily read even if the context is not pinned. v2: (Chris) * Do not store pphwsp address in intel_context. * Log CS wrap-around. * Simplify calculation by relying on integer wraparound. v3: * Include total/avg in traces and error state for debugging Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200216133620.394962-1-chris@chris-wilson.co.uk	2020-02-16 15:16:22 +00:00
Chris Wilson	fcf7df7aae	drm/i915/selftests: Check for the error interrupt before we wait! Sometimes the error interrupt can fire even before we have seen the request go active -- in which case, we end up waiting until the timeout as the request is already completed. Double check for this case! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200214120659.3888735-1-chris@chris-wilson.co.uk	2020-02-14 15:47:24 +00:00
Chris Wilson	4c8ed8b126	drm/i915/selftests: Exercise timeslice rewinding Originally, I did not expect having to rewind a context upon timeslicing: the point was to replace the executing context with a non-executing one! However, given a second context that depends on requests from the first, we may have to split the requests along the first context to execute the second, causing us to partially replay the first context and so have to rewind its RING_TAIL. References: `5ba32c7be8` ("drm/i915/execlists: Always force a context reload when rewinding RING_TAIL") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200213140150.3639027-1-chris@chris-wilson.co.uk	2020-02-13 16:41:23 +00:00
Chris Wilson	37305ede63	drm/i915/selftests: Sabotague the RING_HEAD Apply vast quantities of poison and not tell anyone to see if we fall for the trap of using a stale RING_HEAD. References: `42827350f7` ("drm/i915/gt: Avoid resetting ring->head outside of its timeline mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200211205615.1190127-2-chris@chris-wilson.co.uk	2020-02-12 10:07:13 +00:00
Chris Wilson	f16ccb6445	drm/i915: Disable use of hwsp_cacheline for kernel_context Currently on execlists, we use a local hwsp for the kernel_context, rather than the engine's HWSP, as this is the default for execlists. However, seqno wrap requires allocating a new HWSP cacheline, and may require pinning a new HWSP page in the GGTT. This operation requiring pinning in the GGTT is not allowed within the kernel_context timeline, as doing so may require re-entering the kernel_context in order to evict from the GGTT. As we want to avoid requiring a new HWSP for the kernel_context, we can use the permanently pinned engine's HWSP instead. However to do so we must prevent the use of semaphores reading the kernel_context's HWSP, as the use of semaphores do not support rollover onto the same cacheline. Fortunately, the kernel_context is mostly isolated, so unlikely to give benefit to semaphores. Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200210205722.794180-5-chris@chris-wilson.co.uk	2020-02-11 17:42:17 +00:00
Chris Wilson	6313e78e72	drm/i915/selftests: Relax timeout for error-interrupt reset processing We can not require that the system process a tasklet in reasonable time (thanks be to ksoftirqd), but we can insist that having waited sufficiently for the error interrupt to have been raised and having kicked the tasklet, the reset has begun and the request will be marked as in error (if not already completed). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200210205722.794180-3-chris@chris-wilson.co.uk	2020-02-11 15:33:50 +00:00
Chris Wilson	42827350f7	drm/i915/gt: Avoid resetting ring->head outside of its timeline mutex We manipulate ring->head while active in i915_request_retire underneath the timeline manipulation. We cannot rely on a stable ring->head outside of the timeline->mutex, in particular while setting up the context for resume and reset. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1126 Fixes: `0881954965` ("drm/i915: Introduce intel_context.pin_mutex for pin management") Fixes: `e5dadff4b0` ("drm/i915: Protect request retirement with timeline->mutex") References: `f3c0efc9fe` ("drm/i915/execlists: Leave resetting ring to intel_ring") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200211120131.958949-1-chris@chris-wilson.co.uk	2020-02-11 12:03:22 +00:00
Chris Wilson	b656000782	drm/i915/selftests: Drop live_preempt_hang live_preempt_hang's use of hang injection has been superseded by live_preempt_reset's use of an non-preemptible spinner. The latter does not require intrusive hacks into the code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200209230838.361154-2-chris@chris-wilson.co.uk	2020-02-10 16:46:21 +00:00
Chris Wilson	b0e02a73c5	drm/i915/selftests: Disable heartbeat around hang tests If the heartbeat fires in the middle of the preempt-hang test, it consumes our forced hang disrupting the test. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200131130319.2998318-1-chris@chris-wilson.co.uk	2020-01-31 15:10:02 +00:00
Chris Wilson	bd46aa22a8	drm/i915/selftests: Also wait for the scratch buffer to be bound Since PIN_GLOBAL is no longer guaranteed to be synchronous, we must not forget to include a wait-for-vma prior to execution. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200131142610.3100998-1-chris@chris-wilson.co.uk	2020-01-31 15:10:02 +00:00
Chris Wilson	70a76a9b8e	drm/i915/gt: Hook up CS_MASTER_ERROR_INTERRUPT Now that we have offline error capture and can reset an engine from inside an atomic context while also preserving the GPU state for post-mortem analysis, it is time to handle error interrupts thrown by the command parser. This provides a much, much faster mechanism for us to detect known problems than using heartbeats/hangchecks, and also provides a mechanism for when those are disabled. However, it is limited to problems the HW can detect in the CS and so not a complete solution for detecting lockups. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200128204318.4182039-2-chris@chris-wilson.co.uk	2020-01-29 15:16:52 +00:00
Chris Wilson	989df3a7bd	drm/i915/execlists: Reclaim the hanging virtual request If we encounter a hang on a virtual engine, as we process the hang the request may already have been moved back to the virtual engine (we are processing the hang on the physical engine). We need to reclaim the request from the virtual engine so that the locking is consistent and local to the real engine on which we will hold the request for error state capturing. v2: Pull the reclamation into execlists_hold() and assert that cannot be called from outside of the reset (i.e. with the tasklet disabled). v3: Added selftest v4: Drop the reference owned by the virtual engine Fixes: `748317386a` ("drm/i915/execlists: Offline error capture") Testcase: igt/gem_exec_balancer/hang Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200122140243.495621-2-chris@chris-wilson.co.uk	2020-01-22 17:10:15 +00:00
Chris Wilson	4ba5c086a1	drm/i915/execlists: Take a reference while capturing the guilty request Thanks to preempt-to-busy, we leave the request on the HW as we submit the preemption request. This means that the request may complete at any moment as we process HW events, and in particular the request may be retired as we are planning to capture it for a preemption timeout. Be more careful while obtaining the request to capture after a preemption timeout, and check to see if it completed before we were able to put it on the on-hold list. If we do see it did complete just before we capture the request, proclaim the preemption-timeout a false positive and pardon the reset as we should hit an arbitration point momentarily and so be able to process the preemption. Note that even after we move the request to be on hold it may be retired (as the reset to stop the HW comes after), so we do require to hold our own reference as we work on the request for capture (and all of the peeking at state within the request needs to be carefully protected). Fixes: `32ff621fd7` ("drm/i915/gt: Allow temporary suspension of inflight requests") Closes: https://gitlab.freedesktop.org/drm/intel/issues/997 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200122140243.495621-1-chris@chris-wilson.co.uk	2020-01-22 17:10:15 +00:00
Chris Wilson	32ff621fd7	drm/i915/gt: Allow temporary suspension of inflight requests In order to support out-of-line error capture, we need to remove the active request from HW and put it to one side while a worker compresses and stores all the details associated with that request. (As that compression may take an arbitrary user-controlled amount of time, we want to let the engine continue running on other workloads while the hanging request is dumped.) Not only do we need to remove the active request, but we also have to remove its context and all requests that were dependent on it (both in flight, queued and future submission). Finally once the capture is complete, we need to be able to resubmit the request and its dependents and allow them to execute. v2: Replace stack recursion with a simple list. v3: Check all the parents, not just the first, when searching for a stuck ancestor! References: https://gitlab.freedesktop.org/drm/intel/issues/738 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-2-chris@chris-wilson.co.uk	2020-01-16 19:56:16 +00:00
Chris Wilson	e1c31fb5dd	drm/i915: Merge i915_request.flags with i915_request.fence.flags As we already have a flags field buried within i915_request, reuse it! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200106114234.2529613-3-chris@chris-wilson.co.uk	2020-01-06 14:38:55 +00:00
Chris Wilson	6d728d92d8	drm/i915/selftests: Impose a timeout for request submission Avoid spinning indefinitely waiting for the request to be submitted, and instead apply a timeout. A secondary benefit is that the error message will show which suspect is blocked. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200106114234.2529613-2-chris@chris-wilson.co.uk	2020-01-06 14:38:55 +00:00
Chris Wilson	d1813ca2bb	drm/i915/gt: Clear LRC image inline When creating the initial LRC image, we also want to clear the MI_NOOPs and register values. Rather than use a blanket memset beforehand, apply the clears inline, close the context image and force inhibition of the uninitialised reminder. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200102131707.1463945-2-chris@chris-wilson.co.uk	2020-01-03 11:26:01 +00:00
Chris Wilson	6a505e644c	drm/i915/gt: Include a bunch more rcs image state Empirically the minimal context image we use for rcs is insufficient to state the engine. This is demonstrated if we poison the context image such that any uninitialised state is invalid, and so if the engine samples beyond our defined region, will fail to start. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200102131707.1463945-1-chris@chris-wilson.co.uk	2020-01-03 11:26:01 +00:00
Chris Wilson	e6ba764802	drm/i915: Remove i915->kernel_context Allocate only an internal intel_context for the kernel_context, forgoing a global GEM context for internal use as we only require a separate address space (for our own protection). Now having weaned GT from requiring ce->gem_context, we can stop referencing it entirely. This also means we no longer have to create random and unnecessary GEM contexts for internal use. GEM contexts are now entirely for tracking GEM clients, and intel_context the execution environment on the GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk	2019-12-21 16:37:10 +00:00
Chris Wilson	9f3ccd40ac	drm/i915: Drop GEM context as a direct link from i915_request Keep the intel_context as being the primary state for i915_request, with the GEM context a backpointer from the low level state for the rarer cases we need client information. Our goal is to remove such references to clients from the backend, and leave the HW submission agnostic to client interfaces and self-contained. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-1-chris@chris-wilson.co.uk	2019-12-20 10:52:21 +00:00
Chris Wilson	de5825beae	drm/i915: Serialise with engine-pm around requests on the kernel_context As the engine->kernel_context is used within the engine-pm barrier, we have to be careful when emitting requests outside of the barrier, as the strict timeline locking rules do not apply. Instead, we must ensure the engine_park() cannot be entered as we build the request, which is simplest by taking an explicit engine-pm wakeref around the request construction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-1-chris@chris-wilson.co.uk	2019-11-25 13:17:18 +00:00
Chris Wilson	cfd821b243	drm/i915/selftests: Force bonded submission to overlap Bonded request submission is designed to allow requests to execute in parallel as laid out by the user. If the master request is already finished before its bonded pair is submitted, the pair were not destined to run in parallel and we lose the information about the master engine to dictate selection of the secondary. If the second request was required to be run on a particular engine in a virtual set, that should have been specified, rather than left to the whims of a random unconnected requests! In the selftest, I made the mistake of not ensuring the master would overlap with its bonded pairs, meaning that it could indeed complete before we submitted the bonds. Those bonds were then free to select any available engine in their virtual set, and not the one expected by the test. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191122112152.660743-1-chris@chris-wilson.co.uk	2019-11-22 13:06:36 +00:00
Chris Wilson	1ff2f9e26c	drm/i915/selftests: Always hold a reference on a waited upon request Whenever we wait on a request, make sure we actually hold a reference to it and that it cannot be retired/freed on another CPU! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191121071044.97798-4-chris@chris-wilson.co.uk	2019-11-21 16:14:57 +00:00
Chris Wilson	2d19a71ce6	drm/i915/selftests: Exercise long preemption chains Verify that we can execute a long chain of dependent requests from userspace, each one slightly more important than the last. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191114225736.616885-4-chris@chris-wilson.co.uk	2019-11-15 16:46:18 +00:00
Chris Wilson	b0b1024886	drm/i915/execlists: Verify context register state before execution Check that the context's ring register state still matches our expectations prior to execution. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191102125739.24626-1-chris@chris-wilson.co.uk	2019-11-02 13:39:13 +00:00
Chris Wilson	e5661c6ab0	drm/i915/selftests: Start kthreads before stopping An interesting observation made with our parallel selftests was that on our small/single cpu systems we would call kthread_stop() before the kthreads were spawned. If this happens, the kthread is never run at all; completely bypassing the test. A simple yield() from the parent will ensure that all children have the opportunity to start before we reap them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191101084940.31838-1-chris@chris-wilson.co.uk	2019-11-01 10:12:29 +00:00
Chris Wilson	b79029b2e8	drm/i915/gt: Make timeslice duration configurable Execlists uses a scheduling quantum (a timeslice) to alternate execution between ready-to-run contexts of equal priority. This ensures that all users (though only if they of equal importance) have the opportunity to run and prevents livelocks where contexts may have implicit ordering due to userspace semaphores. However, not all workloads necessarily benefit from timeslicing and in the extreme some sysadmin may want to disable or reduce the timeslicing granularity. The timeslicing mechanism can be compiled out^W^W disabled (but should DCE!) with ./scripts/config --set-val DRM_I915_TIMESLICE_DURATION 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191029091632.26281-1-chris@chris-wilson.co.uk	2019-10-29 16:23:55 +00:00
Chris Wilson	13670f4ce9	drm/i915/selftests: Check a few more fixed locations within the context image As we use hard coded offsets for a few locations within the context image, include those in the selftests to assert that they are valid. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191028121803.29408-1-chris@chris-wilson.co.uk	2019-10-28 16:09:43 +00:00
Chris Wilson	35865aef05	drm/i915/tgl: Adjust the location of RING_MI_MODE in the context image The location of RING_MI_MODE (used to stop the ring across resets) moved for Tigerlake. Fixup the new location and include a selftest to verify the location in the default context image. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191026082220.32632-1-chris@chris-wilson.co.uk	2019-10-26 09:48:34 +01:00
Chris Wilson	d12acee84f	drm/i915/execlists: Cancel banned contexts on schedule-out On schedule-out (CS completion) of a banned context, scrub the context image so that we do not replay the active payload. The intent is that we skip banned payloads on request submission so that the timeline advancement continues on in the background. However, if we are returning to a preempted request, i915_request_skip() is ineffective and instead we need to patch up the context image so that it continues from the start of the next request. v2: Fixup cancellation so that we only scrub the payload of the active request and do not short-circuit the breadcrumbs (which might cause other contexts to execute out of order). v3: Grammar pass Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-3-chris@chris-wilson.co.uk	2019-10-23 23:52:10 +01:00
Chris Wilson	3a7a92aba8	drm/i915/execlists: Force preemption If the preempted context takes too long to relinquish control, e.g. it is stuck inside a shader with arbitration disabled, evict that context with an engine reset. This ensures that preemptions are reasonably responsive, providing a tighter QoS for the more important context at the cost of flagging unresponsive contexts more frequently (i.e. instead of using an ~10s hangcheck, we now evict at ~100ms). The challenge of lies in picking a timeout that can be reasonably serviced by HW for typical workloads, balancing the existing clients against the needs for responsiveness. Note that coupled with timeslicing, this will lead to rapid GPU "hang" detection with multiple active contexts vying for GPU time. The forced preemption mechanism can be compiled out with ./scripts/config --set-val DRM_I915_PREEMPT_TIMEOUT 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-2-chris@chris-wilson.co.uk	2019-10-23 23:52:10 +01:00
Chris Wilson	0587152bf9	drm/i915: Drop assertion that ce->pin_mutex guards state updates The actual conditions are that we know the GPU is not accessing the context, and we hold a pin on the context image to allow CPU access. We used a fake lock on ce->pin_mutex so that we could try and use lockdep to assert that access is serialised, but the various different hardirq/softirq contexts where we need to fake holding the pin_mutex are causing more trouble. Still it would be nice if we did have a way to reassure ourselves that the direct update to the context image is serialised with GPU execution. In the meantime, stop lockdep complaining about false irq inversions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111923 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191022122845.25038-1-chris@chris-wilson.co.uk	2019-10-22 13:32:01 +01:00
Chris Wilson	253a774bb0	drm/i915/execlists: Don't merely skip submission if maybe timeslicing Normally, we try and skip submission if ELSP[1] is filled. However, we may desire to enable timeslicing due to the queue priority, even if ELSP[1] itself does not require timeslicing. That is the queue is equal priority to ELSP[0] and higher priority then ELSP[1]. Previously, we would wait until the context switch to preempt the current ELSP[1], but with timeslicing, we want to preempt ELSP[0] and replace it with the queue. In writing the test case, it become quickly apparent that we were also suppressing the tasklet during promotion and so failing to notice when the queue started requiring timeslicing. Fixes: `2229adc813` ("drm/i915/execlist: Trim immediate timeslice expiry") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191018072027.31948-1-chris@chris-wilson.co.uk	2019-10-18 11:23:26 +01:00
Tvrtko Ursulin	5d904e3c5d	drm/i915: Pass in intel_gt at some for_each_engine sites Where the function, or code segment, operates on intel_gt, we need to start passing it instead of i915 to for_each_engine(_masked). This is another partial step in migration of i915->engines[] to gt->engines[]. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191017094500.21831-2-tvrtko.ursulin@linux.intel.com	2019-10-18 00:06:27 +01:00
Chris Wilson	1357fa8136	drm/i915/selftests: Teach execlists to take intel_gt as its argument The execlists selftests are hardware centric and so want to use the gt as its target. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191016120249.22714-1-chris@chris-wilson.co.uk	2019-10-16 18:19:29 +01:00
Chris Wilson	8574685547	drm/i915/selftests: Drop stale struct_mutex A lately added test was missed when applying the struct_mutex removal patches. Do so now. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191015085911.10317-1-chris@chris-wilson.co.uk	2019-10-16 09:54:28 +01:00
Chris Wilson	9506c23dfa	drm/i915/selftests: Check that GPR are cleared for new contexts We want the general purpose registers to be clear in all new contexts so that we can be confident that no information is leaked from one to the next. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191014090757.32111-7-chris@chris-wilson.co.uk	2019-10-14 11:10:28 +01:00
Chris Wilson	9c27462c89	drm/i915/selftests: Check known register values within the context Check the logical ring context by asserting that the registers hold expected start during execution. (It's a bit chicken-and-egg for how could we manage to execute our request if the registers were not being updated. Still, it's nice to verify that the HW is working as expected.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191014090757.32111-6-chris@chris-wilson.co.uk	2019-10-14 11:10:18 +01:00
Chris Wilson	86027e312c	drm/i915/selftests: Check that registers are preserved between virtual engines Make sure that we copy across the registers from one engine to the next, as we hop around a virtual engine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191010110252.17289-1-chris@chris-wilson.co.uk	2019-10-10 13:53:58 +01:00
Chris Wilson	1664f35aa7	drm/i915/selftests: Appease lockdep Disable irqs around updating the context image to keep lockdep happy: <4>[ 673.483340] WARNING: possible irq lock inversion dependency detected <4>[ 673.483342] 5.4.0-rc1-CI-Trybot_5118+ #1 Tainted: G U <4>[ 673.483342] -------------------------------------------------------- <4>[ 673.483343] swapper/2/0 just changed the state of lock: <4>[ 673.483344] ffff88845db885a0 (&i915_request_get(rq)->submit/1){-...}, at: __i915_sw_fence_complete+0x1b2/0x250 [i915] <4>[ 673.483387] but this lock took another, HARDIRQ-unsafe lock in the past: <4>[ 673.483388] (&ce->pin_mutex/2){+...} <4>[ 673.483389] and interrupts could create inverse lock ordering between them. <4>[ 673.483390] other info that might help us debug this: <4>[ 673.483390] Chain exists of: &i915_request_get(rq)->submit/1 --> &engine->active.lock --> &ce->pin_mutex/2 <4>[ 673.483392] Possible interrupt unsafe locking scenario: <4>[ 673.483392] CPU0 CPU1 <4>[ 673.483393] ---- ---- <4>[ 673.483393] lock(&ce->pin_mutex/2); <4>[ 673.483394] local_irq_disable(); <4>[ 673.483395] lock(&i915_request_get(rq)->submit/1); <4>[ 673.483396] lock(&engine->active.lock); <4>[ 673.483396] <Interrupt> <4>[ 673.483397] lock(&i915_request_get(rq)->submit/1); <4>[ 673.483398] * DEADLOCK * <4>[ 673.483398] 2 locks held by swapper/2/0: <4>[ 673.483399] #0: ffff8883f61ac9b0 (&(&gt->irq_lock)->rlock){-.-.}, at: gen11_gt_irq_handler+0x42/0x280 [i915] <4>[ 673.483433] #1: ffff88845db8c418 (&(&rq->lock)->rlock){-.-.}, at: intel_engine_breadcrumbs_irq+0x34a/0x5a0 [i915] <4>[ 673.483463] the shortest dependencies between 2nd lock and 1st lock: <4>[ 673.483466] -> (&ce->pin_mutex/2){+...} ops: 614520 { <4>[ 673.483468] HARDIRQ-ON-W at: <4>[ 673.483471] lock_acquire+0xa7/0x1c0 <4>[ 673.483501] live_unlite_restore+0x1d8/0x6c0 [i915] <4>[ 673.483543] __i915_subtests+0xb8/0x210 [i915] <4>[ 673.483581] __run_selftests+0x112/0x170 [i915] <4>[ 673.483615] i915_live_selftests+0x2c/0x60 [i915] <4>[ 673.483644] i915_pci_probe+0x93/0x1b0 [i915] <4>[ 673.483646] pci_device_probe+0x9e/0x120 <4>[ 673.483648] really_probe+0xea/0x420 <4>[ 673.483649] driver_probe_device+0x10b/0x120 <4>[ 673.483651] device_driver_attach+0x4a/0x50 <4>[ 673.483652] __driver_attach+0x97/0x130 <4>[ 673.483653] bus_for_each_dev+0x74/0xc0 <4>[ 673.483654] bus_add_driver+0x142/0x220 <4>[ 673.483655] driver_register+0x56/0xf0 <4>[ 673.483657] do_one_initcall+0x58/0x2ff <4>[ 673.483659] do_init_module+0x56/0x1f8 <4>[ 673.483660] load_module+0x243e/0x29f0 <4>[ 673.483661] __do_sys_finit_module+0xe9/0x110 <4>[ 673.483662] do_syscall_64+0x4f/0x210 <4>[ 673.483665] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4>[ 673.483665] INITIAL USE at: <4>[ 673.483667] lock_acquire+0xa7/0x1c0 <4>[ 673.483698] live_unlite_restore+0x1d8/0x6c0 [i915] <4>[ 673.483733] __i915_subtests+0xb8/0x210 [i915] <4>[ 673.483764] __run_selftests+0x112/0x170 [i915] <4>[ 673.483793] i915_live_selftests+0x2c/0x60 [i915] <4>[ 673.483821] i915_pci_probe+0x93/0x1b0 [i915] <4>[ 673.483822] pci_device_probe+0x9e/0x120 <4>[ 673.483824] really_probe+0xea/0x420 <4>[ 673.483825] driver_probe_device+0x10b/0x120 <4>[ 673.483826] device_driver_attach+0x4a/0x50 <4>[ 673.483827] __driver_attach+0x97/0x130 <4>[ 673.483828] bus_for_each_dev+0x74/0xc0 <4>[ 673.483829] bus_add_driver+0x142/0x220 <4>[ 673.483830] driver_register+0x56/0xf0 <4>[ 673.483831] do_one_initcall+0x58/0x2ff <4>[ 673.483833] do_init_module+0x56/0x1f8 <4>[ 673.483834] load_module+0x243e/0x29f0 <4>[ 673.483835] __do_sys_finit_module+0xe9/0x110 <4>[ 673.483836] do_syscall_64+0x4f/0x210 <4>[ 673.483837] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4>[ 673.483838] } <4>[ 673.483868] ... key at: [<ffffffffa0a8f132>] __key.70113+0x2/0xffffffffffef2ed0 [i915] <4>[ 673.483869] ... acquired at: <4>[ 673.483935] __execlists_reset+0xfb/0xc20 [i915] <4>[ 673.483965] execlists_reset+0x3d/0x50 [i915] <4>[ 673.483995] intel_engine_reset+0xdf/0x230 [i915] <4>[ 673.484022] live_preempt_hang+0x1d7/0x2e0 [i915] <4>[ 673.484064] __i915_subtests+0xb8/0x210 [i915] <4>[ 673.484130] __run_selftests+0x112/0x170 [i915] <4>[ 673.484163] i915_live_selftests+0x2c/0x60 [i915] <4>[ 673.484193] i915_pci_probe+0x93/0x1b0 [i915] <4>[ 673.484194] pci_device_probe+0x9e/0x120 <4>[ 673.484195] really_probe+0xea/0x420 <4>[ 673.484196] driver_probe_device+0x10b/0x120 <4>[ 673.484197] device_driver_attach+0x4a/0x50 <4>[ 673.484198] __driver_attach+0x97/0x130 <4>[ 673.484199] bus_for_each_dev+0x74/0xc0 <4>[ 673.484200] bus_add_driver+0x142/0x220 <4>[ 673.484202] driver_register+0x56/0xf0 <4>[ 673.484203] do_one_initcall+0x58/0x2ff <4>[ 673.484204] do_init_module+0x56/0x1f8 <4>[ 673.484205] load_module+0x243e/0x29f0 <4>[ 673.484206] __do_sys_finit_module+0xe9/0x110 <4>[ 673.484207] do_syscall_64+0x4f/0x210 <4>[ 673.484208] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4>[ 673.484209] -> (&engine->active.lock){..-.} ops: 972791 { <4>[ 673.484211] IN-SOFTIRQ-W at: <4>[ 673.484213] lock_acquire+0xa7/0x1c0 <4>[ 673.484214] _raw_spin_lock_irqsave+0x33/0x50 <4>[ 673.484244] execlists_submission_tasklet+0xaf/0x100 [i915] <4>[ 673.484246] tasklet_action_common.isra.18+0x6c/0x1c0 <4>[ 673.484247] __do_softirq+0xdf/0x47f <4>[ 673.484248] irq_exit+0xba/0xc0 <4>[ 673.484249] do_IRQ+0x83/0x160 <4>[ 673.484250] ret_from_intr+0x0/0x1d <4>[ 673.484252] cpuidle_enter_state+0xb2/0x450 <4>[ 673.484253] cpuidle_enter+0x24/0x40 <4>[ 673.484254] do_idle+0x1e7/0x250 <4>[ 673.484256] cpu_startup_entry+0x14/0x20 <4>[ 673.484257] start_secondary+0x15f/0x1b0 <4>[ 673.484258] secondary_startup_64+0xa4/0xb0 <4>[ 673.484259] INITIAL USE at: <4>[ 673.484261] lock_acquire+0xa7/0x1c0 <4>[ 673.484290] intel_engine_init_active+0x7e/0xb0 [i915] <4>[ 673.484305] intel_engines_setup+0x1cd/0x3b0 [i915] <4>[ 673.484305] i915_gem_init+0x12d/0x900 [i915] <4>[ 673.484305] i915_driver_probe+0xb70/0x15d0 [i915] <4>[ 673.484305] i915_pci_probe+0x43/0x1b0 [i915] <4>[ 673.484305] pci_device_probe+0x9e/0x120 <4>[ 673.484305] really_probe+0xea/0x420 <4>[ 673.484305] driver_probe_device+0x10b/0x120 <4>[ 673.484305] device_driver_attach+0x4a/0x50 <4>[ 673.484305] __driver_attach+0x97/0x130 <4>[ 673.484305] bus_for_each_dev+0x74/0xc0 <4>[ 673.484305] bus_add_driver+0x142/0x220 <4>[ 673.484305] driver_register+0x56/0xf0 <4>[ 673.484305] do_one_initcall+0x58/0x2ff <4>[ 673.484305] do_init_module+0x56/0x1f8 <4>[ 673.484305] load_module+0x243e/0x29f0 <4>[ 673.484305] __do_sys_finit_module+0xe9/0x110 <4>[ 673.484305] do_syscall_64+0x4f/0x210 <4>[ 673.484305] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4>[ 673.484305] } <4>[ 673.484305] ... key at: [<ffffffffa0a8f160>] __key.70307+0x0/0xffffffffffef2ea0 [i915] <4>[ 673.484305] ... acquired at: <4>[ 673.484305] _raw_spin_lock_irqsave+0x33/0x50 <4>[ 673.484305] execlists_submit_request+0x2b/0x1e0 [i915] <4>[ 673.484305] submit_notify+0xa8/0x13c [i915] <4>[ 673.484305] __i915_sw_fence_complete+0x81/0x250 [i915] <4>[ 673.484305] i915_sw_fence_wake+0x51/0x70 [i915] <4>[ 673.484305] __i915_sw_fence_complete+0x1ee/0x250 [i915] <4>[ 673.484305] dma_i915_sw_fence_wake+0x1b/0x30 [i915] <4>[ 673.484305] dma_fence_signal_locked+0x9e/0x1b0 <4>[ 673.484305] dma_fence_signal+0x1f/0x40 <4>[ 673.484305] fence_work+0x28/0x80 [i915] <4>[ 673.484305] process_one_work+0x26a/0x620 <4>[ 673.484305] worker_thread+0x37/0x380 <4>[ 673.484305] kthread+0x119/0x130 <4>[ 673.484305] ret_from_fork+0x24/0x50 <4>[ 673.484305] -> (&i915_request_get(rq)->submit/1){-...} ops: 857694 { <4>[ 673.484305] IN-HARDIRQ-W at: <4>[ 673.484305] lock_acquire+0xa7/0x1c0 <4>[ 673.484305] _raw_spin_lock_irqsave_nested+0x39/0x50 <4>[ 673.484305] __i915_sw_fence_complete+0x1b2/0x250 [i915] <4>[ 673.484305] intel_engine_breadcrumbs_irq+0x3d0/0x5a0 [i915] <4>[ 673.484305] cs_irq_handler+0x39/0x50 [i915] <4>[ 673.484305] gen11_gt_irq_handler+0x17b/0x280 [i915] <4>[ 673.484305] gen11_irq_handler+0x54/0xf0 [i915] <4>[ 673.484305] __handle_irq_event_percpu+0x41/0x2c0 <4>[ 673.484305] handle_irq_event_percpu+0x2b/0x70 <4>[ 673.484305] handle_irq_event+0x2f/0x50 <4>[ 673.484305] handle_edge_irq+0x99/0x1b0 <4>[ 673.484305] do_IRQ+0x7e/0x160 <4>[ 673.484305] ret_from_intr+0x0/0x1d <4>[ 673.484305] cpuidle_enter_state+0xb2/0x450 <4>[ 673.484305] cpuidle_enter+0x24/0x40 <4>[ 673.484305] do_idle+0x1e7/0x250 <4>[ 673.484305] cpu_startup_entry+0x14/0x20 <4>[ 673.484305] start_secondary+0x15f/0x1b0 <4>[ 673.484305] secondary_startup_64+0xa4/0xb0 <4>[ 673.484305] INITIAL USE at: <4>[ 673.484305] lock_acquire+0xa7/0x1c0 <4>[ 673.484305] _raw_spin_lock_irqsave_nested+0x39/0x50 <4>[ 673.484305] __i915_sw_fence_complete+0x1b2/0x250 [i915] <4>[ 673.484305] __engine_park+0x233/0x420 [i915] <4>[ 673.484305] ____intel_wakeref_put_last+0x1c/0x70 [i915] <4>[ 673.484305] intel_gt_resume+0x202/0x2c0 [i915] <4>[ 673.484305] i915_gem_init+0x36e/0x900 [i915] <4>[ 673.484305] i915_driver_probe+0xb70/0x15d0 [i915] <4>[ 673.484305] i915_pci_probe+0x43/0x1b0 [i915] <4>[ 673.484305] pci_device_probe+0x9e/0x120 <4>[ 673.484305] really_probe+0xea/0x420 <4>[ 673.484305] driver_probe_device+0x10b/0x120 <4>[ 673.484305] device_driver_attach+0x4a/0x50 <4>[ 673.484305] __driver_attach+0x97/0x130 <4>[ 673.484305] bus_for_each_dev+0x74/0xc0 <4>[ 673.484305] bus_add_driver+0x142/0x220 <4>[ 673.484305] driver_register+0x56/0xf0 <4>[ 673.484305] do_one_initcall+0x58/0x2ff <4>[ 673.484305] do_init_module+0x56/0x1f8 <4>[ 673.484305] load_module+0x243e/0x29f0 <4>[ 673.484305] __do_sys_finit_module+0xe9/0x110 <4>[ 673.484305] do_syscall_64+0x4f/0x210 <4>[ 673.484305] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4>[ 673.484305] } <4>[ 673.484305] ... key at: [<ffffffffa0a8f6a1>] __key.80173+0x1/0xffffffffffef2960 [i915] <4>[ 673.484305] ... acquired at: <4>[ 673.484305] mark_lock+0x382/0x500 <4>[ 673.484305] __lock_acquire+0x7e1/0x15d0 <4>[ 673.484305] lock_acquire+0xa7/0x1c0 <4>[ 673.484305] _raw_spin_lock_irqsave_nested+0x39/0x50 <4>[ 673.484305] __i915_sw_fence_complete+0x1b2/0x250 [i915] <4>[ 673.484305] intel_engine_breadcrumbs_irq+0x3d0/0x5a0 [i915] <4>[ 673.484305] cs_irq_handler+0x39/0x50 [i915] <4>[ 673.484305] gen11_gt_irq_handler+0x17b/0x280 [i915] <4>[ 673.484305] gen11_irq_handler+0x54/0xf0 [i915] <4>[ 673.484305] __handle_irq_event_percpu+0x41/0x2c0 <4>[ 673.484305] handle_irq_event_percpu+0x2b/0x70 <4>[ 673.484305] handle_irq_event+0x2f/0x50 <4>[ 673.484305] handle_edge_irq+0x99/0x1b0 <4>[ 673.484305] do_IRQ+0x7e/0x160 <4>[ 673.484305] ret_from_intr+0x0/0x1d <4>[ 673.484305] cpuidle_enter_state+0xb2/0x450 <4>[ 673.484305] cpuidle_enter+0x24/0x40 <4>[ 673.484305] do_idle+0x1e7/0x250 <4>[ 673.484305] cpu_startup_entry+0x14/0x20 <4>[ 673.484305] start_secondary+0x15f/0x1b0 <4>[ 673.484305] secondary_startup_64+0xa4/0xb0 <4>[ 673.484305] stack backtrace: <4>[ 673.484305] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G U 5.4.0-rc1-CI-Trybot_5118+ #1 <4>[ 673.484305] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019 <4>[ 673.484305] Call Trace: <4>[ 673.484305] <IRQ> <4>[ 673.484305] dump_stack+0x67/0x9b <4>[ 673.484305] check_usage_forwards+0x13c/0x150 <4>[ 673.484305] ? mark_lock+0x382/0x500 <4>[ 673.484305] mark_lock+0x382/0x500 <4>[ 673.484305] ? check_usage_backwards+0x140/0x140 <4>[ 673.484305] __lock_acquire+0x7e1/0x15d0 <4>[ 673.484305] ? debug_object_deactivate+0x17e/0x190 <4>[ 673.484305] lock_acquire+0xa7/0x1c0 <4>[ 673.484305] ? __i915_sw_fence_complete+0x1b2/0x250 [i915] <4>[ 673.484305] _raw_spin_lock_irqsave_nested+0x39/0x50 <4>[ 673.484305] ? __i915_sw_fence_complete+0x1b2/0x250 [i915] <4>[ 673.484305] __i915_sw_fence_complete+0x1b2/0x250 [i915] <4>[ 673.484305] intel_engine_breadcrumbs_irq+0x3d0/0x5a0 [i915] <4>[ 673.484305] cs_irq_handler+0x39/0x50 [i915] <4>[ 673.484305] gen11_gt_irq_handler+0x17b/0x280 [i915] <4>[ 673.484305] gen11_irq_handler+0x54/0xf0 [i915] <4>[ 673.484305] __handle_irq_event_percpu+0x41/0x2c0 <4>[ 673.484305] handle_irq_event_percpu+0x2b/0x70 <4>[ 673.484305] handle_irq_event+0x2f/0x50 <4>[ 673.484305] handle_edge_irq+0x99/0x1b0 <4>[ 673.484305] do_IRQ+0x7e/0x160 <4>[ 673.484305] common_interrupt+0xf/0xf <4>[ 673.484305] </IRQ> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004203121.31138-1-chris@chris-wilson.co.uk	2019-10-07 21:44:02 +01:00
Chris Wilson	2af402982a	drm/i915/selftests: Drop vestigal struct_mutex guards We no longer need struct_mutex to serialise request emission, so remove it from the gt selftests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-20-chris@chris-wilson.co.uk	2019-10-04 15:39:41 +01:00
Chris Wilson	a4e7ccdac3	drm/i915: Move context management under GEM Keep track of the GEM contexts underneath i915->gem.contexts and assign them their own lock for the purposes of list management. v2: Focus on lock tracking; ctx->vm is protected by ctx->mutex v3: Correct split with removal of logical HW ID Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-15-chris@chris-wilson.co.uk	2019-10-04 15:39:34 +01:00
Chris Wilson	7e80576266	drm/i915: Drop struct_mutex from around i915_retire_requests() We don't need to hold struct_mutex now for retiring requests, so drop it from i915_retire_requests() and i915_gem_wait_for_idle(), finally removing I915_WAIT_LOCKED for good. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-8-chris@chris-wilson.co.uk	2019-10-04 15:39:17 +01:00
Chris Wilson	b1e3177bd1	drm/i915: Coordinate i915_active with its own mutex Forgo the struct_mutex serialisation for i915_active, and interpose its own mutex handling for active/retire. This is a multi-layered sleight-of-hand. First, we had to ensure that no active/retire callbacks accidentally inverted the mutex ordering rules, nor assumed that they were themselves serialised by struct_mutex. More challenging though, is the rule over updating elements of the active rbtree. Instead of the whole i915_active now being serialised by struct_mutex, allocations/rotations of the tree are serialised by the i915_active.mutex and individual nodes are serialised by the caller using the i915_timeline.mutex (we need to use nested spinlocks to interact with the dma_fence callback lists). The pain point here is that instead of a single mutex around execbuf, we now have to take a mutex for active tracker (one for each vma, context, etc) and a couple of spinlocks for each fence update. The improvement in fine grained locking allowing for multiple concurrent clients (eventually!) should be worth it in typical loads. v2: Add some comments that barely elucidate anything :( Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk	2019-10-04 15:39:12 +01:00
Chris Wilson	fcde8c7eea	drm/i915/selftests: Exercise potential false lite-restore If execlists's lite-restore is based on the common GEM context tag rather than the per-intel_context LRCA, then a context switch between two intel_contexts on the same engine derived from the same GEM context will perform a lite-restore instead of a full context switch. We can exploit this by poisoning the ringbuffer of the first context and trying to trick a simple RING_TAIL update (i.e. lite-restore) v2: Also check what happens if preempt ce[0] with ce[1] (both instances on the same engine from the same parent context) [Tvrtko] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191002183459.26614-1-chris@chris-wilson.co.uk	2019-10-03 00:26:02 +01:00
Chris Wilson	260e6b7127	drm/i915: Pass intel_gt to has-reset? As we execute GPU resets on a gt/ basis, and use the intel_gt as the primary for all other reset functions, also use it for the has-reset? predicates. Gradually simplifying the churn of pointers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190927211749.2181-1-chris@chris-wilson.co.uk	2019-09-27 23:25:14 +01:00
Chris Wilson	7dc56af526	drm/i915/selftests: Verify the LRC register layout between init and HW Before we submit the first context to HW, we need to construct a valid image of the register state. This layout is defined by the HW and should match the layout generated by HW when it saves the context image. Asserting that this should be equivalent should help avoid any undefined behaviour and verify that we haven't missed anything important! Of course, having insisted that the initial register state within the LRC should match that returned by HW, we need to ensure that it does. v2: Drop the RELATIVE_MMIO flag from gen11, we ignore it for constructing the lrc image. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190924145950.3011-1-chris@chris-wilson.co.uk	2019-09-24 17:27:19 +01:00
Chris Wilson	d19d71fc2b	drm/i915: Mark i915_request.timeline as a volatile, rcu pointer The request->timeline is only valid until the request is retired (i.e. before it is completed). Upon retiring the request, the context may be unpinned and freed, and along with it the timeline may be freed. We therefore need to be very careful when chasing rq->timeline that the pointer does not disappear beneath us. The vast majority of users are in a protected context, either during request construction or retirement, where the timeline->mutex is held and the timeline cannot disappear. It is those few off the beaten path (where we access a second timeline) that need extra scrutiny -- to be added in the next patch after first adding the warnings about dangerous access. One complication, where we cannot use the timeline->mutex itself, is during request submission onto hardware (under spinlocks). Here, we want to check on the timeline to finalize the breadcrumb, and so we need to impose a second rule to ensure that the request->timeline is indeed valid. As we are submitting the request, it's context and timeline must be pinned, as it will be used by the hardware. Since it is pinned, we know the request->timeline must still be valid, and we cannot submit the idle barrier until after we release the engine->active.lock, ergo while submitting and holding that spinlock, a second thread cannot release the timeline. v2: Don't be lazy inside selftests; hold the timeline->mutex for as long as we need it, and tidy up acquiring the timeline with a bit of refactoring (i915_active_add_request) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk	2019-09-20 10:24:09 +01:00
Chris Wilson	0b8d6273db	drm/i915/selftests: Keep the engine awake while we keep for preemption Keep the engine awake to ensure that we don't inject any pm-idle requests. References: https://bugs.freedesktop.org/show_bug.cgi?id=111108 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190912122639.25224-1-chris@chris-wilson.co.uk	2019-09-12 21:02:54 +01:00
Chris Wilson	70d6894d14	drm/i915: Serialize against vma moves Make sure that when submitting requests, we always serialize against potential vma moves and clflushes. Time for a i915_request_await_vma() interface! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190819112033.30638-1-chris@chris-wilson.co.uk	2019-08-19 15:25:56 +01:00
Chris Wilson	acb9488dca	drm/i915/selftests: Prevent the timeslice expiring during suppression tests When testing whether we prevent suppressing preemption, it helps to avoid a time slice expiring prematurely. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111108 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190812091045.29587-2-chris@chris-wilson.co.uk	2019-08-12 13:18:13 +01:00
Chris Wilson	f1c4d157ab	drm/i915: Fix up the inverse mapping for default ctx->engines[] The order in which we store the engines inside default_engines() for the legacy ctx->engines[] has to match the legacy I915_EXEC_RING selector mapping in execbuf::user_map. If we present VCS2 as being the second instance of the video engine, legacy userspace calls that I915_EXEC_BSD2 and so we need to insert it into the second video slot. v2: Record the legacy mapping (hopefully we can remove this need in the future) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111328 Fixes: `2edda80db3` ("drm/i915: Rename engines to match their user interface") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> #v1 Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190808110612.23539-2-chris@chris-wilson.co.uk	2019-08-08 15:45:35 +01:00
Chris Wilson	750e76b4f9	drm/i915/gt: Move the [class][inst] lookup for engines onto the GT To maintain a fast lookup from a GT centric irq handler, we want the engine lookup tables on the intel_gt. To avoid having multiple copies of the same multi-dimension lookup table, move the generic user engine lookup into an rbtree (for fast and flexible indexing). v2: Split uabi_instance cf uabi_class v3: Set uabi_class/uabi_instance after collating all engines to provide a stable uabi across parallel unordered construction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2 Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk	2019-08-06 15:00:43 +01:00
Chris Wilson	f277bc0c98	drm/i915/selftests: Pass intel_context to igt_spinner Teach igt_spinner to only use our internal structs, decoupling the interface from the GEM contexts. This makes it easier to avoid requiring ce->gem_context back references for kernel_context that may have them in future. v2: Lift engine lock to verify_wa() caller. v3: Less than v2, but more so Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190731081126.9139-1-chris@chris-wilson.co.uk	2019-07-31 09:45:27 +01:00
Chris Wilson	09975b861a	drm/i915/execlists: Disable preemption under GVT Preempt-to-busy uses a GPU semaphore to enforce an idle-barrier across preemption, but mediated gvt does not fully support semaphores. v2: Fiddle around with the flags and settle on using has-semaphores for the core bits so that we retain the ability to preempt our own semaphores. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Xiaolin Zhang <xiaolin.zhang@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190709091233.8573-1-chris@chris-wilson.co.uk	2019-07-16 14:06:45 +01:00
Chris Wilson	506927ec8b	drm/i915/selftests: Ignore self-preemption suppression under gvt GVT forces single port submission of individual requests. We do not enjoy the context amalgamation that the test depends upon for setting up the test (where port 0 has a large number of requests with a priority change somewhere in the middle). Under single request submission of gvt it is quite able for the preemption event to occur while another context is active and so there be a real need to act upon that preemption. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111108 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190712082549.25053-1-chris@chris-wilson.co.uk	2019-07-15 10:24:15 +01:00
Chris Wilson	cb823ed991	drm/i915/gt: Use intel_gt as the primary object for handling resets Having taken the first step in encapsulating the functionality by moving the related files under gt/, the next step is to start encapsulating by passing around the relevant structs rather than the global drm_i915_private. In this step, we pass intel_gt to intel_reset.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190712192953.9187-1-chris@chris-wilson.co.uk	2019-07-12 21:06:56 +01:00
Lionel Landwerlin	2a98f4e65b	drm/i915: add infrastructure to hold off preemption on a request We want to set this flag in the next commit on requests containing perf queries so that the result of the perf query can just be a delta of global counters, rather than doing post processing of the OA buffer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [ickle: add basic selftest for nopreempt] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190709164227.25859-1-chris@chris-wilson.co.uk	2019-07-09 21:26:40 +01:00
Chris Wilson	cbcec57e9d	drm/i915/selftests: Fill in a little more of the dummy fence Initialise the dma_fence innards in preparation for making dma_fence_signal() always check the callback list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190708113038.19251-1-chris@chris-wilson.co.uk	2019-07-09 08:34:22 +01:00
Chris Wilson	63251685c1	drm/i915/selftests: Common live setup/teardown We frequently, but not frequently enough!, remember to flush residual operations and objects at the end of a live subtest. The purpose is to cleanup after every subtest, leaving a clean slate for the next subtest, and perform early detection of leaky state. As this should ideally be common for all live subtests, pull the task into a common teardown routine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190703091726.11690-1-chris@chris-wilson.co.uk	2019-07-03 11:07:57 +01:00
Chris Wilson	8ee36e048c	drm/i915/execlists: Minimalistic timeslicing If we have multiple contexts of equal priority pending execution, activate a timer to demote the currently executing context in favour of the next in the queue when that timeslice expires. This enforces fairness between contexts (so long as they allow preemption -- forced preemption, in the future, will kick those who do not obey) and allows us to avoid userspace blocking forward progress with e.g. unbounded MI_SEMAPHORE_WAIT. For the starting point here, we use the jiffie as our timeslice so that we should be reasonably efficient wrt frequent CPU wakeups. Testcase: igt/gem_exec_scheduler/semaphore-resolve Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-2-chris@chris-wilson.co.uk	2019-06-20 16:52:36 +01:00
Chris Wilson	2f5309452d	drm/i915: Stop passing I915_WAIT_LOCKED to i915_request_wait() Since commit `eb8d0f5af4` ("drm/i915: Remove GPU reset dependence on struct_mutex"), the I915_WAIT_LOCKED flags passed to i915_request_wait() has been defunct. Now go ahead and remove it from all callers. References: `eb8d0f5af4` ("drm/i915: Remove GPU reset dependence on struct_mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-3-chris@chris-wilson.co.uk	2019-06-19 12:58:38 +01:00
Daniele Ceraolo Spurio	d858d5695f	drm/i915: update rpm_get/put to use the rpm structure The functions where internally already only using the structure, so we need to just flip the interface. v2: rebase Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com	2019-06-14 15:58:33 +01:00
Chris Wilson	e568ac3874	drm/i915: Pull kref into i915_address_space Make the kref common to both derived structs (i915_ggtt and i915_ppgtt) so that we can safely reference count an abstract ctx->vm address space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190611091238.15808-1-chris@chris-wilson.co.uk	2019-06-11 11:44:24 +01:00
Dan Carpenter	81a04d2e90	drm/i915: selftest_lrc: Check the correct variable We should check "request[n]" instead of just "request". Fixes: `78e41ddd21` ("drm/i915: Apply an execution_mask to the virtual_engine") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190529110355.GA19119@mwanda	2019-05-29 12:07:59 +01:00
Chris Wilson	6951e5893b	drm/i915: Move GEM object domain management from struct_mutex to local Use the per-object local lock to control the cache domain of the individual GEM objects, not struct_mutex. This is a huge leap forward for us in terms of object-level synchronisation; execbuffers are coordinated using the ww_mutex and pread/pwrite is finally fully serialised again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk	2019-05-28 12:45:29 +01:00
Chris Wilson	10be98a77c	drm/i915: Move more GEM objects under gem/ Continuing the theme of separating out the GEM clutter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk	2019-05-28 12:45:29 +01:00
Chris Wilson	ee1136908e	drm/i915/execlists: Virtual engine bonding Some users require that when a master batch is executed on one particular engine, a companion batch is run simultaneously on a specific slave engine. For this purpose, we introduce virtual engine bonding, allowing maps of master:slaves to be constructed to constrain which physical engines a virtual engine may select given a fence on a master engine. For the moment, we continue to ignore the issue of preemption deferring the master request for later. Ideally, we would like to then also remove the slave and run something else rather than have it stall the pipeline. With load balancing, we should be able to move workload around it, but there is a similar stall on the master pipeline while it may wait for the slave to be executed. At the cost of more latency for the bonded request, it may be interesting to launch both on their engines in lockstep. (Bubbles abound.) Opens: Also what about bonding an engine as its own master? It doesn't break anything internally, so allow the silliness. v2: Emancipate the bonds v3: Couple in delayed scheduling for the selftests v4: Handle invalid mutually exclusive bonding v5: Mention what the uapi does v6: s/nbond/num_bonds/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-9-chris@chris-wilson.co.uk	2019-05-22 08:40:46 +01:00
Chris Wilson	78e41ddd21	drm/i915: Apply an execution_mask to the virtual_engine Allow the user to direct which physical engines of the virtual engine they wish to execute one, as sometimes it is necessary to override the load balancing algorithm. v2: Only kick the virtual engines on context-out if required Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-7-chris@chris-wilson.co.uk	2019-05-22 08:40:43 +01:00
Chris Wilson	6d06779e86	drm/i915: Load balancing across a virtual engine Having allowed the user to define a set of engines that they will want to only use, we go one step further and allow them to bind those engines into a single virtual instance. Submitting a batch to the virtual engine will then forward it to any one of the set in a manner as best to distribute load. The virtual engine has a single timeline across all engines (it operates as a single queue), so it is not able to concurrently run batches across multiple engines by itself; that is left up to the user to submit multiple concurrent batches to multiple queues. Multiple users will be load balanced across the system. The mechanism used for load balancing in this patch is a late greedy balancer. When a request is ready for execution, it is added to each engine's queue, and when an engine is ready for its next request it claims it from the virtual engine. The first engine to do so, wins, i.e. the request is executed at the earliest opportunity (idle moment) in the system. As not all HW is created equal, the user is still able to skip the virtual engine and execute the batch on a specific engine, all within the same queue. It will then be executed in order on the correct engine, with execution on other virtual engines being moved away due to the load detection. A couple of areas for potential improvement left! - The virtual engine always take priority over equal-priority tasks. Mostly broken up by applying FQ_CODEL rules for prioritising new clients, and hopefully the virtual and real engines are not then congested (i.e. all work is via virtual engines, or all work is to the real engine). - We require the breadcrumb irq around every virtual engine request. For normal engines, we eliminate the need for the slow round trip via interrupt by using the submit fence and queueing in order. For virtual engines, we have to allow any job to transfer to a new ring, and cannot coalesce the submissions, so require the completion fence instead, forcing the persistent use of interrupts. - We only drip feed single requests through each virtual engine and onto the physical engines, even if there was enough work to fill all ELSP, leaving small stalls with an idle CS event at the end of every request. Could we be greedy and fill both slots? Being lazy is virtuous for load distribution on less-than-full workloads though. Other areas of improvement are more general, such as reducing lock contention, reducing dispatch overhead, looking at direct submission rather than bouncing around tasklets etc. sseu: Lift the restriction to allow sseu to be reconfigured on virtual engines composed of RENDER_CLASS (rcs). v2: macroize check_user_mbz() v3: Cancel virtual engines on wedging v4: Commence commenting v5: Replace 64b sibling_mask with a list of class:instance v6: Drop the one-element array in the uabi v7: Assert it is an virtual engine in to_virtual_engine() v8: Skip over holes in [class][inst] so we can selftest with (vcs0, vcs2) Link: https://github.com/intel/media-driver/pull/283 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-6-chris@chris-wilson.co.uk	2019-05-22 08:40:38 +01:00
Chris Wilson	6e7eb7a807	drm/i915: Bump signaler priority on adding a waiter The handling of the no-preemption priority level imposes the restriction that we need to maintain the implied ordering even though preemption is disabled. Otherwise we may end up with an AB-BA deadlock across multiple engine due to a real preemption event reordering the no-preemption WAITs. To resolve this issue we currently promote all requests to WAIT on unsubmission, however this interferes with the timeslicing requirement that we do not apply any implicit promotion that will defeat the round-robin timeslice list. (If we automatically promote the active request it will go back to the head of the queue and not the tail!) So we need implicit promotion to prevent reordering around semaphores where we are not allowed to preempt, and we must avoid implicit promotion on unsubmission. So instead of at unsubmit, if we apply that implicit promotion on adding the dependency, we avoid the semaphore deadlock and we also reduce the gains made by the promotion for user space waiting. Furthermore, by keeping the earlier dependencies at a higher level, we reduce the search space for timeslicing without altering runtime scheduling too badly (no dependencies at all will be assigned a higher priority for rrul). v2: Limit the bump to external edges (as originally intended) i.e. between contexts and out to the user. Testcase: igt/gem_concurrent_blit Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190515130052.4475-3-chris@chris-wilson.co.uk	2019-05-17 16:04:46 +01:00
Chris Wilson	25d851adbf	drm/i915: Only reschedule the submission tasklet if preemption is possible If we couple the scheduler more tightly with the execlists policy, we can apply the preemption policy to the question of whether we need to kick the tasklet at all for this priority bump. v2: Rephrase it as a core i915 policy and not an execlists foible. v3: Pull the kick together. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190507122544.12698-1-chris@chris-wilson.co.uk	2019-05-07 17:40:20 +01:00
Chris Wilson	46472b3efb	drm/i915: Move i915_request_alloc into selftests/ Having transitioned GEM over to using intel_context as its primary means of tracking the GEM context and engine combined and using i915_request_create(), we can move the older i915_request_alloc() helper function into selftests/ where the remaining users are confined. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-9-chris@chris-wilson.co.uk	2019-04-26 18:32:20 +01:00
Chris Wilson	112ed2d31a	drm/i915: Move GraphicsTechnology files under gt/ Start partitioning off the code that talks to the hardware (GT) from the uapi layers and move the device facing code under gt/ One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to subdivide that header and body further (and split out the submission code from the ringbuffer and logical context handling). This patch aims to be simple motion so git can fixup inflight patches with little mess. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk	2019-04-24 21:01:46 +01:00

1 2 3

141 Commits