Commit Graph

75 Commits

Author SHA1 Message Date
Chris Wilson 117172c8f9 drm/i915/breadcrumbs: Ignore unsubmitted signalers
When a request is preempted, it is unsubmitted from the HW queue and
removed from the active list of breadcrumbs. In the process, this
however triggers the signaler and it may see the clear rbtree with the
old, and still valid, seqno, or it may match the cleared seqno with the
now zero rq->global_seqno. This confuses the signaler into action and
signaling the fence.

Fixes: d6a2289d9d ("drm/i915: Remove the preempted request from the execution queue")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.12+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180206094633.30181-1-chris@chris-wilson.co.uk
(cherry picked from commit fd10e2ce99)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180213090154.17373-1-chris@chris-wilson.co.uk
2018-02-13 16:55:45 -08:00
Chris Wilson b26a32a82a drm/i915: Always run hangcheck while the GPU is busy
Previously, we relied on only running the hangcheck while somebody was
waiting on the GPU, in order to minimise the amount of time hangcheck
had to run. (If nobody was watching the GPU, nobody would notice if the
GPU wasn't responding -- eventually somebody would care and so kick
hangcheck into action.) However, this falls apart from around commit
4680816be3 ("drm/i915: Wait first for submission, before waiting for
request completion"), as not all waiters declare themselves to hangcheck
and so we could switch off hangcheck and miss GPU hangs even when
waiting under the struct_mutex.

If we enable hangcheck from the first request submission, and let it run
until the GPU is idle again, we forgo all the complexity involved with
only enabling around waiters. We just have to remember to be careful that
we do not declare a GPU hang when idly waiting for the next request to
be come ready, as we will run hangcheck continuously even when the
engines are stalled waiting for external events. This should be true
already as we should only be tracking requests submitted to hardware for
execution as an indicator that the engine is busy.

Fixes: 4680816be3 ("drm/i915: Wait first for submission, before waiting for request completion"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104840
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180129144104.3921-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
(cherry picked from commit 889230489b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-02-01 07:33:02 -08:00
Chris Wilson b92326a040 drm/i915: Only report a wakeup if the waiter was truly asleep
If we attempt to wake up a waiter, who is currently checking the seqno
it will be in the TASK_INTERRUPTIBLE state and ttwu will report success.
However, it is actually awake and functioning -- so delay reporting the
actual wake up until it sleeps. This fixes some spurious claims of
missed_breadcrumbs when running under heavy load; i.e. sufficient load to
preempt away the newly woken waiter before they complete their checks.
However, it does so at the cost of a rare false negative; where the
waiter changes between the check and ttwu -- the only way to fix that
would be to extend the reporting from ttwu where the check could be done
atomically.

v2: Defend against !CONFIG_SMP
v3: Don't filter out calls to wake_up_process
v4: Drop risky microoptimisation to skip wakeups

Testcase: igt/drv_missed_irq # sanity check we do detect missed_breadcrumb()
Testcase: igt/gem_concurrent_blit # for generating false positives
References: https://bugs.freedesktop.org/show_bug.cgi?id=100007
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171209124710.1606-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-12-11 17:23:02 +00:00
Chris Wilson 776bc27fd8 drm/i915: Stop listening to request resubmission from the signaler kthread
The intent here was that we would be listening to
i915_gem_request_unsubmit in order to cancel the signaler quickly and
release the reference on the request. Cancelling the signaler is done
directly via intel_engine_cancel_signaling (called from unsubmit), but
that does not directly wake up the signaling thread, and neither does
setting the request->global_seqno back to zero wake up listeners to the
request->execute waitqueue. So the only time that listening to the
request->execute waitqueue would wake up the signaling kthread would be
on the request resubmission, during which time we would already receive
wake ups from rejoining the global breadcrumbs wait rbtree.

Trying to wake up to release the request remains an issue. If the
signaling was cancelled and no other request required signaling, then it
is possible for us to shutdown with the reference on the request still
held. To ensure that we do not try to shutdown, leaking that request, we
kick the signaling threads whenever we disarm the breadcrumbs, i.e. on
parking the engine when idle.

v2: We do need to be sure to release the last reference on stopping the
kthread; asserting that it has been dropped already is insufficient.

Fixes: d6a2289d9d ("drm/i915: Remove the preempted request from the execution queue")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171208121033.5236-1-chris@chris-wilson.co.uk
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
2017-12-11 15:31:33 +00:00
Chris Wilson 832265d38c drm/i915: Include engine state on detecting a missed breadcrumb/seqno
Now that we have a common engine state pretty printer, we can use that
instead of the adhoc information printed when we miss a breadcrumb.

v2: Rearrange intel_engine_disarm_breadcrumbs() to avoid calling
intel_engine_dump() under the rb spinlock (Mika) and to pretty-print the
error state early so that we include the full list of waiters.
v3: Pass missed breadcrumb msg to pretty-printer as the header
v4: Preserve DRM_DEBUG_DRIVER filtering.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171208012303.25504-3-chris@chris-wilson.co.uk
2017-12-08 18:48:36 +00:00
Chris Wilson c534612e78 drm/i915: Clear breadcrumb node when cancelling signaling
When we call intel_engine_cancel_signaling() to stop reporting when
a request is completed via an asynchronous signal, we remove that request
from the breadcrumb wait queue. However, we may be concurrently
processing that request in the signaler itself, the actual operations on
the request's node itself are serialised but we do not actually clear the
waiter after removing it from the tree allowing both parties to attempt
to do so and corrupting the rbtree. (Previously removing from the
breadcrumb wait queue could only be done on behalf of i915_wait_request,
so this race could not happen).

Reported-by: "He, Bo" <bo.he@intel.com>
Fixes: 9eb143bbec ("drm/i915: Allow a request to be cancelled")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "He, Bo" <bo.he@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115121458.24655-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-16 14:17:08 +00:00
Chris Wilson c16c4ba725 drm/i915: Move irqs enabled assertion deeper for mock breadcrumbs
In order to allow the mock breadcrumbs tests to run without device irqs
being enabled, move the intel_irqs_enabled() assert deeper to just
before we commit to enabling the HW irq.

v2: Add a FIXME explaining that placing the assertion so deep is not
ideal, but a compromise for mock breadcrumbs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171107102003.1802-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-10 11:52:57 +00:00
Chris Wilson e5330ac1f5 drm/i915: Check that the breadcrumb wasn't disarmed automatically before parking
We will disarm the breadcrumb interrupt if we see a user interrupt
whilst no one is waiting. This may race with the call to
intel_engine_disarm_breadcrumbs() triggering an assert that we aren't
trying to do the same job twice. Prevent this by checking that the irq
is still armed after flushing the interrupt (for the irq spinlock).

Fixes: bcbd5c33a3 ("drm/i915/guc: Always enable the breadcrumbs irq")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171031122235.1395-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-01 13:43:14 +00:00
Chris Wilson bcbd5c33a3 drm/i915/guc: Always enable the breadcrumbs irq
The execlists emulation on top of the GuC (used for scheduling and
preemption) depends on the MI_USER_INTERRUPT for its notifications and
tasklet action. As we always employ the irq, there is no advantage in
ever disabling it while we are using the GuC, so allow us to arm the
breadcrumb irq when enabling GuC submission and disarm upon disabling.
The impact should be lessened by the delayed irq disabling we do (we
only disable after receiving an interrupt for which no one was wanting),
but allowing guc to explicitly manage the irq in relation to itself is
simpler and prevents an issue with losing an interrupt for preemption
as it is not coupled to an active request.

Internally, we add a reference counter (breadcrumbs.irq_enabled) as a
simple mechanism to allow GuC to keep the breadcrumb irq enabled. To
improve upon always enabling the irq while guc is selected, we need
to hook into the parking facility of intel_engines so that we only enable
the breadcrumbs while the GT is active (one step better would be to
individually park/unpark each engine).

In effect, this means that we keep the breadcrumb irq always enabled for
the entire duration the guc is busy, whereas before we would try to
switch it off whenever we idled for more than interrupt with no
associated waiters. The difference *should* be negligible in practice!

v2: Stop abusing fence signaling (and its auxiliary data structures) to
enable the breadcrumbs irqs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>,
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>,
Link: https://patchwork.freedesktop.org/patch/msgid/20171025143943.7661-3-chris@chris-wilson.co.uk
2017-10-25 18:47:30 +01:00
Kees Cook 39cbf2aa41 drm/i915: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171017065304.3358-1-joonas.lahtinen@linux.intel.com
2017-10-18 14:56:10 +03:00
Helge Deller 516726d46d i915: Use %pS printk format for direct addresses
Use the %pS printk format for printing symbols from direct addresses.
This is important for the ia64, ppc64 and parisc64 architectures, while on
other architectures there is no difference between %pS and %pF.
Fix it for consistency across the kernel.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: intel-gfx@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1504729681-3504-6-git-send-email-deller@gmx.de
2017-09-27 14:23:30 +02:00
Chris Wilson 735e0eb669 drm/i915: Skip adding the request to the signal tree is complete
Enabling the interrupt for the signaler takes a finite amount of time (a
few microseconds) during which it is possible for the request to
complete. Check afterwards and skip adding the request to the signal
rbtree if it complete.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608111405.16466-3-chris@chris-wilson.co.uk
2017-06-08 12:33:08 +01:00
Chris Wilson bac2ef4b47 drm/i915: Report back whether the irq was armed when adding the waiter
The important condition that we need to check after enabling the
interrupt for signaling is whether the request completed in the process
(and so we missed that interrupt). A large cost in enabling the
signaling (rather than waiters) is in waking up the auxiliary signaling
thread, but we only need to do so to catch that missed interrupt. If we
know we didn't miss any interrupts (because we didn't arm the interrupt)
then we can skip waking the auxiliary thread.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608111405.16466-2-chris@chris-wilson.co.uk
2017-06-08 12:33:08 +01:00
Chris Wilson f7b02a529a drm/i915: Skip waking the signaler when enabling before request submission
If we are enabling the breadcrumbs signaling prior to submitting the
request, we know that we cannot have missed the interrupt and can
therefore skip immediately waking the signaler to check.

This reduces a significant chunk of the __i915_gem_request_submit()
overhead for inter-engine synchronisation, for example in gem_exec_whisper.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170426080659.28771-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-04-26 11:51:31 +01:00
Chris Wilson 695eaa3b60 drm/i915: Include interesting seqno in the missed breadcrumb debug
Knowing the neighbouring seqno (current on hw, last submitted to hw)
provide some useful breadcrumbs to the debug log.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170423170619.7156-4-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-04-24 15:53:30 +01:00
Chris Wilson a7980a640c drm/i915: Apply a cond_resched() to the saturated signaler
If the engine is continually completing nops, we can saturate the
signaler and keep it working indefinitely. This angers the NMI watchdog!

A good example is to disable semaphores on snb and run igt/gem_exec_nop -
the parallel, multi-engine workloads are more than sufficient to hog the
CPU, preventing the system from even processing ICMP echo replies.

v2: Tvrtko dug into cond_resched() on x86 and found that it only
depended upon preempt_count and not tif_need_resched() - which means
that we would always call schedule() at that point.

Fixes: c81d46138d ("drm/i915: Convert trace-irq to the breadcrumb waiter")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170404120531.10737-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-04-04 13:48:21 +01:00
Chris Wilson b1becb8826 drm/i915: Park the signaler before sleeping
If the signal to park arrives before we sleep, then we need to check
kthread_should_park() before sleeping to avoid missing the signal.
Otherwise, if the signal arrives whilst we are processing completed
requests, we will reset the current->state back to TASK_INTERRUPTIBLE
and so miss the wakeup.

Fixes: fe3288b5da ("drm/i915: Park the breadcrumbs signaler across a GPU reset")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170403105124.8969-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-04-04 11:03:33 +01:00
Chris Wilson 467221bc60 drm/i915: Protect intel_engine_wakeup() for call from irq context
intel_engine_wakeup() is called by nop_request_submit() which is
installed to handle third party fences completed from within irq
context. As such, it needs the full irqsave/irqrestore and not the
partial spin_irq_lock handling.

[18942.714467] =================================
[18942.719076] [ INFO: inconsistent lock state ]
[18942.723522] 4.11.0-rc2-CI-CI_DRM_2368+ #1 Tainted: G     U  W
[18942.729970] ---------------------------------
[18942.734466] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[18942.740594] gem_eio/1275 [HC0[0]:SC0[0]:HE1:SE1] takes:
[18942.745932]  (&(&fence->lock)->rlock){+.?...}, at: [<ffffffff815ec100>] dma_fence_signal+0x100/0x
230
[18942.755331] {IN-SOFTIRQ-W} state was registered at:
[18942.760356]   __lock_acquire+0x5d0/0x1bb0
[18942.764444]   lock_acquire+0xc9/0x220
[18942.768196]   _raw_spin_lock_irqsave+0x41/0x60
[18942.772747]   dma_fence_signal+0x100/0x230
[18942.776927]   vgem_fence_timeout+0x9/0x10 [vgem]
[18942.781701]   call_timer_fn+0x92/0x380
[18942.785557]   expire_timers+0x150/0x1f0
[18942.789491]   run_timer_softirq+0x7c/0x160
[18942.793705]   __do_softirq+0x116/0x4c0
[18942.797560]   irq_exit+0xa9/0xc0
[18942.800873]   smp_apic_timer_interrupt+0x38/0x50
[18942.805611]   apic_timer_interrupt+0x90/0xa0
[18942.810008]   cpuidle_enter_state+0x135/0x380
[18942.814503]   cpuidle_enter+0x12/0x20
[18942.818250]   call_cpuidle+0x1e/0x40
[18942.821906]   do_idle+0x17e/0x1f0
[18942.825333]   cpu_startup_entry+0x18/0x20
[18942.829463]   rest_init+0x127/0x130
[18942.833025]   start_kernel+0x3f1/0x3fe
[18942.836908]   x86_64_start_reservations+0x2a/0x2c
[18942.841733]   x86_64_start_kernel+0x173/0x186
[18942.846234]   verify_cpu+0x0/0xfc
[18942.849604] irq event stamp: 30568
[18942.853140] hardirqs last  enabled at (30567): [<ffffffff8110b81f>] ktime_get+0xef/0x120
[18942.861468] hardirqs last disabled at (30568): [<ffffffff81876377>] _raw_spin_lock_irqsave+0x17/0
x60
[18942.870812] softirqs last  enabled at (30462): [<ffffffff81085cd9>] __do_softirq+0x1d9/0x4c0
[18942.879443] softirqs last disabled at (30439): [<ffffffff81086139>] irq_exit+0xa9/0xc0
[18942.887616]
[18942.887616] other info that might help us debug this:
[18942.894279]  Possible unsafe locking scenario:
[18942.894279]
[18942.900336]        CPU0
[18942.902851]        ----
[18942.905362]   lock(&(&fence->lock)->rlock);
[18942.909647]   <Interrupt>
[18942.912330]     lock(&(&fence->lock)->rlock);
[18942.916821]
[18942.916821]  *** DEADLOCK ***
[18942.916821]
[18942.922862] 1 lock held by gem_eio/1275:
[18942.926859]  #0:  (&(&fence->lock)->rlock){+.?...}, at: [<ffffffff815ec100>] dma_fence_signal+0x1
00/0x230
[18942.936651]
[18942.936651] stack backtrace:
[18942.941142] CPU: 3 PID: 1275 Comm: gem_eio Tainted: G     U  W       4.11.0-rc2-CI-CI_DRM_2368+ #
1
[18942.950367] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F21 01/06/2
017
[18942.959756] Call Trace:
[18942.962244]  dump_stack+0x67/0x92
[18942.965626]  print_usage_bug.part.23+0x259/0x268
[18942.970362]  mark_lock+0x12c/0x6f0
[18942.973851]  ? check_usage_forwards+0x130/0x130
[18942.978487]  mark_held_locks+0x6f/0xa0
[18942.982329]  ? _raw_spin_unlock_irq+0x27/0x50
[18942.986797]  trace_hardirqs_on_caller+0x150/0x200
[18942.991599]  trace_hardirqs_on+0xd/0x10
[18942.995515]  _raw_spin_unlock_irq+0x27/0x50
[18942.999796]  intel_engine_wakeup+0x26/0x30 [i915]
[18943.004670]  intel_engine_init_global_seqno+0x131/0x1a0 [i915]
[18943.010745]  nop_submit_request+0x2e/0x40 [i915]
[18943.015476]  submit_notify+0x3f/0x5c [i915]
[18943.019763]  __i915_sw_fence_complete+0x176/0x220 [i915]
[18943.025234]  ? try_to_del_timer_sync+0x4d/0x60
[18943.029825]  i915_sw_fence_complete+0x25/0x40 [i915]
[18943.034887]  dma_i915_sw_fence_wake+0x26/0x60 [i915]
[18943.039959]  dma_fence_signal+0x146/0x230
[18943.044109]  vgem_fence_signal_ioctl+0x6c/0xc0 [vgem]
[18943.049275]  drm_ioctl+0x200/0x450
[18943.052758]  ? vgem_fence_attach_ioctl+0x270/0x270 [vgem]
[18943.058334]  do_vfs_ioctl+0x90/0x6e0
[18943.061991]  ? entry_SYSCALL_64_fastpath+0x5/0xb1
[18943.066843]  ? __this_cpu_preempt_check+0x13/0x20
[18943.071643]  ? trace_hardirqs_on_caller+0xe7/0x200
[18943.076532]  SyS_ioctl+0x3c/0x70
[18943.079842]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[18943.084558] RIP: 0033:0x7f0dfcc14357
[18943.088240] RSP: 002b:00007ffeb4628da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[18943.095996] RAX: ffffffffffffffda RBX: ffffffff8147eb93 RCX: 00007f0dfcc14357
[18943.103311] RDX: 00007ffeb4628de0 RSI: 0000000040086442 RDI: 0000000000000005
[18943.110574] RBP: ffffc9000176ff88 R08: 0000000000000004 R09: 0000000000000000
[18943.117845] R10: 0000000000000029 R11: 0000000000000246 R12: 0000000000000001
[18943.125168] R13: 0000000000000005 R14: 0000000040086442 R15: 0000000000000000
[18943.132520]  ? __this_cpu_preempt_check+0x13/0x20

Fixes: cdc3a45390 ("drm/i915: No need to save/restore irq status in intel_engine_wakeup")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170320143133.1507-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-03-21 09:20:07 +00:00
Chris Wilson a6b0a14128 drm/i915/breadcrumbs: Tweak commentary
Tvrtko spotted a stale reference to b->lock (now b->rb_lock) so review
the comments and try to improve them in passing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170315222259.1469-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-03-16 08:49:28 +00:00
Chris Wilson 908a6cbf84 drm/i915/breadcrumbs: Assert that we do not shortcut the current bottom-half
We need to ensure that we always serialize updates to the bottom-half
using the breadcrumbs.irq_lock so that we don't race with a concurrent
interrupt handler. This is most important just prior to leaving the
waiter (when the intel_wait will be overwritten), so make sure we are
not the current bottom-half when skipping the irq locks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170315210726.12095-4-chris@chris-wilson.co.uk
2017-03-15 21:45:40 +00:00
Chris Wilson a5cae7b8ed drm/i915/breadcrumbs: Disable interrupt bottom-half first on idling
Before walking the rbtree of waiters (marking them as complete and waking
them), decouple the interrupt handler. This prevents a race between the
missed waiter waking up and removing its intel_wait (which skips
checking the lock) and the interrupt handler dereferencing the
intel_wait. (Though we do not expect to encounter waiters during idle!)

Fixes: e1c0c91bda ("drm/i915: Wake up all waiters before idling")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170315210726.12095-3-chris@chris-wilson.co.uk
2017-03-15 21:45:39 +00:00
Chris Wilson 429732e860 drm/i915/breadcrumbs: Update bottom-half before marking as complete
When adding a new request to the breadcrumb rbtree, we mark all those
requests inside the rbtree that are already completed as complete. This
wakes those waiters up and allows them to skip the spinlock before
returning to userspace. If one of those is the current bottom-half and
allocated its intel_wait on the stack, it may then overwrite the
b->irq_wait upon exiting i915_wait_request() just as the interrupt handler
dereferences it.

Fixes: 56299fb7d9 ("drm/i915: Signal first fence from irq handler if complete")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170315210726.12095-2-chris@chris-wilson.co.uk
2017-03-15 21:45:38 +00:00
Chris Wilson 4bd66391dd drm/i915/breadcrumbs: Use booleans for intel_breadcrumbs_busy()
Since commit 9b6586ae9f ("drm/i915: Keep a global seqno per-engine")
converted intel_breadcrumbs_busy() to reporting a single boolean, we
need only compute a boolean internally (and not needlessly compute the
flag).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170315210726.12095-1-chris@chris-wilson.co.uk
2017-03-15 21:45:38 +00:00
Daniel Vetter 7ffe939dd9 Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Backmerge drm-next to get at all the good stuff in drm-misc. We need
that because:

- drm_connector_list_iter conversion for i915 needs the core patches.
- Maarten's patches to use the new atomic state iterators also need
  the core patches.
- We need the new link status property to complete the DP retraining
  work, merging through 2 branches wasn't a good idea and we had to
  partially backtrack.
- Chris needs reservation_object_trylock and we want to roll out
  kref_read everywhere.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2017-03-08 10:54:45 +01:00
Dave Airlie 2e16101780 Merge tag 'drm-intel-next-2017-03-06' of git://anongit.freedesktop.org/git/drm-intel into drm-next
4 weeks worth of stuff since I was traveling&lazy:

- lspcon improvements (Imre)
- proper atomic state for cdclk handling (Ville)
- gpu reset improvements (Chris)
- lots and lots of polish around fences, requests, waiting and
  everything related all over (both gem and modeset code), from Chris
- atomic by default on gen5+ minus byt/bsw (Maarten did the patch to
  flip the default, really this is a massive joint team effort)
- moar power domains, now 64bit (Ander)
- big pile of in-kernel unit tests for various gem subsystems (Chris),
  including simple mock objects for i915 device and and the ggtt
  manager.
- i915_gpu_info in debugfs, for taking a snapshot of the current gpu
  state. Same thing as i915_error_state, but useful if the kernel didn't
  notice something is stick. From Chris.
- bxt dsi fixes (Umar Shankar)
- bxt w/a updates (Jani)
- no more struct_mutex for gem object unreference (Chris)
- some execlist refactoring (Tvrtko)
- color manager support for glk (Ander)
- improve the power-well sync code to better take over from the
  firmware (Imre)
- gem tracepoint polish (Tvrtko)
- lots of glk fixes all around (Ander)
- ctx switch improvements (Chris)
- glk dsi support&fixes (Deepak M)
- dsi fixes for vlv and clanups, lots of them (Hans de Goede)
- switch to i915.ko types in lots of our internal modeset code (Ander)
- byt/bsw atomic wm update code, yay (Ville)

* tag 'drm-intel-next-2017-03-06' of git://anongit.freedesktop.org/git/drm-intel: (432 commits)
  drm/i915: Update DRIVER_DATE to 20170306
  drm/i915: Don't use enums for hardware engine id
  drm/i915: Split breadcrumbs spinlock into two
  drm/i915: Refactor wakeup of the next breadcrumb waiter
  drm/i915: Take reference for signaling the request from hardirq
  drm/i915: Add FIFO underrun tracepoints
  drm/i915: Add cxsr toggle tracepoint
  drm/i915: Add VLV/CHV watermark/FIFO programming tracepoints
  drm/i915: Add plane update/disable tracepoints
  drm/i915: Kill level 0 wm hack for VLV/CHV
  drm/i915: Workaround VLV/CHV sprite1->sprite0 enable underrun
  drm/i915: Sanitize VLV/CHV watermarks properly
  drm/i915: Only use update_wm_{pre,post} for pre-ilk platforms
  drm/i915: Nuke crtc->wm.cxsr_allowed
  drm/i915: Compute proper intermediate wms for vlv/cvh
  drm/i915: Skip useless watermark/FIFO related work on VLV/CHV when not needed
  drm/i915: Compute vlv/chv wms the atomic way
  drm/i915: Compute VLV/CHV FIFO sizes based on the PM2 watermarks
  drm/i915: Plop vlv/chv fifo sizes into crtc state
  drm/i915: Plop vlv wm state into crtc_state
  ...
2017-03-08 12:41:47 +10:00
Tvrtko Ursulin cdc3a45390 drm/i915: No need to save/restore irq status in intel_engine_wakeup
It is called from either the process or timer context so it is
correct to always disable interrupts.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170306150321.29024-1-tvrtko.ursulin@linux.intel.com
2017-03-07 07:17:59 +00:00
Tvrtko Ursulin a9e64931ee drm/i915: No need to save/restore irq status in intel_breadcrumbs_fake_irq
Timer callback is a known context so it is correct to always
disable interrupts.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-03-07 07:17:55 +00:00
Chris Wilson e1c0c91bda drm/i915: Wake up all waiters before idling
When we idle, we wakeup the first waiter (checking to see if it missed
an earlier wakeup) and disarm the breadcrumbs. However, we now assert
that there are no waiter when the interrupt is disabled, triggering an
assert if there were multiple waiters when we idled.

[  420.842275] invalid opcode: 0000 [#1] PREEMPT SMP
[  420.842285] Modules linked in: vgem snd_hda_codec_realtek x86_pkg_temp_thermal snd_hda_codec_generic intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep mei_me snd_hda_core mei snd_pcm lpc_ich i915 r8169 mii prime_numbers
[  420.842357] CPU: 4 PID: 8714 Comm: kms_pipe_crc_ba Tainted: G     U  W       4.10.0-CI-CI_DRM_2280+ #1
[  420.842377] Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
[  420.842395] task: ffff880117ddce40 task.stack: ffffc90001114000
[  420.842439] RIP: 0010:__intel_engine_remove_wait+0x1f4/0x200 [i915]
[  420.842454] RSP: 0018:ffffc90001117b18 EFLAGS: 00010046
[  420.842467] RAX: 0000000000000000 RBX: ffff88010c25c2a8 RCX: 0000000000000001
[  420.842481] RDX: 0000000000000001 RSI: 00000000ffffffff RDI: ffffc90001117c50
[  420.842495] RBP: ffffc90001117b58 R08: 0000000011e52352 R09: c4d16acc00000000
[  420.842511] R10: ffffffff82789eb0 R11: ffff880117ddce40 R12: ffffc90001117c50
[  420.842525] R13: ffffc90001117c50 R14: 0000000000000078 R15: 0000000000000000
[  420.842540] FS:  00007fe47dda0a40(0000) GS:ffff88011fb00000(0000) knlGS:0000000000000000
[  420.842559] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  420.842571] CR2: 00007fd6c0a2cec4 CR3: 000000010a5e5000 CR4: 00000000001406e0
[  420.842586] Call Trace:
[  420.842595]  ? do_raw_spin_lock+0xad/0xb0
[  420.842635]  intel_engine_remove_wait.part.3+0x26/0x40 [i915]
[  420.842678]  intel_engine_remove_wait+0xe/0x20 [i915]
[  420.842721]  i915_wait_request+0x4f0/0x8c0 [i915]
[  420.842736]  ? wake_up_q+0x70/0x70
[  420.842747]  ? wake_up_q+0x70/0x70
[  420.842787]  i915_gem_object_wait_fence+0x7d/0x1a0 [i915]
[  420.842829]  i915_gem_object_wait+0x30d/0x520 [i915]
[  420.842842]  ? __this_cpu_preempt_check+0x13/0x20
[  420.842884]  i915_gem_wait_ioctl+0x12e/0x2e0 [i915]
[  420.842924]  ? i915_gem_wait_ioctl+0x22/0x2e0 [i915]
[  420.842939]  drm_ioctl+0x200/0x450
[  420.842976]  ? i915_gem_set_wedged+0x90/0x90 [i915]
[  420.842993]  do_vfs_ioctl+0x90/0x6e0
[  420.843003]  ? entry_SYSCALL_64_fastpath+0x5/0xb1
[  420.843017]  ? __this_cpu_preempt_check+0x13/0x20
[  420.843030]  ? trace_hardirqs_on_caller+0xe7/0x200
[  420.843042]  SyS_ioctl+0x3c/0x70
[  420.843054]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  420.843065] RIP: 0033:0x7fe47c4b9357
[  420.843075] RSP: 002b:00007ffc3c0633c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  420.843094] RAX: ffffffffffffffda RBX: ffffffff81482393 RCX: 00007fe47c4b9357
[  420.843109] RDX: 00007ffc3c063400 RSI: 00000000c010646c RDI: 0000000000000004
[  420.843123] RBP: ffffc90001117f88 R08: 0000000000000008 R09: 0000000000000000
[  420.843137] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  420.843151] R13: 0000000000000004 R14: 00000000c010646c R15: 0000000000000000
[  420.843168]  ? __this_cpu_preempt_check+0x13/0x20
[  420.843180] Code: 81 48 c7 c1 40 6a 16 a0 48 c7 c2 47 29 15 a0 be 17 01 00 00 48 c7 c7 10 6a 16 a0 e8 c7 ea fe e0 e9 5d ff ff ff 0f 0b 0f 0b 0f 0b <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 e8 67 41 7e e1
[  420.843325] RIP: __intel_engine_remove_wait+0x1f4/0x200 [i915] RSP: ffffc90001117b18

Fixes: b66255f0f7 ("drm/i915: Refactor wakeup of the next breadcrumb waiter")
Fixes: 67b807a892 ("drm/i915: Delay disabling the user interrupt for breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170306092916.11623-2-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-03-06 13:45:33 +00:00
Chris Wilson 61d3dc7080 drm/i915: Split breadcrumbs spinlock into two
As we now take the breadcrumbs spinlock within the interrupt handler, we
wish to minimise its hold time. During the interrupt we do not care
about the state of the full rbtree, only that of the first element, so
we can guard that with a separate lock.

v2: Rename first_wait to irq_wait to make it clearer that it is guarded
by irq_lock.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170303190824.1330-1-chris@chris-wilson.co.uk
2017-03-03 20:19:13 +00:00
Chris Wilson b66255f0f7 drm/i915: Refactor wakeup of the next breadcrumb waiter
Refactor the common task of updating the first_waiter, serialised with
the interrupt handler. When we update the first_waiter, we also need to
wakeup the new bottom-half in order to complete the actions that we may
have delegated to it (such as checking the irq-seqno coherency or waking
up other lower priority concurrent waiters).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170303171422.4735-1-chris@chris-wilson.co.uk
2017-03-03 18:31:37 +00:00
Chris Wilson 675204153e drm/i915: s/assert_spin_locked/lockdep_assert_held/
assert_spin_locked() becomes an unconditionally compiled BUG_ON(),
adding debug code right into the heart of critical routines like
interrupt handlers.

   text	   data	    bss	    dec	    hex
1296480	  19944	   2272	1318696	 141f28	before (lockdep disabled)
1295984	  19944	   2272	1318200	 141d38	after

1336261	  21139	   3208	1360608	 14c2e0	before (lockdep enabled)
1339920	  21139	   3208	1364267	 14d12b	after

Small saving for release; hopefully more instructive in debug.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170302132801.599-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-03-02 15:18:55 +00:00
Chris Wilson e60a870d7f drm/i915: Assert that fence->lock is held in an irq-safe manner
Everytime we take the fence->lock (aka request->lock), we must do so
with irqs disabled since it may be used from within an hardirq context.
As sometimes we are taking the lock in a nested manner, assert that the
caller did disable the irqs for us.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170302115130.28434-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-03-02 15:18:55 +00:00
Ingo Molnar ae7e81c077 sched/headers: Prepare for new header dependencies before moving code to <uapi/linux/sched/types.h>
We are going to move scheduler ABI details to <uapi/linux/sched/types.h>,
which will be used from a number of .c files.

Create empty placeholder header that maps to <linux/types.h>.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:27 +01:00
Chris Wilson 80166e406e drm/i915: Consolidate reporting of "missed breadcrumbs"
Move the setting of gpu_error->missed_irq_ring bit to a common function
so that we can get the debug logging for either path.

v2: Add %pF caller

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170228085018.3225-1-chris@chris-wilson.co.uk
2017-02-28 10:09:06 +00:00
Chris Wilson 67b807a892 drm/i915: Delay disabling the user interrupt for breadcrumbs
A significant cost in setting up a wait is the overhead of enabling the
interrupt. As we disable the interrupt whenever the queue of waiters is
empty, if we are frequently waiting on alternating batches, we end up
re-enabling the interrupt on a frequent basis. We do want to disable the
interrupt during normal operations as under high load it may add several
thousand interrupts/s - we have been known in the past to occupy whole
cores with our interrupt handler after accidentally leaving user
interrupts enabled. As a compromise, leave the interrupt enabled until
the next IRQ, or the system is idle. This gives a small window for a
waiter to keep the interrupt active and not be delayed by having to
re-enable the interrupt.

v2: Restore hangcheck/missed-irq detection for continuations
v3: Be more careful restoring the hangcheck timer after reset
v4: Be more careful restoring the fake irq after reset (if required!)
v5: Redo changes to intel_engine_wakeup()
v6: Factor out __intel_engine_wakeup()
v7: Improve commentary for declaring a missed wakeup

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-4-chris@chris-wilson.co.uk
2017-02-27 21:57:23 +00:00
Chris Wilson 19d0a57271 drm/i915: Defer enabling hangcheck to the first fake breadcrumb interrupt
By deferring hangcheck to the fake breadcrumb interrupt, we can simply
the enabling procedure slightly - as by enabling the fake, we then
enable the hangcheck. By always enabling the hangcheck from each fake
interrupt (it will be a no-op for an already queued hangcheck), it will
make restoring the breadcrumbs after a reset simpler in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-3-chris@chris-wilson.co.uk
2017-02-27 21:57:22 +00:00
Chris Wilson 56299fb7d9 drm/i915: Signal first fence from irq handler if complete
As execlists and other non-semaphore multi-engine devices coordinate
between engines using interrupts, we can shave off a few 10s of
microsecond of scheduling latency by doing the fence signaling from the
interrupt as opposed to a RT kthread. (Realistically the delay adds
about 1% to an individual cross-engine workload.) We only signal the
first fence in order to limit the amount of work we move into the
interrupt handler. We also have to remember that our breadcrumbs may be
unordered with respect to the interrupt and so we still require the
waiter process to perform some heavyweight coherency fixups, as well as
traversing the tree of waiters.

v2: No need for early exit in irq handler - it breaks the flow between
patches and prevents the tracepoint
v3: Restore rcu hold across irq signaling of request

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-2-chris@chris-wilson.co.uk
2017-02-27 21:57:20 +00:00
Chris Wilson 8d769ea7bc drm/i915: Report both waiters and success from intel_engine_wakeup()
The two users of the return value from intel_engine_wakeup() are
expecting different results. In the breadcrumbs hangcheck, we are using
it to determine whether wake_up_process() detected the waiter was
currently running (and if so we presume that it hasn't yet missed the
interrupt). However, in the fake_irq path, we are using the return value
as a check as to whether there are any waiters, and so we may
incorrectly stop the fake-irq if that waiter was currently running.

To handle the two different needs, return both bits of information! We
uninline it from the irq path in preparation for the next patch which
makes the irq hotpath special and relegates intel_engine_wakeup() to the
slow fixup paths.

v2: s/ret/result/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-1-chris@chris-wilson.co.uk
2017-02-27 21:57:19 +00:00
Chris Wilson d6a2289d9d drm/i915: Remove the preempted request from the execution queue
After the request is cancelled, we then need to remove it from the
global execution timeline and return it to the context timeline, the
inverse of submit_request().

v2: Move manipulation of struct intel_wait to helpers

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-12-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-02-23 14:50:07 +00:00
Chris Wilson 9eb143bbec drm/i915: Allow a request to be cancelled
If we preempt a request and remove it from the execution queue, we need
to undo its global seqno and restart any waiters.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-11-chris@chris-wilson.co.uk
2017-02-23 14:49:35 +00:00
Chris Wilson cced5e2f09 drm/i915: Take a reference whilst processing the signaler request
The plan in the near-future is to allow requests to be removed from the
signaler. We can no longer then rely on holding a reference to the
request for the duration it is in the signaling tree, and instead must
obtain a reference to the request for the current operation using RCU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-10-chris@chris-wilson.co.uk
2017-02-23 14:49:34 +00:00
Chris Wilson 754c9fd576 drm/i915: Protect the request->global_seqno with the engine->timeline lock
A request is assigned a global seqno only when it is on the hardware
execution queue. The global seqno can be used to maintain a list of
requests on the same engine in retirement order, for example for
constructing a priority queue for waiting. Prior to its execution, or
if it is subsequently removed in the event of preemption, its global
seqno is zero. As both insertion and removal from the execution queue
may operate in IRQ context, it is not guarded by the usual struct_mutex
BKL. Instead those relying on the global seqno must be prepared for its
value to change between reads. Only when the request is complete can
the global seqno be stable (due to the memory barriers on submitting
the commands to the hardware to write the breadcrumb, if the HWS shows
that it has passed the global seqno and the global seqno is unchanged
after the read, it is indeed complete).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-9-chris@chris-wilson.co.uk
2017-02-23 14:49:32 +00:00
Chris Wilson 9b6586ae9f drm/i915: Keep a global seqno per-engine
Replace the global device seqno with one for each engine, and account
for in-flight seqno on each separately. This is consistent with
dma-fence as each timeline has separate fence-contexts for each engine
and a seqno is only ordered within a fence-context (i.e.  seqno do not
need to be ordered wrt to other engines, just ordered within a single
engine). This is required to enable request rewinding for preemption on
individual engines (we have to rewind the global seqno to avoid
overflow, and we do not have to rewind all engines just to preempt one.)

v2: Rename active_seqno to inflight_seqnos to more clearly indicate that
it is a counter and not equivalent to the existing seqno. Update
functions that operated on active_seqno similarly.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-3-chris@chris-wilson.co.uk
2017-02-23 14:49:26 +00:00
Chris Wilson 6ef98ea0da drm/i915: Only start with the fake-irq timer if interrupts are dead
As a backup to waiting on a user-interrupt from the GPU, we use a heavy
and frequent timer to wake up the waiting process should we detect an
inconsistency whilst waiting. After seeing a "missed interrupt", the
next time we wait, we restart the heavy timer. This patch is more
reluctant to restart the timer and will only do so if we have not see any
interrupts since when we started the fake irq timer. If we are seeing
interrupts, then the waiters are being woken normally and we had an
incoherency that caused to miss last time - that is unlikely to reoccur
and so taking the risk of stalling again seems pragmatic.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170217151304.16665-5-chris@chris-wilson.co.uk
2017-02-17 15:31:14 +00:00
Chris Wilson 8998567b51 drm/i915: Defer declaration of missed-interrupt until the waiter is asleep
If the waiter was currently running, assume it hasn't had a chance
to process the pending interrupt (e.g, low priority task on a loaded
system) and wait until it sleeps before declaring a missed interrupt.

References: https://bugs.freedesktop.org/show_bug.cgi?id=99816
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170217151304.16665-4-chris@chris-wilson.co.uk
2017-02-17 15:31:14 +00:00
Chris Wilson 2246bea6cf drm/i915: Postpone fake breadcrumb interrupt until real interrupts cease
When the timer expires for checking on interrupt processing, check to
see if any interrupts arrived within the last time period. If real
interrupts are still being delivered, we can be reassured that we
haven't missed the final interrupt as the waiter will still be woken.
Only once all activity ceases, do we have to worry about the waiter
never being woken and so need to install a timer to kick the waiter for
a slow arrival of a seqno.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170217151304.16665-2-chris@chris-wilson.co.uk
2017-02-17 15:30:50 +00:00
Chris Wilson f97fbf9606 drm/i915: Add unit tests for the breadcrumb rbtree, insert/remove
First retroactive test, make sure that the waiters are in global seqno
order after random inserts and removals.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-3-chris@chris-wilson.co.uk
2017-02-13 20:45:24 +00:00
Chris Wilson fe3288b5da drm/i915: Park the breadcrumbs signaler across a GPU reset
The signal threads may be running concurrently with the GPU reset. The
completion from the GPU run asynchronous with the reset and two threads
may see different snapshots of the state, and the signaler may mark a
request as complete as we try to reset it. We don't tolerate 2 different
views of the same state and complain if we try to mark a request as
failed if it is already complete. Disable the signal threads during
reset to prevent this conflict (even though the conflict implies that
the state we resetting to is invalid, we have already made our
decision!).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99733
References: https://bugs.freedesktop.org/show_bug.cgi?id=99671
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170212172002.23072-4-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-02-13 11:18:28 +00:00
Chris Wilson 7c9e934ef8 drm/i915: Emit dma-fence (and execlists submit) first from signaler
When introduced, I thought that reducing client latency from the
signaler was the priority. Since its inception the signaler has become
responsible for keeping the execlists full, via the dma-fence. As this
is very important to minimise overall execution time, signal the
dma-fence first and then signal any waiting clients.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170124110009.28947-8-chris@chris-wilson.co.uk
2017-01-24 16:00:26 +00:00
Chris Wilson 538b257dae drm/i915: Move breadcrumbs irq_posted up a level to engine
In the next patch, we will use the irq_posted technique for another
engine interrupt, rather than use two members for the atomic updates, we
can use two bits of one instead. First, we need to update the
breadcrumbs to use the new common engine->irq_posted.

v2: Use set_bit() rather than __set_bit() to ensure atomicity with
respect to other bits in the mask

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170124151805.26146-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-01-24 15:55:36 +00:00