OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Chris Wilson	5590af3e11	drm/i915: Drive request submission through fence callbacks Drive final request submission from a callback from the fence. This way the request is queued until all dependencies are resolved, at which point it is handed to the backend for queueing to hardware. At this point, no dependencies are set on the request, so the callback is immediate. A side-effect of imposing a heavier-irqsafe spinlock for execlist submission is that we lose the softirq enabling after scheduling the execlists tasklet. To compensate, we manually kickstart the softirq by disabling and enabling the bh around the fence signaling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: John Harrison <john.c.harrison@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-14-chris@chris-wilson.co.uk	2016-09-09 14:23:05 +01:00
Chris Wilson	821ed7df6e	drm/i915: Update reset path to fix incomplete requests Update reset path in preparation for engine reset which requires identification of incomplete requests and associated context and fixing their state so that engine can resume correctly after reset. The request that caused the hang will be skipped and head is reset to the start of breadcrumb. This allows us to resume from where we left-off. Since this request didn't complete normally we also need to cleanup elsp queue manually. This is vital if we employ nonblocking request submission where we may have a web of dependencies upon the hung request and so advancing the seqno manually is no longer trivial. ABI: gem_reset_stats / DRM_IOCTL_I915_GET_RESET_STATS We change the way we count pending batches. Only the active context involved in the reset is marked as either innocent or guilty, and not mark the entire world as pending. By inspection this only affects igt/gem_reset_stats (which assumes implementation details) and not piglit. ARB_robustness gives this guide on how we expect the user of this interface to behave: * Provide a mechanism for an OpenGL application to learn about graphics resets that affect the context. When a graphics reset occurs, the OpenGL context becomes unusable and the application must create a new context to continue operation. Detecting a graphics reset happens through an inexpensive query. And with regards to the actual meaning of the reset values: Certain events can result in a reset of the GL context. Such a reset causes all context state to be lost. Recovery from such events requires recreation of all objects in the affected context. The current status of the graphics reset state is returned by enum GetGraphicsResetStatusARB(); The symbolic constant returned indicates if the GL context has been in a reset state at any point since the last call to GetGraphicsResetStatusARB. NO_ERROR indicates that the GL context has not been in a reset state since the last call. GUILTY_CONTEXT_RESET_ARB indicates that a reset has been detected that is attributable to the current GL context. INNOCENT_CONTEXT_RESET_ARB indicates a reset has been detected that is not attributable to the current GL context. UNKNOWN_CONTEXT_RESET_ARB indicates a detected graphics reset whose cause is unknown. The language here is explicit in that we must mark up the guilty batch, but is loose enough for us to relax the innocent (i.e. pending) accounting as only the active batches are involved with the reset. In the future, we are looking towards single engine resetting (with minimal locking), where it seems inappropriate to mark the entire world as innocent since the reset occurred on a different engine. Reducing the information available means we only have to encounter the pain once, and also reduces the information leaking from one context to another. v2: Legacy ringbuffer submission required a reset following hibernation, or else we restore stale values to the RING_HEAD and walked over stolen garbage. v3: GuC requires replaying the requests after a reset. v4: Restore engine IRQ after reset (so waiters will be woken!) Rearm hangcheck if resetting with a waiter. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-13-chris@chris-wilson.co.uk	2016-09-09 14:23:05 +01:00
Chris Wilson	ea746f3659	drm/i915: Expand bool interruptible to pass flags to i915_wait_request() We need finer control over wakeup behaviour during i915_wait_request(), so expand the current bool interruptible to a bitmask. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-9-chris@chris-wilson.co.uk	2016-09-09 14:23:03 +01:00
Chris Wilson	70c2a24dbf	drm/i915: Simplify ELSP queue request tracking Emulate HW to track and manage ELSP queue. A set of SW ports are defined and requests are assigned to these ports before submitting them to HW. This helps in cleaning up incomplete requests during reset recovery easier especially after engine reset by decoupling elsp queue management. This will become more clear in the next patch. In the engine reset case we want to resume where we left-off after skipping the incomplete batch which requires checking the elsp queue, removing element and fixing elsp_submitted counts in some cases. Instead of directly manipulating the elsp queue from reset path we can examine these ports, fix up ringbuffer pointers using the incomplete request and restart submissions again after reset. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-3-git-send-email-arun.siluvery@linux.intel.com Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-6-chris@chris-wilson.co.uk	2016-09-09 14:23:01 +01:00
Chris Wilson	9d80841ea4	drm/i915: Allow ringbuffers to be bound anywhere Now that we have WC vmapping available, we can bind our rings anywhere in the GGTT and do not need to restrict them to the mappable region. Except for stolen objects, for which direct access is verbatim and we must use the mappable aperture. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-17-chris@chris-wilson.co.uk	2016-08-18 22:36:48 +01:00
Tvrtko Ursulin	318f89ca20	drm/i915: Initialize legacy semaphores from engine hw id indexed array Build the legacy semaphore initialisation array using the engine hardware ids instead of driver internal ones. This makes the static array size dependent only on the number of gen6 semaphore engines. Also makes the per-engine semaphore wait and signal tables hardware id indexed saving some more space. v2: Refactor I915_GEN6_NUM_ENGINES to GEN6_SEMAPHORE_LAST. (Chris Wilson) v3: More polish. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1471363461-9973-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-08-17 11:29:56 +01:00
Tvrtko Ursulin	5ec2cf7e34	drm/i915: Add enum for hardware engine identifiers Put the engine hardware id in the common header so they are not only associated with the GuC since they are needed for the legacy semaphores implementation. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-08-17 11:29:56 +01:00
Chris Wilson	48bb74e48b	drm/i915: Use VMA for wa_ctx tracking Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-25-git-send-email-chris@chris-wilson.co.uk	2016-08-15 11:01:11 +01:00
Chris Wilson	51d545d026	drm/i915: Use VMA as the primary tracker for semaphore page Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-23-git-send-email-chris@chris-wilson.co.uk	2016-08-15 11:01:10 +01:00
Chris Wilson	adc320c4b7	drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c Since the scratch allocation and cleanup is shared by all engine submission backends, move it out of the legacy intel_ringbuffer.c and into the new home for common routines, intel_engine_cs.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-20-git-send-email-chris@chris-wilson.co.uk	2016-08-15 11:01:08 +01:00
Chris Wilson	56c0f1a7c1	drm/i915: Use VMA for scratch page tracking Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-19-git-send-email-chris@chris-wilson.co.uk	2016-08-15 11:01:07 +01:00
Chris Wilson	57e8853181	drm/i915: Use VMA for ringbuffer tracking Use the GGTT VMA as the primary cookie for handing ring objects as the most common action upon the ring is mapping and unmapping which act upon the VMA itself. By restructuring the code to work with the ring VMA, we can shrink the code and remove a few cycles from context pinning. v2: Move the flush of the object back to before the first pin. We use the am-I-bound? query to only have to check the flush on the first bind and so avoid stalling on active rings. Lots of little renames and small hoops. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-18-git-send-email-chris@chris-wilson.co.uk	2016-08-15 11:01:05 +01:00
Chris Wilson	dbd6ef29a7	drm/i915: Use RCU to annotate and enforce protection for breadcrumb's bh The bottom-half we use for processing the breadcrumb interrupt is a task, which is an RCU protected struct. When accessing this struct, we need to be holding the RCU read lock to prevent it disappearing beneath us. We can use the RCU annotation to mark our irq_seqno_bh pointer as being under RCU guard and then use the RCU accessors to both provide correct ordering of access through the pointer. Most notably, this fixes the access from hard irq context to use the RCU read lock, which both Daniel and Tvrtko complained about. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470761272-1245-3-git-send-email-chris@chris-wilson.co.uk	2016-08-10 10:37:49 +01:00
Chris Wilson	83348ba84e	drm/i915: Move missed interrupt detection from hangcheck to breadcrumbs In commit `2529d57050` ("drm/i915: Drop racy markup of missed-irqs from idle-worker") the racy detection of missed interrupts was removed when we went idle. This however opened up the issue that the stuck waiters were not being reported, causing a test case failure. If we move the stuck waiter detection out of hangcheck and into the breadcrumb mechanims (i.e. the waiter) itself, we can avoid this issue entirely. This leaves hangcheck looking for a stuck GPU (inspecting for request advancement and HEAD motion), and breadcrumbs looking for a stuck waiter - hopefully make both easier to understand by their segregation. v2: Reduce the error message as we now run independently of hangcheck, and the hanging batch used by igt also counts as a stuck waiter causing extra warnings in dmesg. v3: Move the breadcrumb's hangcheck kickstart to the first missed wait. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97104 Fixes: `2529d57050` (waiter"drm/i915: Drop racy markup of missed-irqs...") Testcase: igt/drv_missed_irq Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470761272-1245-2-git-send-email-chris@chris-wilson.co.uk	2016-08-10 10:37:35 +01:00
Chris Wilson	1426f7157e	drm/i915: Correct typo for __i915_gem_active_get_rcu in a comment I mistyped and added an extra _request_ to __i915_gem_active_get_rcu() Also, the same happened to another comment for i915_gem_active_get_rcu() Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470758602-1338-1-git-send-email-chris@chris-wilson.co.uk	2016-08-09 17:17:56 +01:00
Chris Wilson	dcff85c844	drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex The principal motivation for this was to try and eliminate the struct_mutex from i915_gem_suspend - but we still need to hold the mutex current for the i915_gem_context_lost(). (The issue there is that there may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and struct_mutex via the stop_machine().) For the moment, enabling last request tracking for the engine, allows us to do busyness checking and waiting without requiring the struct_mutex - which is useful in its own right. As a side-effect of having a robust means for tracking engine busyness, we can replace our other busyness heuristic, that of comparing against the last submitted seqno. For paranoid reasons, we have a semi-ordered check of that seqno inside the hangchecker, which we can now improve to an ordered check of the engine's busyness (removing a locked xchg in the process). v2: Pass along "bool interruptible" as being unlocked we cannot rely on i915->mm.interruptible being stable or even under our control. v3: Replace check Ironlake i915_gpu_busy() with the common precalculated value Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-6-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:37 +01:00
Chris Wilson	675d9ad71b	drm/i915: Track requests inside each intel_ring By tracking each request occupying space inside an individual intel_ring, we can greatly simplify the logic of tracking available space and not worry about other timelines. (Each ring is an ordered timeline of committed requests.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-17-git-send-email-chris@chris-wilson.co.uk	2016-08-04 08:09:26 +01:00
Chris Wilson	fa545cbf97	drm/i915: Refactor activity tracking for requests With the introduction of requests, we amplified the number of atomic refcounted objects we use and update every execbuffer; from none to several references, and a set of references that need to be changed. We also introduced interesting side-effects in the order of retiring requests and objects. Instead of independently tracking the last request for an object, track the active objects for each request. The object will reside in the buffer list of its most recent active request and so we reduce the kref interchange to a list_move. Now retirements are entirely driven by the request, dramatically simplifying activity tracking on the object themselves, and removing the ambiguity between retiring objects and retiring requests. Furthermore with the consolidation of managing the activity tracking centrally, we can look forward to using RCU to enable lockless lookup of the current active requests for an object. In the future, we will be able to query the status or wait upon rendering to an object without even touching the struct_mutex BKL. All told, less code, simpler and faster, and more extensible. v2: Add a typedef for the function pointer for convenience later. v3: Make the noop retirement callback explicit. Allow passing NULL to the init_request_active() which is expanded to a common noop function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-16-git-send-email-chris@chris-wilson.co.uk	2016-08-04 08:09:25 +01:00
Chris Wilson	96a945aa42	drm/i915: Move the common engine cleanup to intel_engine_cs.c Now that we initialize the state to both legacy and execlists inside intel_engine_cs, we should also clean up that state from the common functions. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470226756-24401-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-08-03 14:01:12 +01:00
Chris Wilson	ad7bdb2b99	drm/i915: Rename engine->semaphore.sync_to, engine->sempahore.signal locals In order to be more consistent with the rest of the request construction and ring emission, use the common names for the ring and request. Rather than using signaler_req, waiter_req, and intel_ring *wait, we use plain req and ring. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-32-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-23-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:33 +01:00
Chris Wilson	ddf07be7a2	drm/i915: Simplify calling engine->sync_to Since requests can no longer be generated as a side-effect of intel_ring_begin(), we know that the seqno will be unchanged during ring-emission. This predicatablity then means we do not have to check for the seqno wrapping around whilst emitting the semaphore for engine->sync_to(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-31-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-22-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:32 +01:00
Chris Wilson	9242f974dc	drm/i915: Stop passing caller's num_dwords to engine->semaphore.signal() Rather than pass in the num_dwords that the caller wishes to use after the signal command packet, split the breadcrumb emission into two phases and have both the signal and breadcrumb individiually acquire space on the ring. This makes the interface simpler for the reader, and will simplify for patches. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-25-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-16-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:28 +01:00
Chris Wilson	ddd66c5154	drm/i915: Unify request submission Move request submission from emit_request into its own common vfunc from i915_add_request(). v2: Convert I915_DISPATCH_flags to BIT(x) whilst passing v3: Rename a few functions to match. v4: Reenable execlists submission after disabling guc. v5: Be aware that everyone calls i915_guc_submission_disable()! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-23-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-14-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:26 +01:00
Chris Wilson	8f9420184a	drm/i915: Move the modulus for ring emission to the register write Space reservation is already safe with respect to the ring->size modulus, but hardware only expects to see values in the range 0...ring->size-1 (inclusive) and so requires the modulus to prevent us writing the value ring->size instead of 0. As this is only required for the register itself, we can defer the modulus to the register update and not perform it after every command packet. We keep the intel_ring_advance() around in the code to provide demarcation for the end-of-packet (which then can be compared against intel_ring_begin() as the number of dwords emitted must match the reserved space). v2: Assert that the ring size is a power-of-two to match assumptions in the code. Simplify the comment before writing the tail value to explain why the modulus is necessary. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-13-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:25 +01:00
Chris Wilson	c5efa1ad09	drm/i915: Convert engine->write_tail to operate on a request If we rewrite the I915_WRITE_TAIL specialisation for the legacy ringbuffer as submitting the request onto the ringbuffer, we can unify the vfunc with both execlists and GuC in the next patch. v2: Drop the modulus from the I915_WRITE_TAIL as it is currently being applied in intel_ring_advance() after every command packet, and add a comment explaining why we need the modulus at all. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-22-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-12-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:24 +01:00
Chris Wilson	ba76d91bc0	drm/i915: Remove intel_ring_get_tail() Joonas doesn't like the tiny function, especially if I go around making it more complicated and using it elsewhere. To remove that temptation, remove the function! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-21-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-11-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:23 +01:00
Chris Wilson	803688babd	drm/i915: Unify legacy/execlists emission of MI_BATCHBUFFER_START Both the ->dispatch_execbuffer and ->emit_bb_start callbacks do exactly the same thing, add MI_BATCHBUFFER_START to the request's ringbuffer - we need only one vfunc. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-20-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-10-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:21 +01:00
Chris Wilson	7c9cf4e33a	drm/i915: Reduce engine->emit_flush() to a single mode parameter Rather than passing a complete set of GPU cache domains for either invalidation or for flushing, or even both, just pass a single parameter to the engine->emit_flush to determine the required operations. engine->emit_flush(GPU, 0) -> engine->emit_flush(EMIT_INVALIDATE) engine->emit_flush(0, GPU) -> engine->emit_flush(EMIT_FLUSH) engine->emit_flush(GPU, GPU) -> engine->emit_flush(EMIT_FLUSH \| EMIT_INVALIDATE) This allows us to extend the behaviour easily in future, for example if we want just a command barrier without the overhead of flushing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-8-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:19 +01:00
Chris Wilson	c7fe7d25ed	drm/i915: Remove obsolete engine->gpu_caches_dirty Space for flushing the GPU cache prior to completing the request is preallocated and so cannot fail - the GPU caches will always be flushed along with the completed request. This means we no longer have to track whether the GPU cache is dirty between batches like we had to with the outstanding_lazy_seqno. With the removal of the duplication in the per-backend entry points for emitting the obsolete lazy flush, we can then further unify the engine->emit_flush. v2: Expand a bit on the legacy of gpu_caches_dirty Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-18-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-7-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:19 +01:00
Chris Wilson	aad29fbbb8	drm/i915: Rename intel_pin_and_map_ring() For more consistent oop-naming, we would use intel_ring_verb, so pick intel_ring_pin() and intel_ring_unpin(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-17-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-6-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:18 +01:00
Chris Wilson	32c04f16f0	drm/i915: Rename residual ringbuf parameters Now that we have a clear ring/engine split and a struct intel_ring, we no longer need the stopgap ringbuf names. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-16-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-5-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:17 +01:00
Chris Wilson	7e37f889b5	drm/i915: Rename struct intel_ringbuffer to struct intel_ring The state stored in this struct is not only the information about the buffer object, but the ring used to communicate with the hardware. Using buffer here is overly specific and, for me at least, conflates with the notion of buffer objects themselves. s/struct intel_ringbuffer/struct intel_ring/ s/enum intel_ring_hangcheck/enum intel_engine_hangcheck/ s/describe_ctx_ringbuf()/describe_ctx_ring()/ s/intel_ring_get_active_head()/intel_engine_get_active_head()/ s/intel_ring_sync_index()/intel_engine_sync_index()/ s/intel_ring_init_seqno()/intel_engine_init_seqno()/ s/ring_stuck()/engine_stuck()/ s/intel_cleanup_engine()/intel_engine_cleanup()/ s/intel_stop_engine()/intel_engine_stop()/ s/intel_pin_and_map_ringbuffer_obj()/intel_pin_and_map_ring()/ s/intel_unpin_ringbuffer()/intel_unpin_ring()/ s/intel_engine_create_ringbuffer()/intel_engine_create_ring()/ s/intel_ring_flush_all_caches()/intel_engine_flush_all_caches()/ s/intel_ring_invalidate_all_caches()/intel_engine_invalidate_all_caches()/ s/intel_ringbuffer_free()/intel_ring_free()/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-15-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-4-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:16 +01:00
Chris Wilson	b5321f309b	drm/i915: Unify intel_logical_ring_emit and intel_ring_emit Both perform the same actions with more or less indirection, so just unify the code. v2: Add back a few intel_engine_cs locals Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-11-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-1-git-send-email-chris@chris-wilson.co.uk	2016-08-02 22:58:13 +01:00
Chris Wilson	33a051a5fc	drm/i915/cmdparser: Remove stray intel_engine_cs *ring When we refer to intel_engine_cs, we want to use engine so as not to confuse ourselves about ringbuffers. v2: Rename all the functions as well, as well as a few more stray comments. v3: Split the really long error message strings Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-6-git-send-email-chris@chris-wilson.co.uk Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-1-git-send-email-chris@chris-wilson.co.uk	2016-07-27 16:23:05 +01:00
Dave Gordon	38a0f2db5b	drm/i915: rename 'ring' where it refers to an engine or engine_id 'ring' is an old deprecated term for a GPU engine. Chris Wilson wants to use the name for what is currently known as an intel_ringbuffer, but it will be dreadfully confusing if some rings are ringbuffers but other rings are still engines. So this patch changes the names of a bunch of parameters called 'ring' to either 'engine' or 'engine_id' according to what they actually are. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469034967-15840-3-git-send-email-david.s.gordon@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-07-21 09:59:41 +01:00
Dave Gordon	bbdc070a79	drm/i915: rename macro parameter(ring) to (engine) 'ring' is an old deprecated term for a GPU engine. Here we make the terminology more consistent by renaming the 'ring' parameter of lots of macros that calculate addresses within the MMIO space of an engine. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469034967-15840-2-git-send-email-david.s.gordon@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-07-21 09:59:25 +01:00
Chris Wilson	f2f0ed718b	drm/i915: Rename ring->virtual_start as ring->vaddr Just a different colour to better match virtual addresses elsewhere. s/ring->virtual_start/ring->vaddr/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-9-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-8-git-send-email-chris@chris-wilson.co.uk	2016-07-20 13:40:39 +01:00
Chris Wilson	406ea8d22f	drm/i915: Treat ringbuffer writes as write to normal memory Ringbuffers are now being written to either through LLC or WC paths, so treating them as simply iomem is no longer adequate. However, for the older !llc hardware, the hardware is documentated as treating the TAIL register update as serialising, so we can relax the barriers when filling the rings (but even if it were not, it is still an uncached register write and so serialising anyway.). For simplicity, let's ignore the iomem annotation. v2: Remove iomem from ringbuffer->virtual_address v3: And for good measure add iomem elsewhere to keep sparse happy Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v2 Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-8-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-7-git-send-email-chris@chris-wilson.co.uk	2016-07-20 13:40:14 +01:00
Chris Wilson	04769652c8	drm/i915: Derive GEM requests from dma-fence dma-buf provides a generic fence class for interoperation between drivers. Internally we use the request structure as a fence, and so with only a little bit of interfacing we can rebase those requests on top of dma-buf fences. This will allow us, in the future, to pass those fences back to userspace or between drivers. v2: The fence_context needs to be globally unique, not just unique to this device. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1469002875-2335-4-git-send-email-chris@chris-wilson.co.uk	2016-07-20 09:29:53 +01:00
Tvrtko Ursulin	019bf27763	drm/i915: Pull out some more common engine init code Created two common helpers for engine setup and engine init phases respectively to help with code sharing. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1468422221-12132-1-git-send-email-tvrtko.ursulin@linux.intel.com Reviewed-by: Chris Wilson <chris-wilson.co.uk>	2016-07-14 11:17:24 +01:00
Tvrtko Ursulin	88d2ba2e95	drm/i915: Move common engine setup into intel_engine_cs.c Common code deserves to be put in a separate file from legacy and execlists implementation for clarity and ease of maintenance. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris-wilson.co.uk>	2016-07-14 11:17:20 +01:00
Tvrtko Ursulin	8b3e2d3639	drm/i915: Unify engine init loop With the unified common engine setup done, and the execlist engine initialization loop clearly split into two phases, we can eliminate the separate legacy engine initialization code. v2: Fix cleanup path for legacy. v3: Rename constructors. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris-wilson.co.uk>	2016-07-14 11:17:06 +01:00
Dave Gordon	c2c7f24008	drm/i915: unify first-stage engine struct setup intel_lrc.c has a table of "logical rings" (meaning engines), while intel_ringbuffer.c has separately open-coded initialisation for each engine. We can deduplicate this somewhat by using the same first-stage engine-setup function for both modes. So here we expose the function that transfers information from the static table of (all) known engines to the dev_priv->engine array of engines available on this device (adjusting the names along the way) and then embed calls to it in both the LRC and the legacy-mode setup. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Chris Wilson <chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2016-07-14 11:16:18 +01:00
Chris Wilson	aca34b6e1c	drm/i915: Group the irq breadcrumb variables into the same cacheline As we inspect both the tasklet (to check for an active bottom-half) and set the irq-posted flag at the same time (both in the interrupt handler and then in the bottom-halt), group those two together into the same cacheline. (Not having total control over placement of the struct means we can't guarantee the cacheline boundary, we need to align the kmalloc and then each struct, but the grouping should help.) v2: Try a couple of different names for the state touched by the user interrupt handler. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467805142-22219-3-git-send-email-chris@chris-wilson.co.uk	2016-07-06 12:47:39 +01:00
Chris Wilson	7b4d3a16dd	drm/i915: Remove stop-rings debugfs interface Now that we have (near) universal GPU recovery code, we can inject a real hang from userspace and not need any fakery. Not only does this mean that the testing is far more realistic, but we can simplify the kernel in the process. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-7-git-send-email-chris@chris-wilson.co.uk	2016-07-04 08:18:23 +01:00
Chris Wilson	67d97da349	drm/i915: Only start retire worker when idle The retire worker is a low frequency task that makes sure we retire outstanding requests if userspace is being lax. We only need to start it once as it remains active until the GPU is idle, so do a cheap test before the more expensive queue_work(). A consequence of this is that we need correct locking in the worker to make the hot path of request submission cheap. To keep the symmetry and keep hangcheck strictly bound by the GPU's wakelock, we move the cancel_sync(hangcheck) to the idle worker before dropping the wakelock. v2: Guard against RCU fouling the breadcrumbs bottom-half whilst we kick the waiter. v3: Remove the wakeref assertion squelching (now we hold a wakeref for the hangcheck, any rpm error there is genuine). v4: To prevent excess work when retiring requests, we split the busy flag into two, a boolean to denote whether we hold the wakeref and a bitmask of active engines. v5: Reorder cancelling hangcheck upon idling to avoid a race where we might cancel a hangcheck after being preempted by a new task Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> References: https://bugs.freedesktop.org/show_bug.cgi?id=88437 Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-1-git-send-email-chris@chris-wilson.co.uk	2016-07-04 08:18:19 +01:00
Chris Wilson	61ff75ac20	drm/i915: Simplify enabling user-interrupts with L3-remapping Borrow the idea from intel_lrc.c to precompute the mask of interrupts we wish to always enable to avoid having lots of conditionals inside the interrupt enabling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-19-git-send-email-chris@chris-wilson.co.uk	2016-07-01 21:04:17 +01:00
Chris Wilson	31bb59cc01	drm/i915: Move the get/put irq locking into the caller With only a single callsite for intel_engine_cs->irq_get and ->irq_put, we can reduce the code size by moving the common preamble into the caller, and we can also eliminate the reference counting. For completeness, as we are no longer doing reference counting on irq, rename the get/put vfunctions to enable/disable respectively and are able to review the use of posting reads. We only require the serialisation with hardware when enabling the interrupt (i.e. so we cannot miss an interrupt by going to sleep before the hardware truly enables it). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-18-git-send-email-chris@chris-wilson.co.uk	2016-07-01 21:04:16 +01:00
Chris Wilson	b3850855f4	drm/i915: Embed signaling node into the GEM request Under the assumption that enabling signaling will be a frequent operation, lets preallocate our attachments for signaling inside the (rather large) request struct (and so benefiting from the slab cache). v2: Convert from void * to more meaningful names and types. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-17-git-send-email-chris@chris-wilson.co.uk	2016-07-01 21:04:14 +01:00
Chris Wilson	c81d46138d	drm/i915: Convert trace-irq to the breadcrumb waiter If we convert the tracing over from direct use of ring->irq_get() and over to the breadcrumb infrastructure, we only have a single user of the ring->irq_get and so we will be able to simplify the driver routines (eliminating the redundant validation and irq refcounting). Process context is preferred over softirq (or even hardirq) for a couple of reasons: - we already utilize process context to have fast wakeup of a single client (i.e. the client waiting for the GPU inspects the seqno for itself following an interrupt to avoid the overhead of a context switch before it returns to userspace) - engine->irq_seqno() is not suitable for use from an softirq/hardirq context as we may require long waits (100-250us) to ensure the seqno write is posted before we read it from the CPU A signaling framework is a requirement for enabling dma-fences. v2: Move to a signaling framework based upon the waiter. v3: Track the first-signal to avoid having to walk the rbtree everytime. v4: Mark the signaler thread as RT priority to reduce latency in the indirect wakeups. v5: Make failure to allocate the thread fatal. v6: Rename kthreads to i915/signal:%u Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-16-git-send-email-chris@chris-wilson.co.uk	2016-07-01 21:02:34 +01:00

1 2 3 4 5 ...

283 Commits