Commit Graph

43 Commits

Author SHA1 Message Date
Chris Wilson 9b6586ae9f drm/i915: Keep a global seqno per-engine
Replace the global device seqno with one for each engine, and account
for in-flight seqno on each separately. This is consistent with
dma-fence as each timeline has separate fence-contexts for each engine
and a seqno is only ordered within a fence-context (i.e.  seqno do not
need to be ordered wrt to other engines, just ordered within a single
engine). This is required to enable request rewinding for preemption on
individual engines (we have to rewind the global seqno to avoid
overflow, and we do not have to rewind all engines just to preempt one.)

v2: Rename active_seqno to inflight_seqnos to more clearly indicate that
it is a counter and not equivalent to the existing seqno. Update
functions that operated on active_seqno similarly.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-3-chris@chris-wilson.co.uk
2017-02-23 14:49:26 +00:00
Tvrtko Ursulin 133b4bd74d drm/i915: Move common workaround code to intel_engine_cs
It is used by all submission backends.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-17 11:39:59 +00:00
Tvrtko Ursulin 8ee7c6e23b drm/i915: Simplify cleanup path in intel_engines_init
We can call the engine cleanup vfunc instead of duplicating the
decision making here.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-17 11:39:59 +00:00
Chris Wilson f97fbf9606 drm/i915: Add unit tests for the breadcrumb rbtree, insert/remove
First retroactive test, make sure that the waiters are in global seqno
order after random inserts and removals.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-3-chris@chris-wilson.co.uk
2017-02-13 20:45:24 +00:00
Chris Wilson bb8f0f5abd drm/i915: Split intel_engine allocation and initialisation
In order to reset the GPU early on in the module load sequence, we need
to allocate the basic engine structs (to populate the mmio offsets etc).
Currently, the engine initialisation allocates both the base struct and
also allocate auxiliary objects, which depend upon state setup quite
late in the load sequence. We split off the allocation callback for
later and allow ourselves to allocate the engine structs themselves
early.

v2: Different paint for the unwind following error.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170124110135.6418-1-chris@chris-wilson.co.uk
2017-01-24 12:29:47 +00:00
Chris Wilson a01cb37aff drm/i915: Remove i915_vma_create from VMA API
With the introduce of i915_vma_instance() for obtaining the VMA
singleton for a (obj, vm, view) tuple, we can remove the
i915_vma_create() in favour of a single entry point. We do incur a
lookup onto an empty tree, but the i915_vma_create() were being called
infrequently and during initialisation, so the small overhead is
negligible.

v2: Drop the i915_ prefix from the now static vma_create() function

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-4-chris@chris-wilson.co.uk
2017-01-19 10:17:39 +00:00
Chris Wilson e8a9c58fcd drm/i915: Unify active context tracking between legacy/execlists/guc
The requests conversion introduced a nasty bug where we could generate a
new request in the middle of constructing a request if we needed to idle
the system in order to evict space for a context. The request to idle
would be executed (and waited upon) before the current one, creating a
minor havoc in the seqno accounting, as we will consider the current
request to already be completed (prior to deferred seqno assignment) but
ring->last_retired_head would have been updated and still could allow
us to overwrite the current request before execution.

We also employed two different mechanisms to track the active context
until it was switched out. The legacy method allowed for waiting upon an
active context (it could forcibly evict any vma, including context's),
but the execlists method took a step backwards by pinning the vma for
the entire active lifespan of the context (the only way to evict was to
idle the entire GPU, not individual contexts). However, to circumvent
the tricky issue of locking (i.e. we cannot take struct_mutex at the
time of i915_gem_request_submit(), where we would want to move the
previous context onto the active tracker and unpin it), we take the
execlists approach and keep the contexts pinned until retirement.
The benefit of the execlists approach, more important for execlists than
legacy, was the reduction in work in pinning the context for each
request - as the context was kept pinned until idle, it could short
circuit the pinning for all active contexts.

We introduce new engine vfuncs to pin and unpin the context
respectively. The context is pinned at the start of the request, and
only unpinned when the following request is retired (this ensures that
the context is idle and coherent in main memory before we unpin it). We
move the engine->last_context tracking into the retirement itself
(rather than during request submission) in order to allow the submission
to be reordered or unwound without undue difficultly.

And finally an ulterior motive for unifying context handling was to
prepare for mock requests.

v2: Rename to last_retired_context, split out legacy_context tracking
for MI_SET_CONTEXT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-3-chris@chris-wilson.co.uk
2016-12-18 16:18:50 +00:00
Tvrtko Ursulin bf9e8429ab drm/i915: Make various init functions take dev_priv
Like GEM init, GUC init, MOCS init and context creation.

Enables them to lose dev_priv locals.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-12-01 18:01:15 +00:00
Tvrtko Ursulin 187685cb90 drm/i915: Make GEM object alloc/free and stolen created take dev_priv
Where it is more appropriate and also to be consistent with
the direction of the driver.

v2: Leave out object alloc/free inlining. (Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-12-01 18:00:15 +00:00
Chris Wilson 20311bd350 drm/i915/scheduler: Execute requests in order of priorities
Track the priority of each request and use it to determine the order in
which we submit requests to the hardware via execlists.

The priority of the request is determined by the user (eventually via
the context) but may be overridden at any time by the driver. When we set
the priority of the request, we bump the priority of all of its
dependencies to match - so that a high priority drawing operation is not
stuck behind a background task.

When the request is ready to execute (i.e. we have signaled the submit
fence following completion of all its dependencies, including third
party fences), we put the request into a priority sorted rbtree to be
submitted to the hardware. If the request is higher priority than all
pending requests, it will be submitted on the next context-switch
interrupt as soon as the hardware has completed the current request. We
do not currently preempt any current execution to immediately run a very
high priority request, at least not yet.

One more limitation, is that this is first implementation is for
execlists only so currently limited to gen8/gen9.

v2: Replace recursive priority inheritance bumping with an iterative
depth-first search list.
v3: list_next_entry() for walking lists
v4: Explain how the dfs solves the recursion problem with PI.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-8-chris@chris-wilson.co.uk
2016-11-14 21:01:21 +00:00
Chris Wilson 0de9136dbb drm/i915/scheduler: Signal the arrival of a new request
The start of the scheduler, add a hook into request submission for the
scheduler to see the arrival of new requests and prepare its runqueues.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-6-chris@chris-wilson.co.uk
2016-11-14 21:00:26 +00:00
Chris Wilson 663f71e73f drm/i915: Remove engine->execlist_lock
The execlist_lock is now completely subsumed by the engine->timeline->lock,
and so we can remove the redundant layer of locking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-5-chris@chris-wilson.co.uk
2016-11-14 21:00:25 +00:00
Mika Kuoppala 3ac168a70b drm/i915: Move hangcheck code out from i915_irq.c
Create new file for hangcheck specific code, intel_hangcheck.c,
and move all related code in it.

v2: s/intel_engine_hangcheck/intel_engine (Chris)

No functional changes.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478018583-5816-1-git-send-email-mika.kuoppala@intel.com
2016-11-02 11:59:10 +02:00
Chris Wilson 85e17f5974 drm/i915: Move the global sync optimisation to the timeline
Currently we try to reduce the number of synchronisations (now the
number of requests we need to wait upon) by noting that if we have
earlier waited upon a request, all subsequent requests in the timeline
will be after the wait. This only applies to requests in this timeline,
as other timelines will not be ordered by that waiter.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-30-chris@chris-wilson.co.uk
2016-10-28 20:53:54 +01:00
Chris Wilson 73cb97010d drm/i915: Combine seqno + tracking into a global timeline struct
Our timelines are more than just a seqno. They also provide an ordered
list of requests to be executed. Due to the restriction of handling
individual address spaces, we are limited to a timeline per address
space but we use a fence context per engine within.

Our first step to introducing independent timelines per context (i.e. to
allow each context to have a queue of requests to execute that have a
defined set of dependencies on other requests) is to provide a timeline
abstraction for the global execution queue.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-23-chris@chris-wilson.co.uk
2016-10-28 20:53:51 +01:00
Chris Wilson 4e50f082ac drm/i915: Reuse the active golden render state batch
The golden render state is constant, but we recreate the batch setting
it up for every new context. If we keep that batch in a volatile cache
we can safely reuse it whenever we need to initialise a new context. We
mark the pages as purgeable and use the shrinker to recover pages from
the batch whenever we face memory pressues, recreating that batch afresh
on the next new context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtien@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-8-chris@chris-wilson.co.uk
2016-10-28 20:53:44 +01:00
Chris Wilson 920cf41949 drm/i915: Introduce an internal allocator for disposable private objects
Quite a few of our objects used for internal hardware programming do not
benefit from being swappable or from being zero initialised. As such
they do not benefit from using a shmemfs backing storage and since they
are internal and never directly exposed to the user, we do not need to
worry about providing a filp. For these we can use an
drm_i915_gem_object wrapper around a sg_table of plain struct page. They
are not swap backed and not automatically pinned. If they are reaped
by the shrinker, the pages are released and the contents discarded. For
the internal use case, this is fine as for example, ringbuffers are
pinned from being written by a request to be read by the hardware. Once
they are idle, they can be discarded entirely. As such they are a good
match for execlist ringbuffers and a small variety of other internal
objects.

In the first iteration, this is limited to the scratch batch buffers we
use (for command parsing and state initialisation).

v2: Allocate physically contiguous pages, where possible.
v3: Reduce maximum order on subsequent requests following an allocation
failure.
v4: Fix up mismatch between swiotlb segment size and page count (it
counts in 2k units, not 4k pages)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-7-chris@chris-wilson.co.uk
2016-10-28 20:53:44 +01:00
Chris Wilson f54d186700 dma-buf: Rename struct fence to dma_fence
I plan to usurp the short name of struct fence for a core kernel struct,
and so I need to rename the specialised fence/timeline for DMA
operations to make room.

A consensus was reached in
https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html
that making clear this fence applies to DMA operations was a good thing.
Since then the patch has grown a bit as usage increases, so hopefully it
remains a good thing!

(v2...: rebase, rerun spatch)
v3: Compile on msm, spotted a manual fixup that I broke.
v4: Try again for msm, sorry Daniel

coccinelle script:
@@

@@
- struct fence
+ struct dma_fence
@@

@@
- struct fence_ops
+ struct dma_fence_ops
@@

@@
- struct fence_cb
+ struct dma_fence_cb
@@

@@
- struct fence_array
+ struct dma_fence_array
@@

@@
- enum fence_flag_bits
+ enum dma_fence_flag_bits
@@

@@
(
- fence_init
+ dma_fence_init
|
- fence_release
+ dma_fence_release
|
- fence_free
+ dma_fence_free
|
- fence_get
+ dma_fence_get
|
- fence_get_rcu
+ dma_fence_get_rcu
|
- fence_put
+ dma_fence_put
|
- fence_signal
+ dma_fence_signal
|
- fence_signal_locked
+ dma_fence_signal_locked
|
- fence_default_wait
+ dma_fence_default_wait
|
- fence_add_callback
+ dma_fence_add_callback
|
- fence_remove_callback
+ dma_fence_remove_callback
|
- fence_enable_sw_signaling
+ dma_fence_enable_sw_signaling
|
- fence_is_signaled_locked
+ dma_fence_is_signaled_locked
|
- fence_is_signaled
+ dma_fence_is_signaled
|
- fence_is_later
+ dma_fence_is_later
|
- fence_later
+ dma_fence_later
|
- fence_wait_timeout
+ dma_fence_wait_timeout
|
- fence_wait_any_timeout
+ dma_fence_wait_any_timeout
|
- fence_wait
+ dma_fence_wait
|
- fence_context_alloc
+ dma_fence_context_alloc
|
- fence_array_create
+ dma_fence_array_create
|
- to_fence_array
+ to_dma_fence_array
|
- fence_is_array
+ dma_fence_is_array
|
- trace_fence_emit
+ trace_dma_fence_emit
|
- FENCE_TRACE
+ DMA_FENCE_TRACE
|
- FENCE_WARN
+ DMA_FENCE_WARN
|
- FENCE_ERR
+ DMA_FENCE_ERR
)
 (
 ...
 )

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk
2016-10-25 14:40:39 +02:00
Tvrtko Ursulin 70006ad6d8 drm/i915: Do not use INTEL_INFO(dev_priv)->ring_mask inside WARNs
Saves 1520 bytes of .rodata strings.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2016-10-14 12:23:19 +01:00
Akash Goel 3b3f1650b1 drm/i915: Allocate intel_engine_cs structure only for the enabled engines
With the possibility of addition of many more number of rings in future,
the drm_i915_private structure could bloat as an array, of type
intel_engine_cs, is embedded inside it.
	struct intel_engine_cs engine[I915_NUM_ENGINES];
Though this is still fine as generally there is only a single instance of
drm_i915_private structure used, but not all of the possible rings would be
enabled or active on most of the platforms. Some memory can be saved by
allocating intel_engine_cs structure only for the enabled/active engines.
Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
indexed using the enums defined in intel_engine_id.
To save memory and continue using the static engine/ring IDs, 'engine' is
defined as an array of pointers.
	struct intel_engine_cs *engine[I915_NUM_ENGINES];
dev_priv->engine[engine_ID] will be NULL for disabled engine instances.

There is a text size reduction of 928 bytes, from 1028200 to 1027272, for
i915.o file (but for i915.ko file text size remain same as 1193131 bytes).

v2:
- Remove the engine iterator field added in drm_i915_private structure,
  instead pass a local iterator variable to the for_each_engine**
  macros. (Chris)
- Do away with intel_engine_initialized() and instead directly use the
  NULL pointer check on engine pointer. (Chris)

v3:
- Remove for_each_engine_id() macro, as the updated macro for_each_engine()
  can be used in place of it. (Chris)
- Protect the access to Render engine Fault register with a NULL check, as
  engine specific init is done later in Driver load sequence.

v4:
- Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
- Kill the superfluous init_engine_lists().

v5:
- Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
  allocation of intel_engine_cs structure. (Chris)

v6:
- Rebase.

v7:
- Optimize the for_each_engine_masked() macro. (Chris)
- Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
- Rebase.

v8: Rebase.

v9: Rebase.

v10:
- For index calculation use engine ID instead of pointer based arithmetic in
  intel_engine_sync_index() as engine pointers are not contiguous now (Chris)
- For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas)
- Use for_each_engine macro for cleanup in intel_engines_init() and remove
  check for NULL engine pointer in cleanup() routines. (Joonas)

v11: Rebase.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com
2016-10-14 09:58:43 +01:00
Chris Wilson 0e70447605 drm/i915: Move common code out of i915_gpu_error.c
In the next patch, I want to conditionally compile i915_gpu_error.c and
that requires moving the functions used by debug out of
i915_gpu_error.c!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-1-chris@chris-wilson.co.uk
2016-10-12 12:00:32 +01:00
Chris Wilson ad07dfcddf drm/i915: Reset the breadcrumbs IRQ more carefully
Along with the interrupt, we want to restore the fake-irq and
wait-timeout detection. If we use the breadcrumbs interface to setup the
interrupt as it wants, the auxiliary timers will also be restored.

v2: Cancel both timers as well, sanitize the IMR.

Fixes: 821ed7df6e ("drm/i915: Update reset path to fix incomplete requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161007065327.24515-3-chris@chris-wilson.co.uk
2016-10-07 08:27:23 +01:00
Chris Wilson 1b36595ffb drm/i915: Show RING registers through debugfs
Knowing where the RINGs are pointing is extremely useful in diagnosing
if the engines are executing the ringbuffers you expect - and igt may be
suppressing the usual method of looking in the GPU error state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161004201132.21801-7-chris@chris-wilson.co.uk
2016-10-05 08:40:06 +01:00
Chris Wilson 821ed7df6e drm/i915: Update reset path to fix incomplete requests
Update reset path in preparation for engine reset which requires
identification of incomplete requests and associated context and fixing
their state so that engine can resume correctly after reset.

The request that caused the hang will be skipped and head is reset to the
start of breadcrumb. This allows us to resume from where we left-off.
Since this request didn't complete normally we also need to cleanup elsp
queue manually. This is vital if we employ nonblocking request
submission where we may have a web of dependencies upon the hung request
and so advancing the seqno manually is no longer trivial.

ABI: gem_reset_stats / DRM_IOCTL_I915_GET_RESET_STATS

We change the way we count pending batches. Only the active context
involved in the reset is marked as either innocent or guilty, and not
mark the entire world as pending. By inspection this only affects
igt/gem_reset_stats (which assumes implementation details) and not
piglit.

ARB_robustness gives this guide on how we expect the user of this
interface to behave:

 * Provide a mechanism for an OpenGL application to learn about
   graphics resets that affect the context.  When a graphics reset
   occurs, the OpenGL context becomes unusable and the application
   must create a new context to continue operation. Detecting a
   graphics reset happens through an inexpensive query.

And with regards to the actual meaning of the reset values:

   Certain events can result in a reset of the GL context. Such a reset
   causes all context state to be lost. Recovery from such events
   requires recreation of all objects in the affected context. The
   current status of the graphics reset state is returned by

	enum GetGraphicsResetStatusARB();

   The symbolic constant returned indicates if the GL context has been
   in a reset state at any point since the last call to
   GetGraphicsResetStatusARB. NO_ERROR indicates that the GL context
   has not been in a reset state since the last call.
   GUILTY_CONTEXT_RESET_ARB indicates that a reset has been detected
   that is attributable to the current GL context.
   INNOCENT_CONTEXT_RESET_ARB indicates a reset has been detected that
   is not attributable to the current GL context.
   UNKNOWN_CONTEXT_RESET_ARB indicates a detected graphics reset whose
   cause is unknown.

The language here is explicit in that we must mark up the guilty batch,
but is loose enough for us to relax the innocent (i.e. pending)
accounting as only the active batches are involved with the reset.

In the future, we are looking towards single engine resetting (with
minimal locking), where it seems inappropriate to mark the entire world
as innocent since the reset occurred on a different engine. Reducing the
information available means we only have to encounter the pain once, and
also reduces the information leaking from one context to another.

v2: Legacy ringbuffer submission required a reset following hibernation,
or else we restore stale values to the RING_HEAD and walked over
stolen garbage.

v3: GuC requires replaying the requests after a reset.

v4: Restore engine IRQ after reset (so waiters will be woken!)
    Rearm hangcheck if resetting with a waiter.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-13-chris@chris-wilson.co.uk
2016-09-09 14:23:05 +01:00
Chris Wilson 7756e45407 drm/i915/cmdparser: Make initialisation failure non-fatal
If the developer adds a register in the wrong order, we BUG during boot.
That makes development and testing very difficult. Let's be a bit more
friendly and disable the command parser with a big warning if the tables
are invalid.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-31-chris@chris-wilson.co.uk
2016-08-18 22:36:58 +01:00
Tvrtko Ursulin 5ec2cf7e34 drm/i915: Add enum for hardware engine identifiers
Put the engine hardware id in the common header so they are
not only associated with the GuC since they are needed for
the legacy semaphores implementation.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-17 11:29:56 +01:00
Chris Wilson bde13ebdab drm/i915: Introduce i915_ggtt_offset()
This little helper only exists to safely discard the upper unused 32bits
of the general 64-bit VMA address - as we know that all Global GTT
currently are less than 4GiB in size and so that the upper bits must be
zero. In many places, we use a u32 for the global GTT offset and we want
to document where we are discarding the full VMA offset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-28-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:14 +01:00
Chris Wilson 19880c4a3f drm/i915: Consolidate i915_vma_unpin_and_release()
In a few places, we repeat a call to clear a pointer to a vma whilst
unpinning and releasing a reference to its owner. Refactor those into a
common function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-26-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:12 +01:00
Chris Wilson 51d545d026 drm/i915: Use VMA as the primary tracker for semaphore page
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-23-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:10 +01:00
Chris Wilson 57f275a22b drm/i915: Move common seqno reset to intel_engine_cs.c
Since the intel_engine_init_seqno() is shared by all engine submission
backends, move it out of the legacy intel_ringbuffer.c and
into the new home for common routines, intel_engine_cs.c

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-21-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:08 +01:00
Chris Wilson adc320c4b7 drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c
Since the scratch allocation and cleanup is shared by all engine
submission backends, move it out of the legacy intel_ringbuffer.c and
into the new home for common routines, intel_engine_cs.c

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-20-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:08 +01:00
Tvrtko Ursulin c1bb11451e drm/i915: Store number of active engines in device info
Until now code was calling hweight32 to figure out the
number from device_info->ring_mask at runtime. Instead
we can cache it at engine init time and use directly.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1470842530-35854-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-08-11 11:33:10 +01:00
Chris Wilson 83348ba84e drm/i915: Move missed interrupt detection from hangcheck to breadcrumbs
In commit 2529d57050 ("drm/i915: Drop racy markup of missed-irqs from
idle-worker") the racy detection of missed interrupts was removed when
we went idle. This however opened up the issue that the stuck waiters
were not being reported, causing a test case failure. If we move the
stuck waiter detection out of hangcheck and into the breadcrumb
mechanims (i.e. the waiter) itself, we can avoid this issue entirely.
This leaves hangcheck looking for a stuck GPU (inspecting for request
advancement and HEAD motion), and breadcrumbs looking for a stuck
waiter - hopefully make both easier to understand by their segregation.

v2: Reduce the error message as we now run independently of hangcheck,
and the hanging batch used by igt also counts as a stuck waiter causing
extra warnings in dmesg.
v3: Move the breadcrumb's hangcheck kickstart to the first missed wait.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97104
Fixes: 2529d57050 (waiter"drm/i915: Drop racy markup of missed-irqs...")
Testcase: igt/drv_missed_irq
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470761272-1245-2-git-send-email-chris@chris-wilson.co.uk
2016-08-10 10:37:35 +01:00
Chris Wilson dcff85c844 drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex
The principal motivation for this was to try and eliminate the
struct_mutex from i915_gem_suspend - but we still need to hold the mutex
current for the i915_gem_context_lost(). (The issue there is that there
may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and
struct_mutex via the stop_machine().) For the moment, enabling last
request tracking for the engine, allows us to do busyness checking and
waiting without requiring the struct_mutex - which is useful in its own
right.

As a side-effect of having a robust means for tracking engine busyness,
we can replace our other busyness heuristic, that of comparing against
the last submitted seqno. For paranoid reasons, we have a semi-ordered
check of that seqno inside the hangchecker, which we can now improve to
an ordered check of the engine's busyness (removing a locked xchg in the
process).

v2: Pass along "bool interruptible" as being unlocked we cannot rely on
i915->mm.interruptible being stable or even under our control.
v3: Replace check Ironlake i915_gpu_busy() with the common precalculated value

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-6-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:37 +01:00
Chris Wilson 115003e9ff drm/i915: Double check the active status on the batch pool
We should not rely on obj->active being uptodate unless we manually
flush it. Instead, we can verify that the next available batch object is
idle by looking at its last active request (and checking it for
completion).

v2: remove the struct drm_device forward declaration added in the
process of removing its necessity

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-3-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:50 +01:00
Chris Wilson fa545cbf97 drm/i915: Refactor activity tracking for requests
With the introduction of requests, we amplified the number of atomic
refcounted objects we use and update every execbuffer; from none to
several references, and a set of references that need to be changed. We
also introduced interesting side-effects in the order of retiring
requests and objects.

Instead of independently tracking the last request for an object, track
the active objects for each request. The object will reside in the
buffer list of its most recent active request and so we reduce the kref
interchange to a list_move. Now retirements are entirely driven by the
request, dramatically simplifying activity tracking on the object
themselves, and removing the ambiguity between retiring objects and
retiring requests.

Furthermore with the consolidation of managing the activity tracking
centrally, we can look forward to using RCU to enable lockless lookup of
the current active requests for an object. In the future, we will be
able to query the status or wait upon rendering to an object without
even touching the struct_mutex BKL.

All told, less code, simpler and faster, and more extensible.

v2: Add a typedef for the function pointer for convenience later.
v3: Make the noop retirement callback explicit. Allow passing NULL to
the init_request_active() which is expanded to a common noop function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-16-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:25 +01:00
Chris Wilson 96a945aa42 drm/i915: Move the common engine cleanup to intel_engine_cs.c
Now that we initialize the state to both legacy and execlists inside
intel_engine_cs, we should also clean up that state from the common
functions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470226756-24401-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-08-03 14:01:12 +01:00
Chris Wilson 7e37f889b5 drm/i915: Rename struct intel_ringbuffer to struct intel_ring
The state stored in this struct is not only the information about the
buffer object, but the ring used to communicate with the hardware. Using
buffer here is overly specific and, for me at least, conflates with the
notion of buffer objects themselves.

s/struct intel_ringbuffer/struct intel_ring/
s/enum intel_ring_hangcheck/enum intel_engine_hangcheck/
s/describe_ctx_ringbuf()/describe_ctx_ring()/
s/intel_ring_get_active_head()/intel_engine_get_active_head()/
s/intel_ring_sync_index()/intel_engine_sync_index()/
s/intel_ring_init_seqno()/intel_engine_init_seqno()/
s/ring_stuck()/engine_stuck()/
s/intel_cleanup_engine()/intel_engine_cleanup()/
s/intel_stop_engine()/intel_engine_stop()/
s/intel_pin_and_map_ringbuffer_obj()/intel_pin_and_map_ring()/
s/intel_unpin_ringbuffer()/intel_unpin_ring()/
s/intel_engine_create_ringbuffer()/intel_engine_create_ring()/
s/intel_ring_flush_all_caches()/intel_engine_flush_all_caches()/
s/intel_ring_invalidate_all_caches()/intel_engine_invalidate_all_caches()/
s/intel_ringbuffer_free()/intel_ring_free()/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-15-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-4-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:16 +01:00
Chris Wilson 6ce213575c drm/i915: Add missing ring_mask to Pineview
It appears that we never told Pineview it has a RENDER_RING. This was
all fine until we started using the ring_mask for determining all the
available rings to initialise for legacy ringbuffer submission in commit
88d2ba2e95 ("drm/i915: Unify engine init loop"). Though really it is a
latent bug since the ring_mask inception in commit 73ae478cdf
("drm/i915: Replace has_bsd/blt/vebox with a mask").

To prevent similar mishaps in future, add a WARN_ON() if we find
ourselves with a device without any rings.

Fixes: 73ae478cdf ("drm/i915: Replace has_bsd/blt/vebox with a mask")
Fixes: 88d2ba2e95 ("drm/i915: Unify engine init loop")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Link: http://patchwork.freedesktop.org/patch/msgid/1469749535-2382-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
2016-07-29 16:22:42 +01:00
Chris Wilson 33a051a5fc drm/i915/cmdparser: Remove stray intel_engine_cs *ring
When we refer to intel_engine_cs, we want to use engine so as not to
confuse ourselves about ringbuffers.

v2: Rename all the functions as well, as well as a few more stray comments.
v3: Split the really long error message strings

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-6-git-send-email-chris@chris-wilson.co.uk
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-1-git-send-email-chris@chris-wilson.co.uk
2016-07-27 16:23:05 +01:00
Chris Wilson 04769652c8 drm/i915: Derive GEM requests from dma-fence
dma-buf provides a generic fence class for interoperation between
drivers. Internally we use the request structure as a fence, and so with
only a little bit of interfacing we can rebase those requests on top of
dma-buf fences. This will allow us, in the future, to pass those fences
back to userspace or between drivers.

v2: The fence_context needs to be globally unique, not just unique to
this device.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1469002875-2335-4-git-send-email-chris@chris-wilson.co.uk
2016-07-20 09:29:53 +01:00
Tvrtko Ursulin 019bf27763 drm/i915: Pull out some more common engine init code
Created two common helpers for engine setup and engine init phases
respectively to help with code sharing.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1468422221-12132-1-git-send-email-tvrtko.ursulin@linux.intel.com
Reviewed-by: Chris Wilson <chris-wilson.co.uk>
2016-07-14 11:17:24 +01:00
Tvrtko Ursulin 88d2ba2e95 drm/i915: Move common engine setup into intel_engine_cs.c
Common code deserves to be put in a separate file from legacy and
execlists implementation for clarity and ease of maintenance.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris-wilson.co.uk>
2016-07-14 11:17:20 +01:00