Commit Graph

14 Commits

Author SHA1 Message Date
Michal Wajdeczko ffd5ce22fa drm/i915/guc: Updates for GuC 32.0.3 firmware
New GuC 32.0.3 firmware made many changes around its ABI that
require driver updates:

* FW release version numbering schema now includes patch number
* FW release version encoding in CSS header
* Boot parameters
* Suspend/resume protocol
* Sample-forcewake command
* Additional Data Structures (ADS)

This commit is a squash of patches 3-8 from series [1].
[1] https://patchwork.freedesktop.org/series/58760/

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # numbering schema
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # ccs heaser
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # boot params
Acked-by: John Spotswood <john.a.spotswood@intel.com> # suspend/resume
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # sample-forcewake
Acked-by: John Spotswood <john.a.spotswood@intel.com> # sample-forcewake
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # ADS
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190527183613.17076-4-michal.wajdeczko@intel.com
2019-05-28 10:07:02 +01:00
Tvrtko Ursulin c5d3e39caa drm/i915: Engine discovery query
Engine discovery query allows userspace to enumerate engines, probe their
configuration features, all without needing to maintain the internal PCI
ID based database.

A new query for the generic i915 query ioctl is added named
DRM_I915_QUERY_ENGINE_INFO, together with accompanying structure
drm_i915_query_engine_info. The address of latter should be passed to the
kernel in the query.data_ptr field, and should be large enough for the
kernel to fill out all known engines as struct drm_i915_engine_info
elements trailing the query.

As with other queries, setting the item query length to zero allows
userspace to query minimum required buffer size.

Enumerated engines have common type mask which can be used to query all
hardware engines, versus engines userspace can submit to using the execbuf
uAPI.

Engines also have capabilities which are per engine class namespace of
bits describing features not present on all engine instances.

v2:
 * Fixed HEVC assignment.
 * Reorder some fields, rename type to flags, increase width. (Lionel)
 * No need to allocate temporary storage if we do it engine by engine.
   (Lionel)

v3:
 * Describe engine flags and mark mbz fields. (Lionel)
 * HEVC only applies to VCS.

v4:
 * Squash SFC flag into main patch.
 * Tidy some comments.

v5:
 * Add uabi_ prefix to engine capabilities. (Chris Wilson)
 * Report exact size of engine info array. (Chris Wilson)
 * Drop the engine flags. (Joonas Lahtinen)
 * Added some more reserved fields.
 * Move flags after class/instance.

v6:
 * Do not check engine info array was zeroed by userspace but zero the
   unused fields for them instead.

v7:
 * Simplify length calculation loop. (Lionel Landwerlin)

v8:
 * Remove MBZ comments where not applicable.
 * Rename ABI flags to match engine class define naming.
 * Rename SFC ABI flag to reflect it applies to VCS and VECS.
 * SFC is wired to even _logical_ engine instances.
 * SFC applies to VCS and VECS.
 * HEVC is present on all instances on Gen11. (Tony)
 * Simplify length calculation even more. (Chris Wilson)
 * Move info_ptr assigment closer to loop for clarity. (Chris Wilson)
 * Use vdbox_sfc_access from runtime info.
 * Rebase for RUNTIME_INFO.
 * Refactor for lower indentation.
 * Rename uAPI class/instance to engine_class/instance to avoid C++
   keyword.

v9:
 * Rebase for s/num_rings/num_engines/ in RUNTIME_INFO.

v10:
 * Use new copy_query_item.

v11:
 * Consolidate with struct i915_engine_class_instnace.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> # v7
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190522090054.6007-1-tvrtko.ursulin@linux.intel.com
2019-05-22 14:17:55 +01:00
Chris Wilson 519a019491 drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD
After realising we need to sample RING_START to detect context switches
from preemption events that do not allow for the seqno to advance, we
can also realise that the seqno itself is just a distance along the ring
and so can be replaced by sampling RING_HEAD.

v2: Bonus comment for the mystery separate CS_STALL before MI_USER_INTERRUPT

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190508080704.24223-1-chris@chris-wilson.co.uk
2019-05-08 15:06:35 +01:00
Chris Wilson dc58958d08 drm/i915: Assert the local engine->wakeref is active
Due to the asynchronous tasklet and recursive GT wakeref, it may happen
that we submit to the engine (underneath it's own wakeref) prior to the
central wakeref being marked as taken. Switch to checking the local wakeref
for greater consistency.

Fixes: 79ffac8599 ("drm/i915: Invert the GEM wakeref hierarchy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190503115225.30831-3-chris@chris-wilson.co.uk
2019-05-07 12:00:10 +01:00
Chris Wilson c34c5bca33 drm/i915/execlists: Flush the tasklet on parking
Tidy up the cleanup sequence by always ensure that the tasklet is
flushed on parking (before we cleanup). The parking provides a
convenient point to ensure that the backend is truly idle.

v2: Do the full check for idleness before parking, to be sure we flush
any residual interrupt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190503080942.30151-1-chris@chris-wilson.co.uk
2019-05-03 11:35:31 +01:00
Chris Wilson 8c334f24e3 drm/i915: Include fence signaled bit in print_request()
Show the fence flags view of request completion in addition to the
normal hwsp check and whether signaling is enabled.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190501114541.10077-2-chris@chris-wilson.co.uk
2019-05-02 16:15:26 +01:00
Chris Wilson 45b9c968c5 drm/i915: Move the engine->destroy() vfunc onto the engine
Make the engine responsible for cleaning itself up!

This removes the i915->gt.cleanup vfunc that has been annoying the
casual reader and myself for the last several years, and helps keep a
future patch to add more cleanup tidy.

v2: Assert that engine->destroy is set after the backend starts
allocating its own state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190501103204.18632-1-chris@chris-wilson.co.uk
2019-05-01 12:13:57 +01:00
Chris Wilson 5e2a0419ef drm/i915: Switch back to an array of logical per-engine HW contexts
We switched to a tree of per-engine HW context to accommodate the
introduction of virtual engines. However, we plan to also support
multiple instances of the same engine within the GEM context, defeating
our use of the engine as a key to looking up the HW context. Just
allocate a logical per-engine instance and always use an index into the
ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user
specified map.

v2: Add for_each_gem_engine() helper to iterator within the engines lock
v3: intel_context_create_request() helper
v4: s/unsigned long/unsigned int/ 4 billion engines is quite enough.
v5: Push iterator locking to caller

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-7-chris@chris-wilson.co.uk
2019-04-26 18:32:11 +01:00
Chris Wilson 11334c6aad drm/i915: Split engine setup/init into two phases
In the next patch, we require the engine vfuncs setup prior to
initialising the pinned kernel contexts, so split the vfunc setup from
the engine initialisation and call it earlier.

v2: s/setup_xcs/setup_common/ for intel_ring_submission_setup()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-6-chris@chris-wilson.co.uk
2019-04-26 18:32:07 +01:00
Chris Wilson fa9f668141 drm/i915: Export intel_context_instance()
We want to pass in a intel_context into intel_context_pin() and that
requires us to first be able to lookup the intel_context!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-2-chris@chris-wilson.co.uk
2019-04-26 18:32:00 +01:00
Chris Wilson 9ce9bdb00d drm/i915: Enable render context support for gen4 (Broadwater to Cantiga)
Broadwater and the rest of gen4  do support being able to saving and
reloading context specific registers between contexts, providing isolation
of the basic GPU state (as programmable by userspace). This allows
userspace to assume that the GPU retains their state from one batch to the
next, minimising the amount of state it needs to reload and manually save
across batches.

v2: CONSTANT_BUFFER woes

Running through piglit turned up an interesting issue, a GPU hang inside
the context load. The context image includes the CONSTANT_BUFFER command
that loads an address into a on-gpu buffer, and the context load was
executing that immediately. However, since it was reading from the GTT
there is no guarantee that the GTT retains the same configuration as
when the context was saved, resulting in stray reads and a GPU hang.

Having tried issuing a CONSTANT_BUFFER (to disable the command) from the
ring before saving the context to no avail, we resort to patching out
the instruction inside the context image before loading.

This does impose that gen4 always reissues CONSTANT_BUFFER commands on
each batch, but due to the use of a shared GTT that was and will remain
a requirement.

v3: ECOSKPD to the rescue

Ville found the magic bit in the ECOSKPD to disable saving and restoring
the CONSTANT_BUFFER from the context image, thereby completely avoiding
the GPU hangs from chasing invalid pointers. This appears to be the
default behaviour for gen5, and so we just need to tweak gen4 to match.

v4: Fix spelling of ECOSKPD and discover it already exists

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419172720.5462-1-chris@chris-wilson.co.uk
2019-04-26 11:39:17 +01:00
Chris Wilson 1215d28e72 drm/i915: Enable render context support for Ironlake (gen5)
Ironlake does support being able to saving and reloading context specific
registers between contexts, providing isolation of the basic GPU state
(as programmable by userspace). This allows userspace to assume that the
GPU retains their state from one batch to the next, minimising the
amount of state it needs to reload, or manually save and restore.

v2: Fix off-by-one in reading CXT_SIZE, and add a comment that the
CXT_SIZE and context-layout do not match in bspec, but the difference is
irrelevant as we overallocate the full page anyway (Ville).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419111749.3910-2-chris@chris-wilson.co.uk
2019-04-26 11:39:17 +01:00
Chris Wilson 79ffac8599 drm/i915: Invert the GEM wakeref hierarchy
In the current scheme, on submitting a request we take a single global
GEM wakeref, which trickles down to wake up all GT power domains. This
is undesirable as we would like to be able to localise our power
management to the available power domains and to remove the global GEM
operations from the heart of the driver. (The intent there is to push
global GEM decisions to the boundary as used by the GEM user interface.)

Now during request construction, each request is responsible via its
logical context to acquire a wakeref on each power domain it intends to
utilize. Currently, each request takes a wakeref on the engine(s) and
the engines themselves take a chipset wakeref. This gives us a
transition on each engine which we can extend if we want to insert more
powermangement control (such as soft rc6). The global GEM operations
that currently require a struct_mutex are reduced to listening to pm
events from the chipset GT wakeref. As we reduce the struct_mutex
requirement, these listeners should evaporate.

Perhaps the biggest immediate change is that this removes the
struct_mutex requirement around GT power management, allowing us greater
flexibility in request construction. Another important knock-on effect,
is that by tracking engine usage, we can insert a switch back to the
kernel context on that engine immediately, avoiding any extra delay or
inserting global synchronisation barriers. This makes tracking when an
engine and its associated contexts are idle much easier -- important for
when we forgo our assumed execution ordering and need idle barriers to
unpin used contexts. In the process, it means we remove a large chunk of
code whose only purpose was to switch back to the kernel context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-5-chris@chris-wilson.co.uk
2019-04-24 22:26:49 +01:00
Chris Wilson 112ed2d31a drm/i915: Move GraphicsTechnology files under gt/
Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
2019-04-24 21:01:46 +01:00