With the introduction of ctx->engines[] we allow multiple logical
contexts to be used on the same engine (e.g. with virtual engines).
According to bspec, aach logical context requires a unique tag in order
for context-switching to occur correctly between them. [Simple
experiments show that it is not so easy to trick the HW into performing
a lite-restore with matching logical IDs, though my memory from early
Broadwell experiments do suggest that it should be generating
lite-restores.]
We only need to keep a unique tag for the active lifetime of the
context, and for as long as we need to identify that context. The HW
uses the tag to determine if it should use a lite-restore (why not the
LRCA?) and passes the tag back for various status identifies. The only
status we need to track is for OA, so when using perf, we assign the
specific context a unique tag.
v2: Calculate required number of tags to fill ELSP.
Fixes: 976b55f0e1 ("drm/i915: Allow a context to define its set of engines")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111895
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-14-chris@chris-wilson.co.uk
Forgo the struct_mutex serialisation for i915_active, and interpose its
own mutex handling for active/retire.
This is a multi-layered sleight-of-hand. First, we had to ensure that no
active/retire callbacks accidentally inverted the mutex ordering rules,
nor assumed that they were themselves serialised by struct_mutex. More
challenging though, is the rule over updating elements of the active
rbtree. Instead of the whole i915_active now being serialised by
struct_mutex, allocations/rotations of the tree are serialised by the
i915_active.mutex and individual nodes are serialised by the caller
using the i915_timeline.mutex (we need to use nested spinlocks to
interact with the dma_fence callback lists).
The pain point here is that instead of a single mutex around execbuf, we
now have to take a mutex for active tracker (one for each vma, context,
etc) and a couple of spinlocks for each fence update. The improvement in
fine grained locking allowing for multiple concurrent clients
(eventually!) should be worth it in typical loads.
v2: Add some comments that barely elucidate anything :(
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk
It might prove useful in the future to know if the vma is utilising
huge-GTT-pages. Related to this is the GTT cache, where there is some HW
"quirkiness" where it must be disabled if using 2M pages, so include
that for good measure.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909171646.22090-1-matthew.auld@intel.com
The sg_table for our backing store might contain addresses from
stolen-memory or in the future local-memory, at which point this is no
longer a dma-iterator. As a consequence we should now break on NULL
iter.sgp, instead of dmap == 0 which is considered an invalid dma
address.
As a bonus, gcc much prefers this construct,
Function old new delta
gen8_ggtt_insert_entries 211 192 -19
gen6_ggtt_insert_entries 292 262 -30
i915_error_object_create 996 954 -42
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829201919.21493-1-matthew.auld@intel.com
We need the rename of reservation_object to dma_resv.
The solution on this merge came from linux-next:
From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Wed, 14 Aug 2019 12:48:39 +1000
Subject: [PATCH] drm: fix up fallout from "dma-buf: rename reservation_object to dma_resv"
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
drivers/gpu/drm/i915/gt/intel_engine_pool.c | 8 ++++----
3 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.c b/drivers/gpu/drm/i915/gt/intel_engine_pool.c
index 03d90b49584a..4cd54c569911 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pool.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.c
@@ -43,12 +43,12 @@ static int pool_active(struct i915_active *ref)
{
struct intel_engine_pool_node *node =
container_of(ref, typeof(*node), active);
- struct reservation_object *resv = node->obj->base.resv;
+ struct dma_resv *resv = node->obj->base.resv;
int err;
- if (reservation_object_trylock(resv)) {
- reservation_object_add_excl_fence(resv, NULL);
- reservation_object_unlock(resv);
+ if (dma_resv_trylock(resv)) {
+ dma_resv_add_excl_fence(resv, NULL);
+ dma_resv_unlock(resv);
}
err = i915_gem_object_pin_pages(node->obj);
which is a simplified version from a previous one which had:
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Looking around the GT initialisation, we have a few log messages we
think are interesting enough present to the user (such as the amount of L4
cache) and a few to inform them of the result of actions or conflicting
HW restrictions (i.e. quirks). These are device specific messages, so
use the dev family of printk.
v2: shave off a few bytes of .rodata!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190815093604.3618-1-chris@chris-wilson.co.uk
We don't care about internal firmware status changes unless
we are doing some real debugging. Note that our CI is not
using DRM_I915_DEBUG_GUC config by default so use it.
v2: protect against accidental overwrites (Chris)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190813081559.23936-1-michal.wajdeczko@intel.com
It used to be handy that we only had a couple of headers, but over time
i915_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.
Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
only where needed, and sort the modified include directives while at it
and as needed.
No functional changes.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f2b887002150acdf218385ea846f7aa617aa5f15.1565271681.git.jani.nikula@intel.com
Skip printing out idle engines that did not contribute to the GPU hang.
As the number of engines gets ever larger, we have increasing noise in
the error state where typically there is only one guilty request on one
engine that we need to inspect.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190808144511.32269-1-chris@chris-wilson.co.uk
For forklifted frankenkernels, the reported kernel version has no
bearing on the actual driver version. Include our own driver date that
is updated every time we tag drm-intel-next.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190807113827.28127-1-chris@chris-wilson.co.uk
- Selftests fixes and improvements (Chris)
- More work around engine tracking for better handling (Chris, Tvrtko)
- HDCP debug and info improvements (Ram, Ashuman)
- Add DSI properties (Vandita)
- Rework on sdvo support for better debuggability before fixing bugs (Ville)
- Display PLLs fixes and improvements, specially targeting Ice Lake (Imre, Matt, Ville)
- Perf fixes and improvements (Lionel)
- Enumerate scratch buffers (Lionel)
- Add infra to hold off preemption on a request (Lionel)
- Ice Lake color space fixes (Uma)
- Type-C fixes and improvements (Lucas)
- Fix and improvements around workarounds (Chris, John, Tvrtko)
- GuC related fixes and improvements (Chris, Daniele, Michal, Tvrtko)
- Fix on VLV/CHV display power domain (Ville)
- Improvements around Watermark (Ville)
- Favor intel_ types on intel_atomic functions (Ville)
- Don’t pass stack garbage to pcode (Ville)
- Improve display tracepoints (Steven)
- Don’t overestimate 4:2:0 link symbol clock (Ville)
- Add support for 4th pipe and transcoder (Lucas)
- Introduce initial support for Tiger Lake platform (Daniele, Lucas, Mahesh, Jose, Imre, Mika, Vandita, Rodrigo, Michel)
- PPGTT allocation simplification (Chris)
- Standardize function names and suffixes to make clean, symmetric and let checkpatch happy (Janusz)
- Skip SINK_COUNT read on CH7511 (Ville)
- Fix on kernel documentation (Chris, Michal)
- Add modular FIA (Anusha, Lucas)
- Fix EHL display (Matt, Vivek)
- Enable hotplug retry (Imre, Jose)
- Disable preemption under GVT (Chris)
- OA; Reconfigure context on the fly (Chris)
- Fixes and improvements around engine reset. (Chris)
- Small clean up on display pipe fault mask (Ville)
- Make sure cdclk is high enough for DP audio on VLV/CHV (Ville)
- Drop some wmb() and improve pwrite flush (Chris)
- Fix critical PSR regression (DK)
- Remove unused variables (YueHaibing)
- Use dev_get_drvdata for simplification (Chunhong)
- Use upstream version of header tests (Jani)
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJdQJHXAAoJEPpiX2QO6xPKIpkH/3lMqbuv6UXyX1zvcYj6Ap4g
c6ocA7O1ooQDfFBfnLJNd6D+Gs3uTt9KROL0WdhmolfgzfLihFnvSx1VP/pvi7gC
kVT1JbwbzuwYbBXQ8WhmtkfqDp/quy3wku/ThNchY9pG1IaqNuRiP35+pXRNLO08
Q+RUHl8j1OkoLTLuzxfYGFtY72F8mIlkki8zMwlthH2Skz9h9d8POh8phOv+3TDx
aQ7CsOfScnLSrEyWlnOeYFexps0LpNC7TAG8fGkVI28Jig16DSwg7QR3MhQ9UtB1
8IC3+Jz8+p83PQHx7mGS7Va/XTERVT4czsoNC/IK7cFMy1yFilzoqpFHH8Is3sk=
=dAkP
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-2019-07-30' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
- More changes on simplifying locking mechanisms (Chris)
- Selftests fixes and improvements (Chris)
- More work around engine tracking for better handling (Chris, Tvrtko)
- HDCP debug and info improvements (Ram, Ashuman)
- Add DSI properties (Vandita)
- Rework on sdvo support for better debuggability before fixing bugs (Ville)
- Display PLLs fixes and improvements, specially targeting Ice Lake (Imre, Matt, Ville)
- Perf fixes and improvements (Lionel)
- Enumerate scratch buffers (Lionel)
- Add infra to hold off preemption on a request (Lionel)
- Ice Lake color space fixes (Uma)
- Type-C fixes and improvements (Lucas)
- Fix and improvements around workarounds (Chris, John, Tvrtko)
- GuC related fixes and improvements (Chris, Daniele, Michal, Tvrtko)
- Fix on VLV/CHV display power domain (Ville)
- Improvements around Watermark (Ville)
- Favor intel_ types on intel_atomic functions (Ville)
- Don’t pass stack garbage to pcode (Ville)
- Improve display tracepoints (Steven)
- Don’t overestimate 4:2:0 link symbol clock (Ville)
- Add support for 4th pipe and transcoder (Lucas)
- Introduce initial support for Tiger Lake platform (Daniele, Lucas, Mahesh, Jose, Imre, Mika, Vandita, Rodrigo, Michel)
- PPGTT allocation simplification (Chris)
- Standardize function names and suffixes to make clean, symmetric and let checkpatch happy (Janusz)
- Skip SINK_COUNT read on CH7511 (Ville)
- Fix on kernel documentation (Chris, Michal)
- Add modular FIA (Anusha, Lucas)
- Fix EHL display (Matt, Vivek)
- Enable hotplug retry (Imre, Jose)
- Disable preemption under GVT (Chris)
- OA; Reconfigure context on the fly (Chris)
- Fixes and improvements around engine reset. (Chris)
- Small clean up on display pipe fault mask (Ville)
- Make sure cdclk is high enough for DP audio on VLV/CHV (Ville)
- Drop some wmb() and improve pwrite flush (Chris)
- Fix critical PSR regression (DK)
- Remove unused variables (YueHaibing)
- Use dev_get_drvdata for simplification (Chunhong)
- Use upstream version of header tests (Jani)
drm-intel-next-2019-07-08:
- Signal fence completion from i915_request_wait (Chris)
- Fixes and improvements around rings pin/unpin (Chris)
- Display uncore prep patches (Daniele)
- Execlists preemption improvements (Chris)
- Selftests fixes and improvements (Chris)
- More Elkhartlake enabling work (Vandita, Jose, Matt, Vivek)
- Defer address space cleanup to an RCU worker (Chris)
- Implicit dev_priv removal and GT compartmentalization and other related follow-ups (Tvrtko, Chris)
- Prevent dereference of engine before NULL check in error capture (Chris)
- GuC related fixes (Daniele, Robert)
- Many changes on active tracking, timelines and locking mechanisms (Chris)
- Disable SAMPLER_STATE prefetching on Gen11 (HW W/a) (Kenneth)
- I915_perf fixes (Lionel)
- Add Ice Lake PCI ID (Mika)
- eDP backlight fix (Lee)
- Fix various gen2 tracepoints (Ville)
- Some irq vfunc clean-up and improvements (Ville)
- Move OA files to separated folder (Michal)
- Display self contained headers clean-up (Jani)
- Preparation for 4th pile (Lucas)
- Move atomic commit, watermark and other places to use more intel_crtc_state (Maarten)
- Many Ice Lake Type C and Thunderbolt fixes (Imre)
- Fix some Ice Lake hw w/a whitelist regs (Lionel)
- Fix memleak in runtime wakeref tracking (Mika)
- Remove unused Private PPAT manager (Michal)
- Don't check PPGTT presence on PPGTT-only platforms (Michal)
- Fix ICL DSI suspend/resume (Chris)
- Fix ICL Bandwidth issues (Ville)
- Add N & CTS values for 10/12 bit deep color (Aditya)
- Moving more GT related stuff under gt folder (Chris)
- Forcewake related fixes (Chris)
- Show support for accurate sw PMU busyness tracking (Chris)
- Handle gtt double alloc failures (Chris)
- Upgrade to new GuC version (Michal)
- Improve w/a debug dumps and pull engine w/a initialization into a common (Chris)
- Look for instdone on all engines at hangcheck (Tvrtko)
- Engine lookup simplification (Chris)
- Many plane color formats fixes and improvements (Ville)
- Fix some compilation issues (YueHaibing)
- GTT page directory clean up and improvements (Mika)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801201314.GA23635@intel.com
The fault registers moved to another offset. The old location is now
taken by the global MOCS registers, to be added in a follow up change.
Based on previous patches by Michel Thierry <michel.thierry@intel.com>.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730180407.5993-2-lucas.demarchi@intel.com
We cannot let the request be retired and freed while we are trying to
dump it during error capture. It is not sufficient just to grab a
reference to the request, as during retirement we may free the ring
which we are also dumping. So take the engine lock to prevent retiring
and freeing of the request.
Reported-by: Alex Shumsky <alexthreed@gmail.com>
Fixes: 83c317832e ("drm/i915: Dump the ringbuffer of the active request for debugging")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Alex Shumsky <alexthreed@gmail.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190715080946.15593-6-chris@chris-wilson.co.uk
(cherry picked from commit cfe7288c27)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
This patch fixes the following warnings:
drivers/gpu/drm/i915/gem/i915_gem_mman.c: In function ‘i915_gem_fault’:
drivers/gpu/drm/i915/gem/i915_gem_mman.c:342:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (!i915_terminally_wedged(i915))
^
drivers/gpu/drm/i915/gem/i915_gem_mman.c:345:2: note: here
case -EAGAIN:
^~~~
drivers/gpu/drm/i915/gem/i915_gem_pages.c: In function ‘i915_gem_object_map’:
./include/linux/compiler.h:78:22: warning: this statement may fall through [-Wimplicit-fallthrough=]
# define unlikely(x) __builtin_expect(!!(x), 0)
^~~~~~~~~~~~~~~~~~~~~~~~~~
./include/asm-generic/bug.h:136:2: note: in expansion of macro ‘unlikely’
unlikely(__ret_warn_on); \
^~~~~~~~
drivers/gpu/drm/i915/i915_utils.h:49:25: note: in expansion of macro ‘WARN’
#define MISSING_CASE(x) WARN(1, "Missing case (%s == %ld)\n", \
^~~~
drivers/gpu/drm/i915/gem/i915_gem_pages.c:270:3: note: in expansion of macro ‘MISSING_CASE’
MISSING_CASE(type);
^~~~~~~~~~~~
drivers/gpu/drm/i915/gem/i915_gem_pages.c:272:2: note: here
case I915_MAP_WB:
^~~~
drivers/gpu/drm/i915/i915_gpu_error.c: In function ‘error_record_engine_registers’:
./include/linux/compiler.h:78:22: warning: this statement may fall through [-Wimplicit-fallthrough=]
# define unlikely(x) __builtin_expect(!!(x), 0)
^~~~~~~~~~~~~~~~~~~~~~~~~~
./include/asm-generic/bug.h:136:2: note: in expansion of macro ‘unlikely’
unlikely(__ret_warn_on); \
^~~~~~~~
drivers/gpu/drm/i915/i915_utils.h:49:25: note: in expansion of macro ‘WARN’
#define MISSING_CASE(x) WARN(1, "Missing case (%s == %ld)\n", \
^~~~
drivers/gpu/drm/i915/i915_gpu_error.c:1196:5: note: in expansion of macro ‘MISSING_CASE’
MISSING_CASE(engine->id);
^~~~~~~~~~~~
drivers/gpu/drm/i915/i915_gpu_error.c:1197:4: note: here
case RCS0:
^~~~
drivers/gpu/drm/i915/display/intel_dp.c: In function ‘intel_dp_get_fia_supported_lane_count’:
./include/linux/compiler.h:78:22: warning: this statement may fall through [-Wimplicit-fallthrough=]
# define unlikely(x) __builtin_expect(!!(x), 0)
^~~~~~~~~~~~~~~~~~~~~~~~~~
./include/asm-generic/bug.h:136:2: note: in expansion of macro ‘unlikely’
unlikely(__ret_warn_on); \
^~~~~~~~
drivers/gpu/drm/i915/i915_utils.h:49:25: note: in expansion of macro ‘WARN’
#define MISSING_CASE(x) WARN(1, "Missing case (%s == %ld)\n", \
^~~~
drivers/gpu/drm/i915/display/intel_dp.c:233:3: note: in expansion of macro ‘MISSING_CASE’
MISSING_CASE(lane_info);
^~~~~~~~~~~~
drivers/gpu/drm/i915/display/intel_dp.c:234:2: note: here
case 1:
^~~~
drivers/gpu/drm/i915/display/intel_display.c: In function ‘check_digital_port_conflicts’:
CC [M] drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgv100.o
drivers/gpu/drm/i915/display/intel_display.c:12043:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (WARN_ON(!HAS_DDI(to_i915(dev))))
^
drivers/gpu/drm/i915/display/intel_display.c:12046:3: note: here
case INTEL_OUTPUT_DP:
^~~~
Also, notice that the Makefile is modified to stop ignoring
fall-through warnings. The -Wimplicit-fallthrough option
will be enabled globally in v5.3.
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Currently we use the engine->active.lock to ensure that the request is
not retired as we capture the data. However, we only need to ensure that
the vma are not removed prior to use acquiring their contents, and
since we have already relinquished our stop-machine protection, we
assume that the user will not be overwriting the contents before we are
able to record them.
In order to capture the vma outside of the spinlock, we acquire a
reference and mark the vma as active to prevent it from being unbound.
However, since it is tricky allocate an entry in the fence tree (doing
so would require taking a mutex) while inside the engine spinlock, we
use an atomic bit and special case the handling for i915_active_wait.
The core benefit is that we can use some non-atomic methods for mapping
the device pages, we can remove the slow compression phase out of atomic
context (i.e. stop antagonising the nmi-watchdog), and no we longer need
large reserves of atomic pages.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111215
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190725223843.8971-1-chris@chris-wilson.co.uk
We have several HAS_* checks for GuC and HuC but we mostly use HAS_GUC
and HAS_HUC, with only 1 exception. Since our HW always has either
both uC or neither of them, just replace all the checks with a unified
HAS_UC.
v2: use HAS_GT_UC (Michal)
v3: fix comment (Michal)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190725001813.4740-2-daniele.ceraolospurio@intel.com
Trust that we now have adequate protection over the low level structures
via the engine->active.lock to allow ourselves to capture the GPU error
state without the heavy hammer of stop_machine(). Sadly this does mean
that we have to forgo some of the lesser used information (not derived
from the active state) that is not controlled by the active locks. This
includes the list of buffers in the ppGTT and pinned globally in the
GGTT. Originally this was used to manually verify relocations, but
hasn't been required for sometime and modern mesa now has the habit of
ensuring that all interesting buffers within a batch are captured in their
entirety (that are the auxiliary state buffers, but not the textures).
A useful side-effect is that this allows us to restore error capturing
for Braswell and Broxton.
v2: Use pagevec for a typical arbitrary number of preallocated pages
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190722222847.24178-1-chris@chris-wilson.co.uk
We cannot let the request be retired and freed while we are trying to
dump it during error capture. It is not sufficient just to grab a
reference to the request, as during retirement we may free the ring
which we are also dumping. So take the engine lock to prevent retiring
and freeing of the request.
Reported-by: Alex Shumsky <alexthreed@gmail.com>
Fixes: 83c317832e ("drm/i915: Dump the ringbuffer of the active request for debugging")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Alex Shumsky <alexthreed@gmail.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190715080946.15593-6-chris@chris-wilson.co.uk
Being part of the GT HW, it make sense to keep the guc/huc structures
inside the GT structure. To help with the encapsulation work done by the
following patches, both structures are placed inside a new intel_uc
container. Although this results in code with ugly nested dereferences
(i915->gt.uc.guc...), it saves us the extra work required in moving
the structures twice (i915 -> gt -> uc). The following patches will
reduce the number of places where we try to access the guc/huc
structures directly from i915 and reduce the ugliness.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-7-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Scratch vma lives under gt but the API used to work on i915. Make this
consistent by renaming the function to intel_gt_scratch_offset and make
it take struct intel_gt.
v2:
* Move to intel_gt. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-33-tvrtko.ursulin@linux.intel.com
For gt related operations it makes more logical sense to stay in the realm
of gt instead of dereferencing via driver i915.
This patch handles a few of the easy ones with work requiring more
refactoring still outstanding.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-30-tvrtko.ursulin@linux.intel.com
When using a global seqno, we required a precise stop-the-workd event to
handle preemption and unwind the global seqno counter. To accomplish
this, we would preempt to a special out-of-band context and wait for the
machine to report that it was idle. Given an idle machine, we could very
precisely see which requests had completed and which we needed to feed
back into the run queue.
However, now that we have scrapped the global seqno, we no longer need
to precisely unwind the global counter and only track requests by their
per-context seqno. This allows us to loosely unwind inflight requests
while scheduling a preemption, with the enormous caveat that the
requests we put back on the run queue are still _inflight_ (until the
preemption request is complete). This makes request tracking much more
messy, as at any point then we can see a completed request that we
believe is not currently scheduled for execution. We also have to be
careful not to rewind RING_TAIL past RING_HEAD on preempting to the
running context, and for this we use a semaphore to prevent completion
of the request before continuing.
To accomplish this feat, we change how we track requests scheduled to
the HW. Instead of appending our requests onto a single list as we
submit, we track each submission to ELSP as its own block. Then upon
receiving the CS preemption event, we promote the pending block to the
inflight block (discarding what was previously being tracked). As normal
CS completion events arrive, we then remove stale entries from the
inflight tracker.
v2: Be a tinge paranoid and ensure we flush the write into the HWS page
for the GPU semaphore to pick in a timely fashion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-1-chris@chris-wilson.co.uk
Now that we have a new subdirectory for display code, continue by moving
modesetting core code.
display/intel_frontbuffer.h sticks out like a sore thumb, otherwise this
is, again, a surprisingly clean operation.
v2:
- don't move intel_sideband.[ch] (Ville)
- use tabs for Makefile file lists and sort them
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613084416.6794-3-jani.nikula@intel.com
To continue the onslaught of removing the assumption of a global
execution ordering, another casualty is the engine->timeline. Without an
actual timeline to track, it is overkill and we can replace it with a
much less grand plain list. We still need a list of requests inflight,
for the simple purpose of finding inflight requests (for retiring,
resetting, preemption etc).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-3-chris@chris-wilson.co.uk
As the fence registers only apply to regions inside the GGTT is makes
more sense that we track these as part of the i915_ggtt and not the
general mm. In the next patch, we will then pull the register locking
underneath the i915_ggtt.mutex.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613073254.24048-1-chris@chris-wilson.co.uk
Similar to earlier conversions, eliminate the implicit dev_priv by
introducing some helpers which take the engine parameter (since the
register itself is per engine).
v2:
* Always use parentheses in macro arguments.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607101535.767-1-tvrtko.ursulin@linux.intel.com
Currently, the subslice_mask runtime parameter is stored as an
array of subslices per slice. Expand the subslice mask array to
better match what is presented to userspace through the
I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is
then calculated:
slice * subslice stride + subslice index / 8
v2: fix spacing in set_sseu_info args
use set_sseu_info to initialize sseu data when building
device status in debugfs
rename variables in intel_engine_types.h to avoid checkpatch
warnings
v3: update headers in intel_sseu.h
v4: add const to some sseu_dev_info variables
use sseu->eu_stride for EU stride calculations
v5: address review comments from Tvrtko and Daniele
v6: remove extra space in intel_sseu_get_subslices
return the correct subslice enable in for_each_instdone
add GEM_BUG_ON to ensure user doesn't pass invalid ss_mask size
use printk formatted string for subslice mask
v7: remove string.h header and rebase
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190524154022.13575-6-stuart.summers@intel.com
Out scatterlist utility routines can be pulled out of i915_gem.h for a
bit more decluttering.
v2: Push I915_GTT_PAGE_SIZE out of i915_scatterlist itself and into the
caller.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-9-chris@chris-wilson.co.uk
It used to be handy that we only had a couple of headers, but over time
intel_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.
Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
only where needed, and sort the modified include directives while at it
and as needed.
No functional changes.
v2: fix sparse warnings on undeclared global functions
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190429125331.32499-1-jani.nikula@intel.com
It used to be handy that we only had a couple of headers, but over time
intel_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.
Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
only where needed, and sort the modified include directives while at it
and as needed.
No functional changes.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/2e4fb1e67ed38870df3040bb0a1b1a58fd90cc86.1556540890.git.jani.nikula@intel.com
This discussion started because we use token pasting in the
GEN{2,3}_IRQ_INIT and GEN{2,3}_IRQ_RESET macros, so gen2-4 passes an
empty argument to those macros, making the code a little weird. The
original proposal was to just add a comment as the empty argument, but
Ville suggested we just add a prefix to the registers, and that indeed
sounds like a more elegant solution.
Now doing this is kinda against our rules for register naming since we
only add gens or platform names as register prefixes when the given
gen/platform changes a register that already existed before. On the
other hand, we have so many instances of IIR/IMR in comments that
adding a prefix would make the users of these register more easily
findable, in addition to make our token pasting macros actually
readable. So IMHO opening an exception here is worth it.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190410235344.31199-4-paulo.r.zanoni@intel.com
The PDP registers are an oddity inside the set of context saved
registers in that they take the engine as a parameter to the macro and
not the mmio_base as the others do. Make it accept the engine->mmio_base
for consistency in programming the context registers.
add/remove: 0/0 grow/shrink: 2/1 up/down: 3/-32 (-29)
Function old new delta
emit_ppgtt_update 324 326 +2
capture 5102 5103 +1
execlists_init_reg_state.isra 1128 1096 -32
And similar savings later!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190405123831.9724-1-chris@chris-wilson.co.uk
We want to use intel_engine_mask_t inside i915_request.h, which means
extracting it from the general header file mess and placing it inside a
types.h. A knock on effect is that the compiler wants to warn about
type-contraction of ALL_ENGINES into intel_engine_maskt_t, so prepare
for the worst.
v2: Use intel_engine_mask_t consistently
v3: Move I915_NUM_ENGINES to its natural home at the end of the enum
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190401162641.10963-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Concept of a sub-platform already exist in our code (like ULX and ULT
platform variants and similar),implemented via the macros which check a
list of device ids to determine a match.
With this patch we consolidate device ids checking into a single function
called during early driver load.
A few low bits in the platform mask are reserved for sub-platform
identification and defined as a per-platform namespace.
At the same time it future proofs the platform_mask handling by preparing
the code for easy extending, and tidies the very verbose WARN strings
generated when IS_PLATFORM macros are embedded into a WARN type
statements.
v2: Fixed IS_SUBPLATFORM. Updated commit msg.
v3: Chris was right, there is an ordering problem.
v4:
* Catch-up with new sub-platforms.
* Rebase for RUNTIME_INFO.
* Drop subplatform mask union tricks and convert platform_mask to an
array for extensibility.
v5:
* Fix subplatform check.
* Protect against forgetting to expand subplatform bits.
* Remove platform enum tallying.
* Add subplatform to error state. (Chris)
* Drop macros and just use static inlines.
* Remove redundant IRONLAKE_M. (Ville)
v6:
* Split out Ironlake change.
* Optimize subplatform check.
* Use __always_inline. (Lucas)
* Add platform_mask comment. (Paulo)
* Pass stored runtime info in error capture. (Chris)
v7:
* Rebased for new AML ULX device id.
* Bump platform mask array size for EHL.
* Stop mentioning device ids in intel_device_subplatform_init by using
the trick of splitting macros i915_pciids.h. (Jani)
* AML seems to be either a subplatform of KBL or CFL so express it like
that.
v8:
* Use one device id table per subplatform. (Jani)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Jose Souza <jose.souza@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190327142328.31780-1-tvrtko.ursulin@linux.intel.com
A few advantages:
- Prepares us for the planned split of display uncore from GT uncore
- Improves our engine-centric view of the world in the engine code
and allows us to avoid jumping back to dev_priv.
- Allows us to wrap accesses to engine register in nice macros that
automatically pick the right mmio base.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190325214940.23632-10-daniele.ceraolospurio@intel.com
ffs() is 1-indexed, but we want to use it as an index into an array, so
use __ffs() instead.
Fixes: eb8d0f5af4 ("drm/i915: Remove GPU reset dependence on struct_mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190315163933.19352-1-chris@chris-wilson.co.uk
To find the active request, we need only search along the individual
engine for the right request. This does not require touching any global
GEM state, so move it into the engine compartment.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-3-chris@chris-wilson.co.uk
In the next patch, we are introducing a broad virtual engine to encompass
multiple physical engines, losing the 1:1 nature of BIT(engine->id). To
reflect the broader set of engines implied by the virtual instance, lets
store the full bitmask.
v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/)
v3: Tvrtko voted for moah churn so teach everyone to not mention ring
and use $class$instance throughout.
v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and
VCS[0-4] in later gen. We opt to keep the code consistent and use
0-index naming throughout.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk