The trick of using an uncached mmio read to ensure that the GGTT writes
are flushed does not require us to do the forcewake dance, so avoid it
in the hope of reducing the frequency that we do keep the device forced
awake.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170318104257.694-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
When we wedge the device, we override engine->submit_request with a nop
to ensure that all in-flight requests are marked in error. However, igt
would like to unwedge the device to test -EIO handling. This requires us
to flush those in-flight requests and restore the original
engine->submit_request.
v2: Use a vfunc to unify enabling request submission to engines
v3: Split new vfunc to a separate patch.
v4: Make the wait interruptible -- the third party fences we wait upon
may be indefinitely broken, so allow the reset to be aborted.
Fixes: 821ed7df6e ("drm/i915: Update reset path to fix incomplete requests")
Testcase: igt/gem_eio
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v3
Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-3-chris@chris-wilson.co.uk
I915_RESET_IN_PROGRESS is being used for both signaling the requirement
to i915_mutex_lock_interruptible() to avoid taking the struct_mutex and
to instruct a waiter (already holding the struct_mutex) to perform the
reset. To allow for a little more coordination, split these two meaning
into a couple of distinct flags. I915_RESET_BACKOFF tells
i915_mutex_lock_interruptible() not to acquire the mutex and
I915_RESET_HANDOFF tells the waiter to call i915_reset().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-1-chris@chris-wilson.co.uk
Current version of intel_guc_init_hw() does a lot:
- cares about submission
- loads huc
- implement WA
This change offloads some of the logic to intel_uc_init_hw(), which now
cares about the above.
v2: rename guc_hw_reset and fix typo in define name (M. Wajdeczko)
v3: rename once again
v4: remove spurious comments and add some style (J. Lahtinen)
v5: flow changes, got rid of dead checks (M. Wajdeczko)
v6: rebase
v7: rebase & onion teardown (J. Lahtinen)
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
GuC historically has two "startup" functions called _init() and _setup()
Then HuC came with it's _init() and _load().
This commit renames intel_guc_setup() and intel_huc_load() to
*uc_init_hw() as they called from the i915_gem_init_hw().
The aim is to be consistent in that entry points called during
particular driver init phases (e.g. init_hw) are all suffixed by that
phase. When reading the leaf functions, it should be clear at what stage
during the driver load it is called and therefore what operations are
legal at that point.
Also, since the functions start with intel_guc and intel_huc they take
appropiate structure.
v2: commit message update (Chris Wilson)
v3: change taken parameters to be more "semantic" (M. Wajdeczko)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
If the object is coherent, we can simply update the cache domain on the
whole object rather than calculate the before/after clflushes. The
advantage is that we then get correct tracking of ellided flushes when
changing coherency later.
Testcase: igt/gem_pwrite_snooped
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170310000942.11661-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Backmerge drm-next to get at all the good stuff in drm-misc. We need
that because:
- drm_connector_list_iter conversion for i915 needs the core patches.
- Maarten's patches to use the new atomic state iterators also need
the core patches.
- We need the new link status property to complete the DP retraining
work, merging through 2 branches wasn't a good idea and we had to
partially backtrack.
- Chris needs reservation_object_trylock and we want to roll out
kref_read everywhere.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
4 weeks worth of stuff since I was traveling&lazy:
- lspcon improvements (Imre)
- proper atomic state for cdclk handling (Ville)
- gpu reset improvements (Chris)
- lots and lots of polish around fences, requests, waiting and
everything related all over (both gem and modeset code), from Chris
- atomic by default on gen5+ minus byt/bsw (Maarten did the patch to
flip the default, really this is a massive joint team effort)
- moar power domains, now 64bit (Ander)
- big pile of in-kernel unit tests for various gem subsystems (Chris),
including simple mock objects for i915 device and and the ggtt
manager.
- i915_gpu_info in debugfs, for taking a snapshot of the current gpu
state. Same thing as i915_error_state, but useful if the kernel didn't
notice something is stick. From Chris.
- bxt dsi fixes (Umar Shankar)
- bxt w/a updates (Jani)
- no more struct_mutex for gem object unreference (Chris)
- some execlist refactoring (Tvrtko)
- color manager support for glk (Ander)
- improve the power-well sync code to better take over from the
firmware (Imre)
- gem tracepoint polish (Tvrtko)
- lots of glk fixes all around (Ander)
- ctx switch improvements (Chris)
- glk dsi support&fixes (Deepak M)
- dsi fixes for vlv and clanups, lots of them (Hans de Goede)
- switch to i915.ko types in lots of our internal modeset code (Ander)
- byt/bsw atomic wm update code, yay (Ville)
* tag 'drm-intel-next-2017-03-06' of git://anongit.freedesktop.org/git/drm-intel: (432 commits)
drm/i915: Update DRIVER_DATE to 20170306
drm/i915: Don't use enums for hardware engine id
drm/i915: Split breadcrumbs spinlock into two
drm/i915: Refactor wakeup of the next breadcrumb waiter
drm/i915: Take reference for signaling the request from hardirq
drm/i915: Add FIFO underrun tracepoints
drm/i915: Add cxsr toggle tracepoint
drm/i915: Add VLV/CHV watermark/FIFO programming tracepoints
drm/i915: Add plane update/disable tracepoints
drm/i915: Kill level 0 wm hack for VLV/CHV
drm/i915: Workaround VLV/CHV sprite1->sprite0 enable underrun
drm/i915: Sanitize VLV/CHV watermarks properly
drm/i915: Only use update_wm_{pre,post} for pre-ilk platforms
drm/i915: Nuke crtc->wm.cxsr_allowed
drm/i915: Compute proper intermediate wms for vlv/cvh
drm/i915: Skip useless watermark/FIFO related work on VLV/CHV when not needed
drm/i915: Compute vlv/chv wms the atomic way
drm/i915: Compute VLV/CHV FIFO sizes based on the PM2 watermarks
drm/i915: Plop vlv/chv fifo sizes into crtc state
drm/i915: Plop vlv wm state into crtc_state
...
Before we instantiate/pin the backing store for our use, we
can prepopulate the shmemfs filp efficiently using a write into the
pagecache. We avoid the penalty of instantiating all the pages, important
if the user is just writing to a few and never uses the object on the GPU,
and using a direct write into shmemfs allows it to avoid the cost of
retrieving a page (mostly the clear-before-use, but in theory we could
curtail swapin) before it is overwritten.
This can be extended later to provide additional specialisation for
other backends (other than shmemfs). For now it provides a defense
against very large write-only allocations from exhausting all of system
memory.
v2: Smelling fixes.
Fixes: fe115628d5 ("drm/i915: Implement pwrite without struct-mutex")
References: https://bugs.freedesktop.org/show_bug.cgi?id=99107
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.10+
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170307120338.7277-2-chris@chris-wilson.co.uk
Once the object has been truncated, it is unrecoverable. To facilitate
detection of this state store the error in obj->mm.pages.
This is required for the next patch which should be applied to v4.10
(via stable), so we also need to mark this patch for backporting. In
that regard, let's consider this to be a fix/improvement too.
v2: Avoid dereferencing the ERR_PTR when freeing the object.
Fixes: 1233e2db19 ("drm/i915: Move object backing storage manipulation to its own locking")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.10+
Link: http://patchwork.freedesktop.org/patch/msgid/20170307132031.32461-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Adding to the tail of the client request list as the only other user is
in the throttle ioctl that iterates forwards over the list. It only
needs protection against deletion of a request as it reads it, it simply
won't see a new request added to the end of the list, or it would be too
early and rejected. We can further reduce the number of spinlocks
required when throttling by removing stale requests from the client_list
as we throttle.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170302122525.19675-1-chris@chris-wilson.co.uk
A significant cost in setting up a wait is the overhead of enabling the
interrupt. As we disable the interrupt whenever the queue of waiters is
empty, if we are frequently waiting on alternating batches, we end up
re-enabling the interrupt on a frequent basis. We do want to disable the
interrupt during normal operations as under high load it may add several
thousand interrupts/s - we have been known in the past to occupy whole
cores with our interrupt handler after accidentally leaving user
interrupts enabled. As a compromise, leave the interrupt enabled until
the next IRQ, or the system is idle. This gives a small window for a
waiter to keep the interrupt active and not be delayed by having to
re-enable the interrupt.
v2: Restore hangcheck/missed-irq detection for continuations
v3: Be more careful restoring the hangcheck timer after reset
v4: Be more careful restoring the fake irq after reset (if required!)
v5: Redo changes to intel_engine_wakeup()
v6: Factor out __intel_engine_wakeup()
v7: Improve commentary for declaring a missed wakeup
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-4-chris@chris-wilson.co.uk
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to
take a vma and vmf parameter when the vma already resides in vmf.
Remove the vma parameter to simplify things.
[arnd@arndb.de: fix ARM build]
Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch makes the I915_PARAM_HAS_EXEC_CONSTANTS getparam return 0
(indicating the optional feature is not supported), and makes execbuf
always return -EINVAL if the flags are used.
Apparently, no userspace ever shipped which used this optional feature:
I checked the git history of Mesa, xf86-video-intel, libva, and Beignet,
and there were zero commits showing a use of these flags. Kernel commit
72bfa19c8d apparently introduced the feature prematurely. According
to Chris, the intention was to use this in cairo-drm, but "the use was
broken for gen6", so I don't think it ever happened.
'relative_constants_mode' has always been tracked per-device, but this
has actually been wrong ever since hardware contexts were introduced, as
the INSTPM register is saved (and automatically restored) as part of the
render ring context. The software per-device value could therefore get
out of sync with the hardware per-context value. This meant that using
them is actually unsafe: a client which tried to use them could damage
the state of other clients, causing the GPU to interpret their BO
offsets as absolute pointers, leading to bogus memory reads.
These flags were also never ported to execlist mode, making them no-ops
on Gen9+ (which requires execlists), and Gen8 in the default mode.
On Gen8+, userspace can write these registers directly, achieving the
same effect. On Gen6-7.5, it likely makes sense to extend the command
parser to support them. I don't think anyone wants this on Gen4-5.
Based on a patch by Dave Gordon.
v3: Return -ENODEV for the getparam, as this is what we do for other
obsolete features. Suggested by Chris Wilson.
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92448
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215093446.21291-1-kenneth@whitecape.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
A request is assigned a global seqno only when it is on the hardware
execution queue. The global seqno can be used to maintain a list of
requests on the same engine in retirement order, for example for
constructing a priority queue for waiting. Prior to its execution, or
if it is subsequently removed in the event of preemption, its global
seqno is zero. As both insertion and removal from the execution queue
may operate in IRQ context, it is not guarded by the usual struct_mutex
BKL. Instead those relying on the global seqno must be prepared for its
value to change between reads. Only when the request is complete can
the global seqno be stable (due to the memory barriers on submitting
the commands to the hardware to write the breadcrumb, if the HWS shows
that it has passed the global seqno and the global seqno is unchanged
after the read, it is indeed complete).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-9-chris@chris-wilson.co.uk
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJYoM2fAAoJEHm+PkMAQRiGr9MH/izEAMri7rJ0QMc3ejt+WmD0
8pkZw3+MVn71z6cIEgpzk4QkEWJd5rfhkETCeCp7qQ9V6cDW1FDE9+0OmPjiphDt
nnzKs7t7skEBwH5Mq5xygmIfkv+Z0QGHZ20gfQWY3F56Uxo+ARF88OBHBLKhqx3v
98C7YbMFLKBslKClA78NUEIdx0UfBaRqerlERx0Lfl9aoOrbBS6WI3iuREiylpih
9o7HTrwaGKkU4Kd6NdgJP2EyWPsd1LGalxBBjeDSpm5uokX6ALTdNXDZqcQscHjE
RmTqJTGRdhSThXOpNnvUJvk9L442yuNRrVme/IqLpxMdHPyjaXR3FGSIDb2SfjY=
=VMy8
-----END PGP SIGNATURE-----
Merge tag 'v4.10-rc8' into drm-next
Linux 4.10-rc8
Backmerge Linus rc8 to fix some conflicts, but also
to avoid pulling it in via a fixes pull from someone.
Setting retire=true is identical to using origin=ORIGIN_CS, so make the
same simplification to intel_fb_obj_flush() as already employed for
intel_fb_obj_invalidate().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-6-chris@chris-wilson.co.uk
Flushing the cachelines for an object is slow, can be as much as 100ms
for a large framebuffer. We currently do this under the struct_mutex BKL
on execution or on pageflip. But now with the ability to add fences to
obj->resv for both flips and execbuf (and we naturally wait on the fence
before CPU access), we can move the clflush operation to a workqueue and
signal a fence for completion, thereby doing the work asynchronously and
not blocking the driver or its clients.
v2: Introduce i915_gem_clflush.h and use a new name, split out some
extras into separate patches.
Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-5-chris@chris-wilson.co.uk
We have three different paths by which userspace wants to flush the
display plane (i.e. objects with obj->pin_display). Use a common helper
to identify those paths and to simplify a later change.
v2: Include the conditional in the name, i915_gem_object_flush_if_display
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-3-chris@chris-wilson.co.uk
The change_domain tracepoint has been inaccurate for a few years - it
doesn't fully capture the domains, especially with userspace bypassing
them. It is defunct, misleading and time to be removed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-1-chris@chris-wilson.co.uk
If we wait upon the full (i.e. all shared fences, or upon an exclusive
fence) reservation object successfully, we know that all fences beneath
it have been signaled, so long as no new fences were added whilst we
slept. If the reservation_object remains the same, as detected by its
seqcount, we can then reap all the fences upon completion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170217151304.16665-6-chris@chris-wilson.co.uk
The physical object is treated as permanently pinned. If we fail to take
this initial pin during i915_gem_object_attach_phys() we need to revert
it back to an ordinary shmemfs object before reporting the failure.
v2: git-add
Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215163900.11606-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
We wait upon jiffies, but report the time elapsed using a
high-resolution timer. This discrepancy can lead to us timing out the
wait prior to us reporting the elapsed time as complete.
This restores the squelching lost in commit e95433c73a ("drm/i915:
Rearrange i915_wait_request() accounting with callers").
Fixes: e95433c73a ("drm/i915: Rearrange i915_wait_request() accounting with callers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/20170216125441.30923-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The i915_gem_object_wait_fence() uses an incoming timeout=0 to query
whether the current fence is busy or idle, without waiting. This can be
used by the wait-ioctl to implement a busy query.
Fixes: e95433c73a ("drm/i915: Rearrange i915_wait_request() accounting with callers")
Testcase: igt/gem_wait/basic-busy-write-all
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/20170212215344.16600-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit d892e9398e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Following a reset, the context and page directory registers are lost.
However, the queue of requests that we resubmit after the reset may
depend upon them - the registers are restored from a context image, but
that restore may be inhibited and may simply be absent from the request
if it was in the middle of a sequence using the same context. If we
prime the CCID/PD registers with the first request in the queue (even
for the hung request), we prevent invalid memory access for the
following requests (and continually hung engines).
v2: Magic BIT(8), reserved for future use but still appears unused.
v3: Some commentary on handling innocent vs guilty requests
v4: Add a wait for PD_BASE fetch. The reload appears to be instant on my
Ivybridge, but this bit probably exists for a reason.
Fixes: 821ed7df6e ("drm/i915: Update reset path to fix incomplete requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170207152437.4252-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
(cherry picked from commit c0dcb203fb)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The i915_gem_object_wait_fence() uses an incoming timeout=0 to query
whether the current fence is busy or idle, without waiting. This can be
used by the wait-ioctl to implement a busy query.
Fixes: e95433c73a ("drm/i915: Rearrange i915_wait_request() accounting with callers")
Testcase: igt/gem_wait/basic-busy-write-all
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/20170212215344.16600-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Write into an object using WB, WC, GTT, and GPU paths and make sure that
our internal API is sufficient to ensure coherent reads and writes.
v2: Avoid invalid free upon allocation error
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-21-chris@chris-wilson.co.uk
The phys object is a rarely used device (only very old machines require
a chunk of physically contiguous pages for a few hardware interactions).
As such, it is not exercised by CI and to combat that we want to add a
test that exercises the phys object on all platforms.
v2: Always set err on error paths and not rely on inheriting the err.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-17-chris@chris-wilson.co.uk
We would like to be able to exercise huge allocations even on memory
constrained devices. To do this we create an object that allocates only
a few pages and remaps them across its whole range - each page is reused
multiple times. We can therefore pretend we are rendering into a much
larger object.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-9-chris@chris-wilson.co.uk
Start exercising the scattergather lists, especially looking at
iteration after coalescing.
v2: Comment on the peculiarity of table construction (i.e. why this
sg_table might be interesting).
v3: Added one __func__ to identify expect_pfn_sg()
v4: Loop until we have crossed the chain boundary (forcing sg_table to
do multiple allocations) before squelching a potential ENOMEM from oom.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-2-chris@chris-wilson.co.uk
Following a hang and reset, we know that the engine is idle and all
context state has been saved or lost. Consequently, we know that the
engine is no longer referencing the last context and we can relinquish
our tracking.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tursulin@ursulin.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170212172002.23072-5-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
The signal threads may be running concurrently with the GPU reset. The
completion from the GPU run asynchronous with the reset and two threads
may see different snapshots of the state, and the signaler may mark a
request as complete as we try to reset it. We don't tolerate 2 different
views of the same state and complain if we try to mark a request as
failed if it is already complete. Disable the signal threads during
reset to prevent this conflict (even though the conflict implies that
the state we resetting to is invalid, we have already made our
decision!).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99733
References: https://bugs.freedesktop.org/show_bug.cgi?id=99671
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170212172002.23072-4-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Disabling the tasklet leaves it if scheduled on the ready to run list
until it is re-enabled. This will leave the ksoftird thread spinning
until satisfied. To prevent this situation on starting the GPU reset, we
want to kill the tasklet first and then disable. The same problem will
arise when a tasklet is scheduled from another device, so a better
solution is required for the general case.
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 1f7b847d72 ("drm/i915: Disable engine->irq_tasklet around resets")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170212172002.23072-3-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
When updating the bookkeeping following the reset, we need the seqno to
be coherent on the CPU prior to trusting its result for deciding whether
any request is completed. We need the irq_barrier before we start making
these decisions, i.e. in reset_prepare.
References: https://bugs.freedesktop.org/show_bug.cgi?id=99733
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170210185214.23463-1-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
As dmabufs may live beyond the PCI device removal, we need to flush the
freed object worker on device release, and include a warning in case
there is a leak.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170210163523.17533-3-chris@chris-wilson.co.uk
Chris Wilson needs the new drm_driver->release callback to make sure
the shiny new dma-buf testcases don't oops the driver on unload.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
When we restart the engines, and we have active requests, a request on
the first engine may complete and queue a request to the second engine
before we try to restart the second engine. That queueing of the
request may race with the engine to restart, and so may corrupt the
current state. Disabling the engine->irq_tasklet prevents the two paths
from writing into ELSP simultaneously (and modifyin the execlists_port[]
at the same time).
Fixes: 821ed7df6e ("drm/i915: Update reset path to fix incomplete requests")
Testcase: igt/gem_exec_fence/await-hang
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170208143033.11651-3-chris@chris-wilson.co.uk
Currently we do a reset prepare/finish around the call to reset the GPU,
but it looks like we need a later stage after the hw has been
reinitialised to allow GEM to restart itself. Start by splitting the 2
GEM phases into 3:
prepare - before the reset, check if GEM recovered, then stop GEM
reset - after the reset, update GEM bookkeeping
finish - after the re-initialisation following the reset, restart GEM
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170208143033.11651-2-chris@chris-wilson.co.uk
Just a simple refactor to move a loop and some support code out of
i915_gem_init_hw(). This is in preparation for avoiding a race between
the tasklet writing to ELSP whilst simultaneously being written by
engine->init_hw() following a GPU reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170208143033.11651-1-chris@chris-wilson.co.uk
Since to unbind an object, we may need a powered up device to access the
GTT entries, we only shrink bound objects if awake. Callers to
i915_gem_shrink_all() had to take this into account and take the rpm
wakeref, but we can move this wakeref into the shrink_all itself for
convenience and making the function live up to its name.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170208104710.18089-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
The goal of the WARN was to catch when we are still actively using the
fence as we go into the runtime suspend. However, the reg->pin_count is
too coarse as it does not distinguish between exclusive ownership of the
fence register from activity.
I've not improved on the WARN, nor have we captured this WARN in an
exact igt, but it is showing up regularly in the wild:
[ 1915.935332] WARNING: CPU: 1 PID: 10861 at drivers/gpu/drm/i915/i915_gem.c:2022 i915_gem_runtime_suspend+0x116/0x130 [i915]
[ 1915.935383] WARN_ON(reg->pin_count)[ 1915.935399] Modules linked in:
snd_hda_intel i915 drm_kms_helper vgem netconsole scsi_transport_iscsi fuse vfat fat x86_pkg_temp_thermal coretemp intel_cstate intel_uncore snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mei_me mei serio_raw intel_rapl_perf intel_pch_thermal soundcore wmi acpi_pad i2c_algo_bit syscopyarea sysfillrect sysimgblt fb_sys_fops drm r8169 mii video [last unloaded: drm_kms_helper]
[ 1915.935785] CPU: 1 PID: 10861 Comm: kworker/1:0 Tainted: G U W 4.9.0-rc5+ #170
[ 1915.935799] Hardware name: LENOVO 80MX/Lenovo E31-80, BIOS DCCN34WW(V2.03) 12/01/2015
[ 1915.935822] Workqueue: pm pm_runtime_work
[ 1915.935845] ffffc900044fbbf0 ffffffffac3220bc ffffc900044fbc40 0000000000000000
[ 1915.935890] ffffc900044fbc30 ffffffffac059bcb 000007e6044fbc60 ffff8801626e3198
[ 1915.935937] ffff8801626e0000 0000000000000002 ffffffffc05e5d4e 0000000000000000
[ 1915.935985] Call Trace:
[ 1915.936013] [<ffffffffac3220bc>] dump_stack+0x4f/0x73
[ 1915.936038] [<ffffffffac059bcb>] __warn+0xcb/0xf0
[ 1915.936060] [<ffffffffac059c4f>] warn_slowpath_fmt+0x5f/0x80
[ 1915.936158] [<ffffffffc052d916>] i915_gem_runtime_suspend+0x116/0x130 [i915]
[ 1915.936251] [<ffffffffc04f1c74>] intel_runtime_suspend+0x64/0x280 [i915]
[ 1915.936277] [<ffffffffac0926f1>] ? dequeue_entity+0x241/0xbc0
[ 1915.936298] [<ffffffffac36bb85>] pci_pm_runtime_suspend+0x55/0x180
[ 1915.936317] [<ffffffffac36bb30>] ? pci_pm_runtime_resume+0xa0/0xa0
[ 1915.936339] [<ffffffffac4514e2>] __rpm_callback+0x32/0x70
[ 1915.936356] [<ffffffffac451544>] rpm_callback+0x24/0x80
[ 1915.936375] [<ffffffffac36bb30>] ? pci_pm_runtime_resume+0xa0/0xa0
[ 1915.936392] [<ffffffffac45222d>] rpm_suspend+0x12d/0x680
[ 1915.936415] [<ffffffffac69f6d7>] ? _raw_spin_unlock_irq+0x17/0x30
[ 1915.936435] [<ffffffffac0810b8>] ? finish_task_switch+0x88/0x220
[ 1915.936455] [<ffffffffac4534bf>] pm_runtime_work+0x6f/0xb0
[ 1915.936477] [<ffffffffac074353>] process_one_work+0x1f3/0x4d0
[ 1915.936501] [<ffffffffac074678>] worker_thread+0x48/0x4e0
[ 1915.936523] [<ffffffffac074630>] ? process_one_work+0x4d0/0x4d0
[ 1915.936542] [<ffffffffac074630>] ? process_one_work+0x4d0/0x4d0
[ 1915.936559] [<ffffffffac07a2c9>] kthread+0xd9/0xf0
[ 1915.936580] [<ffffffffac07a1f0>] ? kthread_park+0x60/0x60
[ 1915.936600] [<ffffffffac69fe62>] ret_from_fork+0x22/0x30
In the case the register is pinned, it should be present and we will
need to invalidate them to be restored upon resume as we cannot expect
the owner of the pin to call get_fence prior to use after resume.
Fixes: 7c108fd8fe ("drm/i915: Move fence cancellation to runtime suspend")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98804
Reported-by: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Imre Deak <imre.deak@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+
Link: http://patchwork.freedesktop.org/patch/msgid/20170203125717.8431-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit e0ec3ec698)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>