Inside the mock GEM device, we try to grab the runtime pm for the fake
device to prevent it from ever suspending. However, if CONFIG_PM is not
set, trying to obtain the wakref returns an error which we WARN about.
Suppress the expected warning.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706205947.11209-1-chris@chris-wilson.co.uk
Calling mock_engine() calls i915_timeline_init() and that requires
struct_mutex to be held as it adds itself to the global list of
timelines. This error was introduced by commit a89d1f921c ("drm/i915:
Split i915_gem_timeline into individual timelines") but the issue was
masked in CI by the earlier lockdep spam.
Fixes: a89d1f921c ("drm/i915: Split i915_gem_timeline into individual timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180508211056.17151-1-chris@chris-wilson.co.uk
When userspace is passing around swapbuffers using DRI, we frequently
have to open and close the same object in the foreign address space.
This shows itself as the same object being rebound at roughly 30fps
(with a second object also being rebound at 30fps), which involves us
having to rewrite the page tables and maintain the drm_mm range manager
every time.
However, since the object still exists and it is only the local handle
that disappears, if we are lazy and do not unbind the VMA immediately
when the local user closes the object but defer it until the GPU is
idle, then we can reuse the same VMA binding. We still have to be
careful to mark the handle and lookup tables as closed to maintain the
uABI, just allowing the underlying VMA to be resurrected if the user is
able to access the same object from the same context again.
If the object itself is destroyed (neither userspace keeping a handle to
it), the VMA will be reaped immediately as usual.
In the future, this will be even more useful as instantiating a new VMA
for use on the GPU will become heavier. A nuisance indeed, so nip it in
the bud.
v2: s/__i915_vma_final_close/i915_vma_destroy/ etc.
v3: Leave a hint as to why we deferred the unbind on close.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180503195115.22309-1-chris@chris-wilson.co.uk
We need to move to a more flexible timeline that doesn't assume one
fence context per engine, and so allow for a single timeline to be used
across a combination of engines. This means that preallocating a fence
context per engine is now a hindrance, and so we want to introduce the
singular timeline. From the code perspective, this has the notable
advantage of clearing up a lot of mirky semantics and some clumsy
pointer chasing.
By splitting the timeline up into a single entity rather than an array
of per-engine timelines, we can realise the goal of the previous patch
of tracking the timeline alongside the ring.
v2: Tweak wait_for_idle to stop the compiling thinking that ret may be
uninitialised.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-2-chris@chris-wilson.co.uk
In the future, we want to move a request between engines. To achieve
this, we first realise that we have two timelines in effect here. The
first runs through the GTT is required for ordering vma access, which is
tracked currently by engine. The second is implied by sequential
execution of commands inside the ringbuffer. This timeline is one that
maps to userspace's expectations when submitting requests (i.e. given the
same context, batch A is executed before batch B). As the rings's
timelines map to userspace and the GTT timeline an implementation
detail, move the timeline from the GTT into the ring itself (per-context
in logical-ring-contexts/execlists, or a global per-engine timeline for
the shared ringbuffers in legacy submission.
The two timelines are still assumed to be equivalent at the moment (no
migrating requests between engines yet) and so we can simply move from
one to the other without adding extra ordering.
v2: Reinforce that one isn't allowed to mix the engine execution
timeline with the client timeline from userspace (on the ring).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-1-chris@chris-wilson.co.uk
We don't need to track every ring for its lifetime as they are managed
by the contexts/engines. What we do want to track are the live rings so
that we can sporadically clean up requests if userspace falls behind. We
can simply restrict the gt->rings list to being only gt->live_rings.
v2: s/live/active/ for consistency with gt.active_requests
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-4-chris@chris-wilson.co.uk
In the next patch, rings are the central timeline as requests may jump
between engines. Therefore in the future as we retire in order along the
engine timeline, we may retire out-of-order within a ring (as the ring now
occurs along multiple engines), leading to much hilarity in miscomputing
the position of ring->head.
As an added bonus, retiring along the ring reduces the penalty of having
one execlists client do cleanup for another (old legacy submission
shares a ring between all clients). The downside is that slow and
irregular (off the critical path) process of cleaning up stale requests
after userspace becomes a modicum less efficient.
In the long run, it will become apparent that the ordered
ring->request_list matches the ring->timeline, a fun challenge for the
future will be unifying the two lists to avoid duplication!
v2: We need both engine-order and ring-order processing to maintain our
knowledge of where individual rings have completed upto as well as
knowing what was last executing on any engine. And finally by decoupling
retiring the contexts on the engine and the timelines along the rings,
we do have to keep a reference to the context on each request
(previously it was guaranteed by the context being pinned).
v3: Not just a reference to the context, but we need to keep it pinned
as we manipulate the rings; i.e. we need a pin for both the manipulation
of the engine state during its retirements, and a separate pin for the
manipulation of the ring state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-3-chris@chris-wilson.co.uk
We want to de-emphasize the link between the request (dependency,
execution and fence tracking) from GEM and so rename the struct from
drm_i915_gem_request to i915_request. That is we may implement the GEM
user interface on top of requests, but they are an abstraction for
tracking execution rather than an implementation detail of GEM. (Since
they are not tied to HW, we keep the i915 prefix as opposed to intel.)
In short, the spatch:
@@
@@
- struct drm_i915_gem_request
+ struct i915_request
A corollary to contracting the type name, we also harmonise on using
'rq' shorthand for local variables where space if of the essence and
repetition makes 'request' unwieldy. For globals and struct members,
'request' is still much preferred for its clarity.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
If we remove some hardcoded assumptions about the preempt context having
a fixed id, reserved from use by normal user contexts, we may only
allocate the i915_gem_context when required. Then the subsequent
decisions on using preemption reduce to having the preempt context
available.
v2: Include an assert that we don't allocate the preempt context twice.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-3-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Even for the mock i915 device, we need to initialise the
drm.mode_config, as we may ultimately query whether there are any KMS
users deep in the bowels of some paths (e.g. eviction). As we initialise
drm.mode_config we must cleanup after ourselves!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171209210835.32609-1-chris@chris-wilson.co.uk
In preparation for huge gtt pages expose page_sizes as part of the
device info, to indicate the page sizes supported by the HW. Currently
only 4K is supported.
v2: s/page_size_mask/page_sizes/
v3: introduce I915_GTT_MAX_PAGE_SIZE
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-5-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-4-chris@chris-wilson.co.uk
Not a fully blown gemfs, just our very own tmpfs kernel mount. Doing so
moves us away from the shmemfs shm_mnt, and gives us the much needed
flexibility to do things like set our own mount options, namely huge=
which should allow us to enable the use of transparent-huge-pages for
our shmem backed objects.
v2: various improvements suggested by Joonas
v3: move gemfs instance to i915.mm and simplify now that we have
file_setup_with_mnt
v4: fallback to tmpfs shm_mnt upon failure to setup gemfs
v5: make tmpfs fallback kinder
v5: better gemfs failure message
flags variable
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-3-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-2-chris@chris-wilson.co.uk
An earlier bugfix tried to work around this build failure:
drivers/gpu/drm/i915/selftests/mock_gem_device.c: In function 'mock_gem_device':
drivers/gpu/drm/i915/selftests/mock_gem_device.c:151:20: error: 'struct dev_archdata' has no member named 'iommu'
Checking for CONFIG_IOMMU_API is not sufficient as a compile-time
test since that may be enabled in configurations that have neither
INTEL_IOMMU not AMD_IOMMU enabled. This changes the check to
INTEL_IOMMU instead, as this is the only case we actually care about.
Fixes: f46f156ea7 ("drm/i915/selftests: Only touch archdata.iommu when it exists")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171005120749.400818-1-arnd@arndb.de
For the fake device we have our own set of mock contexts that need to
match the real contexts we normally create. Currently this requires us
to manually instantiate them for the selftests, which I forgot.
Reported-by: Matthew Auld <matthew.william.auld@gmail.com>
Fixes: e7af311683 ("drm/i915: Introduce a preempt context")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171005105927.22991-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
archdata.iommu only exists when CONFIG_IOMMU_API is enabled (and only
applies to intel-iommu in our case) so conditionally compile it out when
it doesn't exist.
Fixes: b5891fb520 ("drm/i915/selftests: Disable iommu for the mock device")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170918164652.14200-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
On some machines, the iommu cannot allocate a domain for the mock device
causing the dma_map_sg() to fail, and the selftest to fail with -ENOMEM.
For the mock selftests, we are using a fake device and do not care about
iommu; so convince intel_iommu to treat us as a dummy device with an
identity mapping (and no iommu domain).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101080
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170914162240.18310-1-chris@chris-wilson.co.uk
Tested-by: Elizabeth De La Torre Mena <elizabethx.de.la.torre.mena@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com
In the original selftest, we didn't care what the engine->id was, just
that it could uniquely identify it. Later though, we started tracking
the mock engines in the fixed size arrays around the drm_i915_private and
so we now require their indices to be correct. This becomes an issue when
using the standalone harness which runs all available tests at module load,
and so we quickly assign an out-of-bounds index to an engine as we
reallocate the mock GEM device between tests. It doesn't show up in
igt/drv_selftest as that runs each subtest individually.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102045
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170809163930.26470-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Goto the right label in case of error, otherwise there is a leak.
This has been introduced by c5cf9a9147. In this patch a goto has not been
updated.
Fixes: c5cf9a9147 ("drm/i915: Create a kmem_cache to allocate struct i915_priolist from")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://patchwork.freedesktop.org/patch/msgid/20170719223503.30580-1-christophe.jaillet@wanadoo.fr
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We need to unpin the last retired context early in the shutdown sequence
so that its RCU free is done before we try to free the context ida. I
included this in a later patch ("drm/i915: Keep a recent cache of freed
contexts objects for reuse") and so missed that the selftests were broken
in the meantime.
Reported-by: Matthew Auld <matthew.auld@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101627
Fixes: 5f09a9c8ab ("drm/i915: Allow contexts to be unreferenced locklessly")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170719135957.14603-1-chris@chris-wilson.co.uk
Tested-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Supply a pm_domain and its ops for our mock GEM device so that
device runtime pm doesn't complain even though we only want to mark it
permanently active!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170718173028.31207-1-chris@chris-wilson.co.uk
Tested-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Workers on the i915->wq may rearm themselves so for completeness we need
to replace our flush_workqueue() with a call to drain_workqueue() before
unloading the device.
v2: Reinforce the drain_workqueue with an preceding rcu_barrier() as a
few of the tasks that need to be drained may first be armed by RCU.
References: https://bugs.freedesktop.org/show_bug.cgi?id=101627
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170718134124.14832-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
If we move the actual cleanup of the context to a worker, we can allow
the final free to be called from any context and avoid undue latency in
the caller.
v2: Negotiate handling the delayed contexts free by flushing the
workqueue before calling i915_gem_context_fini() and performing the final
free of the kernel context directly
v3: Flush deferred frees before new context allocations
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620110547.15947-2-chris@chris-wilson.co.uk
More stuff for 4.13:
- skl+ wm fixes from Mahesh Kumar
- some refactor and tests for i915_sw_fence (Chris)
- tune execlist/scheduler code (Chris)
- g4x,g33 gpu reset improvements (Chris, Mika)
- guc code cleanup (Michal Wajdeczko, Michał Winiarski)
- dp aux backlight improvements (Puthikorn Voravootivat)
- buffer based guc/host communication (Michal Wajdeczko)
* tag 'drm-intel-next-2017-05-29' of git://anongit.freedesktop.org/git/drm-intel: (253 commits)
drm/i915: Update DRIVER_DATE to 20170529
drm/i915: Keep the forcewake timer alive for 1ms past the most recent use
drm/i915/guc: capture GuC logs if FW fails to load
drm/i915/guc: Introduce buffer based cmd transport
drm/i915/guc: Disable send function on fini
drm: Add definition for eDP backlight frequency
drm/i915: Drop AUX backlight enable check for backlight control
drm/i915: Consolidate #ifdef CONFIG_INTEL_IOMMU
drm/i915: Only GGTT vma may be pinned and prevent shrinking
drm/i915: Serialize GTT/Aperture accesses on BXT
drm/i915: Convert i915_gem_object_ops->flags values to use BIT()
drm/i915/selftests: Silence compiler warning in igt_ctx_exec
drm/i915/guc: Skip port assign on first iteration of GuC dequeue
drm/i915: Remove misleading comment in request_alloc
drm/i915/g33: Improve reset reliability
Revert "drm/i915: Restore lost "Initialized i915" welcome message"
drm/i915/huc: Update GLK HuC version
drm/i915: Check for allocation failure
drm/i915/guc: Remove action status and statistics from debugfs
drm/i915/g4x: Improve gpu reset reliability
...
The i915_priolist are allocated within an atomic context on a path where
we wish to minimise latency. If we use a dedicated kmem_cache, we have
the advantage of a local freelist from which to service new requests
that should keep the latency impact of an allocation small. Though
currently we expect the majority of requests to be at default priority
(and so hit the preallocate priolist), once userspace starts using
priorities they are likely to use many fine grained policies improving
the utilisation of a private slab.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-9-chris@chris-wilson.co.uk
Pull RCU updates from Ingo Molnar:
"The main changes are:
- Debloat RCU headers
- Parallelize SRCU callback handling (plus overlapping patches)
- Improve the performance of Tree SRCU on a CPU-hotplug stress test
- Documentation updates
- Miscellaneous fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits)
rcu: Open-code the rcu_cblist_n_lazy_cbs() function
rcu: Open-code the rcu_cblist_n_cbs() function
rcu: Open-code the rcu_cblist_empty() function
rcu: Separately compile large rcu_segcblist functions
srcu: Debloat the <linux/rcu_segcblist.h> header
srcu: Adjust default auto-expediting holdoff
srcu: Specify auto-expedite holdoff time
srcu: Expedite first synchronize_srcu() when idle
srcu: Expedited grace periods with reduced memory contention
srcu: Make rcutorture writer stalls print SRCU GP state
srcu: Exact tracking of srcu_data structures containing callbacks
srcu: Make SRCU be built by default
srcu: Fix Kconfig botch when SRCU not selected
rcu: Make non-preemptive schedule be Tasks RCU quiescent state
srcu: Expedite srcu_schedule_cbs_snp() callback invocation
srcu: Parallelize callback handling
kvm: Move srcu_struct fields to end of struct kvm
rcu: Fix typo in PER_RCU_NODE_PERIOD header comment
rcu: Use true/false in assignment to bool
rcu: Use bool value directly
...
A very simple mockery, just a random manager and timeline. Useful for
inserting objects and ordering retirement; and not much else.
v2: mock_fini_ggtt() to complement mock_init_ggtt().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-7-chris@chris-wilson.co.uk