Commit Graph

2270 Commits

Author SHA1 Message Date
Chris Wilson 092be382a2 drm/i915: Lift intel_engines_resume() to callers
Since the reset path wants to recover the engines itself, it only wants
to reinitialise the hardware using i915_gem_init_hw(). Pull the call to
intel_engines_resume() to the module init/resume path so we can avoid it
during reset.

Fixes: 79ffac8599 ("drm/i915: Invert the GEM wakeref hierarchy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190626154549.10066-3-chris@chris-wilson.co.uk
2019-06-26 18:01:01 +01:00
Chris Wilson 0c91621cad drm/i915/gt: Pass intel_gt to pm routines
Switch from passing the i915 container to newly named struct intel_gt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190625130128.11009-2-chris@chris-wilson.co.uk
2019-06-25 20:17:22 +01:00
Chris Wilson c6fe28b0c2 drm/i915/gt: Rename i915_gt_timelines
Since the anonymous i915_gt became struct intel_gt and encloses
struct i915_gt_timelines, rename i915_gt_timelines to intel_gt_timelines
to match its parentage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621131640.28864-1-chris@chris-wilson.co.uk
2019-06-21 16:04:07 +01:00
Tvrtko Ursulin db56f97494 drm/i915: Eliminate dual personality of i915_scratch_offset
Scratch vma lives under gt but the API used to work on i915. Make this
consistent by renaming the function to intel_gt_scratch_offset and make
it take struct intel_gt.

v2:
 * Move to intel_gt. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-33-tvrtko.ursulin@linux.intel.com
2019-06-21 13:49:00 +01:00
Tvrtko Ursulin f0c02c1b91 drm/i915: Rename i915_timeline to intel_timeline and move under gt
Move all timeline code under gt and rename to intel_gt prefix.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-32-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:53 +01:00
Tvrtko Ursulin 1d66377a76 drm/i915: Compartmentalize i915_gem_init_ggtt
Continuing on the theme of better logical organization of our code, make
the first step towards making the ggtt code better isolated from wider
struct drm_i915_private.

v2:
 * Bring the ickle onion unwind back. (Chris)
 * Rename to i915_init_ggtt. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-27-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:44 +01:00
Tvrtko Ursulin baea429dc5 drm/i915: Move i915_gem_chipset_flush to intel_gt
This aligns better with the rest of restructuring.

v2:
 * Move call out of line. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-24-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:40 +01:00
Tvrtko Ursulin a1c8a09e0c drm/i915: Convert i915_gem_flush_ggtt_writes to intel_gt
Having introduced struct intel_gt (named the anonymous structure in i915)
we can start using it to compartmentalize our code better. It makes more
sense logically to have the code internally like this and it will also
help with future split between gt and display in i915.

v2:
 * Keep ggtt flush before fb obj flush. (Chris)

v3:
 * Fix refactoring fail.
 * Always flush ggtt writes. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-23-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:38 +01:00
Tvrtko Ursulin 6b0a8dfdf2 drm/i915: Stop using I915_READ/WRITE in intel_wopcm_init_hw
More legacy mmio accessor removal. We pass in intel_gt explicitly allowing
code to use new intel_uncore_read/write helpers.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-17-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:32 +01:00
Tvrtko Ursulin 8649187a95 drm/i915: Move intel_engines_resume into common init
Since this part still operates on i915 and not intel_gt, move it to the
common (top-level) function.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-16-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:31 +01:00
Tvrtko Ursulin abc584f9aa drm/i915: Convert i915_gem_init_hw to intel_gt
More removal of implicit dev_priv from using old mmio accessors.

Actually the top level function remains but is split into a part which
writes to i915 and part which operates on intel_gt in order to initialize
the hardware.

GuC and engines are the only odd ones out remaining.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-15-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:31 +01:00
Tvrtko Ursulin acb56d97d9 drm/i915: Convert i915_ppgtt_init_hw to intel_gt
More removal of implicit dev_priv from using old mmio accessors.

v2:
 * Rebase for uncore_to_i915 removal.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-13-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:29 +01:00
Tvrtko Ursulin 20a7f2fc4d drm/i915: Convert intel_mocs_init_l3cc_table to intel_gt
More removal of implicit dev_priv from using old mmio accessors.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-12-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:28 +01:00
Tvrtko Ursulin d10cfee4d8 drm/i915: Convert gt workarounds to intel_gt
More conversion of i915_gem_init_hw to uncore.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-10-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:25 +01:00
Tvrtko Ursulin cf6844b234 drm/i915: Convert init_unused_rings to intel_gt
More removal of implicit dev_priv from using old mmio accessors.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-9-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:24 +01:00
Tvrtko Ursulin 500bfa380e drm/i915: Convert i915_gem_init_swizzling to intel_gt
Start using the newly introduced struct intel_gt to fuse together correct
logical init flow with uncore for more removal of implicit dev_priv in
mmio access.

v2:
 * Move code to i915_gem_fence_reg. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-7-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:22 +01:00
Tvrtko Ursulin 99f2eb9667 drm/i915: Move intel_gt_pm_init under intel_gt_init_early
And also rename to intel_gt_pm_init_early and make it operate on gt.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-5-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:18 +01:00
Tvrtko Ursulin 24635c5152 drm/i915: Move intel_gt initialization to a separate file
As it will grow in a following patch make a new home for it.

v2:
 * Convert mock_gem_device as well. (Chris)

v3:
 * Rename to intel_gt_init_early and move call site to i915_drv.c. (Chris)

v4:
 * Adjust SPDX tags.
 * No need to gt/ path when including intel_gt_types.h. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-3-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:15 +01:00
Jani Nikula df0566a641 drm/i915: move modesetting core code under display/
Now that we have a new subdirectory for display code, continue by moving
modesetting core code.

display/intel_frontbuffer.h sticks out like a sore thumb, otherwise this
is, again, a surprisingly clean operation.

v2:
- don't move intel_sideband.[ch] (Ville)
- use tabs for Makefile file lists and sort them

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613084416.6794-3-jani.nikula@intel.com
2019-06-17 11:48:32 +03:00
Chris Wilson ce476c80b8 drm/i915: Keep contexts pinned until after the next kernel context switch
We need to keep the context image pinned in memory until after the GPU
has finished writing into it. Since it continues to write as we signal
the final breadcrumb, we need to keep it pinned until the request after
it is complete. Currently we know the order in which requests execute on
each engine, and so to remove that presumption we need to identify a
request/context-switch we know must occur after our completion. Any
request queued after the signal must imply a context switch, for
simplicity we use a fresh request from the kernel context.

The sequence of operations for keeping the context pinned until saved is:

 - On context activation, we preallocate a node for each physical engine
   the context may operate on. This is to avoid allocations during
   unpinning, which may be from inside FS_RECLAIM context (aka the
   shrinker)

 - On context deactivation on retirement of the last active request (which
   is before we know the context has been saved), we add the
   preallocated node onto a barrier list on each engine

 - On engine idling, we emit a switch to kernel context. When this
   switch completes, we know that all previous contexts must have been
   saved, and so on retiring this request we can finally unpin all the
   contexts that were marked as deactivated prior to the switch.

We can enhance this in future by flushing all the idle contexts on a
regular heartbeat pulse of a switch to kernel context, which will also
be used to check for hung engines.

v2: intel_context_active_acquire/_release

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-1-chris@chris-wilson.co.uk
2019-06-14 19:03:32 +01:00
Daniele Ceraolo Spurio c447ff7db3 drm/i915: update with_intel_runtime_pm to use the rpm structure
Matching the underlying get/put functions.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-8-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Daniele Ceraolo Spurio d858d5695f drm/i915: update rpm_get/put to use the rpm structure
The functions where internally already only using the structure, so we
need to just flip the interface.

v2: rebase

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Chris Wilson 84383d2e8d drm/i915: Refine i915_reset.lock_map
We already use a mutex to serialise i915_reset() and wedging, so all we
need it to link that into i915_request_wait() and we have our lock cycle
detection.

v2.5: Take error mutex for selftests

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614071023.17929-3-chris@chris-wilson.co.uk
2019-06-14 15:17:54 +01:00
Chris Wilson 0cf289bd5d drm/i915: Move fence register tracking from i915->mm to ggtt
As the fence registers only apply to regions inside the GGTT is makes
more sense that we track these as part of the i915_ggtt and not the
general mm. In the next patch, we will then pull the register locking
underneath the i915_ggtt.mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613073254.24048-1-chris@chris-wilson.co.uk
2019-06-13 09:37:39 +01:00
Tvrtko Ursulin e33a4be83a drm/i915: Remove I915_POSTING_READ_FW
Only a few call sites remain which have been converted to uncore mmio
accessors and so the macro can be removed.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611104548.30545-2-tvrtko.ursulin@linux.intel.com
2019-06-12 15:33:10 +01:00
Chris Wilson ecab9be174 drm/i915: Combine unbound/bound list tracking for objects
With async binding, we don't want to manage a bound/unbound list as we
may end up running before we even acquire the pages. All that is
required is keeping track of shrinkable objects, so reduce it to the
minimum list.

Fixes: 6951e5893b ("drm/i915: Move GEM object domain management from struct_mutex to local")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190612105720.30310-1-chris@chris-wilson.co.uk
2019-06-12 13:36:43 +01:00
Chris Wilson 33df8a7697 drm/i915: Prevent lock-cycles between GPU waits and GPU resets
We cannot allow ourselves to wait on the GPU while holding any lock as we
may need to reset the GPU. While there is not an explicit lock between
the two operations, lockdep cannot detect the dependency. So let's tell
lockdep about the wait/reset dependency with an explicit lockmap.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190612085246.16374-1-chris@chris-wilson.co.uk
2019-06-12 12:06:11 +01:00
Chris Wilson a8cff4c828 drm/i915: Promote i915->mm.obj_lock to be irqsafe
The intent is to be able to update the mm.lists from inside an irqsoff
section (e.g. from a softirq rcu workqueue), ergo we need to make the
i915->mm.obj_lock irqsafe.

v2: can_discard_pages() ensures we are shrinkable
v3: Beware shadowing of 'flags'

Fixes: 3b4fa9640c ("drm/i915: Track the purgeable objects on a separate eviction list")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110869
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190610145430.17717-1-chris@chris-wilson.co.uk
2019-06-10 20:43:08 +01:00
Chris Wilson 155ab8836c drm/i915: Move object close under its own lock
Use i915_gem_object_lock() to guard the LUT and active reference to
allow us to break free of struct_mutex for handling GEM_CLOSE.

Testcase: igt/gem_close_race
Testcase: igt/gem_exec_parallel
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190606112320.9704-1-chris@chris-wilson.co.uk
2019-06-06 12:51:13 +01:00
Chris Wilson d82b4b2621 drm/i915: Report all objects with allocated pages to the shrinker
Currently, we try to report to the shrinker the precise number of
objects (pages) that are available to be reaped at this moment. This
requires searching all objects with allocated pages to see if they
fulfill the search criteria, and this count is performed quite
frequently. (The shrinker tries to free ~128 pages on each invocation,
before which we count all the objects; counting takes longer than
unbinding the objects!) If we take the pragmatic view that with
sufficient desire, all objects are eventually reapable (they become
inactive, or no longer used as framebuffer etc), we can simply return
the count of pinned pages maintained during get_pages/put_pages rather
than walk the lists every time.

The downside is that we may (slightly) over-report the number of
objects/pages we could shrink and so penalize ourselves by shrinking
more than required. This is mitigated by keeping the order in which we
shrink objects such that we avoid penalizing active and frequently used
objects, and if memory is so tight that we need to free them we would
need to anyway.

v2: Only expose shrinkable objects to the shrinker; a small reduction in
not considering stolen and foreign objects.
v3: Restore the tracking from a "backup" copy from before the gem/ split

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-2-chris@chris-wilson.co.uk
2019-05-31 21:23:51 +01:00
Chris Wilson 3b4fa9640c drm/i915: Track the purgeable objects on a separate eviction list
Currently the purgeable objects, I915_MADV_DONTNEED, are mixed in the
normal bound/unbound lists. Every shrinker pass starts with an attempt
to purge from this set of unneeded objects, which entails us doing a
walk over both lists looking for any candidates. If there are none, and
since we are shrinking we can reasonably assume that the lists are
full!, this becomes a very slow futile walk.

If we separate out the purgeable objects into own list, this search then
becomes its own phase that is preferentially handled during shrinking.
Instead the cost becomes that we then need to filter the purgeable list
if we want to distinguish between bound and unbound objects.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-1-chris@chris-wilson.co.uk
2019-05-31 21:23:51 +01:00
Janusz Krzysztofik 47bc28d7ee drm/i915: Split off pci_driver.remove() tail to drm_driver.release()
In order to support driver hot unbind, some cleanup operations, now
performed on PCI driver remove, must be called later, after all device
file descriptors are closed.

Split out those operations from the tail of pci_driver.remove()
callback and put them into drm_driver.release() which is called as soon
as all references to the driver are put.  As a result, those cleanups
will be now run on last drm_dev_put(), either still called from
pci_driver.remove() if all device file descriptors are already closed,
or on last drm_release() file operation.

Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190530133105.30467-1-janusz.krzysztofik@linux.intel.com
2019-05-31 08:43:18 +01:00
Chris Wilson 446e2d16a1 drm/i915: Move GEM client throttling to its own file
Continuing the decluttering of i915_gem.c by moving the client self
throttling into its own file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-13-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson 3f43c8767e drm/i915: Move GEM object busy checking to its own file
Continuing the decluttering of i915_gem.c by moving the object busy
checking into its own file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-12-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson d45a1a5334 drm/i915: Move GEM object waiting to its own file
Continuing the decluttering of i915_gem.c by moving the object wait
decomposition into its own file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-11-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson 6951e5893b drm/i915: Move GEM object domain management from struct_mutex to local
Use the per-object local lock to control the cache domain of the
individual GEM objects, not struct_mutex. This is a huge leap forward
for us in terms of object-level synchronisation; execbuffers are
coordinated using the ww_mutex and pread/pwrite is finally fully
serialised again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson 37d63f8fdb drm/i915: Pull scatterlist utils out of i915_gem.h
Out scatterlist utility routines can be pulled out of i915_gem.h for a
bit more decluttering.

v2: Push I915_GTT_PAGE_SIZE out of i915_scatterlist itself and into the
caller.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-9-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson 10be98a77c drm/i915: Move more GEM objects under gem/
Continuing the theme of separating out the GEM clutter.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson f0e4a06397 drm/i915: Move GEM domain management to its own file
Continuing the decluttering of i915_gem.c, that of the read/write
domains, perhaps the biggest of GEM's follies?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-7-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson b414fcd5be drm/i915: Move mmap and friends to its own file
Continuing the decluttering of i915_gem.c, now the turn of do_mmap and
the faulthandlers

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-6-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson f033428db2 drm/i915: Move phys objects to its own file
Continuing the decluttering of i915_gem.c, this time the legacy physical
object.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-5-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson 8475355f7a drm/i915: Move shmem object setup to its own file
Split the plain old shmem object into its own file to start decluttering
i915_gem.c

v2: Lose the confusing, hysterical raisins, suffix of _gtt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-4-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson afa1308596 drm/i915: Pull GEM ioctls interface to its own file
Declutter i915_drv/gem.h by moving the ioctl API into its own header.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-2-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Jani Nikula 2491b544ff Merge drm/drm-next into drm-intel-next-queued
Get the HDR dependencies originally merged via drm-misc. Sync up all
i915 changes applied via other trees. And get v5.2-rc2 as the baseline.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-05-28 11:07:59 +03:00
Chris Wilson b27e35ae5b drm/i915: Keep user GGTT alive for a minimum of 250ms
Do not allow runtime pm autosuspend to remove userspace GGTT mmaps too
quickly. For example, igt sets the autosuspend delay to 0, and so we
immediately attempt to perform runtime suspend upon releasing the
wakeref. Unfortunately, that involves tearing down GGTT mmaps as they
require an active device.

Override the autosuspend for GGTT mmaps, by keeping the wakeref around
for 250ms after populating the PTE for a fresh mmap.

v2: Prefer refcount_t for its under/overflow error detection
v3: Flush the user runtime autosuspend prior to system system.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190527115114.13448-1-chris@chris-wilson.co.uk
2019-05-28 08:23:09 +01:00
Dave Airlie 14ee642c2a Features:
- Engine discovery query (Tvrtko)
 - Support for DP YCbCr4:2:0 outputs (Gwan-gyeong)
 - HDCP revocation support, refactoring (Ramalingam)
 - Remove DRM_AUTH from IOCTLs which also have DRM_RENDER_ALLOW (Christian König)
 - Asynchronous display power disabling (Imre)
 - Perma-pin uC firmware and re-enable global reset (Fernando)
 - GTT remapping for display, for bigger fb size and stride (Ville)
 - Enable pipe HDR mode on ICL if only HDR planes are used (Ville)
 - Kconfig to tweak the busyspin durations for i915_wait_request (Chris)
 - Allow multiple user handles to the same VM (Chris)
 - GT/GEM runtime pm improvements using wakerefs (Chris)
 - Gen 4&5 render context support (Chris)
 - Allow userspace to clone contexts on creation (Chris)
 - SINGLE_TIMELINE flags for context creation (Chris)
 - Allow specification of parallel execbuf (Chris)
 
 Refactoring:
 - Header refactoring (Jani)
 - Move GraphicsTechnology files under gt/ (Chris)
 - Sideband code refactoring (Chris)
 
 Fixes:
 - ICL DSI state readout and checker fixes (Vandita)
 - GLK DSI picture corruption fix (Stanislav)
 - HDMI deep color fixes (Clinton, Aditya)
 - Fix driver unbinding from a device in use (Janusz)
 - Fix clock gating with pipe scaling (Radhakrishna)
 - Disable broken FBC on GLK (Daniel Drake)
 - Miscellaneous GuC fixes (Michal)
 - Fix MG PHY DP register programming (Imre)
 - Add missing combo PHY lane power setup (Imre)
 - Workarounds for early ICL VBT issues (Imre)
 - Fix fastset vs. pfit on/off on HSW EDP transcoder (Ville)
 - Add readout and state check for pch_pfit.force_thru (Ville)
 - Miscellaneous display fixes and refactoring (Ville)
 - Display workaround fixes (Ville)
 - Enable audio even if ELD is bogus (Ville)
 - Fix use-after-free in reporting create.size (Chris)
 - Sideband fixes to avoid BYT hard lockups (Chris)
 - Workaround fixes and improvements (Chris)
 
 Maintainer shortcomings:
 - Failure to adequately describe and give credit for all changes (Jani)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFWWmW3ewYy4RJOWc05gHnSar7m8FAlzoK8oACgkQ05gHnSar
 7m8jZg//UuIkz4bIu7A0YfN/VH3/h3fthxboejj27HpO4OO9eFqLVqaEUFEngGvf
 66fnFKNwtLdW7Dsx9iQsKNsVTcdsEE5PvSA6FZ3rVtYOwBdZ9OKYRxci2KcSnjqz
 F0/8Jxgz2G0gu9TV6dgTLrfdJiuJrCbidRV3G5id0XHNEGbpABtmVxYfsbj/w9mU
 luckCgKyRDZNzfhyGIPV763bNGZWLQPcbP99yrZf4+EcsiQ2MfjHJdwe5Ko+iGDk
 sO3lFg/1iEf41gqaD4LPokOtUKZfXI1Sujs1w/0djDbqs9USq0eY1L5C3ZBq5Si1
 woz7ATXO71FfBcNRxLTejNqCVlQMLix/185/ItkDA4gDlHwWZPYaT5VTNgRtEEy6
 XNtscZyM6Z1ghqRqahWWu40g80sOdfYuiTFEAYonVbDAUootgF46uWO/2ib0Hya+
 tYlm60M097eMealzaXEyHPHlW1OeUUJTKxl9j7nHmqVn542OI8gn7xvIXX2VsYDY
 7D4gVPoFg0UpGXM2uuSHVgvxwtg4t083Wu+utYu76RjmwNye4LkHewWGFjmOkYRf
 BraHoA+gKPFtAJjjtkyE/ZnlT4c3tDoQ0a6+gRKVurXzu/Y6JVzquhJvH5mShyZ7
 oTv+erupcz7JEnEeKzgMCyon/Drumiut5I6zr29GNQ3eelpf4jQ=
 =U/nc
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2019-05-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

Features:
- Engine discovery query (Tvrtko)
- Support for DP YCbCr4:2:0 outputs (Gwan-gyeong)
- HDCP revocation support, refactoring (Ramalingam)
- Remove DRM_AUTH from IOCTLs which also have DRM_RENDER_ALLOW (Christian König)
- Asynchronous display power disabling (Imre)
- Perma-pin uC firmware and re-enable global reset (Fernando)
- GTT remapping for display, for bigger fb size and stride (Ville)
- Enable pipe HDR mode on ICL if only HDR planes are used (Ville)
- Kconfig to tweak the busyspin durations for i915_wait_request (Chris)
- Allow multiple user handles to the same VM (Chris)
- GT/GEM runtime pm improvements using wakerefs (Chris)
- Gen 4&5 render context support (Chris)
- Allow userspace to clone contexts on creation (Chris)
- SINGLE_TIMELINE flags for context creation (Chris)
- Allow specification of parallel execbuf (Chris)

Refactoring:
- Header refactoring (Jani)
- Move GraphicsTechnology files under gt/ (Chris)
- Sideband code refactoring (Chris)

Fixes:
- ICL DSI state readout and checker fixes (Vandita)
- GLK DSI picture corruption fix (Stanislav)
- HDMI deep color fixes (Clinton, Aditya)
- Fix driver unbinding from a device in use (Janusz)
- Fix clock gating with pipe scaling (Radhakrishna)
- Disable broken FBC on GLK (Daniel Drake)
- Miscellaneous GuC fixes (Michal)
- Fix MG PHY DP register programming (Imre)
- Add missing combo PHY lane power setup (Imre)
- Workarounds for early ICL VBT issues (Imre)
- Fix fastset vs. pfit on/off on HSW EDP transcoder (Ville)
- Add readout and state check for pch_pfit.force_thru (Ville)
- Miscellaneous display fixes and refactoring (Ville)
- Display workaround fixes (Ville)
- Enable audio even if ELD is bogus (Ville)
- Fix use-after-free in reporting create.size (Chris)
- Sideband fixes to avoid BYT hard lockups (Chris)
- Workaround fixes and improvements (Chris)

Maintainer shortcomings:
- Failure to adequately describe and give credit for all changes (Jani)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87sgt3n45z.fsf@intel.com
2019-05-28 09:26:52 +10:00
Ville Syrjälä aa5ca8b742 drm/i915: Align dumb buffer stride to 4k to allow for gtt remapping
Align dumb buffer stride to 4k if the fb will be big enough to
require gtt remapping.

v2: Leave the stride alone for buffers that look to be for the cursor
v3: Make it not a hack (Daniel)

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-7-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-05-20 18:04:48 +03:00
Ville Syrjälä 1a74fc0b3f drm/i915: Add a new "remapped" gtt_view
To overcome display engine stride limits we'll want to remap the
pages in the GTT. To that end we need a new gtt_view type which
is just like the "rotated" type except not rotated.

v2: Use intel_remapped_plane_info base type
    s/unused/unused_mbz/ (Chris)
    Separate BUILD_BUG_ON()s (Chris)
    Use I915_GTT_PAGE_SIZE (Chris)
v3: Use i915_gem_object_get_dma_address() (Chris)
    Trim the sg (Tvrtko)
v4: Actually trim this time. Limit the max length
    to one row of pages to keep things simple

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-05-20 18:04:47 +03:00
Linus Torvalds a2d635decb drm pull request for 5.2
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJc04M6AAoJEAx081l5xIa+SJgP/0uIgIOM53vPpydgmr+2IEHF
 jbDqrd+mipgNriRVHjDsWdUHCUNtyhB7YEBCMrj3mY0rRFI7FlQQf4lOwYGoHiKP
 4JZg4kwC37997lFXl1uabGj3DmJLtxKL2/D15zCH/uLe+2EDzWznP6NVdFT3WK0P
 YKZQCWT19PWSsLoBRPutWxkmop4AYvkqE0a6vXUlJlFYZK3Bbytx6/179uWKfiX5
 ZkKEEtx1XiDAvcp5gBb6PISurycrBY0e/bkPBnK3ES5vawMbTU5IrmWOrQ4D8yOd
 z9qOVZawZ6+b2XBDgBWjQ9bM7I5R7Il1q/LglYEaFI9+wHUnlUdDSm6ft5/5BiCZ
 fqgkh5Bj2iEsajbSsacoljMOpxpYPqj63mqc+7fAGXF34V+B+9U1bpt8kCbMKowf
 7Abb7IuiCR6vLDapjP6VqTMvdQ4O466OEAN83ULGFTdmMqYYH4AxaIwc+xcAk/aP
 RNq7/RHhh4FRynRAj9fCkGlF3ArnM88gLINwWuEQq4SClWGcvdw7eaHpwWo77c4g
 iccCnTLqSIg5pDVu07AQzzBlW6KulWxh5o72x+Xx+EXWdYUDHQ1SlNs11bSNUBV1
 5MkrzY2GuD+NFEjsXJEDIPOr40mQOyJCXnxq8nXPsz/hD9kHeJPvWn3J3eVKyb5B
 Z6/knNqM0BDn3SaYR/rD
 =YFiQ
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "This has two exciting community drivers for ARM Mali accelerators.
  Since ARM has never been open source friendly on the GPU side of the
  house, the community has had to create open source drivers for the
  Mali GPUs. Lima covers the older t4xx and panfrost the newer 6xx/7xx
  series. Well done to all involved and hopefully this will help ARM
  head in the right direction.

  There is also now the ability if you don't have any of the legacy
  drivers enabled (pre-KMS) to remove all the pre-KMS support code from
  the core drm, this saves 10% or so in codesize on my machine.

  i915 also enable Icelake/Elkhart Lake Gen11 GPUs by default, vboxvideo
  moves out of staging.

  There are also some rcar-du patches which crossover with media tree
  but all should be acked by Mauro.

  Summary:

  uapi changes:
   - Colorspace connector property
   - fourcc - new YUV formts
   - timeline sync objects initially merged
   - expose FB_DAMAGE_CLIPS to atomic userspace

  new drivers:
   - vboxvideo: moved out of staging
   - aspeed: ASPEED SoC BMC chip display support
   - lima: ARM Mali4xx GPU acceleration driver support
   - panfrost: ARM Mali6xx/7xx Midgard/Bitfrost acceleration driver support

  core:
   - component helper docs
   - unplugging fixes
   - devm device init
   - MIPI/DSI rate control
   - shmem backed gem objects
   - connector, display_info, edid_quirks cleanups
   - dma_buf fence chain support
   - 64-bit dma-fence seqno comparison fixes
   - move initial fb config code to core
   - gem fence array helpers for Lima
   - ability to remove legacy support code if no drivers requires it (removes 10% of drm.ko size)
   - lease fixes

  ttm:
   - unified DRM_FILE_PAGE_OFFSET handling
   - Account for kernel allocations in kernel zone only

  panel:
   - OSD070T1718-19TS panel support
   - panel-tpo-td028ttec1 backlight support
   - Ronbo RB070D30 MIPI/DSI
   - Feiyang FY07024DI26A30-D MIPI-DSI panel
   - Rocktech jh057n00900 MIPI-DSI panel

  i915:
   - Comet Lake (Gen9) PCI IDs
   - Updated Icelake PCI IDs
   - Elkhartlake (Gen11) support
   - DP MST property addtions
   - plane and watermark fixes
   - Icelake port sync and VEBOX disable fixes
   - struct_mutex usage reduction
   - Icelake gamma fix
   - GuC reset fixes
   - make mmap more asynchronous
   - sound display power well race fixes
   - DDI/MIPI-DSI clocks for Icelake
   - Icelake RPS frequency changing support
   - Icelake workarounds

  amdgpu:
   - Use HMM for userptr
   - vega20 experimental smu11 support
   - RAS support for vega20
   - BACO support for vega12 + fixes for vega20
   - reworked IH interrupt handling
   - amdkfd RAS support
   - Freesync improvements
   - initial timeline sync object support
   - DC Z ordering fixes
   - NV12 planes support
   - colorspace properties for planes=
   - eDP opts if eDP already initialized

  nouveau:
   - misc fixes

  etnaviv:
   - misc fixes

  msm:
   - GPU zap shader support expansion
   - robustness ABI addition

  exynos:
   - Logging cleanups

  tegra:
   - Shared reset fix
   - CPU cache maintenance fix

  cirrus:
   - driver rewritten using simple helpers

  meson:
   - G12A support

  vmwgfx:
   - Resource dirtying management improvements
   - Userspace logging improvements

  virtio:
   - PRIME fixes

  rockchip:
   - rk3066 hdmi support

  sun4i:
   - DSI burst mode support

  vc4:
   - load tracker to detect underflow

  v3d:
   - v3d v4.2 support

  malidp:
   - initial Mali D71 support in komeda driver

  tfp410:
   - omap related improvement

  omapdrm:
   - drm bridge/panel support
   - drop some omap specific panels

  rcar-du:
   - Display writeback support"

* tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm: (1507 commits)
  drm/msm/a6xx: No zap shader is not an error
  drm/cma-helper: Fix drm_gem_cma_free_object()
  drm: Fix timestamp docs for variable refresh properties.
  drm/komeda: Mark the local functions as static
  drm/komeda: Fixed warning: Function parameter or member not described
  drm/komeda: Expose bus_width to Komeda-CORE
  drm/komeda: Add sysfs attribute: core_id and config_id
  drm: add non-desktop quirk for Valve HMDs
  drm/panfrost: Show stored feature registers
  drm/panfrost: Don't scream about deferred probe
  drm/panfrost: Disable PM on probe failure
  drm/panfrost: Set DMA masks earlier
  drm/panfrost: Add sanity checks to submit IOCTL
  drm/etnaviv: initialize idle mask before querying the HW db
  drm: introduce a capability flag for syncobj timeline support
  drm: report consistent errors when checking syncobj capibility
  drm/nouveau/nouveau: forward error generated while resuming objects tree
  drm/nouveau/fb/ramgk104: fix spelling mistake "sucessfully" -> "successfully"
  drm/nouveau/i2c: Disable i2c bus access after ->fini()
  drm/nouveau: Remove duplicate ACPI_VIDEO_NOTIFY_PROBE definition
  ...
2019-05-08 21:35:19 -07:00
Al Viro 79ea35bc20 don't open-code file_count()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-05-02 02:25:40 -04:00
Chris Wilson 45b9c968c5 drm/i915: Move the engine->destroy() vfunc onto the engine
Make the engine responsible for cleaning itself up!

This removes the i915->gt.cleanup vfunc that has been annoying the
casual reader and myself for the last several years, and helps keep a
future patch to add more cleanup tidy.

v2: Assert that engine->destroy is set after the backend starts
allocating its own state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190501103204.18632-1-chris@chris-wilson.co.uk
2019-05-01 12:13:57 +01:00
Chris Wilson 5e2a0419ef drm/i915: Switch back to an array of logical per-engine HW contexts
We switched to a tree of per-engine HW context to accommodate the
introduction of virtual engines. However, we plan to also support
multiple instances of the same engine within the GEM context, defeating
our use of the engine as a key to looking up the HW context. Just
allocate a logical per-engine instance and always use an index into the
ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user
specified map.

v2: Add for_each_gem_engine() helper to iterator within the engines lock
v3: intel_context_create_request() helper
v4: s/unsigned long/unsigned int/ 4 billion engines is quite enough.
v5: Push iterator locking to caller

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-7-chris@chris-wilson.co.uk
2019-04-26 18:32:11 +01:00
Chris Wilson 11334c6aad drm/i915: Split engine setup/init into two phases
In the next patch, we require the engine vfuncs setup prior to
initialising the pinned kernel contexts, so split the vfunc setup from
the engine initialisation and call it earlier.

v2: s/setup_xcs/setup_common/ for intel_ring_submission_setup()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-6-chris@chris-wilson.co.uk
2019-04-26 18:32:07 +01:00
Chris Wilson 79ffac8599 drm/i915: Invert the GEM wakeref hierarchy
In the current scheme, on submitting a request we take a single global
GEM wakeref, which trickles down to wake up all GT power domains. This
is undesirable as we would like to be able to localise our power
management to the available power domains and to remove the global GEM
operations from the heart of the driver. (The intent there is to push
global GEM decisions to the boundary as used by the GEM user interface.)

Now during request construction, each request is responsible via its
logical context to acquire a wakeref on each power domain it intends to
utilize. Currently, each request takes a wakeref on the engine(s) and
the engines themselves take a chipset wakeref. This gives us a
transition on each engine which we can extend if we want to insert more
powermangement control (such as soft rc6). The global GEM operations
that currently require a struct_mutex are reduced to listening to pm
events from the chipset GT wakeref. As we reduce the struct_mutex
requirement, these listeners should evaporate.

Perhaps the biggest immediate change is that this removes the
struct_mutex requirement around GT power management, allowing us greater
flexibility in request construction. Another important knock-on effect,
is that by tracking engine usage, we can insert a switch back to the
kernel context on that engine immediately, avoiding any extra delay or
inserting global synchronisation barriers. This makes tracking when an
engine and its associated contexts are idle much easier -- important for
when we forgo our assumed execution ordering and need idle barriers to
unpin used contexts. In the process, it means we remove a large chunk of
code whose only purpose was to switch back to the kernel context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-5-chris@chris-wilson.co.uk
2019-04-24 22:26:49 +01:00
Chris Wilson 23c3c3d04f drm/i915: Pull the GEM powermangement coupling into its own file
Split out the powermanagement portion (GT wakeref, suspend/resume) of
GEM from i915_gem.c into its own file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-2-chris@chris-wilson.co.uk
2019-04-24 22:25:28 +01:00
Chris Wilson 112ed2d31a drm/i915: Move GraphicsTechnology files under gt/
Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
2019-04-24 21:01:46 +01:00
Chris Wilson 929eec99f5 drm/i915: Avoid use-after-free in reporting create.size
We have to avoid chasing after a userspace race!

<3>[  473.114328] BUG: KASAN: use-after-free in i915_gem_create+0x1d2/0x1f0 [i915]
<3>[  473.114389] Read of size 8 at addr ffff88815bf1d840 by task gem_flink_race/1541

<4>[  473.114464] CPU: 1 PID: 1541 Comm: gem_flink_race Tainted: G     U            5.1.0-rc4-g7d07e025e786-kasan_88+ #1
<4>[  473.114469] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.10 09/29/2016
<4>[  473.114474] Call Trace:
<4>[  473.114488]  dump_stack+0x7c/0xbb
<4>[  473.114612]  ? i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.114621]  print_address_description+0x65/0x270
<4>[  473.114728]  ? i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.114839]  ? i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.114848]  kasan_report+0x149/0x18d
<4>[  473.114962]  ? i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.115069]  i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.115176]  ? i915_gem_object_create.part.28+0x4b0/0x4b0 [i915]
<4>[  473.115289]  ? i915_gem_dumb_create+0x1a0/0x1a0 [i915]
<4>[  473.115297]  drm_ioctl_kernel+0x192/0x260
<4>[  473.115306]  ? drm_ioctl_permit+0x280/0x280
<4>[  473.115326]  drm_ioctl+0x67c/0x960
<4>[  473.115438]  ? i915_gem_dumb_create+0x1a0/0x1a0 [i915]
<4>[  473.115448]  ? drm_getstats+0x20/0x20
<4>[  473.115459]  ? __lock_acquire+0xa66/0x3fe0
<4>[  473.115474]  ? _raw_spin_unlock_irqrestore+0x39/0x60
<4>[  473.115485]  ? debug_object_active_state+0x2ea/0x4e0
<4>[  473.115496]  ? debug_show_all_locks+0x2d0/0x2d0
<4>[  473.115513]  do_vfs_ioctl+0x18d/0xfa0
<4>[  473.115522]  ? check_flags.part.27+0x440/0x440
<4>[  473.115532]  ? ioctl_preallocate+0x1a0/0x1a0
<4>[  473.115547]  ? __fget+0x2ac/0x410
<4>[  473.115561]  ? __ia32_sys_dup3+0xb0/0xb0
<4>[  473.115569]  ? rwlock_bug.part.0+0x90/0x90
<4>[  473.115590]  ksys_ioctl+0x35/0x70
<4>[  473.115597]  ? lockdep_hardirqs_off+0x1cb/0x2b0
<4>[  473.115608]  __x64_sys_ioctl+0x6a/0xb0
<4>[  473.115614]  ? lockdep_hardirqs_on+0x342/0x590
<4>[  473.115623]  do_syscall_64+0x97/0x400
<4>[  473.115633]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  473.115641] RIP: 0033:0x7fce590d55d7
<4>[  473.115649] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
<4>[  473.115655] RSP: 002b:00007fce4d525ba8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4>[  473.115662] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fce590d55d7
<4>[  473.115667] RDX: 00007fce4d525c10 RSI: 00000000c010645b RDI: 0000000000000007
<4>[  473.115672] RBP: 00007fce4d525c10 R08: 00007fce4d526700 R09: 00007fce4d526700
<4>[  473.115677] R10: 0000000000000054 R11: 0000000000000246 R12: 00000000c010645b
<4>[  473.115682] R13: 0000000000000007 R14: 0000000000000000 R15: 00007ffe0e4a7450

<3>[  473.115731] Allocated by task 1541:
<4>[  473.115766]  kmem_cache_alloc+0xce/0x290
<4>[  473.115895]  i915_gem_object_create.part.28+0x1c/0x4b0 [i915]
<4>[  473.116000]  i915_gem_create+0xe3/0x1f0 [i915]
<4>[  473.116008]  drm_ioctl_kernel+0x192/0x260
<4>[  473.116013]  drm_ioctl+0x67c/0x960
<4>[  473.116020]  do_vfs_ioctl+0x18d/0xfa0
<4>[  473.116026]  ksys_ioctl+0x35/0x70
<4>[  473.116032]  __x64_sys_ioctl+0x6a/0xb0
<4>[  473.116038]  do_syscall_64+0x97/0x400
<4>[  473.116044]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

<3>[  473.116071] Freed by task 1542:
<4>[  473.116101]  kmem_cache_free+0xb7/0x2f0
<4>[  473.116205]  __i915_gem_free_objects+0x7d4/0xe10 [i915]
<4>[  473.116311]  i915_gem_create_ioctl+0xaa/0xd0 [i915]
<4>[  473.116318]  drm_ioctl_kernel+0x192/0x260
<4>[  473.116323]  drm_ioctl+0x67c/0x960
<4>[  473.116330]  do_vfs_ioctl+0x18d/0xfa0
<4>[  473.116335]  ksys_ioctl+0x35/0x70
<4>[  473.116341]  __x64_sys_ioctl+0x6a/0xb0
<4>[  473.116347]  do_syscall_64+0x97/0x400
<4>[  473.116354]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Testcase: igt/gem_flink_race/flink_close
Fixes: e163484afa ("drm/i915: Update size upon return from GEM_CREATE")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190417132507.27133-1-chris@chris-wilson.co.uk
(cherry picked from commit 9953402349)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-04-24 09:39:07 +03:00
Chris Wilson 2d6692e642 drm/i915: Start writeback from the shrinker
When we are called to relieve mempressue via the shrinker, the only way
we can make progress is either by discarding unwanted pages (those
objects that userspace has marked MADV_DONTNEED) or by reclaiming the
dirty objects via swap. As we know that is the only way to make further
progress, we can initiate the writeback as we invalidate the objects.
This means the objects we put onto the inactive anon lru list are
already marked for reclaim+writeback and so will trigger a wait upon the
writeback inside direct reclaim, greatly improving the success rate of
direct reclaim on i915 objects.

The corollary is that we may start a slow swap on opportunistic
mempressure from the likes of the compaction + migration kthreads. This
is limited by those threads only being allowed to shrink idle pages, but
also that if we reactivate the page before it is swapped out by gpu
activity, we only page the cost of repinning the page. The cost is most
felt when an object is reused after mempressure, which hopefully
excludes the latency sensitive tasks (as we are just extending the
impact of swap thrashing to them).

Apparently this is not the first time we've had this idea. Back in
commit 5537252b6b ("drm/i915: Invalidate our pages under memory
pressure") we wanted to start writeback but settled on invalidate after
Hugh Dickins warned us about a possibility of a deadlock within shmemfs
if we started writeback from shrink_slab. Looking at the callchain,
using writeback from i915_gem_shrink should be equivalent to the pageout
also employed by shrink_slab, i.e. it should not be any riskier afaict.

v2: Leave mmapings intact. At this point, the only mmapings of our
objects will be via CPU mmaps on the shmemfs filp, which are
out-of-scope for our LRU tracking. Instead leave those pages to the
inactive anon LRU page list for aging and pageout as normal.

v3: Be selective on which paths trigger writeback, in particular
excluding paths shrinking just to reclaim vm space (e.g. mmap, vmap
reapers) and avoid starting writeback on the entire process space from
within the pm freezer.

References: https://bugs.freedesktop.org/show_bug.cgi?id=108686
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michal Hocko <mhocko@suse.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20190420115539.29081-1-chris@chris-wilson.co.uk
2019-04-20 15:06:31 +01:00
Chris Wilson 9953402349 drm/i915: Avoid use-after-free in reporting create.size
We have to avoid chasing after a userspace race!

<3>[  473.114328] BUG: KASAN: use-after-free in i915_gem_create+0x1d2/0x1f0 [i915]
<3>[  473.114389] Read of size 8 at addr ffff88815bf1d840 by task gem_flink_race/1541

<4>[  473.114464] CPU: 1 PID: 1541 Comm: gem_flink_race Tainted: G     U            5.1.0-rc4-g7d07e025e786-kasan_88+ #1
<4>[  473.114469] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.10 09/29/2016
<4>[  473.114474] Call Trace:
<4>[  473.114488]  dump_stack+0x7c/0xbb
<4>[  473.114612]  ? i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.114621]  print_address_description+0x65/0x270
<4>[  473.114728]  ? i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.114839]  ? i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.114848]  kasan_report+0x149/0x18d
<4>[  473.114962]  ? i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.115069]  i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.115176]  ? i915_gem_object_create.part.28+0x4b0/0x4b0 [i915]
<4>[  473.115289]  ? i915_gem_dumb_create+0x1a0/0x1a0 [i915]
<4>[  473.115297]  drm_ioctl_kernel+0x192/0x260
<4>[  473.115306]  ? drm_ioctl_permit+0x280/0x280
<4>[  473.115326]  drm_ioctl+0x67c/0x960
<4>[  473.115438]  ? i915_gem_dumb_create+0x1a0/0x1a0 [i915]
<4>[  473.115448]  ? drm_getstats+0x20/0x20
<4>[  473.115459]  ? __lock_acquire+0xa66/0x3fe0
<4>[  473.115474]  ? _raw_spin_unlock_irqrestore+0x39/0x60
<4>[  473.115485]  ? debug_object_active_state+0x2ea/0x4e0
<4>[  473.115496]  ? debug_show_all_locks+0x2d0/0x2d0
<4>[  473.115513]  do_vfs_ioctl+0x18d/0xfa0
<4>[  473.115522]  ? check_flags.part.27+0x440/0x440
<4>[  473.115532]  ? ioctl_preallocate+0x1a0/0x1a0
<4>[  473.115547]  ? __fget+0x2ac/0x410
<4>[  473.115561]  ? __ia32_sys_dup3+0xb0/0xb0
<4>[  473.115569]  ? rwlock_bug.part.0+0x90/0x90
<4>[  473.115590]  ksys_ioctl+0x35/0x70
<4>[  473.115597]  ? lockdep_hardirqs_off+0x1cb/0x2b0
<4>[  473.115608]  __x64_sys_ioctl+0x6a/0xb0
<4>[  473.115614]  ? lockdep_hardirqs_on+0x342/0x590
<4>[  473.115623]  do_syscall_64+0x97/0x400
<4>[  473.115633]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  473.115641] RIP: 0033:0x7fce590d55d7
<4>[  473.115649] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
<4>[  473.115655] RSP: 002b:00007fce4d525ba8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4>[  473.115662] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fce590d55d7
<4>[  473.115667] RDX: 00007fce4d525c10 RSI: 00000000c010645b RDI: 0000000000000007
<4>[  473.115672] RBP: 00007fce4d525c10 R08: 00007fce4d526700 R09: 00007fce4d526700
<4>[  473.115677] R10: 0000000000000054 R11: 0000000000000246 R12: 00000000c010645b
<4>[  473.115682] R13: 0000000000000007 R14: 0000000000000000 R15: 00007ffe0e4a7450

<3>[  473.115731] Allocated by task 1541:
<4>[  473.115766]  kmem_cache_alloc+0xce/0x290
<4>[  473.115895]  i915_gem_object_create.part.28+0x1c/0x4b0 [i915]
<4>[  473.116000]  i915_gem_create+0xe3/0x1f0 [i915]
<4>[  473.116008]  drm_ioctl_kernel+0x192/0x260
<4>[  473.116013]  drm_ioctl+0x67c/0x960
<4>[  473.116020]  do_vfs_ioctl+0x18d/0xfa0
<4>[  473.116026]  ksys_ioctl+0x35/0x70
<4>[  473.116032]  __x64_sys_ioctl+0x6a/0xb0
<4>[  473.116038]  do_syscall_64+0x97/0x400
<4>[  473.116044]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

<3>[  473.116071] Freed by task 1542:
<4>[  473.116101]  kmem_cache_free+0xb7/0x2f0
<4>[  473.116205]  __i915_gem_free_objects+0x7d4/0xe10 [i915]
<4>[  473.116311]  i915_gem_create_ioctl+0xaa/0xd0 [i915]
<4>[  473.116318]  drm_ioctl_kernel+0x192/0x260
<4>[  473.116323]  drm_ioctl+0x67c/0x960
<4>[  473.116330]  do_vfs_ioctl+0x18d/0xfa0
<4>[  473.116335]  ksys_ioctl+0x35/0x70
<4>[  473.116341]  __x64_sys_ioctl+0x6a/0xb0
<4>[  473.116347]  do_syscall_64+0x97/0x400
<4>[  473.116354]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Testcase: igt/gem_flink_race/flink_close
Fixes: e163484afa ("drm/i915: Update size upon return from GEM_CREATE")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190417132507.27133-1-chris@chris-wilson.co.uk
2019-04-17 14:54:25 +01:00
Chris Wilson 254e11864a drm/i915: Verify the engine workarounds stick on application
Read the engine workarounds back using the GPU after loading the initial
context state to verify that we are setting them correctly, and bail if
it fails.

v2: Break out the verification into its own loop

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190417075657.19456-3-chris@chris-wilson.co.uk
2019-04-17 10:58:20 +01:00
Chris Wilson 9726920b7e drm/i915: Only reset the pinned kernel contexts on resume
On resume, we know that the only pinned contexts in danger of seeing
corruption are the kernel context, and so we do not need to walk the
list of all GEM contexts as we tracked them on each engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190410190120.830-1-chris@chris-wilson.co.uk
2019-04-10 21:18:11 +01:00
Jani Nikula 696173b064 drm/i915: extract intel_pm.h from intel_drv.h
It used to be handy that we only had a couple of headers, but over time
intel_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.

Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
only where needed, and sort the modified include directives while at it
and as needed.

No functional changes.

v2: gen6_rps_reset_ei() is in i915_irq.c not intel_pm.c.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/adc6463b95eef3440fba9826793f7d1c5f3b0b4a.1554461791.git.jani.nikula@intel.com
2019-04-08 09:52:43 +03:00
Chris Wilson 6960d9cfc7 drm/i915: Be precise in types for i915_gem_busy
Mixing u8 and -1u together leads to zero-extended integer expansion, and
comparing 0x000000ff against 0xffffffff, causing us to report a mixed
uabi-class request as not busy.

The input flag is a u8, and we want to generate a u32 uABI response,
mark our functions so.

Fixes: c8b502422b ("drm/i915: Remove last traces of exec-id (GEM_BUSY)")
Testcase: igt/gem_exec_balance/busy
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190404101914.7231-1-chris@chris-wilson.co.uk
2019-04-04 16:21:00 +01:00
Chris Wilson b01720bfcd drm/i915: Prefault before locking pages in shmem_pwrite
If the user passes in a pointer to a GGTT mmaping of the same buffer
being written to, we can hit a deadlock in acquiring the shmemfs page
(once as the write destination and then as the read source).

[<0>] io_schedule+0xd/0x30
[<0>] __lock_page+0x105/0x1b0
[<0>] find_lock_entry+0x55/0x90
[<0>] shmem_getpage_gfp+0xbb/0x800
[<0>] shmem_read_mapping_page_gfp+0x2d/0x50
[<0>] shmem_get_pages+0x158/0x5d0 [i915]
[<0>] ____i915_gem_object_get_pages+0x17/0x90 [i915]
[<0>] __i915_gem_object_get_pages+0x57/0x70 [i915]
[<0>] i915_gem_fault+0x1b4/0x5c0 [i915]
[<0>] __do_fault+0x2d/0x80
[<0>] __handle_mm_fault+0xad4/0xfb0
[<0>] handle_mm_fault+0xe6/0x1f0
[<0>] __do_page_fault+0x18f/0x3f0
[<0>] page_fault+0x1b/0x20
[<0>] copy_user_enhanced_fast_string+0x7/0x10
[<0>] _copy_from_user+0x37/0x60
[<0>] shmem_pwrite+0xf0/0x160 [i915]
[<0>] i915_gem_pwrite_ioctl+0x14e/0x520 [i915]
[<0>] drm_ioctl_kernel+0x81/0xd0
[<0>] drm_ioctl+0x1a7/0x310
[<0>] do_vfs_ioctl+0x88/0x5d0
[<0>] ksys_ioctl+0x35/0x70
[<0>] __x64_sys_ioctl+0x11/0x20
[<0>] do_syscall_64+0x39/0xe0
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

We can reduce (but not eliminate!) the chance of this happening by
faulting the user_data before we take the page lock in
pagecache_write_begin(). One way to eliminate the potential recursion
here is by disabling pagefaults for the copy, and handling the fallback
to use an alternative method -- so convert to use kmap_atomic (which
should disable preemption and pagefaulting for the copy) and report
ENODEV instead of EFAULT so that our caller tries again with a different
copy mechanism -- we already check that the page should have been
faultable so a false negative should be rare.

Testcase: igt/gem_pwrite/self
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190401133909.31203-1-chris@chris-wilson.co.uk
2019-04-02 14:28:10 +01:00
Chris Wilson ee8efa8079 drm/i915: Check domains for userptr on release
When we return pages to the system, we release control over them and
should defensively return them to the CPU write domain so that we catch
any external writes on reacquiring them (e.g. to transparently
swapout/swapin). While we did this defensive clflushing for ordinary
shmem pages, it was forgotten for userptr. Fortunately, userptr objects
are normally cache coherent and so oblivious to the forgotten domain
tracking.

References: a679f58d05 ("drm/i915: Flush pages on acquisition")
References: 754a254427 ("drm/i915: Skip object locking around a no-op set-domain ioctl")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190331094620.15185-1-chris@chris-wilson.co.uk
2019-03-31 12:46:52 +01:00
Michał Winiarski e163484afa drm/i915: Update size upon return from GEM_CREATE
Since GEM_CREATE is trying to outsmart the user by rounding up unaligned
objects, we used to update the size returned to userspace.
This update seems to have been lost throughout the history.

v2: Use round_up(), reorder locals (Chris)

References: ff72145bad ("drm: dumb scanout create/mmap for intel/radeon (v3)")
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190326170218.13255-1-michal.winiarski@intel.com
2019-03-26 20:24:44 +00:00
Chris Wilson dd19f6bf92 drm/i915: Remove defunct intel_suspend_gt_powersave()
Since commit b7137e0cf1 ("drm/i915: Defer enabling rc6 til after we
submit the first batch/context"), intel_suspend_gt_powersave() has been
a no-op. As we still do not need to do anything explicitly on suspend
(we do everything required on idling), remove the defunct function.

References: b7137e0cf1 ("drm/i915: Defer enabling rc6 til after we submit the first batch/context")
Suggested-by: "Hiatt, Don" <don.hiatt@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190323214009.23294-1-chris@chris-wilson.co.uk
2019-03-24 21:29:44 +00:00
Sujaritha Sundaresan b9d52d381e drm/i915/guc: GuC suspend path cleanup
Adding a call to intel_uc_suspend in i915_gem_suspend, which
is a common point for the suspend/resume and hibernate paths.
This fixes an unbalanced call that causes issues with the CTB
register/deregister.

v2: Making the call unconditional (Daniele)
	Moving the call to after the GEM_BUG_ON (Chris)

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321203804.6845-1-sujaritha.sundaresan@intel.com
2019-03-22 08:58:07 +00:00
Chris Wilson 754a254427 drm/i915: Skip object locking around a no-op set-domain ioctl
If we are already in the desired write domain of a set-domain ioctl,
then there is nothing for us to do and we can quickly return back to
userspace, avoiding any lock contention. By recognising that the
write_domain is always a subset of the read_domains, and excluding the
no-op case of requiring 0 read_domains in the ioctl, we can infer if the
current write_domain matches the target read_domains, there is nothing
for us to do.

Secondary aspect of this is that we undo the arbitrary fetching and
potential flushing of all pages for a set-domain(.write=CPU) call on a
fresh object -- which was introduced simply because we do the get-pages
before taking the struct_mutex.

References: 40e62d5d6b ("drm/i915: Acquire the backing storage outside of struct_mutex in set-domain")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321161908.8007-2-chris@chris-wilson.co.uk
2019-03-21 17:28:52 +00:00
Chris Wilson a679f58d05 drm/i915: Flush pages on acquisition
When we return pages to the system, we ensure that they are marked as
being in the CPU domain since any external access is uncontrolled and we
must assume the worst. This means that we need to always flush the pages
on acquisition if we need to use them on the GPU, and from the beginning
have used set-domain. Set-domain is overkill for the purpose as it is a
general synchronisation barrier, but our intent is to only flush the
pages being swapped in. If we move that flush into the pages acquisition
phase, we know then that when we have obj->mm.pages, they are coherent
with the GPU and need only maintain that status without resorting to
heavy handed use of set-domain.

The principle knock-on effect for userspace is through mmap-gtt
pagefaulting. Our uAPI has always implied that the GTT mmap was async
(especially as when any pagefault occurs is unpredicatable to userspace)
and so userspace had to apply explicit domain control itself
(set-domain). However, swapping is transparent to the kernel, and so on
first fault we need to acquire the pages and make them coherent for
access through the GTT. Our use of set-domain here leaks into the uABI
that the first pagefault was synchronous. This is unintentional and
baring a few igt should be unoticed, nevertheless we bump the uABI
version for mmap-gtt to reflect the change in behaviour.

Another implication of the change is that gem_create() is presumed to
create an object that is coherent with the CPU and is in the CPU write
domain, so a set-domain(CPU) following a gem_create() would be a minor
operation that merely checked whether we could allocate all pages for
the object. On applying this change, a set-domain(CPU) causes a clflush
as we acquire the pages. This will have a small impact on mesa as we move
the clflush here on !llc from execbuf time to create, but that should
have minimal performance impact as the same clflush exists but is now
done early and because of the clflush issue, userspace recycles bo and
so should resist allocating fresh objects.

Internally, the presumption that objects are created in the CPU
write-domain and remain so through writes to obj->mm.mapping is more
prevalent than I expected; but easy enough to catch and apply a manual
flush.

For the future, we should push the page flush from the central
set_pages() into the callers so that we can more finely control when it
is applied, but for now doing it one location is easier to validate, at
the cost of sometimes flushing when there is no need.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321161908.8007-1-chris@chris-wilson.co.uk
2019-03-21 17:28:12 +00:00
Daniele Ceraolo Spurio 3ceea6a1b4 drm/i915: use intel_uncore for all forcewake get/put
Now that the internal code all works on intel_uncore, flip the
external-facing interface.

v2: fix GVT.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190319183543.13679-4-daniele.ceraolospurio@intel.com
2019-03-20 21:12:31 +00:00
Andy Shevchenko 6e514e3717 drm/i915: Switch to bitmap_zalloc()
Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190304092908.57382-2-andriy.shevchenko@linux.intel.com
2019-03-20 17:50:35 +00:00
Chris Wilson 000c4f90e3 drm/i915: Sanity check mmap length against object size
We assumed that vm_mmap() would reject an attempt to mmap past the end of
the filp (our object), but we were wrong.

Applications that tried to use the mmap beyond the end of the object
would be greeted by a SIGBUS. After this patch, those applications will
be told about the error on creating the mmap, rather than at a random
moment on later access.

Reported-by: Antonio Argenziano <antonio.argenziano@intel.com>
Testcase: igt/gem_mmap/bad-size
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314075829.16838-1-chris@chris-wilson.co.uk
(cherry picked from commit 794a11cb67)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-03-18 13:59:42 -07:00
Chris Wilson 794a11cb67 drm/i915: Sanity check mmap length against object size
We assumed that vm_mmap() would reject an attempt to mmap past the end of
the filp (our object), but we were wrong.

Applications that tried to use the mmap beyond the end of the object
would be greeted by a SIGBUS. After this patch, those applications will
be told about the error on creating the mmap, rather than at a random
moment on later access.

Reported-by: Antonio Argenziano <antonio.argenziano@intel.com>
Testcase: igt/gem_mmap/bad-size
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314075829.16838-1-chris@chris-wilson.co.uk
2019-03-18 12:58:15 +00:00
Chris Wilson 831ebf18d6 drm/i915: Suppress the "Failed to idle" warning for gem_eio
It is debatable whether having an error message on suspend for forcibly
cancelling outstanding work is worthwhile. We want to know if it occurs
in the wild (as we will then have to reconsider the approach!), but
equally is not fatal across suspend, as upon resume we automatically
clear the wedged status.

However, CI does trigger this scenario with gem_eio/suspend; as there we
are intentionally wedging the device upon suspend. The dilemma is how
not to trigger a failure report for the dmesg spam, for which the
quickest response is to suppress the warning in the kernel. I'd rather
mark it as accepted in gem_eio, but for now detecting when gem_eio is
playing games and cancelling the warning for that case seems a barely
acceptable hack.

Testcase: igt/gem_eio/suspend
Reference: 5861b013e2 ("drm/i915: Do a synchronous switch-to-kernel-context on idling")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308134512.19115-1-chris@chris-wilson.co.uk
2019-03-08 19:11:29 +00:00
Tvrtko Ursulin ca22f32a62 drm/i915: Relax mmap VMA check
Legacy behaviour was to allow non-page-aligned mmap requests, as does the
linux mmap(2) implementation by virtue of automatically rounding up for
the caller.

To avoid breaking legacy userspace relax the newly introduced fix.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 5c4604e757 ("drm/i915: Prevent a race during I915_GEM_MMAP ioctl with WC set")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Adam Zabrocki <adamza@microsoft.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.0+
Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305110409.28633-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit a90e1948ef)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-03-08 09:52:55 -08:00
Chris Wilson 0881954965 drm/i915: Introduce intel_context.pin_mutex for pin management
Introduce a mutex to start locking the HW contexts independently of
struct_mutex, with a view to reducing the coarse struct_mutex. The
intel_context.pin_mutex is used to guard the transition to and from being
pinned on the gpu, and so is required before starting to build any
request. The intel_context will then remain pinned until the request
completes, but the mutex can be released immediately unpin completion of
pinning the context.

A slight variant of the above is used by per-context sseu that wants to
inspect the pinned status of the context, and requires that it remains
stable (either !pinned or pinned) across its operation. By using the
pin_mutex to serialise operations while pin_count==0, we can take that
pin_mutex for stabilise the boolean pin status.

v2: for Tvrtko!
* Improved commit message.
* Dropped _gpu suffix from gen8_modify_rpcs_gpu.
v3: Repair the locking for sseu selftests

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-7-chris@chris-wilson.co.uk
2019-03-08 14:04:19 +00:00
Chris Wilson c4d52feb2c drm/i915: Move over to intel_context_lookup()
In preparation for an ever growing number of engines and so ever
increasing static array of HW contexts within the GEM context, move the
array over to an rbtree, allocated upon first use.

Unfortunately, this imposes an rbtree lookup at a few frequent callsites,
but we should be able to mitigate those by moving over to using the HW
context as our primary type and so only incur the lookup on the boundary
with the user GEM context and engines.

v2: Check for no HW context in guc_stage_desc_init

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-4-chris@chris-wilson.co.uk
2019-03-08 13:59:52 +00:00
Chris Wilson 7d6ce55887 drm/i915: Remove has-kernel-context
We can no longer assume execution ordering, and in particular we cannot
assume which context will execute last. One side-effect of this is that
we cannot determine if the kernel-context is resident on the GPU, so
remove the routines that claimed to do so.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308093657.8640-4-chris@chris-wilson.co.uk
2019-03-08 10:57:12 +00:00
Chris Wilson c6eeb4797e drm/i915: Reduce presumption of request ordering for barriers
Currently we assume that we know the order in which requests run and so
can determine if we need to reissue a switch-to-kernel-context prior to
idling. That assumption does not hold for the future, so instead of
tracking which barriers have been used, simply determine if we have ever
switched away from the kernel context by using the engine and before
idling ensure that all engines that have been used since the last idle
are synchronously switched back to the kernel context for safety (and
else of shrinking memory while idle).

v2: Use intel_engine_mask_t and ALL_ENGINES

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308093657.8640-3-chris@chris-wilson.co.uk
2019-03-08 10:57:08 +00:00
Chris Wilson 604c37d766 drm/i915: Refactor common code to load initial power context
We load a context (the kernel context) on both module load and resume in
order to initialise some logical state onto the GPU. We can use the same
routine for both operations, which will become more useful as we
refactor rc6/rps enabling.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308093657.8640-2-chris@chris-wilson.co.uk
2019-03-08 10:57:07 +00:00
Chris Wilson 5861b013e2 drm/i915: Do a synchronous switch-to-kernel-context on idling
When the system idles, we switch to the kernel context as a defensive
measure (no users are harmed if the kernel context is lost). Currently,
we issue a switch to kernel context and then come back later to see if
the kernel context is still current and the system is idle. However,
if we are no longer privy to the runqueue ordering, then we have to
relax our assumptions about the logical state of the GPU and the only
way to ensure that the kernel context is currently loaded is by issuing
a request to run after all others, and wait for it to complete all while
preventing anyone else from issuing their own requests.

v2: Pull wedging into switch_to_kernel_context_sync() but only after
waiting (though only for the same short delay) for the active context to
finish.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308093657.8640-1-chris@chris-wilson.co.uk
2019-03-08 10:57:05 +00:00
Chris Wilson 50b022af5d drm/i915: Force GPU idle on suspend
To facilitate the next patch to allow preemptible kernels not to incur
the wrath of hangcheck, we need to ensure that we can still suspend and
shutdown. That is we will not be able to rely on hangcheck to terminate
a blocking kernel and instead must manually do so ourselves. The
advantage is that we can apply more pressure!

As we now perform a GPU reset to clean up any residual kernels, we leave
the GPU in an unknown state and in particular can not talk to the GuC
before we reinitialise it following resume. For example, we no longer
need to tell the GuC to suspend itself, as it is already reset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190307104530.21745-2-chris@chris-wilson.co.uk
2019-03-07 18:09:27 +00:00
Chris Wilson 3d60624916 drm/i915: Make I915_GEM_IDLE_TIMEOUT into a macro
Currently we use HZ/5 for detecting a dead gpu on startup, and we will
wish to reuse this value for detecting a dead gpu on suspend, so convert
it into a macro for later convenience.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190307104530.21745-1-chris@chris-wilson.co.uk
2019-03-07 18:09:26 +00:00
Tvrtko Ursulin a90e1948ef drm/i915: Relax mmap VMA check
Legacy behaviour was to allow non-page-aligned mmap requests, as does the
linux mmap(2) implementation by virtue of automatically rounding up for
the caller.

To avoid breaking legacy userspace relax the newly introduced fix.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 5c4604e757 ("drm/i915: Prevent a race during I915_GEM_MMAP ioctl with WC set")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Adam Zabrocki <adamza@microsoft.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.0+
Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305110409.28633-1-tvrtko.ursulin@linux.intel.com
2019-03-06 11:37:01 +00:00
Chris Wilson cf4331dd39 drm/i915: Move find_active_request() to the engine
To find the active request, we need only search along the individual
engine for the right request. This does not require touching any global
GEM state, so move it into the engine compartment.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-3-chris@chris-wilson.co.uk
2019-03-05 18:20:06 +00:00
Chris Wilson c8b502422b drm/i915: Remove last traces of exec-id (GEM_BUSY)
As we allow per-context engine allows the legacy concept of
I915_EXEC_RING no longer applies universally. We are still exposing the
unrelated exec-id in GEM_BUSY, so transition this ioctl (once more
slightly changing its ABI, but no one cares) over to only reporting the
uabi-class (not instance as we can not foreseeably fit those into the
small bitmask).

The only user of the extended ring information from GEM_BUSY is ddx/sna,
which tries to use the non-rcs business information to guide which
engine to use for subsequent operations on foreign bo. All that matters
for it is the decision between rcs and !rcs, so it is unaffected by the
change in higher bits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305162643.20243-1-chris@chris-wilson.co.uk
2019-03-05 16:40:14 +00:00
Chris Wilson d9948a10b9 drm/i915: Remove second level open-coded rcu work
We currently use a worker queued from an rcu callback to determine when
a how grace period has elapsed while we remained idle. We use this idle
delay to infer that we will be idle for a while and this is a suitable
point at which we can trim our global memory caches.

Since we wrote that, this mechanism now exists as rcu_work, and having
converted the idle shrinkers over to using that, we can remove our own
variant.

v2: Say goodbye to gt.epoch as well.
v3: Remove the misplaced and redundant comment before parking globals

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-3-chris@chris-wilson.co.uk
2019-02-28 11:08:09 +00:00
Chris Wilson 13f1bfd3b3 drm/i915: Make object/vma allocation caches global
As our allocations are not device specific, we can move our slab caches
to a global scope.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-2-chris@chris-wilson.co.uk
2019-02-28 11:08:02 +00:00
Chris Wilson 32eb6bcfdd drm/i915: Make request allocation caches global
As kmem_caches share the same properties (size, allocation/free behaviour)
for all potential devices, we can use global caches. While this
potential has worse fragmentation behaviour (one can argue that
different devices would have different activity lifetimes, but you can
also argue that activity is temporal across the system) it is the
default behaviour of the system at large to amalgamate matching caches.

The benefit for us is much reduced pointer dancing along the frequent
allocation paths.

v2: Defer shrinking until after a global grace period for futureproofing
multiple consumers of the slab caches, similar to the current strategy
for avoiding shrinking too early.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-1-chris@chris-wilson.co.uk
2019-02-28 11:07:56 +00:00
Chris Wilson 2d5eaad007 drm/i915: Compute the global scheduler caps
Do a pass over all the engines upon starting to determine the global
scheduler capability flags (those that are agreed upon by all).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190226102404.29153-7-chris@chris-wilson.co.uk
2019-02-28 08:58:37 +00:00
Chengguang Xu 772b5408e3 drm/i915: remove redundant likely/unlikely annotation
unlikely has already included in IS_ERR(), so just
remove redundant likely/unlikely annotation.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190221020819.21832-1-cgxu519@gmx.com
2019-02-21 16:01:00 +00:00
Chris Wilson 43a8f684b6 drm/i915: Reorder struct_mutex-vs-reset_lock in i915_gem_fault()
Annoyingly, struct_mutex was not entirely eliminated from the reset
pathway; for reasons of its own, intel_display_resume() requires
struct_mutex to prepare the planes it already captured. To avoid the
immediate problem of a deadlock between the struct_mutex and the reset
srcu, we have to acquire the reset_lock before struct_mutex in
i915_gem_fault(). Now any wait underneath struct_mutex will result us in
having to forcibly reset all inflight rendering, less than ideal, but
better than a deadlock (and will do for the short term).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190221102924.13442-1-chris@chris-wilson.co.uk
2019-02-21 14:49:19 +00:00
Chris Wilson c41166f9a1 drm/i915: Beware temporary wedging when determining -EIO
At a few points in our uABI, we check to see if the driver is wedged and
report -EIO back to the user in that case. However, as we perform the
check and reset asynchronously (where once before they were both
serialised by the struct_mutex), we may instead see the temporary wedging
used to cancel inflight rendering to avoid a deadlock during reset
(caused by either us timing out in our reset handler,
i915_wedge_on_timeout or with malice aforethought in intel_reset_prepare
for a stuck modeset). If we suspect this is the case, that is we see a
wedged driver *and* reset in progress, then wait until the reset is
resolved before reporting upon the wedged status.

v2: might_sleep() (Mika)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109580
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190220145637.23503-1-chris@chris-wilson.co.uk
2019-02-20 16:31:08 +00:00
Joonas Lahtinen d0781a89c0 Merge drm/drm-next into drm-intel-next-queued
Doing a backmerge to be able to merge topic/mei-hdcp-2019-02-19 PR.

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-02-20 11:04:08 +02:00
Chris Wilson 62eb3c24b3 drm/i915: Apply rps waitboosting for dma_fence_wait_timeout()
As time goes by, usage of generic ioctls such as drm_syncobj and
sync_file are on the increase bypassing i915-specific ioctls like
GEM_WAIT. Currently, we only apply waitboosting to our driver ioctls as
we track the file/client and account the waitboosting to them. However,
since commit 7b92c1bd05 ("drm/i915: Avoid keeping waitboost active for
signaling threads"), we no longer have been applying the client
ratelimiting on waitboosts and so that information has only been used
for debug tracking.

Push the application of waitboosting down to the common
i915_request_wait, and apply it to all foreign fence waits as well.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190213092504.25709-1-chris@chris-wilson.co.uk
2019-02-13 12:16:39 +00:00
Chris Wilson aeaaa55c73 drm/i915: Recursive i915_reset_trylock() verboten
We cannot nest i915_reset_trylock() as the inner may wait for the
I915_RESET_BACKOFF which in turn is waiting upon sync_srcu who is
waiting for our outermost lock. As we take the reset srcu around the
fence update, we have to defer taking it in i915_gem_fault() until after
we acquire the pin on the fence to avoid nesting. This is a little ugly,
but still works. If a reset occurs between i915_vma_pin_fence() and the
second reset lock, the reset will restore the fence register back to the
pinned value before the reset lock allows us to proceed (our mmap won't
be revoked as we haven't yet marked it as being a userfault as that
requires us to hold the reset lock), so the pagefault is still
serialised with the revocation in reset.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109605
Fixes: 2caffbf117 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190212130831.14425-1-chris@chris-wilson.co.uk
2019-02-12 16:41:34 +00:00
Dave Airlie 5ea3998d56 UAPI Changes:
- Expose RPCS (SSEU) configuration to userspace for Ice Lake
 in order to allow userspace to reconfigure the subslice config
 per context basis. (Tvrtko, Lionel)
 
 Driver Changes:
 
 - Execbuf and preemption improvements including selftests (Chris)
 - Rename HAS_GMCH_DISPLAY/HAS_GMCH (Rodrigo)
 - Debugfs error handling fix for robustness (Greg)
 - Improve reg_rw traces (Ville)
 - Push clear_intel_crtc_state onto the heap (Chris)
 - Watermark fixes for Ice Lake (Ville)
 - Fix enable count array size and bounds checking (Tvrtko)
 - MST Fixes (Lyude)
 - Prevent race and handle error on I915_GEM_MMAP (Joonas)
 - Initial rework for an full atomic gamma mode (Ville)
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJcXJw7AAoJEPpiX2QO6xPKWuQH/i3o86TqLheae2vdi8X/yp92
 XeZbmNfaL7Gui6ggZpRMREjIIiDZ0nFga5INcVtrak6JRjeefb+wQsy38tdb98ne
 EPn6eif0nO/uxc7BsdW2ndIDBSWC+ZW7gt4rzY5zLG/WCmX1qfOOS9GdUpXPF3jY
 sB8Af6lsGSxrCFkBzUB9Z7AyoP/Yq+YeamFOtXLufhtQb4GL3djjdugaUBBfQmNn
 FvZTxhhsXn3HU9Jy1AjRFALhqXvTjZt1L5myav4sCJ2qlsr2dL9y3c7evbO5jrZY
 4TSlKUi5XwfW1RlHTQN43fTq7yWFwoF1FsR4LaKS0a8bKIXiBLwset7YA1m6wmY=
 =Woo8
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2019-02-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

UAPI Changes:

- Expose RPCS (SSEU) configuration to userspace for Ice Lake
in order to allow userspace to reconfigure the subslice config
per context basis. (Tvrtko, Lionel)

Driver Changes:

- Execbuf and preemption improvements including selftests (Chris)
- Rename HAS_GMCH_DISPLAY/HAS_GMCH (Rodrigo)
- Debugfs error handling fix for robustness (Greg)
- Improve reg_rw traces (Ville)
- Push clear_intel_crtc_state onto the heap (Chris)
- Watermark fixes for Ice Lake (Ville)
- Fix enable count array size and bounds checking (Tvrtko)
- MST Fixes (Lyude)
- Prevent race and handle error on I915_GEM_MMAP (Joonas)
- Initial rework for an full atomic gamma mode (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208165000.GA30314@intel.com
2019-02-11 13:41:59 +10:00
Chris Wilson 2caffbf117 drm/i915: Revoke mmaps and prevent access to fence registers across reset
Previously, we were able to rely on the recursive properties of
struct_mutex to allow us to serialise revoking mmaps and reacquiring the
FENCE registers with them being clobbered over a global device reset.
I then proceeded to throw out the baby with the bath water in order to
pursue a struct_mutex-less reset.

Perusing LWN for alternative strategies, the dilemma on how to serialise
access to a global resource on one side was answered by
https://lwn.net/Articles/202847/ -- Sleepable RCU:

    1  int readside(void) {
    2      int idx;
    3      rcu_read_lock();
    4	   if (nomoresrcu) {
    5          rcu_read_unlock();
    6	       return -EINVAL;
    7      }
    8	   idx = srcu_read_lock(&ss);
    9	   rcu_read_unlock();
    10	   /* SRCU read-side critical section. */
    11	   srcu_read_unlock(&ss, idx);
    12	   return 0;
    13 }
    14
    15 void cleanup(void)
    16 {
    17     nomoresrcu = 1;
    18     synchronize_rcu();
    19     synchronize_srcu(&ss);
    20     cleanup_srcu_struct(&ss);
    21 }

No more worrying about stop_machine, just an uber-complex mutex,
optimised for reads, with the overhead pushed to the rare reset path.

However, we do run the risk of a deadlock as we allocate underneath the
SRCU read lock, and the allocation may require a GPU reset, causing a
dependency cycle via the in-flight requests. We resolve that by declaring
the driver wedged and cancelling all in-flight rendering.

v2: Use expedited rcu barriers to match our earlier timing
characteristics.
v3: Try to annotate locking contexts for sparse
v4: Reduce selftest lock duration to avoid a reset deadlock with fences
v5: s/srcu/reset_backoff_srcu/
v6: Remove more stale comments

Testcase: igt/gem_mmap_gtt/hang
Fixes: eb8d0f5af4 ("drm/i915: Remove GPU reset dependence on struct_mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208153708.20023-2-chris@chris-wilson.co.uk
2019-02-08 16:47:32 +00:00
Joonas Lahtinen ebfb697780 drm/i915: Handle vm_mmap error during I915_GEM_MMAP ioctl with WC set
Add err goto label and use it when VMA can't be established or changes
underneath.

v2:
- Dropping Fixes: as it's indeed impossible to race an object to the
  error address. (Chris)
v3:
- Use IS_ERR_VALUE (Chris)

Reported-by: Adam Zabrocki <adamza@microsoft.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Adam Zabrocki <adamza@microsoft.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v2
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190207085454.10598-2-joonas.lahtinen@linux.intel.com
2019-02-07 14:10:36 +02:00
Joonas Lahtinen 5c4604e757 drm/i915: Prevent a race during I915_GEM_MMAP ioctl with WC set
Make sure the underlying VMA in the process address space is the
same as it was during vm_mmap to avoid applying WC to wrong VMA.

A more long-term solution would be to have vm_mmap_locked variant
in linux/mmap.h for when caller wants to hold mmap_sem for an
extended duration.

v2:
- Refactor the compare function

Fixes: 1816f92363 ("drm/i915: Support creation of unbound wc user mappings for objects")
Reported-by: Adam Zabrocki <adamza@microsoft.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.0+
Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Adam Zabrocki <adamza@microsoft.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20190207085454.10598-1-joonas.lahtinen@linux.intel.com
2019-02-07 14:10:35 +02:00
Chris Wilson 21950ee7cc drm/i915: Pull i915_gem_active into the i915_active family
Looking forward, we need to break the struct_mutex dependency on
i915_gem_active. In the meantime, external use of i915_gem_active is
quite beguiling, little do new users suspect that it implies a barrier
as each request it tracks must be ordered wrt the previous one. As one
of many, it can be used to track activity across multiple timelines, a
shared fence, which fits our unordered request submission much better. We
need to steer external users away from the singular, exclusive fence
imposed by i915_gem_active to i915_active instead. As part of that
process, we move i915_gem_active out of i915_request.c into
i915_active.c to start separating the two concepts, and rename it to
i915_active_request (both to tie it to the concept of tracking just one
request, and to give it a longer, less appealing name).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk
2019-02-05 17:20:11 +00:00
Dave Airlie 2cc3b81dfa - Make background color and LUT more robust (Matt)
- Icelake display fixes (Ville, Imre)
 - Workarounds fixes and reorg (Tvrtko, Talha)
 - Enable fastboot by default on VLV and CHV (Hans)
 - Add another PCI ID for Coffee Lake (Rodrigo)
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJcVVKOAAoJEPpiX2QO6xPKzpEH/11faCaucfkejXnR2ff3H/Rc
 EQILDB+SFwzKYaxd8pLHXJ7D8stmBGW4i086bic1JFTxIi/MtQv5rfOO87jqu1DU
 3FFgCLuovzmheKVMuPxnSwGXn2ZI3RWPoDrH7OGaOtKuNAfoFTL9upZYsmBOyA+8
 srraU1zHhhR3pawqqVpGrXCVToKSYQc/mh9Od1v491yoqMEhC6r2JaGiePZQldn9
 J99ouBDOHMM1f45UX4+ORNQB951sQhJ4SW8e2bi2jKuc5WNmX3+tGLYdKemq3OYN
 vi3a4xwSPkhbGWUSQtT7Cy6e2p43p/k7CwVl1iEESVB7HOqINwmxY/UIxm3ap1s=
 =Q8JK
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2019-02-02' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

- Make background color and LUT more robust (Matt)
- Icelake display fixes (Ville, Imre)
- Workarounds fixes and reorg (Tvrtko, Talha)
- Enable fastboot by default on VLV and CHV (Hans)
- Add another PCI ID for Coffee Lake (Rodrigo)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190202082911.GA6615@intel.com
2019-02-04 15:37:58 +10:00
Dave Airlie 37fdaa3390 drm-misc-next for 5.1:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
   - Split out some part of drm_crtc_helper.h into drm_probe_helper.h
   - DRIVER_* flags improvements
   - New tasks on the TODO-list
   - Improvements to the documentation
 
 Driver Changes:
   - Continual of drmP.h removal in multiple drivers
   - Removal of FBINFO_(FLAG_)DEFAULT in multiple drivers
   - sun4i: Addition of the A23 support, multiple fixes for the tiled
     formats
   - atmel-hlcdc: Fix of clipping and rotation properties
   - qxl: various BO-related improvements, prime and generic fbdev emulation
     support
   - dw-hdmi: Support for HDMI2.0 2160p modes and YUV420 output
   - New Sitronix ST7701 panel driver
   - New Kingdisplay KD097D04 panel driver
   - New LeMaker BL035-RGB-002 panel driver
   - New PDA 91-00156-A0 panel driver
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXFRb/wAKCRDj7w1vZxhR
 xXj2AQDFAkH0/tksJ+T6yrHZXQE7q/6fAWTlrDXLo7Nb1hhfswEA1AyaHmZJA0Ar
 u7S+4Gfzs/gisw9Jr4bqQaerEpYJxQA=
 =qumD
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-02-01' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.1:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - Split out some part of drm_crtc_helper.h into drm_probe_helper.h
  - DRIVER_* flags improvements
  - New tasks on the TODO-list
  - Improvements to the documentation

Driver Changes:
  - Continual of drmP.h removal in multiple drivers
  - Removal of FBINFO_(FLAG_)DEFAULT in multiple drivers
  - sun4i: Addition of the A23 support, multiple fixes for the tiled
    formats
  - atmel-hlcdc: Fix of clipping and rotation properties
  - qxl: various BO-related improvements, prime and generic fbdev emulation
    support
  - dw-hdmi: Support for HDMI2.0 2160p modes and YUV420 output
  - New Sitronix ST7701 panel driver
  - New Kingdisplay KD097D04 panel driver
  - New LeMaker BL035-RGB-002 panel driver
  - New PDA 91-00156-A0 panel driver

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190201144749.t3abxvguhstu6bcl@flea
2019-02-04 14:42:34 +10:00
Chris Wilson 8547444137 drm/i915: Identify active requests
To allow requests to forgo a common execution timeline, one question we
need to be able to answer is "is this request running?". To track
whether a request has started on HW, we can emit a breadcrumb at the
beginning of the request and check its timeline's HWSP to see if the
breadcrumb has advanced past the start of this request. (This is in
contrast to the global timeline where we need only ask if we are on the
global timeline and if the timeline has advanced past the end of the
previous request.)

There is still confusion from a preempted request, which has already
started but relinquished the HW to a high priority request. For the
common case, this discrepancy should be negligible. However, for
identification of hung requests, knowing which one was running at the
time of the hang will be much more important.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-2-chris@chris-wilson.co.uk
2019-01-29 19:59:59 +00:00
Chris Wilson 9407d3bdb0 drm/i915: Track active timelines
Now that we pin timelines around use, we have a clearly defined lifetime
and convenient points at which we can track only the active timelines.
This allows us to reduce the list iteration to only consider those
active timelines and not all.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-6-chris@chris-wilson.co.uk
2019-01-28 19:07:13 +00:00
Chris Wilson 5013eb8cd6 drm/i915: Track the context's seqno in its own timeline HWSP
Now that we have allocated ourselves a cacheline to store a breadcrumb,
we can emit a write from the GPU into the timeline's HWSP of the
per-context seqno as we complete each request. This drops the mirroring
of the per-engine HWSP and allows each context to operate independently.
We do not need to unwind the per-context timeline, and so requests are
always consistent with the timeline breadcrumb, greatly simplifying the
completion checks as we no longer need to be concerned about the
global_seqno changing mid check.

One complication though is that we have to be wary that the request may
outlive the HWSP and so avoid touching the potentially danging pointer
after we have retired the fence. We also have to guard our access of the
HWSP with RCU, the release of the obj->mm.pages should already be RCU-safe.

At this point, we are emitting both per-context and global seqno and
still using the single per-engine execution timeline for resolving
interrupts.

v2: s/fake_complete/mark_complete/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-5-chris@chris-wilson.co.uk
2019-01-28 19:07:09 +00:00
Chris Wilson 1e345568e3 drm/i915: Move list of timelines under its own lock
Currently, the list of timelines is serialised by the struct_mutex, but
to alleviate difficulties with using that mutex in future, move the
list management under its own dedicated mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-5-chris@chris-wilson.co.uk
2019-01-28 16:24:22 +00:00
Chris Wilson 528cbd17ce drm/i915: Move vma lookup to its own lock
Remove the struct_mutex requirement for looking up the vma for an
object.

v2: Highlight how the race for duplicate vma creation is resolved on
reacquiring the lock with a short comment.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-3-chris@chris-wilson.co.uk
2019-01-28 16:24:16 +00:00
Chris Wilson 09d7e46b97 drm/i915: Pull VM lists under the VM mutex.
A starting point to counter the pervasive struct_mutex. For the goal of
avoiding (or at least blocking under them!) global locks during user
request submission, a simple but important step is being able to manage
each clients GTT separately. For which, we want to replace using the
struct_mutex as the guard for all things GTT/VM and switch instead to a
specific mutex inside i915_address_space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-2-chris@chris-wilson.co.uk
2019-01-28 16:24:13 +00:00
Chris Wilson 499197dc16 drm/i915: Stop tracking MRU activity on VMA
Our goal is to remove struct_mutex and replace it with fine grained
locking. One of the thorny issues is our eviction logic for reclaiming
space for an execbuffer (or GTT mmaping, among a few other examples).
While eviction itself is easy to move under a per-VM mutex, performing
the activity tracking is less agreeable. One solution is not to do any
MRU tracking and do a simple coarse evaluation during eviction of
active/inactive, with a loose temporal ordering of last
insertion/evaluation. That keeps all the locking constrained to when we
are manipulating the VM itself, neatly avoiding the tricky handling of
possible recursive locking during execbuf and elsewhere.

Note that discarding the MRU (currently implemented as a pair of lists,
to avoid scanning the active list for a NONBLOCKING search) is unlikely
to impact upon our efficiency to reclaim VM space (where we think a LRU
model is best) as our current strategy is to use random idle replacement
first before doing a search, and over time the use of softpinned 48b
per-ppGTT is growing (thereby eliminating any need to perform any eviction
searches, in theory at least) with the remaining users being found on
much older devices (gen2-gen6).

v2: Changelog and commentary rewritten to elaborate on the duality of a
single list being both an inactive and active list.
v3: Consolidate bool parameters into a single set of flags; don't
comment on the duality of a single variable being a multiplicity of
bits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk
2019-01-28 16:24:09 +00:00
Chris Wilson eb8d0f5af4 drm/i915: Remove GPU reset dependence on struct_mutex
Now that the submission backends are controlled via their own spinlocks,
with a wave of a magic wand we can lift the struct_mutex requirement
around GPU reset. That is we allow the submission frontend (userspace)
to keep on submitting while we process the GPU reset as we can suspend
the backend independently.

The major change is around the backoff/handoff strategy for performing
the reset. With no mutex deadlock, we no longer have to coordinate with
any waiter, and just perform the reset immediately.

Testcase: igt/gem_mmap_gtt/hang # regresses
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-3-chris@chris-wilson.co.uk
2019-01-25 14:27:22 +00:00
Daniel Vetter fcd70cd36b drm: Split out drm_probe_helper.h
Having the probe helper stuff (which pretty much everyone needs) in
the drm_crtc_helper.h file (which atomic drivers should never need) is
confusing. Split them out.

To make sure I actually achieved the goal here I went through all
drivers. And indeed, all atomic drivers are now free of
drm_crtc_helper.h includes.

v2: Make it compile. There was so much compile fail on arm drivers
that I figured I'll better not include any of the acks on v1.

v3: Massive rebase because i915 has lost a lot of drmP.h includes, but
not all: Through drm_crtc_helper.h > drm_modeset_helper.h -> drmP.h
there was still one, which this patch largely removes. Which means
rolling out lots more includes all over.

This will also conflict with ongoing drmP.h cleanup by others I
expect.

v3: Rebase on top of atomic bochs.

v4: Review from Laurent for bridge/rcar/omap/shmob/core bits:
- (re)move some of the added includes, use the better include files in
  other places (all suggested from Laurent adopted unchanged).
- sort alphabetically

v5: Actually try to sort them, and while at it, sort all the ones I
touch.

v6: Rebase onto i915 changes.

v7: Rebase once more.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: CK Hu <ck.hu@mediatek.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: virtualization@lists.linux-foundation.org
Cc: etnaviv@lists.freedesktop.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: intel-gfx@lists.freedesktop.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: spice-devel@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Cc: linux-renesas-soc@vger.kernel.org
Cc: linux-rockchip@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-tegra@vger.kernel.org
Cc: xen-devel@lists.xen.org
Link: https://patchwork.freedesktop.org/patch/msgid/20190117210334.13234-1-daniel.vetter@ffwll.ch
2019-01-24 13:20:42 +01:00
Dave Airlie 8ca4fd0406 - Unwind failure on pinning the gen7 PPGTT (Chris)
- Fastset updates to make sure DRRS and PSR are properly enabled (Hans)
 - Header include clean-up (Brajeswar, Jani)
 - Improvements and clean-up on debugfs (Chris, Jani)
 - Avoid division by zero on CNL clocks setup (Xiao)
 - Restrict PSMI context load w/a to Haswell GT1 (Chris)
 - Remove HW semaphores for gen7 inter-engine sync (Chris)
 - Pull the render flush into breadcrumb emission (Chris)
 - i915_params copy and free helpers and other reorgs and docs (Jani)
 - Remove has_pooled_eu static initializer (Tvrtko)
 - Updates on kerneldoc (Chris)
 - Remove redundant trailing request flush (Chris)
 - ringbuffer irq seqno fixes and clean-up (Chris)
 - splitting off runtime device info and other clean-up around (Jani)
 - Selftests improvements (Chris, Daniele)
 - Flush RING_IMR changes before changing the global GT IMR on gen6 and HSW (Chris)
 - Some improvements and fixes around GPU reset and GPU hang report (Chris)
 - Remove partial attempt to swizzle on pread/pwrite (Chris)
 - Return immediately if trylock fails for direct-reclaim (Chris)
 - Downgrade scare message for unknown HuC firmware (Jani)
 - ACPI / PMIC for MIPI / DSI (Hans)
 - Reduce i915_request_alloc retirement to local context (Chris)
 - Init per-engine WAs for all engines (Daniele)
 - drop DPF code for gen8+ (Daniele)
 - Guard error capture against unpinned vma (Chris)
 - Use mutex_lock_killable from inside the shrinker (Chris)
 - Removing pooling from struct_mutex from vmap shrinker (Chris)
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJcN9waAAoJEPpiX2QO6xPKW3oH/AkR8oFAcCF8VCYtPHnRUV6F
 AsKYG57NMfWH3tfJRg+W5gVtPV+gxOv5IIf6DHjoT2nR0N/r57sGlpKaR4uhq14r
 NUK6+cEoNDSDr3zdGt5QI3p8YokEo5MFIVYNPD9THO0hz4uvlzowru1bSC5bJvjn
 RvMsb4SUYFMA3BmHNsSczPPVuJLccndl5JFv8oUOrtA5ksct/8G3CI6V2KSOwT7W
 ebjrQ56vFMLuDMYwgfrss5sFMoRT60Da0C6dCroHKOzB9L/w7oPOVfwNE9miDRCx
 7IRIfICiPTCB2GsJ635sXet2sMV7RFxSjKkDND5iwV2vp96yqAwEjhXJq3t8MCE=
 =MqY8
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2019-01-10' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

- Unwind failure on pinning the gen7 PPGTT (Chris)
- Fastset updates to make sure DRRS and PSR are properly enabled (Hans)
- Header include clean-up (Brajeswar, Jani)
- Improvements and clean-up on debugfs (Chris, Jani)
- Avoid division by zero on CNL clocks setup (Xiao)
- Restrict PSMI context load w/a to Haswell GT1 (Chris)
- Remove HW semaphores for gen7 inter-engine sync (Chris)
- Pull the render flush into breadcrumb emission (Chris)
- i915_params copy and free helpers and other reorgs and docs (Jani)
- Remove has_pooled_eu static initializer (Tvrtko)
- Updates on kerneldoc (Chris)
- Remove redundant trailing request flush (Chris)
- ringbuffer irq seqno fixes and clean-up (Chris)
- splitting off runtime device info and other clean-up around (Jani)
- Selftests improvements (Chris, Daniele)
- Flush RING_IMR changes before changing the global GT IMR on gen6 and HSW (Chris)
- Some improvements and fixes around GPU reset and GPU hang report (Chris)
- Remove partial attempt to swizzle on pread/pwrite (Chris)
- Return immediately if trylock fails for direct-reclaim (Chris)
- Downgrade scare message for unknown HuC firmware (Jani)
- ACPI / PMIC for MIPI / DSI (Hans)
- Reduce i915_request_alloc retirement to local context (Chris)
- Init per-engine WAs for all engines (Daniele)
- drop DPF code for gen8+ (Daniele)
- Guard error capture against unpinned vma (Chris)
- Use mutex_lock_killable from inside the shrinker (Chris)
- Removing pooling from struct_mutex from vmap shrinker (Chris)

Signed-off-by: Dave Airlie <airlied@redhat.com>

# gpg: Signature made Fri 11 Jan 2019 09:58:18 AEST
# gpg:                using RSA key FA625F640EEB13CA
# gpg: Good signature from "Rodrigo Vivi <rodrigo.vivi@intel.com>"
# gpg:                 aka "Rodrigo Vivi <rodrigo.vivi@gmail.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6D20 7068 EEDD 6509 1C2C  E2A3 FA62 5F64 0EEB 13CA

# Conflicts:
#	drivers/gpu/drm/i915/intel_dp.c
#	drivers/gpu/drm/i915/intel_drv.h
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114183820.GA2855@intel.com
2019-01-24 19:44:16 +10:00
Jani Nikula 739f3abdbf drm/i915: small isolated c99 types to kernel types switch
Mixed C99 and kernel types use is getting ugly. Prefer kernel types.

sed -i 's/\buint\(8\|16\|32\|64\)_t\b/u\1/g'

Minor checkpatch fixes sprinkled on top of the changed lines.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/14ed72e7f04c9340a057855c5950b54811f8a477.1547629303.git.jani.nikula@intel.com
2019-01-17 09:02:00 +02:00
Chris Wilson 9f58892ea9 drm/i915: Pull all the reset functionality together into i915_reset.c
Currently the code to reset the GPU and our state is spread widely
across a few files. Pull the logic together into a common file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190116153304.787-1-chris@chris-wilson.co.uk
2019-01-16 22:45:31 +00:00
Chris Wilson 18bb2bccb5 drm/i915: Serialise concurrent calls to i915_gem_set_wedged()
Make i915_gem_set_wedged() and i915_gem_unset_wedged() behaviour more
consistent if called concurrently, and only do the wedging and reporting
once, curtailing any possible race where we start unwedging in the middle
of a wedge.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114210408.4561-2-chris@chris-wilson.co.uk
2019-01-16 15:24:16 +00:00
Chris Wilson 484d9a844d drm/i915/userptr: Avoid struct_mutex recursion for mmu_invalidate_range_start
Since commit 93065ac753 ("mm, oom: distinguish blockable mode for mmu
notifiers") we have been able to report failure from
mmu_invalidate_range_start which allows us to use a trylock on the
struct_mutex to avoid potential recursion and report -EBUSY instead.
Furthermore, this allows us to pull the work into the main callback and
avoid the sleight-of-hand in using a workqueue to avoid lockdep.

However, not all paths to mmu_invalidate_range_start are prepared to
handle failure, so instead of reporting the recursion, deal with it by
propagating the failure upwards, who can decide themselves to handle it
or report it.

v2: Mark up the recursive lock behaviour and comment on the various weak
points.

v3: Follow commit 3824e41975 ("drm/i915: Use mutex_lock_killable() from
inside the shrinker") and also use mutex_lock_killable().
v3.1: No leak on EINTR.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108375
References: 93065ac753 ("mm, oom: distinguish blockable mode for mmu notifiers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190115124442.3500-1-chris@chris-wilson.co.uk
2019-01-15 17:07:23 +00:00
Chris Wilson decd29e6b5 drm/i915: Only dump GPU state on set-wedged if interesting
As we may frequently mark the device as wedged to flush requests off it
during the normal course of events, quite often we have a large state
dump that is of no interest. Don't bother dumping it all if the engines
are all idle.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190115122057.1677-1-chris@chris-wilson.co.uk
2019-01-15 14:09:08 +00:00
Chris Wilson 8d761e773e drm/i915: Combined gt.awake/gt.power wakerefs
As the GT_IRQ power domain implies a wakeref, we can use it inplace of
our existing redundant rpm grab.

v2: Drop papering over forgetting to take the runtime wakeref in
selftests

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-20-chris@chris-wilson.co.uk
2019-01-14 16:18:43 +00:00
Chris Wilson 0e6e0be4c9 drm/i915: Markup paired operations on display power domains
The majority of runtime-pm operations are bounded and scoped within a
function; these are easy to verify that the wakeref are handled
correctly. We can employ the compiler to help us, and reduce the number
of wakerefs tracked when debugging, by passing around cookies provided
by the various rpm_get functions to their rpm_put counterpart. This
makes the pairing explicit, and given the required wakeref cookie the
compiler can verify that we pass an initialised value to the rpm_put
(quite handy for double checking error paths).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-16-chris@chris-wilson.co.uk
2019-01-14 16:18:30 +00:00
Chris Wilson d4225a535b drm/i915: Syntatic sugar for using intel_runtime_pm
Frequently, we use intel_runtime_pm_get/_put around a small block.
Formalise that usage by providing a macro to define such a block with an
automatic closure to scope the intel_runtime_pm wakeref to that block,
i.e. macro abuse smelling of python.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-15-chris@chris-wilson.co.uk
2019-01-14 16:18:25 +00:00
Chris Wilson 538ef96b9d drm/i915/gem: Track the rpm wakerefs
Keep track of the temporary rpm wakerefs used for user access to the
device, so that we can cancel them upon release and clearly identify any
leaks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-10-chris@chris-wilson.co.uk
2019-01-14 16:18:13 +00:00
Chris Wilson 506d1f6245 drm/i915: Track GT wakeref
Record the wakeref used for keeping the device awake as the GPU is
executing requests and be sure to cancel the tracking upon parking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-3-chris@chris-wilson.co.uk
2019-01-14 16:18:04 +00:00
Chris Wilson 16e4dd0342 drm/i915: Markup paired operations on wakerefs
The majority of runtime-pm operations are bounded and scoped within a
function; these are easy to verify that the wakeref are handled
correctly. We can employ the compiler to help us, and reduce the number
of wakerefs tracked when debugging, by passing around cookies provided
by the various rpm_get functions to their rpm_put counterpart. This
makes the pairing explicit, and given the required wakeref cookie the
compiler can verify that we pass an initialised value to the rpm_put
(quite handy for double checking error paths).

For regular builds, the compiler should be able to eliminate the unused
local variables and the program growth should be minimal. Fwiw, it came
out as a net improvement as gcc was able to refactor rpm_get and
rpm_get_if_in_use together,

v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual
mark up for smaller more targeted patches.
v3: Mention the cookie in Returns

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk
2019-01-14 16:17:53 +00:00
Dave Airlie 8c1a765bc6 drm-misc-next for 5.1:
UAPI Changes:
 
 Cross-subsystem Changes:
   - Turn dma-buf fence sequence numbers into 64 bit numbers
 
 Core Changes:
   - Move to a common helper for the DP MST hotplug for radeon, i915 and
     amdgpu
   - i2c improvements for drm_dp_mst
   - Removal of drm_syncobj_cb
   - Introduction of an helper to create and attach the TV margin properties
 
 Driver Changes:
   - Improve cache flushes for v3d
   - Reflection support for vc4
   - HDMI overscan support for vc4
   - Add implicit fencing support for rockchip and sun4i
   - Switch to generic fbdev emulation for virtio
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXDOTqAAKCRDj7w1vZxhR
 xZ8QAQD4j8m9Ea3bzY5Rr8BYUx1k+Cjj6Y6abZmot2rSvdyOHwD+JzJFIFAPZjdd
 uOKhLnDlubaaoa6OGPDQShjl9p3gyQE=
 =WQGO
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-01-07-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.1:

UAPI Changes:

Cross-subsystem Changes:
  - Turn dma-buf fence sequence numbers into 64 bit numbers

Core Changes:
  - Move to a common helper for the DP MST hotplug for radeon, i915 and
    amdgpu
  - i2c improvements for drm_dp_mst
  - Removal of drm_syncobj_cb
  - Introduction of an helper to create and attach the TV margin properties

Driver Changes:
  - Improve cache flushes for v3d
  - Reflection support for vc4
  - HDMI overscan support for vc4
  - Add implicit fencing support for rockchip and sun4i
  - Switch to generic fbdev emulation for virtio

Signed-off-by: Dave Airlie <airlied@redhat.com>

[airlied: applied amdgpu merge fixup]
From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190107180333.amklwycudbsub3s5@flea
2019-01-10 05:58:52 +10:00
Jani Nikula 2f80d7bd8d drm/i915: drop all drmP.h includes
Needs just a few additional includes here and there.

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190108082709.3748-1-jani.nikula@intel.com
2019-01-09 10:26:36 +02:00
Jani Nikula 3eb0930a42 Merge drm/drm-next into drm-intel-next-queued
Generally catch up with 5.0-rc1, and specifically get the changes:

96d4f267e4 ("Remove 'type' argument from access_ok() function")
0b2c8f8b6b ("i915: fix missing user_access_end() in page fault exception case")
594cc251fd ("make 'user_access_begin()' do 'access_ok()'")

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-01-08 10:50:22 +02:00
Chris Wilson b9d126e75b drm/i915: Remove partial attempt to swizzle on pread/pwrite
Our attempt to account for bit17 swizzling of pread/pwrite onto tiled
objects was flawed due to the simple fact that we do not always know the
swizzling for a particular page (due to the swizzling varying based on
location in certain unbalanced configurations). Furthermore, the
pread/pwrite paths are now unbalanced in that we are required to use the
GTT as in some cases we do not have direct CPU access to the backing
physical pages (thus some paths trying to account for the swizzle, but
others neglecting, chaos ensues).

There are no known users who do use pread/pwrite into a tiled object
(you need to manually detile anyway, so why now just use mmap and avoid
the copy?) and no user bug reports to indicate that it is being used in
the wild. As no one is hitting the buggy path, we can just remove the
buggy code.

v2: Just use the fault allowing kmap() + normal copy_(to|from)_user
v3: Avoid int overflow in computing 'length' from 'remain' (Tvrtko)

References: fe115628d5 ("drm/i915: Implement pwrite without struct-mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190105120758.9237-1-chris@chris-wilson.co.uk
2019-01-05 12:51:06 +00:00
Chris Wilson 55c15512a9 drm/i915: Do not allow unwedging following a failed driver initialisation
If we declare the driver wedged during early initialisation, we leave
the driver in an undefined state (with respect to GEM execution). As
this leads to unexpected behaviour if we allow the user to unwedge the
device (through debugfs, and performed by igt at test start), do not.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190103213340.1669-1-chris@chris-wilson.co.uk
2019-01-04 09:06:00 +00:00
Linus Torvalds 96d4f267e4 Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.

It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access.  But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.

A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model.  And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.

This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

There were a couple of notable cases:

 - csky still had the old "verify_area()" name as an alias.

 - the iter_iov code had magical hardcoded knowledge of the actual
   values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
   really used it)

 - microblaze used the type argument for a debug printout

but other than those oddities this should be a total no-op patch.

I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something.  Any missed conversion should be trivially fixable, though.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03 18:57:57 -08:00
Chris Wilson 55277e1f31 drm/i915: Always try to reset the GPU on takeover
When we first introduced the reset to sanitize the GPU on taking over
from the BIOS and before returning control to third parties (the BIOS!),
we restricted it to only systems utilizing HW contexts as we were
uncertain of how stable our reset mechanism truly was. We now have
reasonable coverage across all machines that expose a GPU reset method,
and so we should be safe to sanitize the GPU state everywhere.

v2: We _have_ to skip the reset if it would clobber the display.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190103112104.19561-1-chris@chris-wilson.co.uk
2019-01-03 12:40:42 +00:00
Chris Wilson 1216e3c3af drm/i915: Drop unused engine->irq_seqno_barrier w/a
Now that we have eliminated the CPU-side irq_seqno_barrier by moving the
delays on the GPU before emitting the MI_USER_INTERRUPT, we can remove
the engine->irq_seqno_barrier infrastructure. Though intentionally
slowing down the GPU is nasty, so is the code we can now remove!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181228171641.16531-6-chris@chris-wilson.co.uk
2018-12-31 15:35:45 +00:00
Arun KS ca79b0c211 mm: convert totalram_pages and totalhigh_pages variables to atomic
totalram_pages and totalhigh_pages are made static inline function.

Main motivation was that managed_page_count_lock handling was complicating
things.  It was discussed in length here,
https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes
better to remove the lock and convert variables to atomic, with preventing
poteintial store-to-read tearing as a bonus.

[akpm@linux-foundation.org: coding style fixes]
Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.org
Signed-off-by: Arun KS <arunks@codeaurora.org>
Suggested-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:47 -08:00
Chris Wilson 6faf5916e6 drm/i915: Remove HW semaphores for gen7 inter-engine synchronisation
The writing is on the wall for the existence of a single execution queue
along each engine, and as a consequence we will not be able to track
dependencies along the HW queue itself, i.e. we will not be able to use
HW semaphores on gen7 as they use a global set of registers (and unlike
gen8+ we can not effectively target memory to keep per-context seqno and
dependencies).

On the positive side, when we implement request reordering for gen7 we
also can not presume a simple execution queue and would also require
removing the current semaphore generation code. So this bring us another
step closer to request reordering for ringbuffer submission!

The negative side is that using interrupts to drive inter-engine
synchronisation is much slower (4us -> 15us to do a nop on each of the 3
engines on ivb). This is much better than it was at the time of introducing
the HW semaphores and equally important userspace weaned itself off
intermixing dependent BLT/RENDER operations (the prime culprit was glyph
rendering in UXA). So while we regress the microbenchmarks, it should not
impact the user.

References: https://bugs.freedesktop.org/show_bug.cgi?id=108888
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181228140736.32606-2-chris@chris-wilson.co.uk
2018-12-28 14:43:27 +00:00
Lucas De Marchi cf819eff90 drm/i915: replace IS_GEN<N> with IS_GEN(..., N)
Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of
gen_mask to do the comparison. Now callers can pass then gen as a parameter,
so we don't require one macro for each gen.

The following spatch was used to convert the users of these macros:

@@
expression e;
@@
(
- IS_GEN2(e)
+ IS_GEN(e, 2)
|
- IS_GEN3(e)
+ IS_GEN(e, 3)
|
- IS_GEN4(e)
+ IS_GEN(e, 4)
|
- IS_GEN5(e)
+ IS_GEN(e, 5)
|
- IS_GEN6(e)
+ IS_GEN(e, 6)
|
- IS_GEN7(e)
+ IS_GEN(e, 7)
|
- IS_GEN8(e)
+ IS_GEN(e, 8)
|
- IS_GEN9(e)
+ IS_GEN(e, 9)
|
- IS_GEN10(e)
+ IS_GEN(e, 10)
|
- IS_GEN11(e)
+ IS_GEN(e, 11)
)

v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than
    using the bitmask

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com
2018-12-12 16:52:10 -08:00
Mika Kuoppala dd847a7069 drm/i915: Compile fix for 64b dma-fence seqno
Many errs of the form:
drivers/gpu/drm/i915/selftests/intel_hangcheck.c: In function ‘__igt_reset_evict_vma’:
./include/linux/kern_levels.h:5:18: error: format ‘%x’ expects argument of type ‘unsigned int’, but argum

Fixes: b312d8ca3a ("dma-buf: make fence sequence numbers 64 bit v2")
Cc: Christian König <christian.koenig@amd.com>
Cc: Chunming Zhou <david1.zhou@amd.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181207123428.16257-1-mika.kuoppala@linux.intel.com
2018-12-07 13:10:11 +00:00
Chris Wilson 5179749925 drm/i915: Allocate a common scratch page
Currently we allocate a scratch page for each engine, but since we only
ever write into it for post-sync operations, it is not exposed to
userspace nor do we care for coherency. As we then do not care about its
contents, we can use one page for all, reducing our allocations and
avoid complications by not assuming per-engine isolation.

For later use, it simplifies engine initialisation (by removing the
allocation that required struct_mutex!) and means that we can always rely
on there being a scratch page.

v2: Check that we allocated a large enough scratch for I830 w/a

Fixes: 06e562e7f515 ("drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5") # v4.18.20
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108850
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181204141522.13640-1-chris@chris-wilson.co.uk
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.18.20+
2018-12-04 15:57:08 +00:00
Tvrtko Ursulin 094304beb4 drm/i915: Verify GT workaround state after GPU init
Since we now have all the GT workarounds in a table, by adding a simple
shared helper function we can now verify that their values are still
applied after some interesting events in the lifetime of the driver.

Initially we only do this after GPU initialization.

v2:
 Chris Wilson:
 * Simplify verification by realizing it's a simple xor and and.
 * Remove verification from engine reset path.
 * Return bool straight away from the verify API.

v3:
 * API rename. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203125014.3219-4-tvrtko.ursulin@linux.intel.com
2018-12-04 12:23:18 +00:00
Tvrtko Ursulin 25d140faaa drm/i915: Record GT workarounds in a list
To enable later verification of GT workaround state at various stages of
driver lifetime, we record the list of applicable ones per platforms to a
list, from which they are also applied.

The added data structure is a simple array of register, mask and value
items, which is allocated on demand as workarounds are added to the list.

This is a temporary implementation which later in the series gets fused
with the existing per context workaround list handling. It is separated at
this stage since the following patch fixes a bug which needs to be as easy
to backport as possible.

Also, since in the following patch we will be adding a new class of
workarounds (per engine) which can be applied from interrupt context, we
straight away make the provision for safe read-modify-write cycle.

v2:
 * Change dev_priv to i915 along the init path. (Chris Wilson)
 * API rename. (Chris Wilson)

v3:
 * Remove explicit list size tracking in favour of growing the allocation
   in power of two chunks. (Chris Wilson)

v4:
 Chris Wilson:
 * Change wa_list_finish to early return.
 * Copy workarounds using the compiler for static checking.
 * Do not bother zeroing unused entries.
 * Re-order struct i915_wa_list.

v5:
 * kmalloc_array.
 * Whitespace cleanup.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203133319.10174-1-tvrtko.ursulin@linux.intel.com
2018-12-04 12:23:14 +00:00
Chris Wilson 3800960afe drm/i915: Complete the fences as they are cancelled due to wedging
We inspect the requests under the assumption that they will be marked as
completed when they are removed from the queue. Currently however, in the
process of wedging the requests will be removed from the queue before they
are completed, so rearrange the code to complete the fences before the
locks are dropped.

<1>[  354.473346] BUG: unable to handle kernel NULL pointer dereference at 0000000000000250
<6>[  354.473363] PGD 0 P4D 0
<4>[  354.473370] Oops: 0000 [#1] PREEMPT SMP PTI
<4>[  354.473380] CPU: 0 PID: 4470 Comm: gem_eio Tainted: G     U            4.20.0-rc4-CI-CI_DRM_5216+ #1
<4>[  354.473393] Hardware name: Intel Corporation NUC7CJYH/NUC7JYB, BIOS JYGLKCPX.86A.0027.2018.0125.1347 01/25/2018
<4>[  354.473480] RIP: 0010:__i915_schedule+0x311/0x5e0 [i915]
<4>[  354.473490] Code: 49 89 44 24 20 4d 89 4c 24 28 4d 89 29 44 39 b3 a0 04 00 00 7d 3a 41 8b 44 24 78 85 c0 74 13 48 8b 93 78 04 00 00 48 83 e2 fc <39> 82 50 02 00 00 79 1e 44 89 b3 a0 04 00 00 48 8d bb d0 03 00 00
<4>[  354.473515] RSP: 0018:ffffc900001bba90 EFLAGS: 00010046
<4>[  354.473524] RAX: 0000000000000003 RBX: ffff8882624c8008 RCX: f34a737800000000
<4>[  354.473535] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8882624c8048
<4>[  354.473545] RBP: ffffc900001bbab0 R08: 000000005963f1f1 R09: 0000000000000000
<4>[  354.473556] R10: ffffc900001bba10 R11: ffff8882624c8060 R12: ffff88824fdd7b98
<4>[  354.473567] R13: ffff88824fdd7bb8 R14: 0000000000000001 R15: ffff88824fdd7750
<4>[  354.473578] FS:  00007f44b4b5b980(0000) GS:ffff888277e00000(0000) knlGS:0000000000000000
<4>[  354.473590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  354.473599] CR2: 0000000000000250 CR3: 000000026976e000 CR4: 0000000000340ef0
<4>[  354.473611] Call Trace:
<4>[  354.473622]  ? lock_acquire+0xa6/0x1c0
<4>[  354.473677]  ? i915_schedule_bump_priority+0x57/0xd0 [i915]
<4>[  354.473736]  i915_schedule_bump_priority+0x72/0xd0 [i915]
<4>[  354.473792]  i915_request_wait+0x4db/0x840 [i915]
<4>[  354.473804]  ? get_pwq.isra.4+0x2c/0x50
<4>[  354.473813]  ? ___preempt_schedule+0x16/0x18
<4>[  354.473824]  ? wake_up_q+0x70/0x70
<4>[  354.473831]  ? wake_up_q+0x70/0x70
<4>[  354.473882]  ? gen6_rps_boost+0x118/0x120 [i915]
<4>[  354.473936]  i915_gem_object_wait_fence+0x8a/0x110 [i915]
<4>[  354.473991]  i915_gem_object_wait+0x113/0x500 [i915]
<4>[  354.474047]  i915_gem_wait_ioctl+0x11c/0x2f0 [i915]
<4>[  354.474101]  ? i915_gem_unset_wedged+0x210/0x210 [i915]
<4>[  354.474113]  drm_ioctl_kernel+0x81/0xf0
<4>[  354.474123]  drm_ioctl+0x2de/0x390
<4>[  354.474175]  ? i915_gem_unset_wedged+0x210/0x210 [i915]
<4>[  354.474187]  ? finish_task_switch+0x95/0x260
<4>[  354.474197]  ? lock_acquire+0xa6/0x1c0
<4>[  354.474207]  do_vfs_ioctl+0xa0/0x6e0
<4>[  354.474217]  ? __fget+0xfc/0x1e0
<4>[  354.474225]  ksys_ioctl+0x35/0x60
<4>[  354.474233]  __x64_sys_ioctl+0x11/0x20
<4>[  354.474241]  do_syscall_64+0x55/0x190
<4>[  354.474251]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  354.474260] RIP: 0033:0x7f44b3de65d7
<4>[  354.474267] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
<4>[  354.474293] RSP: 002b:00007fff974948e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4>[  354.474305] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f44b3de65d7
<4>[  354.474316] RDX: 00007fff97494940 RSI: 00000000c010646c RDI: 0000000000000007
<4>[  354.474327] RBP: 00007fff97494940 R08: 0000000000000000 R09: 00007f44b40bbc40
<4>[  354.474337] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c010646c
<4>[  354.474348] R13: 0000000000000007 R14: 0000000000000000 R15: 0000000000000000

v2: Avoid floating requests.
v3: Can't call dma_fence_signal() under the timeline lock!
v4: Can't call dma_fence_signal() from inside another fence either.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203113701.12106-2-chris@chris-wilson.co.uk
2018-12-04 11:26:33 +00:00
Jani Nikula 2ac5e38ea4 Merge drm/drm-next into drm-intel-next-queued
Pull in v4.20-rc3 via drm-next.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2018-11-20 13:14:08 +02:00
Chris Wilson a1db9c54eb drm/i915: Track rcu_head for our idle worker
While our little rcu worker might be able to be replaced now by the
dedicated rcu_work, in the meantime we should mark up the rcu_head for
correct debugobjects tracking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181108092101.27598-2-chris@chris-wilson.co.uk
2018-11-09 12:02:09 +00:00
Chris Wilson 8811d616df drm/i915: Initialise the obj->rcu head
Make the rcu_head known to the system, in particular for debugobjects.
And having declared it for debugobjects, we need to tidy up afterwards.

v2: mark the obj->rcu as being destroyed when we reuse its location for
the freed list.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108691
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181109090311.15321-1-chris@chris-wilson.co.uk
2018-11-09 10:40:20 +00:00
Kuo-Hsin Yang 64e3d12f76 mm, drm/i915: mark pinned shmemfs pages as unevictable
The i915 driver uses shmemfs to allocate backing storage for gem
objects. These shmemfs pages can be pinned (increased ref count) by
shmem_read_mapping_page_gfp(). When a lot of pages are pinned, vmscan
wastes a lot of time scanning these pinned pages. In some extreme case,
all pages in the inactive anon lru are pinned, and only the inactive
anon lru is scanned due to inactive_ratio, the system cannot swap and
invokes the oom-killer. Mark these pinned pages as unevictable to speed
up vmscan.

Export pagevec API check_move_unevictable_pages().

This patch was inspired by Chris Wilson's change [1].

[1]: https://patchwork.kernel.org/patch/9768741/

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com> # mm part
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20181106132324.17390-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-11-07 15:28:32 +00:00
Chris Wilson e6db7f4d7c drm/i915: Break long iterations for get/put shmemfs pages
As we may have to iterate a few thousand elements to acquire and release
the shmemfs backing storage for a GPU object, we need to break up the
long loop with cond_resched() to retain a modicum of low latency for
other processes.

Testcase: igt/benchmarks/gem_syslatency
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kuo-Hsin Yang <vovoy@chromium.org>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181105170640.26905-1-chris@chris-wilson.co.uk
2018-11-06 13:14:05 +00:00
Linus Torvalds 53b3b6bbfd drm pull for 4.20-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbz8FfAAoJEAx081l5xIa+t8AQAJ6KaMM7rYlRDzIr1Vuh0++2
 kFmfEQnSZnOWxO+zpyQpCJQr+/1aAHQ6+QzWq+/fX/bwH01H1Q4+z6t+4QoPqYlw
 rUYXZYRxnqGVPYx8hwNLmRbJcsXMVKly1SemBXIoabKkEtPNX5AZ4FXCR2WEbsV3
 /YLxsYQv2KMd8aoeC9Hupa07Jj9GfOtEo0a9B1hKmo+XiF9HqPadxaofqOpQ6MCh
 54itBgP7Kj4mwwr8KxG2JCNJagG5aG8q3yiEwaU5b0KzWya4o1wrOAJcBaEAIxpj
 JAgPde+f2L6w9dQbBBWeVFYKNn0jJqJdmbPs5Ek3i/NNFyx01Mn/3vlTZoRUqJN7
 TGXwOI/BWz1iTaHyFPqVH6RPQAoUUDeCwgHkXonogFxvQLpiFG+dRNqxue0XVUMX
 9tDSdZefWPoH3n9J/gDhwbV2Qbw/2n6yzCRYCb8HkqX1Y1JTmdYVgKvcnOwyYwsJ
 QzcVkWUJ31UAaZcTLCEW6SVqcUR0mso3LJAPSKp2NJiVLL8mSd/ViUTUbxRNkkXf
 H0abVGDjWAAZaT5uqNVqg4kV1Vc4Kj+/9QtspW4ktGezOz9DsctwJtfhTgOmT8Fx
 zlEwWmAbf1iJP9UgqI7r4+Nq24saqUYmIX0bowEasLIRO+l14Pf9mQJjgKRcMs/j
 SK4W5EreSFosKsxtQU4H
 =3yxi
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2018-10-24' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "This is going to rebuild more than drm as it adds a new helper to
  list.h for doing bulk updates. Seemed like a reasonable addition to
  me.

  Otherwise the usual merge window stuff lots of i915 and amdgpu, not so
  much nouveau, and piles of everything else.

  Core:
   - Adds a new list.h helper for doing bulk list updates for TTM.
   - Don't leak fb address in smem_start to userspace (comes with EXPORT
     workaround for people using mali out of tree hacks)
   - udmabuf device to turn memfd regions into dma-buf
   - Per-plane blend mode property
   - ref/unref replacements with get/put
   - fbdev conflicting framebuffers code cleaned up
   - host-endian format variants
   - panel orientation quirk for Acer One 10

  bridge:
   - TI SN65DSI86 chip support

  vkms:
   - GEM support.
   - Cursor support

  amdgpu:
   - Merge amdkfd and amdgpu into one module
   - CEC over DP AUX support
   - Picasso APU support + VCN dynamic powergating
   - Raven2 APU support
   - Vega20 enablement + kfd support
   - ACP powergating improvements
   - ABGR/XBGR display support
   - VCN jpeg support
   - xGMI support
   - DC i2c/aux cleanup
   - Ycbcr 4:2:0 support
   - GPUVM improvements
   - Powerplay and powerplay endian fixes
   - Display underflow fixes

  vmwgfx:
   - Move vmwgfx specific TTM code to vmwgfx
   - Split out vmwgfx buffer/resource validation code
   - Atomic operation rework

  bochs:
   - use more helpers
   - format/byteorder improvements

  qxl:
   - use more helpers

  i915:
   - GGTT coherency getparam
   - Turn off resource streamer API
   - More Icelake enablement + DMC firmware
   - Full PPGTT for Ivybridge, Haswell and Valleyview
   - DDB distribution based on resolution
   - Limited range DP display support

  nouveau:
   - CEC over DP AUX support
   - Initial HDMI 2.0 support

  virtio-gpu:
   - vmap support for PRIME objects

  tegra:
   - Initial Tegra194 support
   - DMA/IOMMU integration fixes

  msm:
   - a6xx perf improvements + clock prefix
   - GPU preemption optimisations
   - a6xx devfreq support
   - cursor support

  rockchip:
   - PX30 support
   - rgb output interface support

  mediatek:
   - HDMI output support on mt2701 and mt7623

  rcar-du:
   - Interlaced modes on Gen3
   - LVDS on R8A77980
   - D3 and E3 SoC support

  hisilicon:
   - misc fixes

  mxsfb:
   - runtime pm support

  sun4i:
   - R40 TCON support
   - Allwinner A64 support
   - R40 HDMI support

  omapdrm:
   - Driver rework changing display pipeline ordering to use common code
   - DMM memory barrier and irq fixes
   - Errata workarounds

  exynos:
   - out-bridge support for LVDS bridge driver
   - Samsung 16x16 tiled format support
   - Plane alpha and pixel blend mode support

  tilcdc:
   - suspend/resume update

  mali-dp:
   - misc updates"

* tag 'drm-next-2018-10-24' of git://anongit.freedesktop.org/drm/drm: (1382 commits)
  firmware/dmc/icl: Add missing MODULE_FIRMWARE() for Icelake.
  drm/i915/icl: Fix signal_levels
  drm/i915/icl: Fix DDI/TC port clk_off bits
  drm/i915/icl: create function to identify combophy port
  drm/i915/gen9+: Fix initial readout for Y tiled framebuffers
  drm/i915: Large page offsets for pread/pwrite
  drm/i915/selftests: Disable shrinker across mmap-exhaustion
  drm/i915/dp: Link train Fallback on eDP only if fallback link BW can fit panel's native mode
  drm/i915: Fix intel_dp_mst_best_encoder()
  drm/i915: Skip vcpi allocation for MSTB ports that are gone
  drm/i915: Don't unset intel_connector->mst_port
  drm/i915: Only reset seqno if actually idle
  drm/i915: Use the correct crtc when sanitizing plane mapping
  drm/i915: Restore vblank interrupts earlier
  drm/i915: Check fb stride against plane max stride
  drm/amdgpu/vcn:Fix uninitialized symbol error
  drm: panel-orientation-quirks: Add quirk for Acer One 10 (S1003)
  drm/amd/amdgpu: Fix debugfs error handling
  drm/amdgpu: Update gc_9_0 golden settings.
  drm/amd/powerplay: update PPtable with DC BTC and Tvr SocLimit fields
  ...
2018-10-28 17:49:53 -07:00
Chris Wilson ab0d6a1418 drm/i915: Large page offsets for pread/pwrite
Handle integer overflow when computing the sub-page length for shmem
backed pread/pwrite.

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181012140228.29783-1-chris@chris-wilson.co.uk
(cherry picked from commit a5e856a534)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-10-17 18:24:05 -07:00
Chris Wilson a5e856a534 drm/i915: Large page offsets for pread/pwrite
Handle integer overflow when computing the sub-page length for shmem
backed pread/pwrite.

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181012140228.29783-1-chris@chris-wilson.co.uk
2018-10-15 12:52:03 +01:00
Chris Wilson e9eaf82d97 drm/i915: Priority boost for waiting clients
Latency is in the eye of the beholder. In the case where a client stops
and waits for the gpu, give that request chain a small priority boost
(not so that it overtakes higher priority clients, to preserve the
external ordering) so that ideally the wait completes earlier.

v2: Tvrtko recommends to keep the boost-from-user-stall as small as
possible and to allow new client flows to be preferred for interactivity
over stalls.

Testcase: igt/gem_sync/switch-default
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181001144755.7978-3-chris@chris-wilson.co.uk
2018-10-01 20:34:24 +01:00