Commit Graph

42 Commits

Author SHA1 Message Date
Chris Wilson 2850748ef8 drm/i915: Pull i915_vma_pin under the vm->mutex
Replace the struct_mutex requirement for pinning the i915_vma with the
local vm->mutex instead. Note that the vm->mutex is tainted by the
shrinker (we require unbinding from inside fs-reclaim) and so we cannot
allocate while holding that mutex. Instead we have to preallocate
workers to do allocate and apply the PTE updates after we have we
reserved their slot in the drm_mm (using fences to order the PTE writes
with the GPU work and with later unbind).

In adding the asynchronous vma binding, one subtle requirement is to
avoid coupling the binding fence into the backing object->resv. That is
the asynchronous binding only applies to the vma timeline itself and not
to the pages as that is a more global timeline (the binding of one vma
does not need to be ordered with another vma, nor does the implicit GEM
fencing depend on a vma, only on writes to the backing store). Keeping
the vma binding distinct from the backing store timelines is verified by
a number of async gem_exec_fence and gem_exec_schedule tests. The way we
do this is quite simple, we keep the fence for the vma binding separate
and only wait on it as required, and never add it to the obj->resv
itself.

Another consequence in reducing the locking around the vma is the
destruction of the vma is no longer globally serialised by struct_mutex.
A natural solution would be to add a kref to i915_vma, but that requires
decoupling the reference cycles, possibly by introducing a new
i915_mm_pages object that is own by both obj->mm and vma->pages.
However, we have not taken that route due to the overshadowing lmem/ttm
discussions, and instead play a series of complicated games with
trylocks to (hopefully) ensure that only one destruction path is called!

v2: Add some commentary, and some helpers to reduce patch churn.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk
2019-10-04 15:39:02 +01:00
Chris Wilson 1f7fd484ff drm/i915: Replace i915_vma_put_fence()
Avoid calling i915_vma_put_fence() by using our alternate paths that
bind a secondary vma avoiding the original fenced vma. For the few
instances where we need to release the fence (i.e. on binding when the
GGTT range becomes invalid), replace the put_fence with a revoke_fence.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190822061557.18402-1-chris@chris-wilson.co.uk
2019-08-22 08:53:42 +01:00
Chris Wilson e2ccc50a3a drm/i915: Track ggtt fence reservations under its own mutex
We can reduce the locking for fence registers from the dev->struct_mutex
to a local mutex. We could introduce a mutex for the sole purpose of
tracking the fence acquisition, except there is a little bit of overlap
with the fault tracking, so use the i915_ggtt.mutex as it covers both.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190822060914.2671-1-chris@chris-wilson.co.uk
2019-08-22 08:53:40 +01:00
Chris Wilson 3d6792cf0a drm/i915: Forgo last_fence active request tracking
We were using the last_fence to track the last request that used this
vma that might be interpreted by a fence register and forced ourselves
to wait for this request before modifying any fence register that
overlapped our vma. Due to requirement that we need to track any XY_BLT
command, linear or tiled, this in effect meant that we have to track the
vma for its active lifespan anyway, so we can forgo the explicit
last_fence tracking and just use the whole vma->active.

Another solution would be to pipeline the register updates, and would
help resolve some long running stalls for gen3 (but only gen 2 and 3!)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190812174804.26180-1-chris@chris-wilson.co.uk
2019-08-12 19:29:16 +01:00
Jani Nikula 6da4a2c411 drm/i915: remove unnecessary includes of intel_display_types.h header
With its original name intel_drv.h the intel_display_types.h header was
superfluously cargo-cult included all over the place, while it's really
mostly about display internals. Remove the unnecessary includes.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e3d737f0ab87c55969e62c1e077e15c04c238297.1565085692.git.jani.nikula@intel.com
2019-08-07 12:43:55 +03:00
Jani Nikula 1d455f8de8 drm/i915: rename intel_drv.h to display/intel_display_types.h
Everything about the file is about display, and mostly about types
related to display. Move under display/ as intel_display_types.h to
reflect the facts.

There's still plenty to clean up, but start off with moving the file
where it logically belongs and naming according to contents.

v2: fix the include guard name in the renamed file

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190806113933.11799-1-jani.nikula@intel.com
2019-08-07 12:43:50 +03:00
Tvrtko Ursulin f88709bd1c drm/i915: Use intel_uncore_rmw in intel_gt_init_swizzling
Two easy opportunities to compact the code by using the existing helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-8-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:24 +01:00
Tvrtko Ursulin 500bfa380e drm/i915: Convert i915_gem_init_swizzling to intel_gt
Start using the newly introduced struct intel_gt to fuse together correct
logical init flow with uncore for more removal of implicit dev_priv in
mmio access.

v2:
 * Move code to i915_gem_fence_reg. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-7-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:22 +01:00
Daniele Ceraolo Spurio 58a111f03a drm/i915: make intel_wakeref work on the rpm struct
intel_runtime_pm is the only thing they use from the i915 structure,
so use that directly.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-9-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Daniele Ceraolo Spurio d858d5695f drm/i915: update rpm_get/put to use the rpm structure
The functions where internally already only using the structure, so we
need to just flip the interface.

v2: rebase

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Daniele Ceraolo Spurio 87b391b951 drm/i915: Remove rpm asserts that use i915
Quite a few of the call points have already switched to the version
working directly on the runtime_pm structure, so let's switch over the
rest and kill the i915-based asserts.

v2: rebase

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-3-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Chris Wilson 0cf289bd5d drm/i915: Move fence register tracking from i915->mm to ggtt
As the fence registers only apply to regions inside the GGTT is makes
more sense that we track these as part of the i915_ggtt and not the
general mm. In the next patch, we will then pull the register locking
underneath the i915_ggtt.mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613073254.24048-1-chris@chris-wilson.co.uk
2019-06-13 09:37:39 +01:00
Chris Wilson 1c8242c3a4 drm/i915: Use unchecked writes for setting up the fences
As the fence registers are not part of the engine powerwells, we do not
need to fiddle with forcewake in order to update a fence. Avoid using
the heavyweight debug checking normal mmio writes as the checking
dominates the selftest runtime and is superfluous!

In the process, retire the I915_WRITE() implicit macro with the new
intel_uncore_write interface.

v2: s/unc/uncore/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190604120022.20472-2-chris@chris-wilson.co.uk
2019-06-04 14:51:25 +01:00
Chris Wilson 37d63f8fdb drm/i915: Pull scatterlist utils out of i915_gem.h
Out scatterlist utility routines can be pulled out of i915_gem.h for a
bit more decluttering.

v2: Push I915_GTT_PAGE_SIZE out of i915_scatterlist itself and into the
caller.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-9-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Andy Shevchenko 6e514e3717 drm/i915: Switch to bitmap_zalloc()
Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190304092908.57382-2-andriy.shevchenko@linux.intel.com
2019-03-20 17:50:35 +00:00
Ville Syrjälä 5c22786983 drm/i915: Reorder gen3/4 swizzle detection logic
g33/i964g/g45 are the exceptional cases when it comes to
the swizzle detection. Let's reorder the code to handle
them first and let everything else be handled by the
else branch. This allows us to unset .is_mobile for the
desktop PNV variant (which supposedly must follow the
"mobile" path here).

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190318165633.28924-1-ville.syrjala@linux.intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2019-03-20 16:57:00 +02:00
Chris Wilson c1d1746f6d drm/i915: Avoid reset lock in writing fence registers
The idea of taking the reset lock around writing the fence register was
to serialise the mmio write we also perform during the reset where those
registers get clobbered. However, the lock is overkill as write tearing
between reset and fence_update() is harmless; the final value of the
fence register is the same. A race between revoke_fences() and
fence_update() is also harmless at this point as on the fault path where
this is necessary, we acquire the reset lock to coordinate ourselves in
the upper layer.

The danger of acquiring the reset lock again in fence_update() is that
we may recurse from the shrinker along the i915_gem_fault() path.

<4> [125.739646] ============================================
<4> [125.739652] WARNING: possible recursive locking detected
<4> [125.739659] 5.0.0-rc6-ga6e4cbf00557-drmtip_223+ #1 Tainted: G     U
<4> [125.739666] --------------------------------------------
<4> [125.739672] gem_mmap_gtt/1017 is trying to acquire lock:
<4> [125.739679] 00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x0/0x310 [i915]
<4> [125.739848]
but task is already holding lock:
<4> [125.739854] 00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x192/0x310 [i915]
<4> [125.739918]
other info that might help us debug this:
<4> [125.739925]  Possible unsafe locking scenario:

<4> [125.739930]        CPU0
<4> [125.739934]        ----
<4> [125.739937]   lock(&dev_priv->gpu_error.reset_backoff_srcu);
<4> [125.739944]   lock(&dev_priv->gpu_error.reset_backoff_srcu);
<4> [125.739950]
 *** DEADLOCK ***

<4> [125.739958]  May be due to missing lock nesting notation

<4> [125.739966] 5 locks held by gem_mmap_gtt/1017:
<4> [125.739972]  #0: 00000000471f682c (&mm->mmap_sem){++++}, at: __do_page_fault+0x133/0x500
<4> [125.739987]  #1: 0000000026542685 (&dev->struct_mutex){+.+.}, at: i915_gem_fault+0x1f6/0x860 [i915]
<4> [125.740061]  #2: 00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x192/0x310 [i915]
<4> [125.740126]  #3: 00000000c828eb4f (fs_reclaim){+.+.}, at: fs_reclaim_acquire.part.25+0x0/0x30
<4> [125.740140]  #4: 000000002d360d65 (shrinker_rwsem){++++}, at: shrink_slab+0x1cb/0x2c0
<4> [125.740151]
stack backtrace:
<4> [125.740159] CPU: 1 PID: 1017 Comm: gem_mmap_gtt Tainted: G     U            5.0.0-rc6-ga6e4cbf00557-drmtip_223+ #1
<4> [125.740170] Hardware name: Dell Inc.                 OptiPlex 745                 /0GW726, BIOS 2.3.1  05/21/2007
<4> [125.740180] Call Trace:
<4> [125.740189]  dump_stack+0x67/0x9b
<4> [125.740199]  __lock_acquire+0xc75/0x1b00
<4> [125.740209]  ? arch_tlb_finish_mmu+0x2a/0xa0
<4> [125.740216]  ? tlb_finish_mmu+0x1a/0x30
<4> [125.740222]  ? zap_page_range_single+0xe2/0x130
<4> [125.740230]  ? lock_acquire+0xa6/0x1c0
<4> [125.740237]  lock_acquire+0xa6/0x1c0
<4> [125.740296]  ? i915_clear_error_registers+0x280/0x280 [i915]
<4> [125.740357]  i915_reset_trylock+0x44/0x310 [i915]
<4> [125.740417]  ? i915_clear_error_registers+0x280/0x280 [i915]
<4> [125.740426]  ? lockdep_hardirqs_on+0xe0/0x1b0
<4> [125.740434]  ? _raw_spin_unlock_irqrestore+0x39/0x60
<4> [125.740499]  fence_update+0x218/0x470 [i915]
<4> [125.740571]  i915_vma_unbind+0xa6/0x550 [i915]
<4> [125.740640]  i915_gem_object_unbind+0xfa/0x190 [i915]
<4> [125.740711]  i915_gem_shrink+0x2dc/0x590 [i915]
<4> [125.740722]  ? ___preempt_schedule+0x16/0x18
<4> [125.740792]  ? i915_gem_shrinker_scan+0xc9/0x130 [i915]
<4> [125.740861]  i915_gem_shrinker_scan+0xc9/0x130 [i915]
<4> [125.740870]  do_shrink_slab+0x143/0x3f0
<4> [125.740878]  shrink_slab+0x228/0x2c0
<4> [125.740886]  shrink_node+0x167/0x450
<4> [125.740894]  do_try_to_free_pages+0xc4/0x340
<4> [125.740902]  try_to_free_pages+0xdc/0x2e0
<4> [125.740911]  __alloc_pages_nodemask+0x662/0x1110
<4> [125.740921]  ? reacquire_held_locks+0xb5/0x1b0
<4> [125.740928]  ? reacquire_held_locks+0xb5/0x1b0
<4> [125.740986]  ? i915_reset_trylock+0x192/0x310 [i915]
<4> [125.741045]  ? i915_memcpy_init_early+0x30/0x30 [i915]
<4> [125.741054]  pte_alloc_one+0x12/0x70
<4> [125.741060]  __pte_alloc+0x11/0xf0
<4> [125.741067]  apply_to_page_range+0x37e/0x440
<4> [125.741127]  remap_io_mapping+0x6c/0x100 [i915]
<4> [125.741196]  i915_gem_fault+0x5a9/0x860 [i915]
<4> [125.741204]  ? ptlock_alloc+0x15/0x30
<4> [125.741212]  __do_fault+0x2c/0xb0
<4> [125.741218]  __handle_mm_fault+0x8ee/0xfa0
<4> [125.741227]  handle_mm_fault+0x196/0x3a0
<4> [125.741235]  __do_page_fault+0x246/0x500
<4> [125.741243]  ? page_fault+0x8/0x30
<4> [125.741250]  page_fault+0x1e/0x30
<4> [125.741256] RIP: 0033:0x55d0cc456e12
<4> [125.741264] Code: b0 df ff ff 89 c2 8b 85 70 df ff ff 01 c2 8b 85 70 df ff ff 48 98 48 8d 0c 85 00 00 00 00 48 8b 85 e0 df ff ff 48 01 c8 f7 d2 <89> 10 83 85 70 df ff ff 01 81 bd 70 df ff ff ff 03 00 00 7e be 48
<4> [125.741280] RSP: 002b:00007ffc1bab7ab0 EFLAGS: 00010206
<4> [125.741287] RAX: 00007fc787cb6000 RBX: 0000000000000000 RCX: 0000000000000000
<4> [125.741295] RDX: 00000000ffffffff RSI: 0000000000005401 RDI: 0000000000000002
<4> [125.741303] RBP: 00007ffc1bab9b70 R08: 00007ffc1bab7920 R09: 000000000000001b
<4> [125.741310] R10: 7165722074736554 R11: 0000000000000246 R12: 000055d0cc454a80
<4> [125.741318] R13: 00007ffc1bab9f60 R14: 0000000000000000 R15: 0000000000000000

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109665
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190219122215.8941-4-chris@chris-wilson.co.uk
2019-02-20 16:40:13 +00:00
Chris Wilson 2caffbf117 drm/i915: Revoke mmaps and prevent access to fence registers across reset
Previously, we were able to rely on the recursive properties of
struct_mutex to allow us to serialise revoking mmaps and reacquiring the
FENCE registers with them being clobbered over a global device reset.
I then proceeded to throw out the baby with the bath water in order to
pursue a struct_mutex-less reset.

Perusing LWN for alternative strategies, the dilemma on how to serialise
access to a global resource on one side was answered by
https://lwn.net/Articles/202847/ -- Sleepable RCU:

    1  int readside(void) {
    2      int idx;
    3      rcu_read_lock();
    4	   if (nomoresrcu) {
    5          rcu_read_unlock();
    6	       return -EINVAL;
    7      }
    8	   idx = srcu_read_lock(&ss);
    9	   rcu_read_unlock();
    10	   /* SRCU read-side critical section. */
    11	   srcu_read_unlock(&ss, idx);
    12	   return 0;
    13 }
    14
    15 void cleanup(void)
    16 {
    17     nomoresrcu = 1;
    18     synchronize_rcu();
    19     synchronize_srcu(&ss);
    20     cleanup_srcu_struct(&ss);
    21 }

No more worrying about stop_machine, just an uber-complex mutex,
optimised for reads, with the overhead pushed to the rare reset path.

However, we do run the risk of a deadlock as we allocate underneath the
SRCU read lock, and the allocation may require a GPU reset, causing a
dependency cycle via the in-flight requests. We resolve that by declaring
the driver wedged and cancelling all in-flight rendering.

v2: Use expedited rcu barriers to match our earlier timing
characteristics.
v3: Try to annotate locking contexts for sparse
v4: Reduce selftest lock duration to avoid a reset deadlock with fences
v5: s/srcu/reset_backoff_srcu/
v6: Remove more stale comments

Testcase: igt/gem_mmap_gtt/hang
Fixes: eb8d0f5af4 ("drm/i915: Remove GPU reset dependence on struct_mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208153708.20023-2-chris@chris-wilson.co.uk
2019-02-08 16:47:32 +00:00
Chris Wilson 7ae1940014 drm/i915: Defer removing fence register tracking to rpm wakeup
Currently, we may simultaneously release the fence register from both
fence_update() and i915_gem_restore_fences(). This is dangerous, so
defer the bookkeeping entirely to i915_gem_restore_fences() when the
device is asleep.

Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208153708.20023-1-chris@chris-wilson.co.uk
2019-02-08 16:47:30 +00:00
Chris Wilson 21950ee7cc drm/i915: Pull i915_gem_active into the i915_active family
Looking forward, we need to break the struct_mutex dependency on
i915_gem_active. In the meantime, external use of i915_gem_active is
quite beguiling, little do new users suspect that it implies a barrier
as each request it tracks must be ordered wrt the previous one. As one
of many, it can be used to track activity across multiple timelines, a
shared fence, which fits our unordered request submission much better. We
need to steer external users away from the singular, exclusive fence
imposed by i915_gem_active to i915_active instead. As part of that
process, we move i915_gem_active out of i915_request.c into
i915_active.c to start separating the two concepts, and rename it to
i915_active_request (both to tie it to the concept of tracking just one
request, and to give it a longer, less appealing name).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk
2019-02-05 17:20:11 +00:00
Jani Nikula 739f3abdbf drm/i915: small isolated c99 types to kernel types switch
Mixed C99 and kernel types use is getting ugly. Prefer kernel types.

sed -i 's/\buint\(8\|16\|32\|64\)_t\b/u\1/g'

Minor checkpatch fixes sprinkled on top of the changed lines.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/14ed72e7f04c9340a057855c5950b54811f8a477.1547629303.git.jani.nikula@intel.com
2019-01-17 09:02:00 +02:00
Chris Wilson 538ef96b9d drm/i915/gem: Track the rpm wakerefs
Keep track of the temporary rpm wakerefs used for user access to the
device, so that we can cancel them upon release and clearly identify any
leaks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-10-chris@chris-wilson.co.uk
2019-01-14 16:18:13 +00:00
Chris Wilson 16e4dd0342 drm/i915: Markup paired operations on wakerefs
The majority of runtime-pm operations are bounded and scoped within a
function; these are easy to verify that the wakeref are handled
correctly. We can employ the compiler to help us, and reduce the number
of wakerefs tracked when debugging, by passing around cookies provided
by the various rpm_get functions to their rpm_put counterpart. This
makes the pairing explicit, and given the required wakeref cookie the
compiler can verify that we pass an initialised value to the rpm_put
(quite handy for double checking error paths).

For regular builds, the compiler should be able to eliminate the unused
local variables and the program growth should be minimal. Fwiw, it came
out as a net improvement as gcc was able to refactor rpm_get and
rpm_get_if_in_use together,

v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual
mark up for smaller more targeted patches.
v3: Mention the cookie in Returns

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk
2019-01-14 16:17:53 +00:00
Jani Nikula 2f80d7bd8d drm/i915: drop all drmP.h includes
Needs just a few additional includes here and there.

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190108082709.3748-1-jani.nikula@intel.com
2019-01-09 10:26:36 +02:00
Lucas De Marchi cf819eff90 drm/i915: replace IS_GEN<N> with IS_GEN(..., N)
Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of
gen_mask to do the comparison. Now callers can pass then gen as a parameter,
so we don't require one macro for each gen.

The following spatch was used to convert the users of these macros:

@@
expression e;
@@
(
- IS_GEN2(e)
+ IS_GEN(e, 2)
|
- IS_GEN3(e)
+ IS_GEN(e, 3)
|
- IS_GEN4(e)
+ IS_GEN(e, 4)
|
- IS_GEN5(e)
+ IS_GEN(e, 5)
|
- IS_GEN6(e)
+ IS_GEN(e, 6)
|
- IS_GEN7(e)
+ IS_GEN(e, 7)
|
- IS_GEN8(e)
+ IS_GEN(e, 8)
|
- IS_GEN9(e)
+ IS_GEN(e, 9)
|
- IS_GEN10(e)
+ IS_GEN(e, 10)
|
- IS_GEN11(e)
+ IS_GEN(e, 11)
)

v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than
    using the bitmask

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com
2018-12-12 16:52:10 -08:00
Tvrtko Ursulin c56b89f16d drm/i915: Use INTEL_GEN everywhere
Coccinelle patch:

 @@
 identifier p;
 @@
 -INTEL_INFO(p)->gen
 +INTEL_GEN(p)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180208130606.15556-12-tvrtko.ursulin@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180209215847.6660-1-chris@chris-wilson.co.uk
2018-02-09 22:29:02 +00:00
Chris Wilson 5b364bec5c drm/i915: Flush ggtt writes through the old fenced vma before changing fences
This is a precautionary measure as I have no evidence to suggest we've
hit a bug here (I was hoping this might explain gdg's odd behaviour, but
alas), but given that we have a function to flush the ggtt writes it
seems prudent to use it prior to changing the fence register. Due to the
intrinsic nature of the GTT often operating as an independent mmio path,
we should not just rely on the write to the fence acting as a full flush
for GTT writes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180130164457.14037-1-chris@chris-wilson.co.uk
2018-01-31 10:49:16 +00:00
Chris Wilson a65adaf8a8 drm/i915: Track user GTT faulting per-vma
We don't wish to refault the entire object (other vma) when unbinding
one partial vma. To do this track which vma have been faulted into the
user's address space.

v2: Use a local vma_offset to tidy up a multiline unmap_mapping_range().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-3-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-09 17:07:29 +01:00
Chris Wilson 3bd4073524 drm/i915: Consolidate get_fence with pin_fence
Following the pattern now used for obj->mm.pages, use just pin_fence and
unpin_fence to control access to the fence registers. I.e. instead of
calling get_fence(); pin_fence(), we now just need to call pin_fence().
This will make it easier to reduce the locking requirements around
fence registers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-09 17:07:29 +01:00
Changbin Du 969b0950a1 drm/i915: Add interface to reserve fence registers for vGPU
In the past, vGPU alloc fence registers by walking through mm.fence_list
to find fence which pin_count = 0 and vma is empty. vGPU may not find
enough fence registers this way. Because a fence can be bind to vma even
though it is not in using. We have found such failure many times these
days.

An option to resolve this issue is that we can force-remove fence from
vma in this case.

This patch added two new api to the fence management code:
 - i915_reserve_fence() will try to find a free fence from fence_list
   and force-remove vma if need.
 - i915_unreserve_fence() reclaim a reserved fence after vGPU has
   finished.

With this change, the fence management is more clear to work with vGPU.
GVTg do not need remove fence from fence_list in private.

v3: (Chris)
  - Add struct_mutex lock assertion.
  - Only count for unpinned fence.

v2: (Chris)
  - Rename the new api for symmetry.
  - Add safeguard to ensure at least 1 fence remained for host display.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1504512061-5892-1-git-send-email-changbin.du@intel.com
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-04 16:34:59 +01:00
Chris Wilson 181df2d458 drm/i915: Take rpm wakelock for releasing the fence on unbind
Unbind the vma may happen at any time, outside of the normal GT wakeref.
As such it relies on having a wakeref of its own. However, we can forgo
clearing the register whilst the device is asleep and just mark it as
unused - so that when we do wake up the device, we will clear the unused
fence register (see i915_gem_restore_fences).

[22423.944631] WARNING: CPU: 3 PID: 26178 at drivers/gpu/drm/i915/intel_drv.h:1739 i915_vma_put_fence+0xf3/0x100 [i915]
[22423.946053] RPM wakelock ref not held during HW access
[22423.946056] Modules linked in: vgem(E) i915(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) x86_pkg_temp_thermal(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) intel_gtt(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) evdev(E) aesni_intel(E) aes_x86_64(E) crypto_simd(E) cryptd(E) glue_helper(E) sysimgblt(E) fb_sys_fops(E) prime_numbers(E) drm(E) efivars(E) mei_me(E) lpc_ich(E) mei(E) mfd_core(E) battery(E) video(E) acpi_pad(E) button(E) tpm_tis(E) tpm_tis_core(E) tpm(E) autofs4(E) i2c_i801(E) thermal(E) fan(E) i2c_designware_platform(E) i2c_designware_core(E)
[22423.946438] CPU: 2 PID: 26178 Comm: gem_concurrent_ Tainted: G            E   4.10.0+ #101
[22423.946513] Hardware name: ��������������������������������� ���������������������������������/���������������������������������, BIOS RYBDWi35.86A.0246.2
[22423.946600] Call Trace:
[22423.946641]  dump_stack+0x68/0x9f
[22423.946703]  __warn+0x107/0x130
[22423.946763]  warn_slowpath_fmt+0xa8/0xe0
[22423.946825]  ? __warn+0x130/0x130
[22423.946868]  ? free_hot_cold_page_list+0x53/0x70
[22423.946942]  ? mark_lock+0xcc/0x7f0
[22423.946997]  ? __lock_is_held+0x84/0x100
[22423.947115]  ? i915_vma_put_fence+0x64/0x100 [i915]
[22423.947224]  i915_vma_put_fence+0xf3/0x100 [i915]
[22423.947335]  i915_vma_unbind+0x4da/0x560 [i915]
[22423.947387]  ? rb_erase+0x812/0x8a0
[22423.947439]  ? kfree+0xa2/0xd0
[22423.947562]  i915_vma_close+0x159/0x180 [i915]
[22423.947674]  intel_ring_free+0x31/0x50 [i915]
[22423.947776]  i915_gem_context_free+0x1ff/0x3d0 [i915]
[22423.947887]  context_close+0x106/0x110 [i915]
[22423.947989]  context_idr_cleanup+0xc/0x10 [i915]
[22423.948041]  idr_for_each+0x14d/0x1d0
[22423.948158]  ? context_close+0x110/0x110 [i915]
[22423.948206]  ? get_from_free_list+0x70/0x70
[22423.948261]  ? __lock_is_held+0x84/0x100
[22423.948325]  ? __mutex_unlock_slowpath+0xd4/0x400
[22423.948448]  i915_gem_context_close+0x4b/0x90 [i915]
[22423.948544]  i915_driver_preclose+0x28/0x50 [i915]
[22423.948620]  drm_release+0x175/0x690 [drm]
[22423.948681]  ? fcntl_setlk+0x5e0/0x5e0
[22423.948746]  __fput+0x17d/0x300
[22423.948807]  ____fput+0x9/0x10
[22423.948859]  task_work_run+0xa7/0xe0
[22423.948924]  do_exit+0x4d2/0x13e0
[22423.948986]  ? mm_update_next_owner+0x320/0x320
[22423.949051]  ? __do_page_fault+0x209/0x5c0
[22423.949110]  ? mark_held_locks+0x23/0xc0
[22423.949166]  ? entry_SYSCALL_64_fastpath+0x5/0xb1
[22423.949232]  do_group_exit+0x93/0x160
[22423.949289]  SyS_exit_group+0x18/0x20
[22423.949350]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[22423.949403] RIP: 0033:0x7f9cc2e154c8
[22423.949484] RSP: 002b:00007ffd7e81b448 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[22423.949557] RAX: ffffffffffffffda RBX: ffffffff810ef1f0 RCX: 00007f9cc2e154c8
[22423.949617] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
[22423.949677] RBP: ffff880367e9ff98 R08: 00000000000000e7 R09: ffffffffffffff88
[22423.949741] R10: 00007f9cc1d5c000 R11: 0000000000000246 R12: 00007f9cc30f6c30
[22423.949798] R13: 0000000000000000 R14: 00007f9cc30f6c20 R15: 0000000000000003
[22423.949868]  ? trace_hardirqs_off_caller+0xc0/0x110

v2: Move the rpm check down a layer so that we still perform the
vma/fence update required for the deferred mmio write on resume.
v3: Don't touch i915_gem_object_set_cache_level() and leave the rpm to
the low level routines (such as i915_vma_put_fence).
v4: vma may be null in fence_write, so extract drm_i915_private from
fence->i915

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170306092916.11623-3-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-03-06 14:38:18 +00:00
Chris Wilson f51455d442 drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE
Start converting over from the byte count to its semantic macro, either
we want to allocate the size of a physical page in main memory or we
want the size of a virtual page in the GTT. 4096 could mean either, but
PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve
code comprehension and future changes. In the future, we may want to use
variable GTT page sizes and so have the challenge of knowing which
hardcoded values were used to represent a physical page vs the virtual
page.

v2: Look for a few more 4096s to convert, discover IS_ALIGNED().
v3: 4096ul paranoia, make fence alignment a distinct value of 4096, keep
bdw stolen w/a as 4096 until we know better.
v4: Add asserts that i915_vma_insert() start/end are aligned to GTT page
sizes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110144734.26052-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-10 20:54:32 +00:00
Chris Wilson cea84d16c3 drm/i915: Remove the rounding down of the gen4+ fence region
Restricting the fence to the end of the previous tile-row breaks access
to the final portion of the object. On gen2/3 we employed lazy fencing
to pad out the fence with scratch page to provide access to the tail,
and now we also pad out the object on gen4+ we can apply the same fix.

Fixes: af1a7301c7 ("drm/i915: Only fence tiled region of object.")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-5-chris@chris-wilson.co.uk
2017-01-10 08:12:21 +00:00
Chris Wilson 944397f04f drm/i915: Store required fence size/alignment for GGTT vma
The fence size/alignment is a combination of the vma size plus object
tiling parameters. Those parameters are rarely changed, making the fence
size/alignemnt roughly constant for the lifetime of the VMA. We can
simplify subsequent calculations by precalculating the size/alignment
required for GGTT vma taking fencing into account (with an update if we
do change the tiling or stride).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-4-chris@chris-wilson.co.uk
2017-01-10 08:12:21 +00:00
Chris Wilson 0d4e8f1dbc drm/i915: Replace WARNs in fence register writes with extensive asserts
All of these conditions are prechecked by i915_tiling_ok() before we
allow setting the tiling/stride on the object and so we should never
fail asserting those conditions before writing the register.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-3-chris@chris-wilson.co.uk
2017-01-10 08:12:20 +00:00
Chris Wilson 6649a0b650 drm/i915: Extract tile_row_size for fencing
Computing the tile row size of a tiled object (for use with fence
registers) is repeated, so extract it to a common helper.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-10 08:12:09 +00:00
Chris Wilson b1ed35d917 drm/i915: Revoke fenced GTT mmapings across GPU reset
The fence registers are clobbered by a GPU reset. If there is concurrent
user access to a fenced region via a GTT mmaping, the access will not be
fenced during the reset (until we restore the fences afterwards). In order
to prevent invalid access during the reset, before we clobber the fences
first we must invalidate the GTT mmapings. Access to the mmap will then
be forced to fault in the page, and in handling the fault, i915_gem_fault()
will take the struct_mutex and wait upon the reset to complete.

v2: Fix up commentary.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99274
Testcase: igt/gem_mmap_gtt/hang
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170104145110.1486-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-01-04 18:44:30 +00:00
Jani Nikula 9eebfdbff2 drm/i915: simplify check for I915G/I945G in bit 6 swizzling detection
Commit c9c4b6f6c2 ("drm/i915: fix swizzle detection for gen3") added a
complicated check for I915G/I945G. Pineview and other gen3 devices match
IS_MOBILE() anyway. Simplify.

Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1481627459-488-1-git-send-email-jani.nikula@intel.com
2016-12-14 15:18:52 +02:00
Jani Nikula 73f67aa8cc drm/i915: distinguish G33 and Pineview from each other
Pineview deserves to use its own platform enum (which was already added,
unused, previously). IS_G33() no longer matches Pineview, and gets
replaced by IS_G33() || IS_PINEVIEW() or equivalent. Pineview is no
longer an outlier among platform definitions.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1481143689-19672-1-git-send-email-jani.nikula@intel.com
2016-12-07 23:28:33 +02:00
Chris Wilson 49d73912cb drm/i915: Convert vm->dev backpointer to vm->i915
99% of the time we access i915_address_space->dev we want the i915
device and not the drm device, so let's store the drm_i915_private
backpointer instead. The only real complication here are the inlines
in i915_vma.h where drm_i915_private is not yet defined and so we have
to choose an alternate path for our asserts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161129095008.32622-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-11-29 11:38:00 +00:00
Tvrtko Ursulin 4362f4f6dd drm/i915: Use dev_priv in INTEL_INFO in i915_gem_fence_reg.c
Plus a small cascade of function prototype changes.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-17 13:56:06 +00:00
Joonas Lahtinen b42fe9ca0a drm/i915: Split out i915_vma.c
As a side product, had to split two other files;
- i915_gem_fence_reg.h
- i915_gem_object.h (only parts that needed immediate untanglement)

I tried to move code in as big chunks as possible, to make review
easier. i915_vma_compare was moved to a header temporarily.

v2:
- Use i915_gem_fence_reg.{c,h}

v3:
- Rebased

v4:
- Fix building when DEBUG_GEM is enabled by reordering a bit.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-11-11 14:34:54 +02:00