Commit Graph

38 Commits

Author SHA1 Message Date
Chris Wilson ea746f3659 drm/i915: Expand bool interruptible to pass flags to i915_wait_request()
We need finer control over wakeup behaviour during i915_wait_request(),
so expand the current bool interruptible to a bitmask.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-9-chris@chris-wilson.co.uk
2016-09-09 14:23:03 +01:00
Chris Wilson 3b4e896f14 drm/i915: Remove unused no-shrinker-steal
After removing the user of this wart, we can remove the wart entirely.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-10-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:39 +01:00
Chris Wilson 5cba5be6b6 drm/i915/shrinker: Wait before acquiring struct_mutex under oom
We can now wait for the GPU (all engines) to become idle without
requiring the struct_mutex. Inside the shrinker, we need to currently
take the struct_mutex in order to purge objects and to purge the objects
we need the GPU to be idle - causing a stall whilst we hold the
struct_mutex. We can hide most of that stall by performing the wait
before taking the struct_mutex and only doing essential waits for
new rendering on objects to be freed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-8-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:38 +01:00
Chris Wilson dcff85c844 drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex
The principal motivation for this was to try and eliminate the
struct_mutex from i915_gem_suspend - but we still need to hold the mutex
current for the i915_gem_context_lost(). (The issue there is that there
may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and
struct_mutex via the stop_machine().) For the moment, enabling last
request tracking for the engine, allows us to do busyness checking and
waiting without requiring the struct_mutex - which is useful in its own
right.

As a side-effect of having a robust means for tracking engine busyness,
we can replace our other busyness heuristic, that of comparing against
the last submitted seqno. For paranoid reasons, we have a semi-ordered
check of that seqno inside the hangchecker, which we can now improve to
an ordered check of the engine's busyness (removing a locked xchg in the
process).

v2: Pass along "bool interruptible" as being unlocked we cannot rely on
i915->mm.interruptible being stable or even under our control.
v3: Replace check Ironlake i915_gpu_busy() with the common precalculated value

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-6-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:37 +01:00
Chris Wilson 0eafec6d32 drm/i915: Enable lockless lookup of request tracking via RCU
If we enable RCU for the requests (providing a grace period where we can
inspect a "dead" request before it is freed), we can allow callers to
carefully perform lockless lookup of an active request.

However, by enabling deferred freeing of requests, we can potentially
hog a lot of memory when dealing with tens of thousands of requests per
second - with a quick insertion of a synchronize_rcu() inside our
shrinker callback, that issue disappears.

v2: Currently, it is our responsibility to handle reclaim i.e. to avoid
hogging memory with the delayed slab frees. At the moment, we wait for a
grace period in the shrinker, and block for all RCU callbacks on oom.
Suggested alternatives focus on flushing our RCU callback when we have a
certain number of outstanding request frees, and blocking on that flush
after a second high watermark. (So rather than wait for the system to
run out of memory, we stop issuing requests - both are nondeterministic.)

Paul E. McKenney wrote:

Another approach is synchronize_rcu() after some largish number of
requests.  The advantage of this approach is that it throttles the
production of callbacks at the source.  The corresponding disadvantage
is that it slows things up.

Another approach is to use call_rcu(), but if the previous call_rcu()
is still in flight, block waiting for it.  Yet another approach is
the get_state_synchronize_rcu() / cond_synchronize_rcu() pair.  The
idea is to do something like this:

        cond_synchronize_rcu(cookie);
        cookie = get_state_synchronize_rcu();

You would of course do an initial get_state_synchronize_rcu() to
get things going.  This would not block unless there was less than
one grace period's worth of time between invocations.  But this
assumes a busy system, where there is almost always a grace period
in flight.  But you can make that happen as follows:

        cond_synchronize_rcu(cookie);
        cookie = get_state_synchronize_rcu();
        call_rcu(&my_rcu_head, noop_function);

Note that you need additional code to make sure that the old callback
has completed before doing a new one.  Setting and clearing a flag
with appropriate memory ordering control suffices (e.g,. smp_load_acquire()
and smp_store_release()).

v3: More comments on compiler and processor order of operations within
the RCU lookup and discover we can use rcu_access_pointer() here instead.

v4: Wrap i915_gem_active_get_rcu() to take the rcu_read_lock itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: "Goel, Akash" <akash.goel@intel.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-25-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:20:05 +01:00
Chris Wilson 573adb3962 drm/i915: Move obj->active:5 to obj->flags
We are motivated to avoid using a bitfield for obj->active for a couple
of reasons. Firstly, we wish to document our lockless read of obj->active
using READ_ONCE inside i915_gem_busy_ioctl() and that requires an
integral type (i.e. not a bitfield). Secondly, gcc produces abysmal code
when presented with a bitfield and that shows up high on the profiles of
request tracking (mainly due to excess memory traffic as it converts
the bitfield to a register and back and generates frequent AGI in the
process).

v2: BIT, break up a long line in compute the other engines, new paint
for i915_gem_object_is_active (now i915_gem_object_get_active).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-23-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:20:04 +01:00
Chris Wilson 3272db5313 drm/i915: Combine all i915_vma bitfields into a single set of flags
In preparation to perform some magic to speed up i915_vma_pin(), which
is among the hottest of hot paths in execbuf, refactor all the bitfields
accessed by i915_vma_pin() into a single unified set of flags.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-16-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:59 +01:00
Chris Wilson aa653a685d drm/i915: Be more careful when unbinding vma
When we call i915_vma_unbind(), we will wait upon outstanding rendering.
This will also trigger a retirement phase, which may update the object
lists. If, we extend request tracking to the VMA itself (rather than
keep it at the encompassing object), then there is a potential that the
obj->vma_list be modified for other elements upon i915_vma_unbind(). As
a result, if we walk over the object list and call i915_vma_unbind(), we
need to be prepared for that list to change.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-8-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:18 +01:00
Chris Wilson 15717de219 drm/i915: Count how many VMA are bound for an object
Since we may have VMA allocated for an object, but we interrupted their
binding, there is a disparity between have elements on the obj->vma_list
and being bound. i915_gem_obj_bound_any() does this check, but this is
not rigorously observed - add an explicit count to make it easier.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-7-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:17 +01:00
Chris Wilson 2a1d775201 drm/i915: Prefer list_first_entry_or_null
list_first_entry_or_null() can generate better code than using
if (!list_empty()) {ptr = list_first_entry()) ..., so put it to use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-3-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469530913-17180-2-git-send-email-chris@chris-wilson.co.uk
2016-07-26 13:00:58 +01:00
Chris Wilson f8c417cdb1 drm/i915: Rename drm_gem_object_unreference in preparation for lockless free
Ultimately wraps kref_put(), so adopt its nomenclature for consistency
with other subsystems.

s/drm_gem_object_unreference/i915_gem_object_put/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-6-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-5-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:12 +01:00
Chris Wilson 25dc556a2a drm/i915: Wrap drm_gem_object_reference in i915_gem_object_get
Ultimately wraps kref_get(), so adopt its nomenclature for consistency
with other subsystems.

s/drm_gem_object_reference/i915_gem_object_get/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-5-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-4-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:11 +01:00
Chris Wilson 4f074a5393 drm/i915: Update ifdeffery for mutex->owner
In commit 7608a43d8f ("locking/mutexes: Use MUTEX_SPIN_ON_OWNER when
appropriate") the owner field in the mutex was updated from being
dependent upon CONFIG_SMP to using optimistic spin. Update our peek
function to suite.

Fixes:7608a43d8f2e ("locking/mutexes: Use MUTEX_SPIN_ON_OWNER...")
Reported-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1468244777-4888-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-12 11:54:27 +01:00
Chris Wilson 91c8a326a1 drm/i915: Convert dev_priv->dev backpointers to dev_priv->drm
Since drm_i915_private is now a subclass of drm_device we do not need to
chase the drm_i915_private->dev backpointer and can instead simply
access drm_i915_private->drm directly.

   text	   data	    bss	    dec	    hex	filename
1068757	   4565	    416	1073738	 10624a	drivers/gpu/drm/i915/i915.ko
1066949	   4565	    416	1071930	 105b3a	drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:
@@
struct drm_i915_private *d;
identifier i;
@@
(
- d->dev->i
+ d->drm.i
|
- d->dev
+ &d->drm
)

and for good measure the dev_priv->dev backpointer was removed entirely.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-4-git-send-email-chris@chris-wilson.co.uk
2016-07-05 11:58:45 +01:00
Chris Wilson bed50aea61 drm/i915/shrinker: Flush active on objects before counting
As we inspect obj->active to decide how many objects we can shrink (we
only shrink idle objects), it helps to flush the active lists first
in order to have a more accurate count of available objects.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-1-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:42:27 +01:00
Chris Wilson 6e5a5beb8e drm/i915: Split idling from forcing context switch
We only need to force a switch to the kernel context placeholder during
eviction. All other uses of i915_gpu_idle() just want to wait until
existing work on the GPU is idle. Rename i915_gpu_idle() to
i915_gem_wait_for_idle() to avoid any implications about "parking" the
context first.

v2: Tweak an error message if the wait fails for the ilk vtd w/a

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466776558-21516-6-git-send-email-chris@chris-wilson.co.uk
2016-06-24 15:03:14 +01:00
Chris Wilson c033666a94 drm/i915: Store a i915 backpointer from engine, and use it
text	   data	    bss	    dec	    hex	filename
6309351	3578714	 696320	10584385	 a18141	vmlinux
6308391	3578714	 696320	10583425	 a17d81	vmlinux

Almost 1KiB of code reduction.

v2: More s/INTEL_INFO()->gen/INTEL_GEN()/ and IS_GENx() conversions

   text	   data	    bss	    dec	    hex	filename
6304579	3578778	 696320	10579677	 a16edd	vmlinux
6303427	3578778	 696320	10578525	 a16a5d	vmlinux

Now over 1KiB!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462545621-30125-3-git-send-email-chris@chris-wilson.co.uk
2016-05-09 13:41:24 +01:00
Praveen Paneri ea9d9768a4 drm/i915: Add rpm get/put in oom and vmap notifier
i915_gem_shrink() will scan the bound list only if device is not
suspended but in OOM failure scenario it becomes absolutely necessary
to release as much memory as possible. Also in allocation failure from
vmap address space, it is incumbent on the Driver to reap all its
vmaps. So, adding rpm get/put in i915_gem_shrinker_oom() and
i915_gem_shrinker_vmap() to ensure shrinking of bound objects as well.

Signed-off-by: Praveen Paneri <praveen.paneri@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1462178429-13449-2-git-send-email-praveen.paneri@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-05-02 14:02:27 +01:00
Praveen Paneri 178a30c90a drm/i915: Unbind objects in shrinker only if device is runtime active
When the system is running low on memory, gem shrinker is invoked.
In this process objects will be unbounded from GTT and unbinding process
will require access to GTT(GTTADR) and also to fence register potentially.
That requires a resume of gfx device, if suspended, in the shrinker path.
Considering the power leakage due to intermediate resume, perform unbinding
operation only if device is already runtime active.

v2: Use newly implemented intel_runtime_pm_get_if_in_use (Chris)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Praveen Paneri <praveen.paneri@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1462178429-13449-1-git-send-email-praveen.paneri@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-05-02 14:02:09 +01:00
Chris Wilson 8ef8561f2c drm/i915: Move ioremap_wc tracking onto VMA
By tracking the iomapping on the VMA itself, we can share that area
between multiple users. Also by only revoking the iomapping upon
unbinding from the mappable portion of the GGTT, we can keep that iomap
across multiple invocations (e.g. execlists context pinning).

Note that by moving the iounnmap tracking to the VMA, we actually end up
fixing a leak of the iomapping in intel_fbdev.

v1.5: Rebase prompted by Tvrtko
v2: Drop dev_priv parameter, we can recover the i915_ggtt from the vma.
v3: Move handling of ioremap space exhaustion to vmap_purge and also
allow vmallocs to recover old iomaps. Add Tvrtko's kerneldoc.
v4: Fix a use-after-free in shrinker and rearrange i915_vma_iomap
v5: Back to i915_vm_to_ggtt
v6: Use i915_vma_pin_iomap and i915_vma_unpin_iomap to mark critical
sections and ensure the VMA cannot be reaped whilst mapped.
v7: Move i915_vma_iounmap so that consumers of the API are not tempted,
and add iomem annotations

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-5-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Chris Wilson 1bec9b0bda drm/i915/shrinker: Only shmemfs objects are backed by swap
Since we can only swap out shmemfs objects, those are the only ones that
can influence the ability of the shrinker to free pages. Currently, all
non-shmemfs objects have a raised pages_pin_count to protect them from
the shrinker, so this just makes the logic for can_release_pages()
clearer (and safer in future so that we don't over estimate our ability
to free up pages from future non-swappable objects).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461150592-27818-3-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-04-20 13:49:44 +01:00
Chris Wilson 1768d4550c drm/i915/shrinker: Report "unevictable" pages
Inside the shrinker we call can_release_pages() to indicate whether or
not we can make forward progress in freeing up memory by unbinding that
object. When adding our report to oom, we should be using the same
logic.

Whilst here, change the reporting from bytes to pages so that it looks
smaller to the user!, is consistent with the neighbouring oom report
itself which displays counts in pages, and makes the unsigned long
overflow less likely.

v2: Split oversized format string into two lines

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461150592-27818-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-04-20 13:47:43 +01:00
Chris Wilson ca61d4a425 drm/i915/shrinker: Only report objects with extra pinned pages as pinned
When iterating over the bound list, we expect all objects there to have
their pages pinned (by the bound VMA). So only report those objects with
additional pin count on their pages as "pinned". These should be those
objects used for display and hardware access.

Reported-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461150592-27818-1-git-send-email-chris@chris-wilson.co.uk
2016-04-20 13:44:19 +01:00
Chris Wilson eae2c43b12 drm/i915/shrinker: Restrict vmap purge to objects with vmaps
When called because we have run out of vmap address space, we only need
to recover objects that have vmappings and not all.

v2: Start using is_vmalloc_addr()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-5-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
2016-04-11 17:12:16 +01:00
Chris Wilson 168cf367d7 drm/i915/shrinker: Refactor common uninterruptible locking
Both the oom and vmap notifier callbacks have a loop to acquire the
struct_mutex and set the device as uninterruptible, within a certain
time. Refactor the common code into a pair of functions.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459848145-24042-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-04-05 11:13:52 +01:00
Chris Wilson e87666b52f drm/i915/shrinker: Hook up vmap allocation failure notifier
If the core runs out of vmap address space, it will call a notifier in
case any driver can reap some of its vmaps. As i915.ko is possibily
holding onto vmap address space that could be recovered, hook into the
notifier chain and try and reap objects holding onto vmaps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Roman Pen <r.peniaev@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kahola <mika.kahola@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459777603-23618-4-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-04-05 11:13:15 +01:00
Chris Wilson 6f0ac2047a drm/i915/shrinker: Account for unshrinkable unbound pages
Since we only attempt to purge an object if can_release_pages() report
true, we should also only add it to the count of potential recoverable
pages when can_release_pages() is true.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Akash Goel <akash.goel@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459777603-23618-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-04-05 11:11:16 +01:00
Chris Wilson 1c7f4bca5a drm/i915: Rename vma->*_list to *_link for consistency
Elsewhere we have adopted the convention of using '_link' to denote
elements in the list (and '_list' for the actual list_head itself), and
that the name should indicate which list the link belongs to (and
preferrably not just where the link is being stored).

s/vma_link/obj_link/ (we iterate over obj->vma_list)
s/mm_list/vm_link/ (we iterate over vm->[in]active_list)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-02-26 13:15:39 +00:00
Imre Deak a8a4058925 drm/i915: Sanitize GEM shrinker init and clean-up
Factor out the common GEM shrinker clean-up code and call the shrinker
init function from the same function from where the corresponding
shrinker clean-up function is called. Also add sanity checking to the
shrinker and OOM registration calls.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453209992-25995-4-git-send-email-imre.deak@intel.com
2016-01-27 17:43:14 +02:00
Chris Wilson c1a415e261 drm/i915: Disable shrinker for non-swapped backed objects
If the system has no available swap pages, we cannot make forward
progress in the shrinker by releasing active pages, only by releasing
purgeable pages which are immediately reaped. Take total_swap_pages into
account when counting up available objects to be shrunk and subsequently
shrinking them. By doing so, we avoid unbinding objects that cannot be
shrunk and so wasting CPU cycles flushing those objects from the GPU to
the system and then immediately back again (as they will more than
likely be reused shortly after).

Based on a patch by Akash Goel.

v2: frontswap registers extra swap pages available for the system, so it
is already include in the count of available swap pages.

v3: Use get_nr_swap_pages() to query the currently available amount of
swap space. This should also stop us from shrinking the GPU buffers if
we ever run out of swap space. Though at that point, we would expect the
oom-notifier to be running and failing miserably...

Reported-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-mm@kvack.org
Cc: Akash Goel <akash.goel@intel.com>
Cc: sourab.gupta@intel.com
Link: http://patchwork.freedesktop.org/patch/msgid/1449244734-25733-2-git-send-email-chris@chris-wilson.co.uk
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-01-05 11:05:38 +01:00
Chris Wilson 5763ff04dc drm/i915: Avoid GPU stalls from kswapd
Exclude active GPU pages from the purview of the background shrinker
(kswapd), as these cause uncontrollable GPU stalls. Given that the
shrinker is rerun until the freelists are satisfied, we should have
opportunity in subsequent passes to recover the pages once idle. If the
machine does run out of memory entirely, we have the forced idling in the
oom-notifier as a means of releasing all the pages we can before an oom
is prematurely executed.

Note that this relies upon an up-front retire_requests to keep the
inactive list in shape, which was added in a previous patch, mostly as
execlist ctx pinning band-aids.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Add note about retire_requests.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-10-07 16:05:41 +02:00
Chris Wilson c9c0f5ea1b drm/i915: During shrink_all we only need to idle the GPU
We can forgo an evict-everything here as the shrinker operation itself
will unbind any vma as required. If we explicitly idle the GPU through a
switch to the default context, we not only create a request in an
illegal context (e.g. whilst shrinking during execbuf with a request
already allocated), but switching to the default context will not free
up the memory backing the active contexts - unless in the unlikely
situation that context had already been closed (and just kept arrive by
being the current context). The saving is near zero and the danger real.

To compensate for the loss of the forced retire, add a couple of
retire-requests to i915_gem_shirnk() - this should help free up any
transitive cache from the requests.

Note that the second retire_requests is for the benefit of the
hand-rolled execlist ctx active tracking: We need to manually kick
requests to get those unpinned again. Once that's fixed we can try to
remove this again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add summary of why we need a pile of retire_requests.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-10-07 16:05:39 +02:00
Chris Wilson 3abafa539d drm/i915: Add a tracepoint for the shrinker
Often it is very useful to know why we suddenly purge vast tracts of
memory and surprisingly up until now we didn't even have a tracepoint
for when we shrink our memory.

Note that there are slab_start/end tracepoints already, but those
don't cover the internal recursion when we directly call into our
shrinker code. Hence a separate tracepoint seems justified. Also note
that we don't really need a separate tracepoint for the actual amount
of pages freed since we already have an unbind tracpoint for that.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add a note that there's also slab_start/end and why they're
insufficient.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-10-07 16:05:38 +02:00
Chris Wilson 143875400b drm/i915: shrinker_control->nr_to_scan is now unsigned long
As the shrinker_control now passes us unsigned long targets, update our
shrinker functions to match.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-10-07 16:04:04 +02:00
Daniel Vetter 1f2449cdd6 drm/i915: Fix kerneldoc for i915_gem_shrink_all
I've botched this, so let's fix it.

Botched in

commit eb0b44adc0
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Mar 18 14:47:59 2015 +0100

    drm/i915: kerneldoc for i915_gem_shrinker.c

v2: Be a good citizen^Wmaintainer and add the proper commit citation.
Noticed by Jani.

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2015-10-07 16:03:05 +02:00
Chris Wilson f6234c1dee drm/i915: Simplify object is-pinned checking for shrinker
When looking for viable candidates to shrink, we only want objects that
are not pinned. However to do so we performed a double iteration over
the vma in the objects, first looking for the pin-count, then looking
for allocations. We can do both at once and be slightly more explicit in
our validity test.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10 10:58:34 +02:00
Daniel Vetter eb0b44adc0 drm/i915: kerneldoc for i915_gem_shrinker.c
And remove one bogus * from i915_gem_gtt.c since that's not a
kerneldoc there.

v2: Review from Chris:
- Clarify memory space to better distinguish from address space.
- Add note that shrink doesn't guarantee the freed memory and that
  users must fall back to shrink_all.
- Explain how pinning ties in with eviction/shrinker.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-03-20 11:48:16 +01:00
Daniel Vetter be6a037695 drm/i915: Extract i915_gem_shrinker.c
Two code changes:
- Extract i915_gem_shrinker_init.
- Inline i915_gem_object_is_purgeable since we open-code it everywhere
  else too.

This already has the benefit of pulling all the shrinker code
together, next patch adds a bit of kerneldoc.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-03-20 11:48:15 +01:00