Instead of waiting for 50ms, just wait until the next vblank, since
it's the minimum requirement. The whole infrastructure of FBC is based
on vblanks, so waiting for X vblanks instead of X milliseconds sounds
like the correct way to go. Besides, 50ms may be less than a vblank on
super slow modes that may or may not exist.
There are some small improvements in PC state residency (due to the
fact that we're now using 16ms for the common modes instead of 50ms),
but the biggest advantage is still the correctness of being
vblank-based instead of time-based.
v2:
- Rebase after changing the patch order.
- Update the commit message.
v3:
- Fix bogus vblank_get() instead of vblank_count() (Ville).
- Don't forget to call drm_crtc_vblank_{get,put} (Chris, Ville)
- Adjust the performance details on the commit message.
v4:
- Don't grab the FBC mutex just to grab the vblank (Maarten)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453406585-10233-1-git-send-email-paulo.r.zanoni@intel.com
Factor out common clean-up code for the GEM load time init function.
Also rename i915_gem_load() to i915_gem_load_init() to have a better
match with its new clean-up function.
No functional change.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453209992-25995-5-git-send-email-imre.deak@intel.com
Factor out the common GEM shrinker clean-up code and call the shrinker
init function from the same function from where the corresponding
shrinker clean-up function is called. Also add sanity checking to the
shrinker and OOM registration calls.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453209992-25995-4-git-send-email-imre.deak@intel.com
Swap the order of context & engine cleanup, so that contexts are cleaned
up first, and *then* engines. This is a more sensible order anyway, but
in particular has become necessary since the 'intel_ring_initialized()
must be simple and inline' patch, which now uses ring->dev as an
'initialised' flag, so it can now be NULL after engine teardown. This
in turn can cause a problem in the context code, which (used to) check
the ring->dev->struct_mutex -- causing a fault if ring->dev was NULL.
Also rename the cleanup function to reflect what it actually does
(cleanup engines, not a ringbuffer), and fix an annoying whitespace issue.
v2: Also make the fix in i915_load_modeset_init, not just in
i915_driver_unload (Chris Wilson)
v3: Had extra stuff in it.
v4: Reverted extra stuff (so we're back to v2).
Rebased and updated commentary above (Dave Gordon).
Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1453405067-32890-3-git-send-email-david.s.gordon@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Some of the HW registers are privileged and cannot be written to from
non-privileged batch buffers coming from userspace unless they are added to
the HW whitelist. This whitelist is maintained by HW and it is different from
SW whitelist. Userspace need write access to them to implement preemption
related WA.
The reason for using this approach is, the register bits that control
preemption granularity at the HW level are not context save/restored; so even
if we set these bits always in kernel they are going to change once the
context is switched out. We can consider making them non-privileged by
default but these registers also contain other chicken bits which should not
be allowed to be modified.
In the later revisions controlling bits are save/restored at context level but
in the existing revisions these are exported via other debug registers and
should be on the whitelist. This patch adds changes to provide HW with a list
of registers to be whitelisted. HW checks this list during execution and
provides access accordingly.
HW imposes a limit on the number of registers on whitelist and it is
per-engine. At this point we are only enabling whitelist for RCS and we don't
foresee any requirement for other engines.
The registers to be whitelisted are added using generic workaround list
mechanism, even these are only enablers for userspace workarounds. But by
sharing this mechanism we get some test assets without additional cost (Mika).
v2: rebase
v3: parameterize RING_FORCE_TO_NONPRIV() as _MMIO() should be limited to
i915_reg.h (Ville), drop inline for wa_ring_whitelist_reg (Mika).
v4: improvements suggested by Chris Wilson.
Clarify that this is HW whitelist and different from the one maintained in
driver. This list is engine specific but it gets initialized along with other
WA which is RCS specific thing, so make it clear that we are not doing any
cross engine setup during initialization.
Make HW whitelist count of each engine available in debugfs.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453412634-29238-2-git-send-email-arun.siluvery@linux.intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
At the moment execbuf ring selection is fully coupled to
internal ring ids which is not a good thing on its own.
This dependency is also spread between two source files and
not spelled out at either side which makes it hidden and
fragile.
This patch decouples this dependency by introducing an explicit
translation table of execbuf uAPI to ring id close to the only
call site (i915_gem_do_execbuffer).
This way we are free to change driver internal implementation
details without breaking userspace. All state relating to the
uAPI is now contained in, or next to, i915_gem_do_execbuffer.
As a side benefit, this patch decreases the compiled size
of i915_gem_do_execbuffer.
v2: Extract ring selection into eb_select_ring. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1452870770-13981-1-git-send-email-tvrtko.ursulin@linux.intel.com
Now that we've eliminated a lot of uses of ring->default_context,
we can eliminate the pointer itself.
All the engines share the same default intel_context, so we can just
keep a single reference to it in the dev_priv structure rather than one
in each of the engine[] elements. This make refcounting more sensible
too, as we now have a refcount of one for the one pointer, rather than
a refcount of one but multiple pointers.
From an idea by Chris Wilson.
v2: transform an extra instance of ring->default_context introduced by
42f1cae8c drm/i915: Restore inhibiting the load of the default context
That patch's commentary includes:
v2: Mark the global default context as uninitialized on GPU reset so
that the context-local workarounds are reloaded upon re-enabling
The code implementing that now also benefits from the replacement of
the multiple (per-ring) pointers to the default context with a single
pointer to the unique kernel context.
v4: Rebased, remove underused local (Nick Hoath)
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1453230175-19330-3-git-send-email-david.s.gordon@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
There are a number of places where the driver needs a request, but isn't
working on behalf of any specific user or in a specific context. At
present, we associate them with the per-engine default context. A future
patch will abolish those per-engine context pointers; but we can already
eliminate a lot of the references to them, just by making the allocator
allow NULL as a shorthand for "an appropriate context for this ring",
which will mean that the callers don't need to know anything about how
the "appropriate context" is found (e.g. per-ring vs per-device, etc).
So this patch renames the existing i915_gem_request_alloc(), and makes
it local (static inline), and replaces it with a wrapper that provides
a default if the context is NULL, and also has a nicer calling
convention (doesn't require a pointer to an output parameter). Then we
change all callers to use the new convention:
OLD:
err = i915_gem_request_alloc(ring, user_ctx, &req);
if (err) ...
NEW:
req = i915_gem_request_alloc(ring, user_ctx);
if (IS_ERR(req)) ...
OLD:
err = i915_gem_request_alloc(ring, ring->default_context, &req);
if (err) ...
NEW:
req = i915_gem_request_alloc(ring, NULL);
if (IS_ERR(req)) ...
v4: Rebased
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Nick Hoath <nicholas.hoath@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453230175-19330-2-git-send-email-david.s.gordon@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This reverts commit 396e33ae20.
This commit was triggering some FIFO underrun warnings on ILK-IVB
platforms (but surprisingly not on HSW/BDW that share more or less the
same codepaths). These underruns were caught by the continuous
integration (CI) system and could be reproduced consistently when
running the basic acceptance tests (BAT) on the affected platforms.
Note that this revert will cause a visible regression for some
end-users; the "flicker when mouse moves between monitors in X" issue
that was reported before this patch was merged will now return. However
regressions that are visible to CI have higher priority since they
prevent proper testing of future patches on those platforms. Hopefully
we'll be able to figure out the cause of the underruns quickly and
remerge an improved version of this patch to fix the regression.
Cc: Daniel Vetter <daniel@ffwll.ch>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93640
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1453232584-8543-1-git-send-email-matthew.d.roper@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
LRC lifetime is well defined so we can cache the page pointing
to the object backing store in the context in order to avoid
walking over the object SG page list from the interrupt context
without the big lock held.
v2: Also cache the mapping. (Chris Wilson)
v3: Unmap on the error path.
v4: No need to cache the page. (Chris Wilson)
v5: No need to dirty the page on unpin. (Chris Wilson)
v6: kmap() cannot fail and use kmap_to_page to simplify unpin.
(Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1452877965-32042-1-git-send-email-tvrtko.ursulin@linux.intel.com
LRC code was calling GEM API like i915_gem_obj_ggtt_offset from
places where the struct_mutex cannot be grabbed (irq handlers).
To avoid that this patch caches some interesting bits and values
in the engine and context structures.
Some usages are also removed where they are not needed like a
few asserts which are either impossible or have been checked
already during engine initialization.
Side benefit is also that interrupt handlers and command
submission stop evaluating invariant conditionals, like what
Gen we are running on, on every interrupt and every command
submitted.
This patch deals with logical ring context id and descriptors
while subsequent patches will deal with the remaining issues.
v2:
* Cache the VMA instead of the address. (Chris Wilson)
* Incorporate Dave Gordon's good comments and function name.
v3:
* Extract ctx descriptor template to a function and group
functions dealing with ctx descriptor & co together near
top of the file. (Dave Gordon)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1452870629-13830-1-git-send-email-tvrtko.ursulin@linux.intel.com
Add a common function to return "on" or "off" string based on the
argument, and drop the local versions of it.
This is the onoff version of
commit 42a8ca4cb4
Author: Jani Nikula <jani.nikula@intel.com>
Date: Thu Aug 27 16:23:30 2015 +0300
drm/i915: add yesno utility function
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1452768814-29787-1-git-send-email-jani.nikula@intel.com
If we go into suspend with unclaimed access detected,
it would be nice to catch that access on a next suspend path.
So instead of just notifying about it, arm the unclaimed
mmio checks on suspend side.
We want to keep the asymmetry on resume, as if it was
on resume path, it was not driver that is responsible so
no point in arming mmio debugs.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1452261080-6979-2-git-send-email-mika.kuoppala@intel.com
We have done unclaimed register access check in normal
(mmio_debug=0) mode once per write. This adds probability
of finding the exact sequence where we did the bad access, but
also adds burden to each write.
As we have mmio_debug available for more fine grained analysis,
give up accuracy of detecting correct spot at the first occurrence
by doing the one shot detection and arming of mmio_debug in hangcheck
and in modeset. This removes the write path performance burden.
v2: Remove gratuitous DRM_DEBUG and return value, comments (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <przanoni@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450250808-14864-1-git-send-email-mika.kuoppala@intel.com
Currently interrupt code is the only place checking
for the unclaimed register access prior to actual register
macros using the same functionality. Rename the function
and make it return bool so that the possible error message
context is clear in the caller side. The motivation is to allow
usage of unclaimed detection on arbitrary places.
v2: rebase, s/access/mmio, s/dev/dev_priv
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450189512-30360-2-git-send-email-mika.kuoppala@intel.com
In addition to calculating final watermarks, let's also pre-calculate a
set of intermediate watermark values at atomic check time. These
intermediate watermarks are a combination of the watermarks for the old
state and the new state; they should satisfy the requirements of both
states which means they can be programmed immediately when we commit the
atomic state (without waiting for a vblank). Once the vblank does
happen, we can then re-program watermarks to the more optimal final
value.
v2: Significant rebasing/rewriting.
v3:
- Move 'need_postvbl_update' flag to CRTC state (Daniel)
- Don't forget to check intermediate watermark values for validity
(Maarten)
- Don't due async watermark optimization; just do it at the end of the
atomic transaction, after waiting for vblanks. We do want it to be
async eventually, but adding that now will cause more trouble for
Maarten's in-progress work. (Maarten)
- Don't allocate space in crtc_state for intermediate watermarks on
platforms that don't need it (gen9+).
- Move WaCxSRDisabledForSpriteScaling:ivb into intel_begin_crtc_commit
now that ilk_update_wm is gone.
v4:
- Add a wm_mutex to cover updates to intel_crtc->active and the
need_postvbl_update flag. Since we don't have async yet it isn't
terribly important yet, but might as well add it now.
- Change interface to program watermarks. Platforms will now expose
.initial_watermarks() and .optimize_watermarks() functions to do
watermark programming. These should lock wm_mutex, copy the
appropriate state values into intel_crtc->active, and then call
the internal program watermarks function.
v5:
- Skip intermediate watermark calculation/check during initial hardware
readout since we don't trust the existing HW values (and don't have
valid values of our own yet).
- Don't try to call .optimize_watermarks() on platforms that don't have
atomic watermarks yet. (Maarten)
v6:
- Rebase
v7:
- Further rebase
v8:
- A few minor indentation and line length fixes
v9:
- Yet another rebase since Maarten's patches reworked a bunch of the
code (wm_pre, wm_post, etc.) that this was previously based on.
v10:
- Move wm_mutex to dev_priv to protect against racing commits against
disjoint CRTC sets. (Maarten)
- Drop unnecessary clearing of cstate->wm.need_postvbl_update (Maarten)
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1452108870-24204-1-git-send-email-matthew.d.roper@intel.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Although we can do a good job of reading out hardware state, the
graphics firmware may have programmed the watermarks in a creative way
that doesn't match how i915 would have chosen to program them. We
shouldn't trust the firmware's watermark programming, but should rather
re-calculate how we think WM's should be programmed and then shove those
values into the hardware.
We can do this pretty easily by creating a dummy top-level state,
running it through the check process to calculate all the values, and
then just programming the watermarks for each CRTC.
v2: Move watermark sanitization after our BIOS fb reconstruction; the
watermark calculations that we do here need to look at pstate->fb,
which isn't setup yet in intel_modeset_setup_hw_state(), even
though we have an enabled & visible plane.
v3:
- Don't move 'active = optimal' watermark assignment; we just undo
that change in the next patch anyway. (Ville)
- Move atomic helper locking fix to separate patch. (Maarten)
v4:
- Grab connection_mutex before calling atomic helper to duplicate
state. The connector loop inside the helper will throw a WARN
if we don't hold something to protect the connector list (and the
helper itself doesn't try to lock the list).
- Make failure to calculate watermarks for inherited state a WARN()
since it probably indicates a serious problem in either our state
readout code or our watermark code for this platform.
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
On skylake when calculating plane visibility with the crtc in
dpms off mode the real cdclk may be different from what it would be
if the crtc was active. This may result in a WARN_ON(cdclk < crtc_clock)
from skl_max_scale. The fix is to keep a atomic_cdclk that would be true
if all crtc's were active.
This is required to get the same calculations done correctly regardless
of dpms mode.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1447945645-32005-12-git-send-email-maarten.lankhorst@linux.intel.com
Parallel modesets are still not allowed, but this will allow updating
a different crtc during a modeset if the clock is not changed.
Additionally when all pipes are DPMS off the cdclk will be lowered
to the minimum allowed.
Changes since v1:
- Add dev_priv->active_crtcs for tracking which crtcs are active.
- Rename min_cdclk to min_pixclk and move to dev_priv.
- Add a active_crtcs mask which is updated atomically.
- Add intel_atomic_state->modeset which is set on modesets.
- Commit new pixclk/active_crtcs right after state swap.
Changes since v2:
- Make the changes related to max_pixel_rate calculations more readable.
Changes since v3:
- Add cherryview and missing WARN_ON to readout.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Using __stringify(x) instead of #x adds support for macros as
a parameter and compile-time concatenation reduces the runtime
overhead.
Slightly increases the .text size but should not matter.
v2:
- Define I915_STATE_WARN_ON though I915_STATE_WARN
(Bikeshed inspiration by Chris)
v3:
- More specific commit message
v4:
- Do not directly pass arbitary string as format, instead
guard with "%s" (Dave)
Cc: Rob Clark <robdclark@gmail.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450441647-23924-3-git-send-email-joonas.lahtinen@linux.intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Otherwise usage in the i915 debug macros yields problems due to
i915_drv.h <-> i915_trace.h <-> intel_drv.h include loops.
v2:
- Document not-so-obvious need for linux/cache.h (Chris)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450436898-20408-2-git-send-email-joonas.lahtinen@linux.intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
commit 344df9809f ("drm/i915/skl: Disable coarse power gating up until F0")
failed to take into account that the same workaround is used in guc
when forcewake is sampled.
Wrap the condition check inside a macro and use it in both places
to fix the guc side scope.
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450286318-6854-1-git-send-email-mika.kuoppala@intel.com
Limit busywaiting only to the request currently being processed by the
GPU. If the request is not currently being processed by the GPU, there
is a very low likelihood of it being completed within the 2 microsecond
spin timeout and so we will just be wasting CPU cycles.
v2: Check for logical inversion when rebasing - we were incorrectly
checking for this request being active, and instead busywaiting for
when the GPU was not yet processing the request of interest.
v3: Try another colour for the seqno names.
v4: Another colour for the function names.
v5: Remove the forced coherency when checking for the active request. On
reflection and plenty of recent experimentation, the issue is not a
cache coherency problem - but an irq/seqno ordering problem (timing issue).
Here, we do not need the w/a to force ordering of the read with an
interrupt.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449833608-22125-4-git-send-email-chris@chris-wilson.co.uk
As we mark the preallocated objects as bound, we should also flag them
correctly as being map-and-fenceable (if appropriate!) so that later
users do not get confused and try and rebind the pinned vma in order to
get a map-and-fenceable binding.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Goel, Akash" <akash.goel@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: drm-intel-fixes@lists.freedesktop.org
Link: http://patchwork.freedesktop.org/patch/msgid/1448029000-10616-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In some cases we want to check whether we hold an RPM wakelock reference
for the whole duration of a sequence. To achieve this add a new RPM
atomic sequence counter that we increment any time the wakelock refcount
drops to zero. Check whether the sequence number stays the same during
the atomic section and that we hold the wakelock at the beginning of the
section.
Motivated by Chris.
v2-v3:
- unchanged
v4:
- swap the order of atomic_read() and assert_rpm_wakelock_held() in
assert_rpm_atomic_begin() to avoid race
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450203038-5150-10-git-send-email-imre.deak@intel.com
Atm, we assert that the device is not suspended until the point when the
device is truly put to a suspended state. This is fine, but we can catch
more problems if we check that RPM refcount is non-zero. After that one
drops to zero we shouldn't access the device any more, even if the actual
device suspend may be delayed. Change assert_rpm_wakelock_held()
accordingly to check for a non-zero RPM refcount in addition to the
current device-not-suspended check.
For the new asserts to work we need to annotate every place explicitly in
the code where we expect that the device is powered. The places where we
only assume this, but may not hold an RPM reference:
- driver load
We assume the device to be powered until we enable RPM. Make this
explicit by taking an RPM reference around the load function.
- system and runtime sudpend/resume handlers
These handlers are called when the RPM reference becomes 0 and know the
exact point after which the device can get powered off. Disable the
RPM-reference-held check for their duration.
- the IRQ, hangcheck and RPS work handlers
These handlers are flushed in the system/runtime suspend handler
before the device is powered off, so it's guaranteed that they won't
run while the device is powered off even though they don't hold any
RPM reference. Disable the RPM-reference-held check for their duration.
In all these cases we still check that the device is not suspended.
These explicit annotations also have the positive side effect of
documenting our assumptions better.
This caught additional WARNs from the atomic modeset path, those should
be fixed separately.
v2:
- remove the redundant HAS_RUNTIME_PM check (moved to patch 1) (Ville)
v3:
- use a new dedicated RPM wakelock refcount to also catch cases where
our own RPM get/put functions were not called (Chris)
- assert also that the new RPM wakelock refcount is 0 in the RPM
suspend handler (Chris)
- change the assert error message to be more meaningful (Chris)
- prevent false assert errors and check that the RPM wakelock is 0 in
the RPM resume handler too
- prevent false assert errors in the hangcheck work too
- add a device not suspended assert check to the hangcheck work
v4:
- rename disable/enable_rpm_asserts to disable/enable_rpm_wakeref_asserts
and wakelock_count to wakeref_count
- disable the wakeref asserts in the IRQ handlers and RPS work too
- update/clarify commit message
v5:
- mark places we plan to change to use proper RPM refcounting with
separate DISABLE/ENABLE_RPM_WAKEREF_ASSERTS aliases (Chris)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1450227139-13471-1-git-send-email-imre.deak@intel.com
The RVDA and RVDS (raw VBT data address and size) fields of the ASLE
mailbox may specify an alternate location for VBT instead of mailbox #4.
Use the alternate location if available and valid, falling back to
mailbox #4 otherwise.
v2: Update debug logging (Ville)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Mika Kahola <mika.kahola@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450178280-28020-1-git-send-email-jani.nikula@intel.com
In the future the VBT might not be in mailbox #4 of the ACPI OpRegion,
thus unavailable in i915_opregion, so add a separate file for the VBT.
v2: Drop the locking as unneeded (Chris)
v3: Rebase
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450178232-27780-1-git-send-email-jani.nikula@intel.com
Make the validation function a boolean operating on a buffer of given
size, removing the extra pointer dances.
Move the OpRegion based VBT validation to intel_opregion_setup(), only
initializing opregion->vbt if it's valid.
v2: move logging about valid VBT to opregion setup too (Ville)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450178175-27420-1-git-send-email-jani.nikula@intel.com
Here are the patchset to add get_eld op to audio component for
communicating more directly between i915 and HD-audio.
Currently, the HDMI/DP audio status and ELD are notified and obtained
via the hardware-level communication over HD-audio unsolicited event
and verbs although the graphics driver holds the exactly same
information. As we already have a notification via audio component,
this is another step forward; namely, the audio driver may fetch
directly the audio status and ELD via the new component op.
The commits are based on Dave's latest drm-next branch.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJWaXTVAAoJEGwxgFQ9KSmkdZIQAKg2NQN1CGOIb+ce80UmJ9ap
myZH42aZXytPrMAhHQCw2mvf8aL18EKyQcGv+anLdFM+AqlzZvH/exfOblkr88lg
ZMGr0bXVNa2Lfyrgt9blUwH58uETQZC4P3BrI0cPqIJBdPagDxnxoaV1e/21g/9p
0a2bhiz0gn4OFb83vpi5pL4hGv+BQwwlmkOujcVg7yxR7ylnYI419NM9Z+Lbmfq7
p5jId6Q3EwBw6vpWryOI2TElM3VDThoOGCOtkfmZx6o4fDZ0bdl8CYVCgRFwZZCe
kk01Caa+5+CW88MlJ1VX6gLy0WRlPY0AFreCWKgy5HCUNqew9ruhUeMj4+C1oHpj
ui/79ULLRN2hmu2rvU8lZb0ClihXkDCBN8p89j6o2I+y1aIcUtxvY9Srg5w2tVBe
Ue+OSB3lA4rdnuSjxZiaPf+V4rozIyNJHRjo6xNdY0zuScB4lw9Bh7IYXmj8B8OW
k3LklToj4ZGeyCgfcTQwztAh7fFEXUb1wN+lLqCt3b9688zvMYTQlJ8ZdtK+t188
3DNz9QjjPd4DcxLypl1VpM2Xv3AhuFfugq0oEuQq9bXs7qtj+iLmSWWdmhUNaVWb
Qot21vJEHDii6jtoLdbVMTEZTWyr2nXUfUNFJpUgitif2UhqqgecnR16Fi05pjTv
+Th/GvjddrQ0oe9DwVGY
=NShN
-----END PGP SIGNATURE-----
Merge tag 'drm-i915-get-eld' of tiwai/sound into drm-intel-next-queued
Add get_eld audio component for i915/HD-audio
Currently, the HDMI/DP audio status and ELD are notified and obtained
via the hardware-level communication over HD-audio unsolicited event
and verbs although the graphics driver holds the exactly same
information. As we already have a notification via audio component,
this is another step forward; namely, the audio driver may fetch
directly the audio status and ELD via the new component op.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
In various places, a single page of a (regular) GEM object is mapped into
CPU address space and updated. In each such case, either the page or the
the object should be marked dirty, to ensure that the modifications are
not discarded if the object is evicted under memory pressure.
The typical sequence is:
va = kmap_atomic(i915_gem_object_get_page(obj, pageno));
*(va+offset) = ...
kunmap_atomic(va);
Here we introduce i915_gem_object_get_dirty_page(), which performs the
same operation as i915_gem_object_get_page() but with the side-effect
of marking the returned page dirty in the pagecache. This will ensure
that if the object is subsequently evicted (due to memory pressure),
the changes are written to backing store rather than discarded.
Note that it works only for regular (shmfs-backed) GEM objects, but (at
least for now) those are the only ones that are updated in this way --
the objects in question are contexts and batchbuffers, which are always
shmfs-backed.
Separate patches deal with the cases where whole objects are (or may
be) dirtied.
v3: Mark two more pages dirty in the page-boundary-crossing
cases of the execbuffer relocation code [Chris Wilson]
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1449773486-30822-2-git-send-email-david.s.gordon@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch adds a reverse mapping from a digital port number to
intel_encoder object containing the corresponding intel_digital_port.
It simplifies the query of the encoder a lot.
Note that, even if it's a valid digital port, the dig_port_map[] might
point still to NULL -- usually it implies a DP MST port. Due to this
fact, the NULL check in each place has no WARN_ON() and just skips the
port. Once when the situation changes in future, we might introduce
WARN_ON() for a more strict check.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The cherryview device shares many characteristics with the valleyview
device. When support was added to the driver for cherryview, the
corresponding device info structure included .is_valleyview = 1.
This is not correct and leads to some confusion.
This patch changes .is_valleyview to .is_cherryview in the cherryview
device info structure and simplifies the IS_CHERRYVIEW macro.
Then where appropriate, instances of IS_VALLEYVIEW are replaced with
IS_VALLEYVIEW || IS_CHERRYVIEW or equivalent.
v2: Use IS_VALLEYVIEW || IS_CHERRYVIEW instead of defining a new macro.
Also add followup patches to fix issues discovered during the first
review. (Ville)
v3: Fix some style issues and one gen check. Remove CRT related changes
as CRT is not supported on CHV. (Imre, Ville)
v4: Make a few more optimizations. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Wayne Boyer <wayne.boyer@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449692975-14803-1-git-send-email-wayne.boyer@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Userspace can pass in an offset that it presumes the object is located
at. The kernel will then do its utmost to fit the object into that
location. The assumption is that userspace is handling its own object
locations (for example along with full-ppgtt) and that the kernel will
rarely have to make space for the user's requests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
v2: Fixed incorrect eviction found by Michal Winiarski - fix suggested by Chris
Wilson. Fixed incorrect error paths causing crash found by Michal Winiarski.
(Not published externally)
v3: Rebased because of trivial conflict in object_bind_to_vm. Fixed eviction
to allow eviction of soft-pinned objects when another soft-pinned object used
by a subsequent execbuffer overlaps reported by Michal Winiarski.
(Not published externally)
v4: Moved soft-pinned objects to the front of ordered_vmas so that they are
pinned first after an address conflict happens to avoid repeated conflicts in
rare cases (Suggested by Chris Wilson). Expanded comment on
drm_i915_gem_exec_object2.offset to cover this new API.
v5: Added I915_PARAM_HAS_EXEC_SOFTPIN parameter for detecting this capability
(Kristian). Added check for multiple pinnings on eviction (Akash). Made sure
buffers are not considered misplaced without the user specifying
EXEC_OBJECT_SUPPORTS_48B_ADDRESS. User must assume responsibility for any
addressing workarounds. Updated object2.offset field comment again to clarify
NO_RELOC case (Chris). checkpatch cleanup.
v6: Trivial rebase on latest drm-intel-nightly
v7: Catch attempts to pin above the max virtual address size and return
EINVAL (Tvrtko). Decouple EXEC_OBJECT_SUPPORTS_48B_ADDRESS and
EXEC_OBJECT_PINNED flags, user must pass both flags in any attempt to pin
something at an offset above 4GB (Chris, Daniel Vetter).
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Zou Nanhai <nanhai.zou@intel.com>
Cc: Kristian Høgsberg <hoegsberg@gmail.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Acked-by: PDT
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449575707-20933-1-git-send-email-thomas.daniel@intel.com
Let's introduce ULT and ULX Kabylake definitions and start
using it for a propper DDI buffer translation.
v2: Remove extra white space. (Paulo)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
This reverts commit 6d65ba943a.
Mika Kuoppala traced down a use-after-free crash in module unload to
this commit, because ring->last_context is leaked beyond when the
context gets destroyed. Mika submitted a quick fix to patch that up in
the context destruction code, but that's too much of a hack.
The right fix is instead for the ring to hold a full reference onto
it's last context, like we do for legacy contexts.
Since this is causing a regression in BAT it gets reverted before we
can close this.
Cc: Nick Hoath <nicholas.hoath@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Alex Dai <yu.dai@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93248
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Use the first retired request on a new context to unpin
the old context. This ensures that the hw context remains
bound until it has been written back to by the GPU.
Now that the context is pinned until later in the request/context
lifecycle, it no longer needs to be pinned from context_queue to
retire_requests.
This fixes an issue with GuC submission where the GPU might not
have finished writing back the context before it is unpinned. This
results in a GPU hang.
v2: Moved the new pin to cover GuC submission (Alex Dai)
Moved the new unpin to request_retire to fix coverage leak
v3: Added switch to default context if freeing a still pinned
context just in case the hw was actually still using it
v4: Unwrapped context unpin to allow calling without a request
v5: Only create a switch to idle context if the ring doesn't
already have a request pending on it (Alex Dai)
Rename unsaved to dirty to avoid double negatives (Dave Gordon)
Changed _no_req postfix to __ prefix for consistency (Dave Gordon)
Split out per engine cleanup from context_free as it
was getting unwieldy
Corrected locking (Dave Gordon)
v6: Removed some bikeshedding (Mika Kuoppala)
Added explanation of the GuC hang that this fixes (Daniel Vetter)
v7: Removed extra per request pinning from ring reset code (Alex Dai)
Added forced ring unpin/clean in error case in context free (Alex Dai)
Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
Issue: VIZ-4277
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Alex Dai <yu.dai@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>