Commit Graph

634253 Commits

Author SHA1 Message Date
Chris Wilson 27745e829a drm/i915/execlists: Use a local lock for dfs_link access
Avoid requiring struct_mutex for exclusive access to the temporary
dfs_link inside the i915_dependency as not all callers may want to touch
struct_mutex. So rather than force them to take a highly contended
lock, introduce a local lock for the execlists schedule operation.

Reported-by: David Weinehall <david.weinehall@linux.intel.com>
Fixes: 9a151987d7 ("drm/i915: Add execution priority boosting for mmioflips")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161116152721.11053-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-11-16 21:12:39 +00:00
Praveen Paneri 85ee17ebee drm/i915/bxt: Broxton decoupled MMIO
Decoupled MMIO is an alternative way to access forcewake domain
registers, which requires less cycles for a single read/write and
avoids frequent software forcewake.
This certainly gives advantage over the forcewake as this new
mechanism “decouples” CPU cycles and allow them to complete even
when GT is in a CPD (frequency change) or C6 state.

This can co-exist with forcewake and we will continue to use forcewake
as appropriate. E.g. 64-bit register writes to avoid writing 2 dwords
separately and land into funny situations.

v2:
- Moved platform check out of the function and got rid of duplicate
 functions to find out decoupled power domain (Chris)
- Added a check for forcewake already held and skipped decoupled
 access (Chris)
- Skipped writing 64 bit registers through decoupled MMIO (Chris)

v3:
- Improved commit message with more info on decoupled mmio (Tvrtko)
- Changed decoupled operation to enum and used u32 instead of
 uint_32 data type for register offset (Tvrtko)
- Moved HAS_DECOUPLED_MMIO to device info (Tvrtko)
- Added lookup table for converting fw_engine to pd_engine (Tvrtko)
- Improved __gen9_decoupled_read and __gen9_decoupled_write
 routines (Tvrtko)

v4:
- Fixed alignment and variable names (Chris)
- Write GEN9_DECOUPLED_REG0_DW1 register in just one go (Zhe Wang)

v5:
- Changed HAS_DECOUPLED_MMIO() argument name to dev_priv (Tvrtko)
- Sanitize info->had_decoupled_mmio at init (Chris)

Signed-off-by: Zhe Wang <zhe1.wang@intel.com>
Signed-off-by: Praveen Paneri <praveen.paneri@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479230360-22395-1-git-send-email-praveen.paneri@intel.com
2016-11-16 09:33:10 +00:00
Maarten Lankhorst 5eff503b9d drm/i915/gen9+: Kill off hw_ddb from intel_crtc.
dev_priv->hw_ddb is only used by skl_update_crtcs, but the ddb
allocation for each pipe is calculated in crtc_state.

We can rid of the global member by looking at crtc_state.
Do this by saving all active old ddb allocations from the old crtc_state
in an array, and then point them to the new allocation every time we update
a crtc.

This will allow us to keep track of the intermediate ddb allocations,
which is what hw_ddb was previously used for. With hw_ddb gone all
SKL-style watermark values are properly maintained only in crtc_state.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478609742-13603-5-git-send-email-maarten.lankhorst@linux.intel.com
[mlankhorst: Reword commit message.]
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
2016-11-15 13:13:47 +01:00
Maarten Lankhorst 512b552798 drm/i915/gen9+: Preserve old allocation from crtc_state.
This is the last bit required for making nonblocking modesets work
correctly. The state in intel_crtc->hw_ddb is updated in the
nonblocking part of a nonblocking commit.

This means that even attempting a commit before a nonblocking modeset
completes will fail, because intel_crtc->hw_ddb still has stale values.
The stale values are 0 if the crtc is being enabled resulting in a
failure during atomic check, but it may also result in double use of
ddb allocations.

Fix this by explicitly copying the ddb allocation from the old state.
This has to be done explicitly, because a modeset that doesn't change
active pipes, or a modeset converted to a fastset will will clear the
current state.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478609742-13603-4-git-send-email-maarten.lankhorst@linux.intel.com
[mlankhorst: Reword commit message.]
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
2016-11-15 11:23:11 +01:00
Maarten Lankhorst e62929b3f6 drm/i915/gen9+: Program watermarks as a separate step during evasion, v3.
The watermark updates for SKL style watermarks are no longer done
in the plane callbacks, but are now called in a separate watermark
update function that's called during the same vblank evasion,
before the plane updates.

This also gets rid of the global skl_results, which was required for
keeping track of the current atomic commit.

Changes since v1:
- Move line unwrap to correct patch. (Lyude)
- Make sure we don't regress ILK watermarks. (Matt)
- Rephrase commit message. (Matt)
Changes since v2:
- Fix disable watermark check to use the correct way to determine single
  step watermark support.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lyude <cpaul@redhat.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478609742-13603-3-git-send-email-maarten.lankhorst@linux.intel.com
[mlankhorst: Small whitespace fix in skl_initial_wm]
2016-11-15 11:23:11 +01:00
Maarten Lankhorst ccf010fb94 drm/i915: Add an atomic evasion step to watermark programming, v4.
Allow the driver to write watermarks during atomic evasion.
This will make it possible to write the watermarks in a cleaner
way on gen9+.

intel_atomic_state is not used here yet, but will be used when
we program all watermarks as a separate step during evasion.

This also writes linetime all the time, while before it was only
done during plane updates. This looks like this could be a bugfix,
but I'm not sure what it affects.

Changes since v1:
- Add comment about atomic evasion to commit message.
- Unwrap I915_WRITE call. (Lyude)
Changes since v2:
- Rename atomic_evade_watermarks to atomic_update_watermarks. (Ville)
- Add line wraps where appropriate, fix grammar in commit message. (Matt)
Changes since v3:
- Actually fix commit message. (Matt)
- Line wrap calls to watermark update functions. (Matt)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478609742-13603-2-git-send-email-maarten.lankhorst@linux.intel.com
2016-11-15 11:23:11 +01:00
Chris Wilson 9a151987d7 drm/i915: Add execution priority boosting for mmioflips
Commit 6b5e90f58c ("drm/i915/scheduler: Boost priorities for flips")
added priority boosting for the modern atomic pageflips (and modesets),
but we should do the same for existing users of mmioflips (we don't yet
need to consider csflips as they are not used by execlists and so do not
have any support for a scheduler).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161115092249.18356-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-11-15 10:04:01 +00:00
Chris Wilson 6b5e90f58c drm/i915/scheduler: Boost priorities for flips
Boost the priority of any rendering required to show the next pageflip
as we want to avoid missing the vblank by being delayed by invisible
workload. We prioritise avoiding jank and jitter in the GUI over
starving background tasks.

v2: Descend dma_fence_array when boosting priorities.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-10-chris@chris-wilson.co.uk
2016-11-14 21:01:22 +00:00
Chris Wilson 9f792ebafe drm/i915: Store the execution priority on the context
In order to support userspace defining different levels of importance to
different contexts, and in particular the preferred order of execution,
store a priority value on each context. By default, the kernel's
context, which is used for idling and other background tasks, is given
minimum priority (i.e. all user contexts will execute before the kernel).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-9-chris@chris-wilson.co.uk
2016-11-14 21:01:22 +00:00
Chris Wilson 20311bd350 drm/i915/scheduler: Execute requests in order of priorities
Track the priority of each request and use it to determine the order in
which we submit requests to the hardware via execlists.

The priority of the request is determined by the user (eventually via
the context) but may be overridden at any time by the driver. When we set
the priority of the request, we bump the priority of all of its
dependencies to match - so that a high priority drawing operation is not
stuck behind a background task.

When the request is ready to execute (i.e. we have signaled the submit
fence following completion of all its dependencies, including third
party fences), we put the request into a priority sorted rbtree to be
submitted to the hardware. If the request is higher priority than all
pending requests, it will be submitted on the next context-switch
interrupt as soon as the hardware has completed the current request. We
do not currently preempt any current execution to immediately run a very
high priority request, at least not yet.

One more limitation, is that this is first implementation is for
execlists only so currently limited to gen8/gen9.

v2: Replace recursive priority inheritance bumping with an iterative
depth-first search list.
v3: list_next_entry() for walking lists
v4: Explain how the dfs solves the recursion problem with PI.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-8-chris@chris-wilson.co.uk
2016-11-14 21:01:21 +00:00
Chris Wilson 52e5420907 drm/i915/scheduler: Record all dependencies upon request construction
The scheduler needs to know the dependencies of each request for the
lifetime of the request, as it may choose to reschedule the requests at
any time and must ensure the dependency tree is not broken. This is in
additional to using the fence to only allow execution after all
dependencies have been completed.

One option was to extend the fence to support the bidirectional
dependency tracking required by the scheduler. However the mismatch in
lifetimes between the submit fence and the request essentially meant
that we had to build a completely separate struct (and we could not
simply reuse the existing waitqueue in the fence for one half of the
dependency tracking). The extra dependency tracking simply did not mesh
well with the fence, and keeping it separate both keeps the fence
implementation simpler and allows us to extend the dependency tracking
into a priority tree (whilst maintaining support for reordering the
tree).

To avoid the additional allocations and list manipulations, the use of
the priotree is disabled when there are no schedulers to use it.

v2: Create a dedicated slab for i915_dependency.
    Rename the lists.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-7-chris@chris-wilson.co.uk
2016-11-14 21:00:28 +00:00
Chris Wilson 0de9136dbb drm/i915/scheduler: Signal the arrival of a new request
The start of the scheduler, add a hook into request submission for the
scheduler to see the arrival of new requests and prepare its runqueues.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-6-chris@chris-wilson.co.uk
2016-11-14 21:00:26 +00:00
Chris Wilson 663f71e73f drm/i915: Remove engine->execlist_lock
The execlist_lock is now completely subsumed by the engine->timeline->lock,
and so we can remove the redundant layer of locking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-5-chris@chris-wilson.co.uk
2016-11-14 21:00:25 +00:00
Chris Wilson d55ac5bf97 drm/i915: Defer transfer onto execution timeline to actual hw submission
Defer the transfer from the client's timeline onto the execution
timeline from the point of readiness to the point of actual submission.
For example, in execlists, a request is finally submitted to hardware
when the hardware is ready, and only put onto the hardware queue when
the request is ready. By deferring the transfer, we ensure that the
timeline is maintained in retirement order if we decide to queue the
requests onto the hardware in a different order than fifo.

v2: Rebased onto distinct global/user timeline lock classes.
v3: Play with the position of the spin_lock().
v4: Nesting finally resolved with distinct sw_fence lock classes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-4-chris@chris-wilson.co.uk
2016-11-14 21:00:23 +00:00
Chris Wilson 23902e49c9 drm/i915: Split request submit/execute phase into two
In order to support deferred scheduling, we need to differentiate
between when the request is ready to run (i.e. the submit fence is
signaled) and when the request is actually run (a new execute fence).
This is typically split between the request itself wanting to wait upon
others (for which we use the submit fence) and the CPU wanting to wait
upon the request, for which we use the execute fence to be sure the
hardware is ready to signal completion.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-3-chris@chris-wilson.co.uk
2016-11-14 21:00:22 +00:00
Chris Wilson bb89485e99 drm/i915: Create distinct lockclasses for execution vs user timelines
In order to simplify the lockdep annotation, as they become more complex
in the future with deferred execution and multiple paths through the
same functions, create a separate lockclass for the user timeline and
the hardware execution timeline.

We should only ever be locking the user timeline and the execution
timeline in parallel so we only need to create two lock classes, rather
than a separate class for every timeline.

v2: Rename the lock classes to be more consistent with other lockdep.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-2-chris@chris-wilson.co.uk
2016-11-14 21:00:21 +00:00
Chris Wilson 556b748710 drm/i915: Give each sw_fence its own lockclass
Localise the static struct lock_class_key to the caller of
i915_sw_fence_init() so that we create a lock_class instance for each
unique sw_fence rather than all sw_fences sharing the same
lock_class. This eliminate some lockdep false positive when using fences
from within fence callbacks.

For the relatively small number of fences currently in use [2], this adds
160 bytes of unused text/code when lockdep is disabled. This seems
quite high, but fully reducing it via ifdeffery is also quite ugly.
Removing the #fence strings saves 72 bytes with just a single #ifdef.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-1-chris@chris-wilson.co.uk
2016-11-14 21:00:20 +00:00
Ville Syrjälä e3c566dfe4 drm/i915: Remove some duplicated plane swapping logic
On pre-gen4 we connect plane A to pipe B and vice versa to get an FBC
capable plane feeding the LVDS port by default. We have the logic for
the plane swapping duplicated in many places. Let's remove a bit of the
duplication by having the crtc look up the thing from the primary plane.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478616439-10150-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-14 20:27:55 +02:00
Ville Syrjälä c99f53f7ca drm/i915: Simplify DP port limited color range bit platform checks
Instead of checking for everything not supporting the limited color
range bit in the DP port register, let's check for the one thing
that does have it (g4x).

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479145447-12907-3-git-send-email-ville.syrjala@linux.intel.com
2016-11-14 20:27:54 +02:00
Ville Syrjälä 0037071d8a drm/i915: Kill dp_encoder_is_mst
dp_encoder_is_mst flag in the crtc state can be replaced by
intel_crtc_has_type(..., INTEL_OUTPUT_DP_MST). Let's do that.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479145447-12907-2-git-send-email-ville.syrjala@linux.intel.com
2016-11-14 20:27:54 +02:00
Ville Syrjälä 4ea7be2b56 drm/i915: Add horizontal mirroring support for CHV pipe B planes
The primary and sprite planes on CHV pipe B support horizontal
mirroring. Expose it to the world.

Sadly the hardware ignores the mirror bit when the rotate bit is
set, so we'll have to reject the 180+X case.

v2: Drop the BIT()
v3: Pass dev_priv instead of dev to IS_CHERRYVIEW()

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479142440-25283-4-git-send-email-ville.syrjala@linux.intel.com
2016-11-14 19:58:48 +02:00
Ville Syrjälä df0cd455e7 drm/i915: Clean up rotation DSPCNTR/DVSCNTR/etc. setup
Move the plane control register rotation setup away from the
coordinate munging code. This will result in neater looking
code once we add reflection support for CHV.

v2: Drop the BIT(), drop some usless parens,

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479142440-25283-3-git-send-email-ville.syrjala@linux.intel.com
2016-11-14 19:58:26 +02:00
Ville Syrjälä f22aa14352 drm/i915: Use & instead if == to check for rotations
Using == to check for 180 degree rotation only works as long as the
reflection bits aren't set. That will change soon enough for CHV, so
let's stop doing things the wrong way.

v2: Drop the BIT()

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479142440-25283-2-git-send-email-ville.syrjala@linux.intel.com
2016-11-14 19:57:42 +02:00
Paulo Zanoni 5697d60f6e drm/i915/fbc: convert intel_fbc.c to use INTEL_GEN()
Because it's shorter, easier to read, newer and cooler. And I don't
think anybody else has pending FBC patches right now, so the conflicts
should be minimal.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478883461-20201-8-git-send-email-paulo.r.zanoni@intel.com
2016-11-14 14:00:29 -02:00
Paulo Zanoni 4f8f225151 drm/i915/fbc: use drm_atomic_get_existing_crtc_state when appropriate
Use drm_atomic_get_existing_crtc_state() instead of looping through
the CRTC states and checking if the FBC CRTC is there.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478883461-20201-6-git-send-email-paulo.r.zanoni@intel.com
2016-11-14 14:00:29 -02:00
Paulo Zanoni f7e9b004b8 drm/i915/fbc: inline intel_fbc_can_choose()
It only has two checks now, so it makes sense to just move the code to
the caller.

Also take this opportunity to make no_fbc_reason make more sense: now
we'll only list "no suitable CRTC for FBC" instead of maybe giving a
reason why the last CRTC we checked was not selected, and we'll more
consistently set the reason (e.g., if no primary planes are visible).

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478883461-20201-5-git-send-email-paulo.r.zanoni@intel.com
2016-11-14 14:00:29 -02:00
Paulo Zanoni ee2be30997 drm/i915/fbc: extract intel_fbc_can_enable()
Extract that part of the code to a new function and call this function
only once during intel_fbc_choose_crtc() instead of once per plane.
Those checks are independent from planes/CRTCs.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478883461-20201-4-git-send-email-paulo.r.zanoni@intel.com
2016-11-14 14:00:29 -02:00
Paulo Zanoni ba67fab02c drm/i915/fbc: replace a loop with drm_atomic_get_existing_crtc_state()
Much simpler. Thanks to Ville for pointing this.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478883461-20201-3-git-send-email-paulo.r.zanoni@intel.com
2016-11-14 14:00:29 -02:00
Paulo Zanoni 03e39104d9 drm/i915/fbc: move the intel_fbc_can_choose() call out of the loop
We can just call it earlier, so do it. The goal of the loop is to get
the plane's CRTC state, and we don't need it in order to call
intel_fbc_can_choose().

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478883461-20201-2-git-send-email-paulo.r.zanoni@intel.com
2016-11-14 14:00:29 -02:00
Chris Wilson 4d1368f7fa drm/i915: Fix test on inputs for vma_compare()
When supplying a view to vma_compare() it is required that the supplied
i915_address_space is the global GTT. I tested the VMA instead (which is
the current position in the rbtree and maybe from any address space).

(This reapplies commit a44342acde ("drm/i915: Fix test on inputs for
vma_compare()") as it was lost in the vma split)

Reported-by: Matthew Auld <matthew.auld@intel.com>
Tested-by: Matthew Auld <matthew.auld@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98579
Fixes: db6c2b4151 ("drm/i915: Store the vma in an rbtree...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161103200852.23431-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Fixes: b42fe9ca0a ("drm/i915: Split out i915_vma.c")
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-11-14 15:59:21 +00:00
Paulo Zanoni 2ef32dee97 drm/i915/gen9: fix the WM memory bandwidth WA for Y tiling cases
The previous spec version said "double Ytile planes minimum lines",
and I interpreted this as referring to what the spec calls "Y tile
minimum", but in fact it was referring to what the spec calls "Minimum
Scanlines for Y tile". I noticed that Mahesh Kumar had a different
interpretation, so I sent and email to the spec authors and got
clarification on the correct meaning. Also, BSpec was updated and
should be clear now.

Fixes: ee3d532fcb ("drm/i915/gen9: unconditionally apply the memory bandwidth WA")
Cc: stable@vger.kernel.org
Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478636531-6081-1-git-send-email-paulo.r.zanoni@intel.com
2016-11-14 13:20:22 -02:00
Ville Syrjälä 7a17995a3d drm/i915: Assume non-DP++ port if dvo_port is HDMI and there's no AUX ch specified in the VBT
My heuristic for detecting type 1 DVI DP++ adaptors based on the VBT
port information apparently didn't survive the reality of buggy VBTs.
In this particular case we have a machine with a natice HDMI port, but
the VBT tells us it's a DP++ port based on its capabilities.

The dvo_port information in VBT does claim that we're dealing with a
HDMI port though, but we have other machines which do the same even
when they actually have DP++ ports. So that piece of information alone
isn't sufficient to tell the two apart.

After staring at a bunch of VBTs from various machines, I have to
conclude that the only other semi-reliable clue we can use is the
presence of the AUX channel in the VBT. On this particular machine
AUX channel is specified as zero, whereas on some of the other machines
which listed the DP++ port as HDMI have a non-zero AUX channel.

I've also seen VBTs which have dvo_port a DP but have a zero AUX
channel. I believe those we need to treat as DP ports, so we'll limit
the AUX channel check to just the cases where dvo_port is HDMI.

If we encounter any more serious failures with this heuristic I think
we'll have to have to throw it out entirely. But that could mean that
there is a risk of type 1 DVI dongle users getting greeted by a
black screen, so I'd rather not go there unless absolutely necessary.

v2: Remove the duplicate PORT_A check (Daniel)
    Fix some typos in the commit message

Cc: Daniel Otero <daniel.otero@outlook.com>
Cc: stable@vger.kernel.org
Tested-by: Daniel Otero <daniel.otero@outlook.com>
Fixes: d61992565b ("drm/i915: Determine DP++ type 1 DVI adaptor presence based on VBT")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97994
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478884464-14251-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-11-14 16:29:46 +02:00
Jani Nikula c007fb4a38 drm/i915: rename preliminary_hw_support to alpha_support
The term "preliminary hardware support" has always caused confusion both
among users and developers. It has always been about preliminary driver
support for new hardware, and not so much about preliminary hardware. Of
course, initially both the software and hardware are in early stages,
but the distinction becomes more clear when the user picks up production
hardware and an older kernel to go with it, with just the early support
we had for the hardware at the time the kernel was released. The user
has to specifically enable the alpha quality *driver* support for the
hardware in that specific kernel version.

Rename preliminary_hw_support to alpha_support to emphasize that the
module parameter, config option, and flag are about software, not about
hardware. Improve the language in help texts and debug logging as well.

This appears to be a good time to do the change, as there are currently
no platforms with preliminary^W alpha support.

Cc: Rob Clark <robdclark@gmail.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477909108-18696-1-git-send-email-jani.nikula@intel.com
2016-11-14 15:33:27 +02:00
Joonas Lahtinen d2ad3ae4ec drm/i915: Update i915_driver_load kerneldoc
Update i915_driver_load kerneldoc to match code.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478784994-2494-2-git-send-email-joonas.lahtinen@linux.intel.com
2016-11-14 14:30:34 +02:00
Chris Wilson 2b3c83176e drm/i915: Stop skipping the final clflush back to system pages
When we release the shmem backing storage, we make sure that the pages
are coherent with the cpu cache. However, our clflush routine was
skipping the flush as the object had no pages at release time. Fix this by
explicitly flushing the sg_table we are decoupling.

Fixes: 03ac84f183 ("drm/i915: Pass around sg_table to get_pages/put_pages backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161111145809.9701-2-chris@chris-wilson.co.uk
2016-11-11 16:21:52 +00:00
Chris Wilson 9caa34aa93 drm/i915: Only wait upon the execution timeline when unlocked
In order to walk the list of all timelines, we currently require the
struct_mutex. We are sometimes called prior to the struct_mutex being
taken by the caller (i.e !I915_WAIT_LOCKED) in which case we can only
trust the global execution timelines (as these are owned by the device).
This means in the unlocked phase we can only wait upon the currently
executing requests and not all queued.

[  175.743243] general protection fault: 0000 [#1] SMP
[  175.743263] Modules linked in: nls_iso8859_1 intel_rapl x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iwlwifi aesni_intel aes_x86_64 lrw snd_soc_rt5640 gf128mul snd_soc_rl6231 snd_soc_core glue_helper snd_compress snd_pcm_dmaengine snd_hda_codec_hdmi ablk_helper snd_hda_codec_realtek cryptd snd_hda_codec_generic serio_raw cfg80211 snd_hda_intel snd_hda_codec ir_lirc_codec snd_hda_core lirc_dev snd_hwdep snd_pcm lpc_ich mei_me mei snd_seq_midi shpchp snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer rc_rc6_mce acpi_als nuvoton_cir kfifo_buf rc_core snd industrialio snd_soc_sst_acpi soundcore snd_soc_sst_match i2c_designware_platform 8250_dw i2c_designware_core dw_dmac spi_pxa2xx_platform mac_hid acpi_pad parport_pc ppdev lp parport
[  175.743509]  autofs4 i915 e1000e psmouse ptp pps_core xhci_pci ehci_pci ahci xhci_hcd ehci_hcd libahci video sdhci_acpi sdhci i2c_hid hid
[  175.743560] CPU: 2 PID: 2386 Comm: wtdg_monitor.sh Tainted: G     U          4.9.0-rc4-nightly+ #2
[  175.743581] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0358.2016.0606.1423 06/06/2016
[  175.743603] task: ffff88024509ba80 task.stack: ffffc9007bd18000
[  175.743618] RIP: 0010:[<ffffffffa01af29b>]  [<ffffffffa01af29b>] i915_gem_wait_for_idle+0x3b/0x140 [i915]
[  175.743660] RSP: 0000:ffffc9007bd1b9b8  EFLAGS: 00010297
[  175.743674] RAX: ffff88024489d248 RBX: 0000000000000000 RCX: 0000000000000000
[  175.743691] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880244898000
[  175.743708] RBP: ffffc9007bd1b9f0 R08: 0000000000000000 R09: 0000000000000001
[  175.743724] R10: 00000028eaf42792 R11: 0000000000000001 R12: dead000000000100
[  175.743741] R13: dead000000000148 R14: ffffc9007bd1ba5f R15: 0000000000000005
[  175.743758] FS:  00007f2638330700(0000) GS:ffff880256d00000(0000) knlGS:0000000000000000
[  175.743777] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  175.743791] CR2: 00007f885c8cea40 CR3: 00000002416b5000 CR4: 00000000003406e0
[  175.743808] Stack:
[  175.743816]  ffff88024489d248 000000004509ba80 ffff880244898000 ffff88024509ba80
[  175.743840]  00000000ffff8b69 ffffc9007bd1ba5f ffffc9007bd1ba5e ffffc9007bd1ba28
[  175.743863]  ffffffffa01b661d 00000000ffffffff 0000000000000000 ffff880244898000
[  175.743886] Call Trace:
[  175.743906]  [<ffffffffa01b661d>] i915_gem_shrinker_lock_uninterruptible.constprop.5+0x5d/0xc0 [i915]
[  175.743937]  [<ffffffffa01b6cd0>] i915_gem_shrinker_oom+0x30/0x1b0 [i915]
[  175.743955]  [<ffffffff8109ca79>] notifier_call_chain+0x49/0x70
[  175.743971]  [<ffffffff8109cd9d>] __blocking_notifier_call_chain+0x4d/0x70
[  175.743988]  [<ffffffff8109cdd6>] blocking_notifier_call_chain+0x16/0x20
[  175.744005]  [<ffffffff811885dc>] out_of_memory+0x22c/0x480
[  175.744020]  [<ffffffff81205542>] __alloc_pages_slowpath+0x851/0x8ec
[  175.744037]  [<ffffffff8118ca51>] __alloc_pages_nodemask+0x2c1/0x310
[  175.744054]  [<ffffffff811d8ea8>] alloc_pages_current+0x88/0x120
[  175.744070]  [<ffffffff811833a4>] __page_cache_alloc+0xb4/0xc0
[  175.744086]  [<ffffffff811865ca>] filemap_fault+0x29a/0x500
[  175.744101]  [<ffffffff81299aa6>] ext4_filemap_fault+0x36/0x50
[  175.744117]  [<ffffffff811b3d4a>] __do_fault+0x6a/0xe0
[  175.744131]  [<ffffffff811b97ee>] handle_mm_fault+0xd0e/0x1330
[  175.744147]  [<ffffffff8106738c>] __do_page_fault+0x23c/0x4d0
[  175.744162]  [<ffffffff81067650>] do_page_fault+0x30/0x80
[  175.744177]  [<ffffffff817ffbe8>] page_fault+0x28/0x30
[  175.744191] Code: 41 57 41 56 41 55 41 54 53 48 83 ec 10 4c 8b a7 48 52 00 00 89 75 d4 48 89 45 c8 49 39 c4 74 78 4d 8d 6c 24 48 41 bf 05 00 00 00 <49> 8b 5d 00 48 85 db 74 50 8b 83 20 01 00 00 85 c0 74 15 48 8b
[  175.744320] RIP  [<ffffffffa01af29b>] i915_gem_wait_for_idle+0x3b/0x140 [i915]
[  175.744351]  RSP <ffffc9007bd1b9b8>

Fixes: 80b204bce8 ("drm/i915: Enable multiple timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161111145809.9701-1-chris@chris-wilson.co.uk
2016-11-11 16:21:52 +00:00
Tvrtko Ursulin 514e1d6480 drm/i915: Convert i915_drv.c to INTEL_GEN
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478270568-7902-2-git-send-email-tvrtko.ursulin@linux.intel.com
2016-11-11 14:58:27 +00:00
Tvrtko Ursulin b7f05d4ae0 drm/i915: Pass dev_priv to INTEL_INFO everywhere apart from the gen use
After this patch only conversion of INTEL_INFO(p)->gen to
INTEL_GEN(dev_priv) remains before the __I915__ macro can
be removed.

v2: Tidy vlv_compute_wm. (David Weinehall)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
2016-11-11 14:58:26 +00:00
Tvrtko Ursulin 4805fe82c0 drm/i915: Further assorted dev_priv cleanups
A small selection of macros which can only accept dev_priv from
now on and a resulting trickle of fixups.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
2016-11-11 14:58:26 +00:00
Tvrtko Ursulin 56b857a5e3 drm/i915: More assorted dev_priv cleanups
A small selection of macros which can only accept dev_priv from
now on and a resulting trickle of fixups.

v2: Keep original order. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
2016-11-11 14:58:26 +00:00
Tvrtko Ursulin 0031fb9685 drm/i915: Assorted dev_priv cleanups
A small selection of macros which can only accept dev_priv from
now on and a resulting trickle of fixups.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
2016-11-11 14:58:26 +00:00
Joonas Lahtinen b42fe9ca0a drm/i915: Split out i915_vma.c
As a side product, had to split two other files;
- i915_gem_fence_reg.h
- i915_gem_object.h (only parts that needed immediate untanglement)

I tried to move code in as big chunks as possible, to make review
easier. i915_vma_compare was moved to a header temporarily.

v2:
- Use i915_gem_fence_reg.{c,h}

v3:
- Rebased

v4:
- Fix building when DEBUG_GEM is enabled by reordering a bit.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-11-11 14:34:54 +02:00
Daniel Vetter 8be8f4a9a9 Merge tag 'gvt-next-kvmgt-framework' of https://github.com/01org/gvt-linux into drm-intel-next-queued
Zhenyu Wang writes:

gvt-next-kvmgt-framework

This adds initial KVMGT framework based on GVT-g MPT(Mediated Passthrough) interface.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-11-10 19:07:30 +01:00
Tvrtko Ursulin 0c40ce130e drm/i915: Trim the object sg table
At the moment we allocate enough sg table entries assuming we
will not be able to do any coalescing. But since in practice
we most often can, and more so very effectively, this ends up
wasting a lot of memory.

A simple and effective way of trimming the over-allocated
entries is to copy the table over to a new one allocated to the
exact size.

Experiments on my freshly logged and idle desktop (KDE) showed
that by doing this we can save approximately 1 MiB of RAM, or
when running a typical benchmark like gl_manhattan I have
even seen a 6 MiB saving.

More complicated techniques such as only copying the last used
page and freeing the rest are left to the reader.

v2:
 * Update commit message.
 * Use temporary sg_table on stack. (Chris Wilson)

v3:
 * Commit message update.
 * Comment added.
 * Replace memcpy with copy assignment.
   (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478704423-7447-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-11-10 09:29:25 +00:00
Jike Song f30437c5e7 drm/i915/gvt: add KVMGT support
KVMGT is the MPT implementation based on VFIO/KVM. It provides
a kvmgt_mpt ops to gvt for vGPU access mediation, e.g. to
mediate and emulate the MMIO accesses, to inject interrupts
to vGPU user, to intercept the GTT writing and replace it with
DMA-able address, to write-protect guest PPGTT table for
shadowing synchronization, etc. This patch provides the MPT
implementation for GVT, not yet functional due to theabsence
of mdev.

It's built as kvmgt.ko, depends on vfio.ko, kvm.ko and mdev.ko,
and being required by i915.ko. To not introduce hard dependency
in i915.ko, we used indirect symbol reference. But that means
users have to include kvmgt.ko into init ramdisk if their
i915.ko is included.

Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-11-10 15:45:39 +08:00
Jike Song 9ec1e66b80 drm/i915/gvt: refactor intel_gvt_io_emulation_ops to be intel_gvt_ops
There are currently 4 methods in intel_gvt_io_emulation_ops
to emulate CFG/MMIO reading/writing for intel vGPU. A possibly
better scope is: add 3 more methods for vgpu create/destroy/reset
respectively, and rename the ops to 'intel_gvt_ops', then pass
it to the MPT module (say the future kvmgt) to use: they are
all methods for external usage.

Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-11-10 15:45:15 +08:00
Jike Song 7b3343b7e8 drm/i915/gvt: allow several MPT methods to be NULL
Hypervisors are different, the MPT ops is a only superset of
all possibly supported hypervisors. There might be other way
out of the MPT to achieve same target. e.g. vfio-based kvmgt
won't provide map_gfn_to_mfn method to establish guest EPT
mapping for aperture, since it will be done in QEMU/KVM, MMIO
is also trapped elsewhere, etc.

Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-11-10 15:45:14 +08:00
Jike Song 40df6ea07a drm/i915/gvt: introduce host_init/host_exit to MPT
GVT host needs init/exit hooks to do some initialization/cleanup
work, e.g.: vfio mdev host device register/unregister.

Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-11-10 15:45:13 +08:00
Jike Song 8f89743bdd drm/i915/gvt: remove obsolete code for old kvmgt opregion
Current GVT contains some obsolete logic originally cooked to
support the old, non-vfio kvmgt, which is actually workarounds.
We don't support that anymore, so it's safe to remove it and
make a better framework.

Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-11-10 15:45:12 +08:00
Zhenyu Wang 1f31c82948 drm/i915/gvt: add intel vgpu types support
By providing predefined vGPU types, users can choose which type a vgpu
to create and use, without specifying detailed parameters.

Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
2016-11-10 15:44:54 +08:00