Commit Graph

2423 Commits

Author SHA1 Message Date
Chris Wilson a575c67617 drm/i915: Recreate vmapping even when the object is pinned
Sometimes we know we are the only user of the bo, but since we take a
protective pin_pages early on, an attempt to change the vmap on the
object is denied because it is busy. i915_gem_object_pin_map() cannot
tell from our single pin_count if the operation is safe. Instead we must
pass that information down from the caller in the manner of
I915_MAP_OVERRIDE.

This issue has existed from the introduction of the mapping, but was
never noticed as the only place where this conflict might happen is for
cached kernel buffers (such as allocated by i915_gem_batch_pool_get()).
Until recently there was only a single user (the cmdparser) so no
conflicts ever occurred. However, we now use it to allocate batches for
different operations (using MAP_WC on !llc for writes) in addition to the
existing shadow batch (using MAP_WB for reads).

We could either keep both mappings cached, or use a different write
mechanism if we detect a MAP_WB already exists (i.e. clflush
afterwards), but as we haven't seen this issue in the wild (it requires
hitting the GPU reloc path in addition to the cmdparser) for simplicity
just allow the mappings to be recreated.

v2: Include the i915_MAP_OVERRIDE bit in the enum so the compiler knows
about all the valid values.

Fixes: 7dd4f6729f ("drm/i915: Async GPU relocation processing")
Testcase: igt/gem_lut_handle # byt, completely by accident
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170828104631.8606-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-08-29 10:39:08 +01:00
Praveen Paneri 5654a1623c drm/i915: Fix FBC cfb stride programming for non X-tiled FB
When FBC is enabled for linear, legacy Y-tiled and Yf-tiled
surfaces on gen9, the cfb stride must be programmed by SW as

cfb_stride = ceiling[(at least plane width in pixels)/
		     (32 * compression limit factor)] * 8

v2: Minor fix for a build error
v3: Fixed subject, register name and platform check (Ville)
v4: Added WA details in comment (Paulo)
v5:
 - Read modified reg write to preserve other bit values (Paulo)
 - Store modified stride value in reg_params (Paulo)
 - Keep GLK out of the WA (Paulo)
v6:
 - added additional field in reg_params for gen9_wa_cfb_stride (Paulo)
 - Used appropriate bit mask while writing the register (Paulo)
v7 (from Paulo):
 - Fix coding style and spacing issues.
 - Mask the old values before writing.
 - Bikeshed comments and unnecessary checks.

Signed-off-by: Praveen Paneri <praveen.paneri@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1502389833-32621-1-git-send-email-praveen.paneri@intel.com
2017-08-25 21:13:33 -03:00
Jani Nikula cc9985893a drm/i915/bios: throw away high level child device union
All the child device config fields, including legacy, are now available
in the same struct, so use it for everything.

As this change touches plenty of code with "p_child", rename them to
"child" while at it. Also do some simple unification and constification
where not intrusive. This in the name of avoiding extra cleanup churn
for the same lines as here.

No functional changes.

Cc: Animesh Manna <animesh.manna@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/103300a9ae8629624619fc8df2c533e745cc5a78.1503600621.git.jani.nikula@intel.com
2017-08-25 16:18:20 +03:00
Chris Wilson 66df1014ef drm/i915: Keep a small stash of preallocated WC pages
We use WC pages for coherent writes into the ppGTT on !llc
architectures. However, to create a WC page requires a stop_machine(),
i.e. is very slow. To compensate we currently keep a per-vm cache of
recently freed pages, but we still see the slow startup of new contexts.
We can amoritize that cost slightly by allocating WC pages in small
batches (PAGEVEC_SIZE == 14) and since creating a WC page implies a
stop_machine() there is no penalty for keeping that stash global.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170822173828.5932-1-chris@chris-wilson.co.uk
2017-08-22 19:43:00 +01:00
Daniel Vetter a42894ebb5 drm/i915: Update DRIVER_DATE to 20170818
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-08-18 22:40:45 +02:00
Chris Wilson d1b48c1e71 drm/i915: Replace execbuf vma ht with an idr
This was the competing idea long ago, but it was only with the rewrite
of the idr as an radixtree and using the radixtree directly ourselves,
along with the realisation that we can store the vma directly in the
radixtree and only need a list for the reverse mapping, that made the
patch performant enough to displace using a hashtable. Though the vma ht
is fast and doesn't require any extra allocation (as we can embed the node
inside the vma), it does require a thread for resizing and serialization
and will have the occasional slow lookup. That is hairy enough to
investigate alternatives and favour them if equivalent in peak performance.
One advantage of allocating an indirection entry is that we can support a
single shared bo between many clients, something that was done on a
first-come first-serve basis for shared GGTT vma previously. To offset
the extra allocations, we create yet another kmem_cache for them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-5-chris@chris-wilson.co.uk
2017-08-18 11:59:02 +01:00
Chris Wilson f2f5c0610f drm/i915: Don't use MI_STORE_DWORD_IMM on Sandybridge/vcs
MI_STORE_DWORD_IMM just doesn't work on the video decode engine under
Sandybridge, so refrain from using it. Then switch the selftests over to
using the now common test prior to using MI_STORE_DWORD_IMM.

Fixes: 7dd4f6729f ("drm/i915: Async GPU relocation processing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.13-rc1+
Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-08-18 11:55:02 +01:00
Jani Nikula ab3595bc4f drm/i915/opregion: let user specify override VBT via firmware load
Sometimes it would be most enlightening to debug systems by replacing
the VBT to be used. For example, in the referenced bug the BIOS provides
different VBT depending on the boot mode (UEFI vs. legacy). It would be
interesting to try the failing boot mode with the VBT from the working
boot, and see if that makes a difference.

Add a module parameter to load the VBT using the firmware loader, not
unlike the EDID firmware mechanism.

As a starting point for experimenting, one can pick up the BIOS provided
VBT from /sys/kernel/debug/dri/0/i915_opregion/i915_vbt.

v2: clarify firmware load return value check (Bob)

v3: kfree the loaded firmware blob

References: https://bugs.freedesktop.org/show_bug.cgi?id=97822#c83
Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817115209.25912-1-jani.nikula@intel.com
2017-08-17 16:34:54 +03:00
Chris Wilson b805014897 drm/i915: Handle full s64 precision for wait-ioctl
The wait-ioctl is optionally supplied a timeout with nanosecond
precision in a s64 field. We use nsecs_to_jiffies64() to convert that
into the jiffies consumed by the scheduler, but internally
nsecs_to_jiffies64() does not guard against overflow (as it's purpose is
for use by the scheduler and not drivers!). So we must guard against the
overflow ourselves, and in the process note that we may then return
much earlier than the timeout selected by the user, so don't report
ETIME unless we do hit the timeout. (Woe betold us though if the user
waits for a year (32bit) and the request is still not complete!)

v2: Refine overflow detection (to not include an overffow itself)

Reported-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811105731.9482-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-08-15 16:25:25 +01:00
Chris Wilson b8f55be644 drm/i915: Split obj->cache_coherent to track r/w
Another month, another story in the cache coherency saga. This time, we
come to the realisation that i915_gem_object_is_coherent() has been
reporting whether we can read from the target without requiring a cache
invalidate; but we were using it in places for testing whether we could
write into the object without requiring a cache flush. So split the
tracking into two, one to decide before reads, one after writes.

See commit e27ab73d17 ("drm/i915: Mark CPU cache as dirty on every
transition for CPU writes") for the previous entry in this saga.

v2: Be verbose
v3: Remove unused function (i915_gem_object_is_coherent)
v4: Fix inverted coherency check prior to execbuf (from v2)
v5: Add comment for nasty code where we are optimising on gcc's behalf.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101109
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101555
Testcase: igt/kms_mmap_write_crc
Testcase: igt/kms_pwrite_crc
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Tested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811111116.10373-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-08-15 15:46:57 +01:00
Daniel Vetter baa68f6e5b Merge tag 'gvt-next-2017-08-15' of https://github.com/01org/gvt-linux into drm-intel-next-queued
gvt-next-2017-08-15

gvt update for 4.14
- MMIO save/restore optimization (Changbin)
- Split workload scan vs. dispatch for more parallel exec (Ping)
- vGPU full 48bit ppgtt support (Joonas, Tina)
- vGPU hw id expose for perf (Zhenyu)
- other misc cleanup and fixes

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815023940.skhjfcsyrao7axqi@zhen-hp.sh.intel.com
2017-08-15 12:52:39 +02:00
Tina Zhang 8a4ab66f38 drm/i915: Enable guest i915 full ppgtt functionality
Enable the guest i915 full ppgtt functionality when host can provide this
capability. vgt_caps is introduced to guest i915 driver to get the vgpu
capabilities from the device model. VGT_CPAS_FULL_PPGTT is one of the
capabilities type to let guest i915 dirver know that the guest i915 full
ppgtt is supported by device model.

Notice that the minor version of pvinfo isn't bumped because of this
vgt_caps introduction, due to older guest would be broken by simply
increasing the pvinfo version. Although the pvinfo minor version doesn't
increase, the compatibility won't be blocked. The compatibility is ensured
by checking the value of caps field in pvinfo. Zero means no full ppgtt
support and BIT(2) means this feature is provided.

Changes since v1:
- Use u32 instead of uint32_t (Joonas)
- Move VGT_CAPS_FULL_PPGTT introduction to this patch and use #define
  instead of enum (Joonas)
- Rewrite the vgpu full ppgtt capability checking logic. (Joonas)
- Some coding style refine. (Joonas)

Changes since v2:
- Divide the whole patch set into two separate patch series, with one
  patch in i915 side to check guest i915 full ppgtt capability and enable
  it when this capability is supported by the device model, and the other
  one in gvt side which fixs the blocking issue and enables the device
  model to provide the capability to guest. And this patch focuses on guest
  i915 side. (Joonas)
- Change the title from "introduce vgt_caps to pvinfo" to
  "Enable guest i915 full ppgtt functionality". (Tina)

Change since v3:
- Add some comments about pvinfo caps and version. (Joonas)

Change since v4:
- Tested by Tina Zhang.

Change since v5:
- Add limitation about supporting 32bit full ppgtt.

Change since v6:
- Change the fallback to 48bit full ppgtt if i915.ppgtt_enable=2. (Zhenyu)

Change in v9:
- Remove the fixme comment due to no plan for 32bit full ppgtt
  support. (Zhenyu)
- Reorder the patch-set to fix compiling issue with git-bisect. (Zhenyu)
- Add print log when forcing guest 48bit full ppgtt. (Zhenyu)

v10:
- Update against Joonas's has_full_ppgtt and has_full_48bit_ppgtt disconnect
  change. (Zhenyu)

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> # in v2
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-15 10:12:48 +08:00
Daniel Vetter 9db529aac9 drm/i915: More surgically unbreak the modeset vs reset deadlock
There's no reason to entirely wedge the gpu, for the minimal deadlock
bugfix we only need to unbreak/decouple the atomic commit from the gpu
reset. The simplest way to fix that is by replacing the
unconditional fence wait a the top of commit_tail by a wait which
completes either when the fences are done (normal case, or when a
reset doesn't need to touch the display state). Or when the gpu reset
needs to force-unblock all pending modeset states.

The lesser source of deadlocks is when we try to pin a new framebuffer
and run into a stall. There's a bunch of places this can happen, like
eviction, changing the caching mode, acquiring a fence on older
platforms. And we can't just break the depency loop and keep going,
the only way would be to break out and restart. But the problem with
that approach is that we must stall for the reset to complete before
we grab any locks, and with the atomic infrastructure that's a bit
tricky. The only place is the ioctl code, and we don't want to insert
code into e.g. the BUSY ioctl. Hence for that problem just create a
critical section, and if any code is in there, wedge the GPU. For the
steady-state this should never be a problem.

Note that in both cases TDR itself keeps working, so from a userspace
pov this trickery isn't observable. Users themselvs might spot a short
glitch while the rendering is catching up again, but that's still
better than pre-TDR where we've thrown away all the rendering,
including innocent batches. Also, this fixes the regression TDR
introduced of making gpu resets deadlock-prone when we do need to
touch the display.

One thing I noticed is that gpu_error.flags seems to use both our own
wait-queue in gpu_error.wait_queue, and the generic wait_on_bit
facilities. Not entirely sure why this inconsistency exists, I just
picked one style.

A possible future avenue could be to insert the gpu reset in-between
ongoing modeset changes, which would avoid the momentary glitch. But
that's a lot more work to implement in the atomic commit machinery,
and given that we only need this for pre-g4x hw, of questionable
utility just for the sake of polishing gpu reset even more on those
old boxes. It might be useful for other features though.

v2: Rebase onto 4.13 with a s/wait_queue_t/struct wait_queue_entry/.

v3: Really emabarrassing fixup, I checked the wrong bit and broke the
unbreak/wakeup logic.

v4: Also handle deadlocks in pin_to_display.

v5: Review from Michel:
- Fixup the BUILD_BUG_ON
- Don't forget about the overlay

Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v2)
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170808080828.23650-3-daniel.vetter@ffwll.ch
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
2017-08-14 17:03:36 +02:00
Rodrigo Vivi f761bef2f3 drm/i915: Introduce intel_hpd_pin function.
The idea is to have an unique place to decide the pin-port
per platform.

So let's create this function now without any functional
change. Just adding together code from hdmi and dp together.

v2: Add missing pin for port A.
v3: Fix typo on subject.
    Avoid behaviour change so add WARN_ON and return
    if port A on HDMI. (by DK).

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811182650.14327-2-rodrigo.vivi@intel.com
2017-08-11 11:53:47 -07:00
Rodrigo Vivi 256cfdde42 drm/i915: Simplify hpd pin to port
We will soon need to make that pin port association per
platform, so let's try to simplify it beforehand.

Also we are moving the backwards port to pin
here as well so let's use a standardized way.

One extra possibility here would be to add a
MISSING_CASE along with PORT_NONE, but I don't want
to change this behaviour for now.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811182650.14327-1-rodrigo.vivi@intel.com
2017-08-11 11:53:31 -07:00
Rodrigo Vivi 23247d715d drm/i915: Fix PCH names for KBP and CNP.
No functional change.

KBP was based on SPT and spec wasn't clear about the full name.
There was the initial point of the "Point" confusion.

Later the split with Coffee Lake and Cannon Lake both using CNP
and also some uncertainty from the specs we had at that time
made us to propagated the mistake along.

So, let's fix this now and avoid propagating these wrong
"points".

Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170731185220.758-1-rodrigo.vivi@intel.com
2017-08-03 12:29:56 -07:00
Lionel Landwerlin f89823c212 drm/i915/perf: Implement I915_PERF_ADD/REMOVE_CONFIG interface
The motivation behind this new interface is expose at runtime the
creation of new OA configs which can be used as part of the i915 perf
open interface. This will enable the kernel to learn new configs which
may be experimental, or otherwise not part of the core set currently
available through the i915 perf interface.

v2: Drop DRM_ERROR for userspace errors (Matthew)
    Add padding to userspace structure (Matthew)
    s/guid/uuid/ (Matthew)

v3: Use u32 instead of int to iterate through registers (Matthew)

v4: Lock access to dynamic config list (Lionel)

v5: by Matthew:
    Fix uninitialized error values
    Fix incorrect unwiding when opening perf stream
    Use kmalloc_array() to store register
    Use uuid_is_valid() to valid config uuids
    Declare ioctls as write only
    Check padding members are set to 0
    by Lionel:
    Return ENOENT rather than EINVAL when trying to remove non
    existing config

v6: by Chris:
    Use ref counts for OA configs
    Store UUID in drm_i915_perf_oa_config rather then using pointer
    Shuffle fields of drm_i915_perf_oa_config to avoid padding

v7: by Chris
    Rename uapi pointers fields to end with '_ptr'

v8: by Andrzej, Marek, Sebastian
    Update register whitelisting
    by Lionel
    Add more register names for documentation
    Allow configuration programming in non-paranoid mode
    Add support for value filter for a couple of registers already
    programmed in other part of the kernel

v9: Documentation fix (Lionel)
    Allow writing WAIT_FOR_RC6_EXIT only on Gen8+ (Andrzej)

v10: Perform read access_ok() on register pointers (Lionel)

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Andrzej Datczuk <andrzej.datczuk@intel.com>
Reviewed-by: Andrzej Datczuk <andrzej.datczuk@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170803165812.2373-2-lionel.g.landwerlin@intel.com
2017-08-03 18:19:53 +01:00
Lionel Landwerlin 701f8231a2 drm/i915/perf: prune OA configs
In the following commit we'll introduce loadable userspace
configs. This change reworks how configurations are handled in the
perf driver and retains only the test configurations in kernel space.

We now store the test config in dev_priv and resolve the id only once
when opening the perf stream. The OA config is then handled through a
pointer to the structure holding the configuration details.

v2: Rework how test configs are handled (Lionel)

v3: Use u32 to hold number of register (Matthew)

v4: Removed unused dev_priv->perf.oa.current_config variable (Matthew)

v5: Lock device when accessing exclusive_stream (Lionel)

v6: Ensure OACTXCONTROL is always reprogrammed (Lionel)

v7: Switch a couple of index variable from int to u32 (Matthew)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170803165812.2373-3-lionel.g.landwerlin@intel.com
2017-08-03 18:18:05 +01:00
Daniel Vetter d0604a24d4 drm/i915: Update DRIVER_DATE to 20170731
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-31 10:08:11 +02:00
Paulo Zanoni 525a4f9382 drm/i915/fbc: add comments to the FBC auxiliary structs
I wrote this code an year and a half ago and I couldn't exactly
remember the main differences of these two structures when reviewing a
new FBC patch. Add some comments to help explain what's the purpose of
each struct.

For the record, the original commits are:
 b183b3f143 ("drm/i915/fbc: introduce struct intel_fbc_reg_params")
 aaf78d276b ("drm/i915/fbc: introduce struct intel_fbc_state_cache")

Cc: Praveen Paneri <praveen.paneri@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20170714193822.12121-1-paulo.r.zanoni@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:58 +02:00
Chris Wilson 535275d323 drm/i915: Squelch reset messages during selftests
During our selftests, we try reseting the GPU tens of thousands of
times, flooding the dmesg with our reset spam drowning out any potential
warnings. Add an option to i915_reset()/i915_reset_engine() to specify a
quiet reset for selftesting.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170721123238.16428-19-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:57 +02:00
Imre Deak b2891eb253 drm/i915/hsw+: Add has_fuses power well attribute
The pattern of a power well backing a set of fuses whose initialization
we need to wait for during power well enabling is common to all GEN9+
platforms. Adding support for this to the HSW power well enable helper
allows us to use the HSW/BDW power well code for GEN9+ as well in a
follow-up patch.

v2:
- Use an enum for power gates instead of raw numbers. (Ville)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170711204236.5618-6-imre.deak@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:53 +02:00
Imre Deak 001bd2cb17 drm/i915/hsw, bdw: Add irq_pipe_mask, has_vga power well attributes
The pattern of a power well backing a set of pipe IRQ or VGA
functionality applies to all HSW+ platforms. Using power well attributes
instead of platform checks to decide whether to init/reset pipe IRQs and
VGA correspondingly is cleaner and it allows us to unify the HSW/BDW and
GEN9+ power well code in follow-up patches.

Also use u8 for pipe_mask in related helpers to match the type in the
power well struct.

v2:
- Use u8 instead of u32 for irq_pipe_mask. (Ville)

v3:
- Use u8 for pipe_mask in related helpers too for clarity.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170712155413.29839-1-imre.deak@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:53 +02:00
Imre Deak b5565a2efc drm/i915/bxt, glk: Give a proper name to the power well struct phy field
Follow-up patches will add new fields to the i915_power_well struct that
are specific to the hsw_power_well_ops helpers. Prepare for this by
changing the generic 'data' field to a union of platform specific
structs.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1499352040-8819-8-git-send-email-imre.deak@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:51 +02:00
Imre Deak 438b8dc457 drm/i915: Unify power well ID enums
Atm, the power well IDs are defined in separate platform specific enums,
which isn't ideal for the following reasons:
- the IDs are used by helpers like lookup_power_well() in a platform
  independent way
- the always-on power well is used by multiple platforms and so needs
  now separate IDs, although these IDs refer to the same thing

To make things more consistent use a single enum instead of the two
separate ones, listing the IDs per platform (or set of very similar
platforms like all GEN9/10). Replace the separate always-on power
well IDs with a single ID.

While at it also add a note clarifying the distinction between regular
power wells that follow a common programming pattern and custom ones
that are programmed in some other way. The IDs for regular power wells
need to stay fixed, since they also define the request and state HW flag
positions in their corresponding power well control register(s).

v2:
- Add comment about id to req,status bit mapping to the enum. (Rodrigo)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170711204236.5618-1-imre.deak@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:49 +02:00
Chris Wilson 77b25a972b drm/i915: Make i915_gem_context_mark_guilty() safe for unlocked updates
Since we make call i915_gem_context_mark_guilty() concurrently when
resetting different engines in parallel, we need to make sure that our
updates are safe for the unlocked access.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170721123238.16428-12-chris@chris-wilson.co.uk
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:47 +02:00
Daniel Vetter 64282ea2d2 Merge airlied/drm-next into drm-intel-next-queued
Resync with upstream to avoid git getting too badly confused. Also, we
have a conflict with the drm_vblank_cleanup removal, which cannot be
resolved by simply taking our side. Bake that in properly.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:33:49 +02:00
Daniel Vetter af05559854 Merge airlied/drm-next into drm-misc-next
I need this to be able to apply the deferred fbdev setup patches, I
need the relevant prep work that landed through the drm-intel tree.

Also squash in conflict fixup from Laurent Pinchart.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-26 13:43:33 +02:00
Daniel Vetter 8b5d27b911 drm/i915: Remove intel_flip_work infrastructure
This gets rid of all the interactions between the legacy flip code and
the modeset code. Yay!

This highlights an ommission in the atomic paths, where we fail to
apply a boost to the pending rendering when we miss the target vblank.
But the existing code is still dead and can be removed.

v2: Note that the boosting doesn't work in atomic (Chris).

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170720175754.30751-7-daniel.vetter@ffwll.ch
2017-07-20 22:45:36 +02:00
Daniel Vetter afa8ce5b30 drm/i915: Nuke legacy flip queueing code
Just a very minimal patch to nuke that code. Lots of the flip
interrupt handling stuff is still around.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170719125502.25696-2-daniel.vetter@ffwll.ch
2017-07-20 10:45:27 +02:00
Chris Wilson 3b19f16a55 drm/i915: Drain the device workqueue on unload
Workers on the i915->wq may rearm themselves so for completeness we need
to replace our flush_workqueue() with a call to drain_workqueue() before
unloading the device.

v2: Reinforce the drain_workqueue with an preceding rcu_barrier() as a
few of the tasks that need to be drained may first be armed by RCU.

References: https://bugs.freedesktop.org/show_bug.cgi?id=101627
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170718134124.14832-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-07-19 13:19:24 +01:00
Daniel Vetter 58947144af drm/i915: Update DRIVER_DATE to 20170717
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-17 09:06:19 +02:00
Kumar, Mahesh 6ea593c029 drm/i915: Addition wrapper for fixed16.16 operation
This patch introduce addition wrapper for fixed point 16.16 operations.
Which will be used by later patches to avoid direct member variables
access of fixed_16_16_t structure.

add_fixed16 : takes 2 fixed_16_16_t variable & returns fixed_16_16_t
add_fixed16_u32 : takes fixed_16_16_t & u32 variable & returns fixed_16_16_t

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170705143154.32132-5-mahesh1.kumar@intel.com
2017-07-13 16:38:39 +02:00
Kumar, Mahesh eac2cb81fb drm/i915: cleanup fixed-point wrappers naming
This patch make naming of fixed-point wrappers consistent
operation_<any_post_operation>_<1st operand>_<2nd operand>
also shorten the name for fixed_16_16 to fixed16

s/u32_to_fixed_16_16/u32_to_fixed16
s/fixed_16_16_to_u32/fixed16_to_u32
s/fixed_16_16_to_u32_round_up/fixed16_to_u32_round_up
s/min_fixed_16_16/min_fixed16
s/max_fixed_16_16/max_fixed16
s/mul_u32_fixed_16_16/mul_u32_fixed16
s/fixed_16_16_div/div_fixed16

Changes Since V1:
 - Split the patch in more logical patches (Maarten)
Changes Since V2:
 - Rebase

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170705143154.32132-4-mahesh1.kumar@intel.com
2017-07-13 16:38:22 +02:00
Kumar, Mahesh eed02a7b53 drm/i915: Always perform internal fixed16 division in 64 bits
This patch combines fixed_16_16_div & fixed_16_16_div_u64 wrappers.
And new fixed_16_16_div wrapper always performs division operation in
u64 internally, to avoid any data loss which was happening in earlier
version of wrapper.
earlier wrapper was converting u32 to fixed16 in 32 bit so we were
losing 16-MSB data.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170705143154.32132-3-mahesh1.kumar@intel.com
[mlankhorst: Fix typo in commit message.]
2017-07-13 16:37:20 +02:00
Kumar, Mahesh 07ab976d19 drm/i915: take-out common clamping code of fixed16 wrappers
This patch creates a new function for clamping u64 to fixed16.
And make use of this function in other fixed16 wrappers.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170705143154.32132-2-mahesh1.kumar@intel.com
2017-07-13 16:37:02 +02:00
Daniel Vetter 953152253e main drm pull for v4.13
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZYseIAAoJEAx081l5xIa+85kP/0zKzKKVzZXSXG2TAGb5jNfk
 Ex+TELG8tWk9KBxA7lEE5c0WEsnP79cNoXZLQu8wlUzO8+kwQK5Bz0zgNUkpSuo1
 RthwdsxBQX1++UxB+HoSG+dOa7hkKVqlgQR3z9qyhsBXzetkJV0DoYcpMV0A1EWd
 6Jzt+AvCShVkcW+21LqHPlc5EIVewrDMoA3oU6aYCLhyAOUTVvvQB2ML8YApH7TM
 JrSrzCFHTrQEBbGUrZQhzR0sZzZzk9byntb/I/mdVbHeCyIHiL8sC4PfWSOyyazm
 GkPnA8G3aFAY9haBRz9jG/VBr1yVb0mCBjkWQ1lGfIAOCDDSc+d7PDXdG+i4AewK
 jZheXlrDIdGgmJLy4W3rdEqJvdf7UQHZOs8594OL19l4+FxCTrol1JSHSMeavCvr
 8bUNil9Jb/ONU/wmp+q55U0k4TCTyerUA7gKnuaJAwBvd4n78/PKmQnbrWinDyJc
 GQXp6zESk9bKt5DXSnVZuVf4POTzpuAsQkkfX1V2y145EHTQYfS3jLENWqEjyZUy
 QtKCHZvRkJfGaFU4Pr+vBo9Iu1GlA5OiOv08QadldTT4OxUI0T6yaLDobHCQfKPE
 sc3wCuCM+/dAnqoKDcGC4hAmF8zDdO0kw65P2m7uC6T9Jm1G35CioKbzo+fzUhuL
 fg5TBpbp2Wwe2oPA5iBm
 =2S5N
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.13' into drm-intel-next-queued

Resync with the main drm-next pull request for 4.13. What we really
need is to fully resync with pending drm-misc, but that's not yet
possible due to the still ongoing merge window.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2017-07-10 21:56:39 +02:00
Daniel Vetter 666b7cdc69 drm/i915: Drop FBDEV #ifdev in mst code
Since

commit a03fdcb186
Author: Archit Taneja <architt@codeaurora.org>
Date:   Wed Aug 5 12:28:57 2015 +0530

    drm: Add top level Kconfig option for DRM fbdev emulation

this is properly handled using dummy functions. This essentially
undoes

commit 7296c849bf
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Jul 22 20:10:28 2014 +1000

    drm/i915: fix build without fbde

v2: We also need to drop the #ifdef from headers. Seems like a small
price to pay for slightly cleaner code.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170704151833.17304-3-daniel.vetter@ffwll.ch
2017-07-06 10:00:33 +02:00
Manasi Navare c99a259b4b drm/i915/edp: Add a T12 panel delay quirk to fix DP AUX CH timeouts
This patch fixes the DP AUX CH timeouts observed during CI IGT
tests thus fixing the CI failures. This is done by adding a
quirk for a particular PCI device that requires the panel power
cycle delay (T12) to be set to 800ms which is 300msecs more than
the minimum value specified in the eDP spec. So a quirk is
implemented for that specific PCI device.

v4:
* Add Bugzilla links for FDO bugs in the commit message (Ville, Jani)
v3:
* Change some comments, specify the delay as 800 * 10 (Ville)
v2:
* Change the function and variable names to from PPS_T12_
to _T12 since it is a T12 delay (Clint)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101144
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101154
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101167
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101515
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Clinton Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Clinton Taylor <clinton.a.taylor@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1498840428-23176-1-git-send-email-manasi.d.navare@intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-07-04 16:57:44 +03:00
Daniel Vetter 2c4b851933 drm/i915: Update DRIVER_DATE to 20170703
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-03 08:40:41 +02:00
Chris Wilson 7b92c1bd05 drm/i915: Avoid keeping waitboost active for signaling threads
Once a client has requested a waitboost, we keep that waitboost active
until all clients are no longer waiting. This is because we don't
distinguish which waiter deserves the boost. However, with the advent of
fence signaling, the signaler threads appear as waiters to the RPS
interrupt handler. So instead of using a single boolean to track when to
keep the waitboost active, use a counter of all outstanding waitboosted
requests.

At this point, I have removed all vestiges of the rate limiting on
clients. Whilst this means that compositors should remain more fluid,
it also means that boosts are more prevalent. See commit b29c19b645
("drm/i915: Boost RPS frequency for CPU stalls") for a longer discussion
on the pros and cons of both approaches.

A drawback of this implementation is that it requires constant request
submission to keep the waitboost trimmed (as it is now cancelled when the
request is completed). This will be fine for a busy system, but near
idle the boosts may be kept for longer than desired (effectively tens of
vblanks worstcase) and there is a reliance on rc6 instead.

v2: Remove defunct rps.client_lock

Reported-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170628123548.9236-1-chris@chris-wilson.co.uk
2017-06-28 15:23:12 +01:00
Ville Syrjälä c5e855d078 drm/i915: Always use 9 bits of the LPC bridge device ID for PCH detection
Make the code less confusiong by always using the top 9 bits of the
LPC bridge device ID to detect the PCH type. We need to add a bit of
new code for WPT, and we need to adjust the KBP ID as well. All the
other pre-CNP IDs are fine as is.

The virtualization cases I think are fine. These P2X and P3X IDs
actually just look like the old PIIX4 and PIIX3 IDs to me. Not sure
why they're not called PIIX3/4 though. The qemu one has a comment
saying the full ID is 0x2918 which is fine with 9 bits.

v2: Keep the CNP ID as 0xa300 (DK)

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170621174944.23306-1-ville.syrjala@linux.intel.com
2017-06-22 19:08:35 +03:00
Ville Syrjälä 243dec586f drm/i915: Document that PPT==CPT and WPT==LPT
For our purposes PPT is equivalent to CPT, and WPT is equivalent to
LPT. Document that fact.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620130310.13245-4-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2017-06-22 19:08:35 +03:00
Dave Airlie 305b9eddee Merge tag 'drm-intel-next-2017-06-19' of git://anongit.freedesktop.org/git/drm-intel into drm-next
Final pile of features for 4.13

New uabi:
- batch bo in first slot, for faster execbuf assembly in userspace
  (Chris Wilson)
- (sub)slice getparam, needed for mesa perf support (Robert Bragg)

First pile of patches for cnl/cfl support, maintained by Rodrigo but
with lots of contributions from others. Still incomplete since public
review still ongoing.

Features/refactoring:
- Make execbuf faster (Chris Wilson), a pile of series to make execbuf
  buffer handling have fewer passes, use less list walking, postpone
  more work to async workers and shuffle buffers less, all to make the
  common case much faster (in some cases at least).
- cold boot support for glk dsi (Madhav Chauhan)
- Clean up pipe A quirk and related old platform hacks (Ville)
- perf sampling support for kbl/glk (Lionel)
- perf cleanups (Robert Bragg)
- wire atomic state to backlight code, to avoid pipe lookup hacks
  (Maarten)
- reduce request waiting latency/overhead to remove the spinning and
  associated cpu cycle wasting (Chris)
- fix 90/270 rotation wm computation (Ville)
- new ddb allocation algo for skl (Kumar Mahesh)
- fix regression due to system suspend optimiazatino (Imre)
- the usual pile of small cleanups and refactors all over

GVT updates contained in this tag:
- optimization for per-VM mmio save/restore (Changbin)
- optimization for mmio hash table (Changbin)
- scheduler optimization with event (Ping)
- vGPU reset refinement (Fred)
- other misc refactor and cleanups, etc.

* tag 'drm-intel-next-2017-06-19' of git://anongit.freedesktop.org/git/drm-intel: (170 commits)
  drm/i915: Update DRIVER_DATE to 20170619
  drm/i915/cfl: Introduce Coffee Lake workarounds.
  drm/i915: Store 9 bits of PCI Device ID for platforms with a LP PCH
  drm/i915: Stash a pointer to the obj's resv in the vma
  drm/i915: Async GPU relocation processing
  drm/i915: Allow execbuffer to use the first object as the batch
  drm/i915: Wait upon userptr get-user-pages within execbuffer
  drm/i915: First try the previous execbuffer location
  drm/i915: Store a persistent reference for an object in the execbuffer cache
  drm/i915: Eliminate lots of iterations over the execobjects array
  drm/i915: Disable EXEC_OBJECT_ASYNC when doing relocations
  drm/i915: Pass vma to relocate entry
  drm/i915: Store a direct lookup from object handle to vma
  drm/i915: Fix retrieval of hangcheck stats
  drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty
  drm/i915: Mark CPU cache as dirty on every transition for CPU writes
  drm/i915: Make i915_vma_destroy() static
  drm/i915: Actually attach the tv_format property to the SDVO connector
  Revert "drm/i915/skl: New ddb allocation algorithm"
  drm/i915/glk: Add cold boot sequence for GLK DSI
  ...
2017-06-21 08:55:22 +10:00
Michel Thierry 702c8f8e5d drm/i915: Add engine reset count to error state
Driver maintains count of how many times a given engine is reset, useful to
capture this in error state also. It gives an idea of how engine is coping
up with the workloads it is executing before this error state.

A follow-up patch will provide this information in debugfs.

v2: s/engine_reset/reset_engine/ (Chris)
    Define count as unsigned int (Tvrtko)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615201828.23144-7-michel.thierry@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620095751.13127-7-chris@chris-wilson.co.uk
2017-06-20 21:00:22 +01:00
Michel Thierry a1ef70e144 drm/i915: Add support for per engine reset recovery
This change implements support for per-engine reset as an initial, less
intrusive hang recovery option to be attempted before falling back to the
legacy full GPU reset recovery mode if necessary. This is only supported
from Gen8 onwards.

Hangchecker determines which engines are hung and invokes error handler to
recover from it. Error handler schedules recovery for each of those engines
that are hung. The recovery procedure is as follows,
 - identifies the request that caused the hang and it is dropped
 - force engine to idle: this is done by issuing a reset request
 - reset the engine
 - re-init the engine to resume submissions.

If engine reset fails then we fall back to heavy weight full gpu reset
which resets all engines and reinitiazes complete state of HW and SW.

v2: Rebase.
v3: s/*engine_reset*/*reset_engine*/; freeze engine and irqs before
calling i915_gem_reset_engine (Chris).
v4: Rebase, modify i915_gem_reset_prepare to use a ring mask and
reuse the function for reset_engine.
v5: intel_reset_engine_start/cancel instead of request/unrequest_reset.
v6: Clean up reset_engine function to not require mutex, i.e. no need to call
revoke/restore_fences and _retire_requests (Chris).
v7: Remove leftovers from v5, i.e. no need to disable irq, hold
forcewake or wakeup the handoff bit (Chris).
v8: engine_retire_requests should be (and it was) static; explain that
we have to re-init the engine after reset, which is why the init_hw call
is needed; check reset-in-progress flag (Chris).
v9: Rebase, include code to pass the active request to gem_reset_engine
(as it is already done in full reset). Remove unnecessary
intel_reset_engine_start/cancel, these are executed as part of the
reset.
v10: Rebase, use the right I915_RESET_ENGINE flag.
v11: Fixup to call reset_finish_engine even on error.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615201828.23144-6-michel.thierry@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620095751.13127-6-chris@chris-wilson.co.uk
2017-06-20 21:00:16 +01:00
Michel Thierry 142bc7d99b drm/i915: Modify error handler for per engine hang recovery
This is a preparatory patch which modifies error handler to do per engine
hang recovery. The actual patch which implements this sequence follows
later in the series. The aim is to prepare existing recovery function to
adapt to this new function where applicable (which fails at this point
because core implementation is lacking) and continue recovery using legacy
full gpu reset.

A helper function is also added to query the availability of engine
reset. A subsequent patch will add the capability to query which type
of reset is present (engine -> full -> no-reset) via the get-param
ioctl.

It has been decided that the error events that are used to notify user of
reset will only be sent in case if full chip reset. In case of just
single (or multiple) engine resets, userspace won't be notified by these
events.

Note that this implementation of engine reset is for i915 directly
submitting to the ELSP, where the driver manages the hang detection,
recovery and resubmission. With GuC submission these tasks are shared
between driver and firmware; i915 will still responsible for detecting a
hang, and when it does it will have to request GuC to reset that Engine and
remind the firmware about the outstanding submissions. This will be
added in different patch.

v2: rebase, advertise engine reset availability in platform definition,
add note about GuC submission.
v3: s/*engine_reset*/*reset_engine*/. (Chris)
Handle reset as 2 level resets, by first going to engine only and fall
backing to full/chip reset as needed, i.e. reset_engine will need the
struct_mutex.
v4: Pass the engine mask to i915_reset. (Chris)
v5: Rebase, update selftests.
v6: Rebase, prepare for mutex-less reset engine.
v7: Pass reset_engine mask as a function parameter, and iterate over the
engine mask for reset_engine. (Chris)
v8: Use i915.reset >=2 in has_reset_engine; remove redundant reset
logging; add a reset-engine-in-progress flag to prevent concurrent
resets, and avoid dual purposing of reset-backoff. (Chris)
v9: Support reset of different engines in parallel (Chris)
v10: Handle reset-engine flag locking better (Chris)
v11: Squash in reporting of per-engine-reset availability.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ian Lister <ian.lister@intel.com>
Signed-off-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615201828.23144-4-michel.thierry@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620095751.13127-5-chris@chris-wilson.co.uk
2017-06-20 21:00:11 +01:00
Chris Wilson 1acfc104cd drm/i915: Enable rcu-only context lookups
Whilst the contents of the context is still protected by the big
struct_mutex, this is not much of an improvement. It is just one tiny
step towards reducing our BKL.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620110547.15947-3-chris@chris-wilson.co.uk
2017-06-20 17:13:54 +01:00
Chris Wilson 5f09a9c8ab drm/i915: Allow contexts to be unreferenced locklessly
If we move the actual cleanup of the context to a worker, we can allow
the final free to be called from any context and avoid undue latency in
the caller.

v2: Negotiate handling the delayed contexts free by flushing the
workqueue before calling i915_gem_context_fini() and performing the final
free of the kernel context directly
v3: Flush deferred frees before new context allocations

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620110547.15947-2-chris@chris-wilson.co.uk
2017-06-20 17:13:47 +01:00
Chris Wilson 829a0af29f drm/i915: Group all the global context information together
Create a substruct to hold all the global context state under
drm_i915_private.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620110547.15947-1-chris@chris-wilson.co.uk
2017-06-20 17:13:40 +01:00