OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Maarten Lankhorst	7397489399	drm/i915: Fix modeset handling during gpu reset, v5. This function would call drm_modeset_lock_all, while the suspend/resume functions already have their own locking. Fix this by factoring out __intel_display_resume, and calling the atomic helpers for duplicating atomic state and disabling all crtc's during suspend. Changes since v1: - Deal with -EDEADLK right after lock_all and clean up calls to hw readout. - Always take all modeset locks so updates during gpu reset are blocked. Changes since v2: - Fix deadlock in intel_update_primary_planes. - Move WARN_ON(EDEADLK) to __intel_display_resume. - pctx -> ctx - only call __intel_display_resume on success in intel_display_resume. Changes since v3: - Rebase on top of dev_priv -> dev change. - Use drm_modeset_lock_all_ctx instead of drm_modeset_lock_all. Changes since v4 [by vsyrjala]: - Deal with skip_intermediate_wm - Update comment w.r.t. mode_config.mutex vs. ->detect() - Rebase due to INTEL_GEN() etc. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: `e2c8b8701e` ("drm/i915: Use atomic helpers for suspend, v2.") Cc: stable@vger.kernel.org Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470428910-12125-2-git-send-email-ville.syrjala@linux.intel.com	2016-08-05 23:28:27 +03:00
Ville Syrjälä	a168f5b3f1	drm/i915: Don't mark PCH underrun reporting as disabled for transcoder B/C on LPT-H Marking PCH transcoder FIFO underrun reporting as disabled for transcoder B/C on LPT-H will block us from enabling the south error interrupt. So let's only mark transcoder A underrun reporting as disabled initially. This is a little tricky to hit since you need a machine with LPT-H, and the BIOS must enable either pipe B or C at boot. Then i915 would mark the "transcoder B/C" underrun reporting as disabled and never enable it again, meaning south interrupts would never get enabled either. The only other interrupt in there is actually the poison interrupt which, if we could ever trigger it, would just result in a little error in dmesg. Here's the resulting change in SDEIMR on my HSW when I boot it with multiple displays attached: - (0x000c4004): 0xf115ffff + (0x000c4004): 0xf114ffff My previous attempt [1] tried to fix this a little differently, but Daniel requested I do this instead. [1] https://lists.freedesktop.org/archives/intel-gfx/2015-November/081420.html Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470416417-15021-1-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-08-05 20:00:17 +03:00
Ville Syrjälä	09fa8bb909	drm/i915: Add some curly braces intel_enable_pipe() looks rather confusing when one side doesn't have the curly braces, and the other one does. And what's even worse, there's another if-else inside the braceless side. Let's put braces around it to make it clear which branch goes where. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470418894-1249-1-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-08-05 20:41:34 +03:00
Chris Wilson	5a198b8c53	drm/i915: Do not overwrite the request with zero on reallocation When using RCU lookup for the request, commit `0eafec6d32` ("drm/i915: Enable lockless lookup of request tracking via RCU"), we acknowledge that we may race with another thread that could have reallocated the request. In order for the first thread not to blow up, the second thread must not clear the request completed before overwriting it. In the RCU lookup, we allow for the engine/seqno to be replaced but we do not allow for it to be zeroed. The choice we make is to either add extra checking to the RCU lookup, or embrace the inherent races (as intended). It is more complicated as we need to manually clear everything we depend upon being zero initialised, but we benefit from not emiting the memset() to clear the entire frequently allocated structure (that memset turns up in throughput profiles). And at the same time, the lookup remains flexible for future adjustments. v2: Old style LRC requires another variable to be initialize. (The danger inherent in not zeroing everything.) v3: request->batch also needs to be cleared v4: signaling.tsk is no long used unset, but pid still exists Fixes: `0eafec6d32` ("drm/i915: Enable lockless lookup of request...") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: "Goel, Akash" <akash.goel@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470731014-6894-2-git-send-email-chris@chris-wilson.co.uk	2016-08-09 10:17:27 +01:00
Chris Wilson	edf6b76f64	drm/i915: Add smp_rmb() to busy ioctl's RCU dance In the debate as to whether the second read of active->request is ordered after the dependent reads of the first read of active->request, just give in and throw a smp_rmb() in there so that ordering of loads is assured. v2: Explain the manual smp_rmb() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470731014-6894-1-git-send-email-chris@chris-wilson.co.uk	2016-08-09 10:17:26 +01:00
Chris Wilson	87b723a16d	drm/i915: Don't check for idleness before retiring after a GPU hang When we force the cleanup after a GPU hang, we want to retire all requests, or else we may leak them if truly wedged (and the GPU never advances again). Converting to the active request helpers had the issue of doing the check against busyness before reporting the request, so if we claim the GPU had hung but this engine hadn't we could potential skip the request cleanup - triggering the self-check BUG. Fixes: `dcff85c844` ("drm/i915: Enable i915_gem_wait_for_idle() ...") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470728222-10243-3-git-send-email-chris@chris-wilson.co.uk	2016-08-09 09:11:59 +01:00
Chris Wilson	385384a82c	drm/i915: Wrap the protected active RCU dereference in a helper As we do the lockdep protected RCU lookup in a couple of places, refactor that code to a common helper i915_gem_active_raw(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470728222-10243-2-git-send-email-chris@chris-wilson.co.uk	2016-08-09 09:11:59 +01:00
Chris Wilson	2e7ba01494	drm/i915: Remove unused i915_gem_active_peek_rcu() This was originally introduced to be used by the busy-ioctl, but in the end busy ioctl performed a different dance. Since there are no users, and no likely users, remove an unwanted chunk of the API. Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470728222-10243-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-08-09 09:11:39 +01:00
Daniel Vetter	c5b7e97b27	drm/i915: Update DRIVER_DATE to 20160808 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-08-08 09:37:31 +02:00
Matthew Auld	cb7f27601c	drm/i915: fix aliasing_ppgtt leak In i915_ggtt_cleanup_hw we need to remember to free aliasing_ppgtt. This fixes the following kmemleak message: unreferenced object 0xffff880213cca000 (size 8192): comm "modprobe", pid 1298, jiffies 4294745402 (age 703.930s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff817c808e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff8121f9c2>] kmem_cache_alloc_trace+0x142/0x1d0 [<ffffffffa06d11ef>] i915_gem_init_ggtt+0x10f/0x210 [i915] [<ffffffffa06d71bb>] i915_gem_init+0x5b/0xd0 [i915] [<ffffffffa069749a>] i915_driver_load+0x97a/0x1460 [i915] [<ffffffffa06a26ef>] i915_pci_probe+0x4f/0x70 [i915] [<ffffffff81423015>] local_pci_probe+0x45/0xa0 [<ffffffff81424463>] pci_device_probe+0x103/0x150 [<ffffffff81515e6c>] driver_probe_device+0x22c/0x440 [<ffffffff81516151>] __driver_attach+0xd1/0xf0 [<ffffffff8151379c>] bus_for_each_dev+0x6c/0xc0 [<ffffffff8151555e>] driver_attach+0x1e/0x20 [<ffffffff81514fa3>] bus_add_driver+0x1c3/0x280 [<ffffffff81516aa0>] driver_register+0x60/0xe0 [<ffffffff8142297c>] __pci_register_driver+0x4c/0x50 [<ffffffffa013605b>] 0xffffffffa013605b Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Fixes: `b18b6bde30` ("drm/i915/bdw: Free PPGTT struct") Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470420280-21417-1-git-send-email-matthew.auld@intel.com	2016-08-05 22:39:32 +02:00
Daniel Vetter	437c30874c	drm/i915: Update comment before i915_spin_request ~jiffie and a few usecs is 3 orders of magnitude different. A bit much. This was changed in commit `ca5b721e23` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Dec 11 11:32:58 2015 +0000 drm/i915: Limit the busy wait on requests to 5us not 10ms! But probably missed the comment since the change was non-local to the comment. v2: Polish comment more (Chris). Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470413484-23775-1-git-send-email-daniel.vetter@ffwll.ch	2016-08-05 22:30:56 +02:00
Rodrigo Vivi	4194c088df	drm/i915: Use drm official vblank_no_hw_counter callback. No functional change. Instead of defining a new empty function let's use what is available on drm. It gets cleaner, and easy to read, and understand. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2016-08-05 09:40:07 -07:00
Rodrigo Vivi	4e9121e6b4	drm/i915: Fix copy_to_user usage for pipe_crc Copy to user return the number of bytes it couldn't write and zero on success. So any number different than 0 should be considered a fault, not only when it doesn't write the full size. v2: fixed the inverted logic. (Ville) Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2016-08-05 09:39:47 -07:00
Ville Syrjälä	19e0b4cab9	Revert "drm/i915: Track active streams also for DP SST" This reverts commit `f64425a82b`. active_streams will get totally out of whack with SST unless we sync up with the hw state at readout, obviously! We don't yet do that, so now the WARNs fire all the time. Let's revert :( Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470413142-26402-1-git-send-email-ville.syrjala@linux.intel.com References: https://bugs.freedesktop.org/show_bug.cgi?id=95472#c14 Acked-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-08-05 19:20:31 +03:00
Matthew Auld	575e3ccbce	drm/i915: fix WaInsertDummyPushConstPs As pointed out by Chris Harris, we are using the wrong WA name, it should in fact be WaToEnableHwFixForPushConstHWBug, also it should be applied from C0 onwards for both BXT and KBL. Fixes: `7b9005cd45` ("drm/i915: Add WaInsertDummyPushConstP for bxt and kbl") Cc: Chris Harris <chris.harris@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reported-by: Chris Harris <chris.harris@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470127013-29653-1-git-send-email-matthew.auld@intel.com	2016-08-05 13:16:55 +03:00
Chris Wilson	209b3f7ed0	drm/i915: Assert that the request hasn't been retired With all callers now not playing tricks with dropping the struct_mutex between waiting and retiring, we can assert that the request is ready to be retired. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-19-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:44 +01:00
Chris Wilson	3e510a8e65	drm/i915: Repack fence tiling mode and stride into a single integer In the previous commit, we moved the obj->tiling_mode out of a bitfield and into its own integer so that we could safely use READ_ONCE(). Let us now repair some of that damage by sharing the tiling_mode with its companion, the fence stride. v2: New magic Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-18-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:43 +01:00
Chris Wilson	deeb1519b6	drm/i915: Document and reject invalid tiling modes Through the GTT interface to the fence registers, we can only handle linear, X and Y tiling. The more esoteric tiling patterns are ignored. Document that the tiling ABI only supports upto Y tiling, and reject any attempts to set a tiling mode other than NONE, X or Y. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-17-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:42 +01:00
Chris Wilson	9ad3676148	drm/i915: Remove locking for get_tiling Since we are not concerned with userspace racing itself with set-tiling (the order is indeterminant even if we take a lock), then we can safely read back the single obj->tiling_mode and do the static lookup of swizzle mode without having to take a lock. get-tiling is reasonably frequent due to the back-channel passing around of tiling parameters in DRI2/DRI3. v2: Make tiling_mode a full unsigned int so that we can trivially use it with READ_ONCE(). Separating it out into manual control over the flags field was too noisy for a simple patch. Note that we could use the lower bits of obj->stride for the tiling mode. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-16-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:42 +01:00
Chris Wilson	e883d73503	drm/i915: Remove pinned check from madvise ioctl We don't need to incur the overhead of checking whether the object is pinned prior to changing its madvise. If the object is pinned, the madvise will not take effect until it is unpinned and so we cannot free the pages being pointed at by hardware. Marking a pinned object with allocated pages as DONTNEED will not trigger any undue warnings. The check is therefore superfluous, and by removing it we can remove a linear walk over all the vma the object has. Still despite it being an overzealous check, that error code is part of the current ABI and so we must proceed with caution. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-15-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:41 +01:00
Chris Wilson	c21724cc4d	drm/i915: Reduce locking inside swfinish ioctl We only need to take the struct_mutex if the object is pinned to the display engine and so requires checking for clflush. (The race with userspace pinning the object to a framebuffer is irrelevant.) v2: Use access once for compiler hints (or not as it is a bitfield) v3: READ_ONCE, obj->pin_display is not a bitfield anymore v4: Don't be creative with goto. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-14-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:41 +01:00
Chris Wilson	3fdc13c7a3	drm/i915: Remove (struct_mutex) locking for busy-ioctl By applying the same logic as for wait-ioctl, we can query whether a request has completed without holding struct_mutex. The biggest impact system-wide is removing the flush_active and the contention that causes. Testcase: igt/gem_busy Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Akash Goel <akash.goel@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-13-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:40 +01:00
Chris Wilson	033d549b81	drm/i915: Remove (struct_mutex) locking for wait-ioctl With a bit of care (and leniency) we can iterate over the object and wait for previous rendering to complete with judicial use of atomic reference counting. The ABI requires us to ensure that an active object is eventually flushed (like the busy-ioctl) which is guaranteed by our management of requests (i.e. everything that is submitted to hardware is flushed in the same request). All we have to do is ensure that we can detect when the requests are complete for reporting when the object is idle (without triggering ETIME), locklessly - this is handled by i915_gem_active_wait_unlocked(). The impact of this is actually quite small - the return to userspace following the wait was already lockless and so we don't see much gain in latency improvement upon completing the wait. What we do achieve here is completing an already finished wait without hitting the struct_mutex, our hold is quite short and so we are typically just a victim of contention rather than a cause - but it is still one less contention point! v2: Break up a long line. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-12-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:40 +01:00
Chris Wilson	258a5edee0	drm/i915: Do a nonblocking wait first in pread/pwrite If we try and read or write to an active request, we first must wait upon the GPU completing that request. Let's do that without holding the mutex (and so allow someone else to access the GPU whilst we wait). Upon completion, we will acquire the mutex and only then start the operation (i.e. we do not rely on state from before the initial wait). v2: Repaint the goto labels v3: Move the tracepoints back to the start of the ioctls Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-11-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:39 +01:00
Chris Wilson	3b4e896f14	drm/i915: Remove unused no-shrinker-steal After removing the user of this wart, we can remove the wart entirely. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-10-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:39 +01:00
Chris Wilson	f3f6184c5f	drm/i915: Tidy generation of the GTT mmap offset If we make the observation that mmap-offsets are only released when we free an object, we can then deduce that the shrinker only creates free space in the mmap arena indirectly by flushing the request list and freeing expired objects. If we combine this with the lockless vma-manager and lockless idling, we can avoid taking our big struct_mutex until we need to actually free the requests. One side-effect is that we defer the madvise checking until we need the pages (i.e. the fault handler). This brings us into line with the other delayed checks (and madvise in general). v2: s/ret/err/ and use if (!err) rather than if (ret == 0) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-9-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:38 +01:00
Chris Wilson	5cba5be6b6	drm/i915/shrinker: Wait before acquiring struct_mutex under oom We can now wait for the GPU (all engines) to become idle without requiring the struct_mutex. Inside the shrinker, we need to currently take the struct_mutex in order to purge objects and to purge the objects we need the GPU to be idle - causing a stall whilst we hold the struct_mutex. We can hide most of that stall by performing the wait before taking the struct_mutex and only doing essential waits for new rendering on objects to be freed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-8-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:38 +01:00
Chris Wilson	307dc25bf6	drm/i915: Simplify do_idling() (Ironlake vt-d w/a) Now that we pass along the expected interruptible nature for the wait-for-idle, we do not need to modify the global i915->mm.interruptible for this single call. (Only the immediate call to i915_gem_wait_for_idle() takes the interruptible status as the other action, dma_map_sg(), is independent of i915.ko) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-7-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:37 +01:00
Chris Wilson	dcff85c844	drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex The principal motivation for this was to try and eliminate the struct_mutex from i915_gem_suspend - but we still need to hold the mutex current for the i915_gem_context_lost(). (The issue there is that there may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and struct_mutex via the stop_machine().) For the moment, enabling last request tracking for the engine, allows us to do busyness checking and waiting without requiring the struct_mutex - which is useful in its own right. As a side-effect of having a robust means for tracking engine busyness, we can replace our other busyness heuristic, that of comparing against the last submitted seqno. For paranoid reasons, we have a semi-ordered check of that seqno inside the hangchecker, which we can now improve to an ordered check of the engine's busyness (removing a locked xchg in the process). v2: Pass along "bool interruptible" as being unlocked we cannot rely on i915->mm.interruptible being stable or even under our control. v3: Replace check Ironlake i915_gpu_busy() with the common precalculated value Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-6-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:37 +01:00
Chris Wilson	90f4fcd56b	drm/i915: Remove forced stop ring on suspend/unload Before suspending (or unloading), we would first wait upon all rendering to be completed and then disable the rings. This later step is a remanent from DRI1 days when we did not use request tracking for all operations upon the ring. Now that we are sure we are waiting upon the very last operation by the engine, we can forgo clobbering the ring registers, though we do keep the assert that the engine is indeed idle before sleeping. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-5-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:36 +01:00
Chris Wilson	f826ee21e5	drm/i915/userptr: Remove superfluous interruptible=false on waiting Inside the kthread context, we can't be interrupted by signals so touching the mm.interruptible flag is pointless and wait-request now consumes EIO itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-4-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:36 +01:00
Chris Wilson	8a3b3d576c	drm/i915: Convert non-blocking userptr waits for requests over to using RCU We can completely avoid taking the struct_mutex around the non-blocking waits by switching over to the RCU request management (trading the mutex for a RCU read lock and some complex atomic operations). The improvement is that we gain further contention reduction, and overall the code become simpler due to the reduced mutex dancing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-3-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:35 +01:00
Chris Wilson	b8f9096d6a	drm/i915: Convert non-blocking waits for requests over to using RCU We can completely avoid taking the struct_mutex around the non-blocking waits by switching over to the RCU request management (trading the mutex for a RCU read lock and some complex atomic operations). The improvement is that we gain further contention reduction, and overall the code become simpler due to the reduced mutex dancing. v2: Move i915_gem_fault tracepoint back to the start of the function, before the unlocked wait. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-2-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:34 +01:00
Chris Wilson	2467658e2d	drm/i915: Introduce i915_gem_active_wait_unlocked() It is useful to be able to wait on pending rendering without grabbing the struct_mutex. We can do this by using the i915_gem_active_get_rcu() primitive to acquire a reference to the pending request without requiring struct_mutex, just the RCU read lock, and then call i915_wait_request(). v2: Rebase onto new i915_gem_active_get_unlocked() semantics that take the RCU read lock on behalf of the caller. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-1-git-send-email-chris@chris-wilson.co.uk	2016-08-05 10:54:33 +01:00
Daniel Vetter	94558e265b	Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued Backmerge the 4.8 pull request state from Dave - conflicts were getting out of hand, and Chris has some patches which outright don't apply without everything merged together again. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2016-08-05 10:36:15 +02:00
Ville Syrjälä	5ac9056753	drm/i915: Fix iboost setting for SKL Y/U DP DDI buffer translation entry 2 The spec was recently fixed to have the correct iboost setting for the SKL Y/U DP DDI buffer translation table entry 2. Update our tables to match. Cc: David Weinehall <david.weinehall@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470140517-13011-1-git-send-email-ville.syrjala@linux.intel.com Cc: stable@vger.kernel.org Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>	2016-08-05 09:40:35 +03:00
Matt Roper	055c3ff69d	drm/i915/gen9: Give one extra block per line for SKL plane WM calculations The bspec was updated a couple weeks ago to add an extra block per line to plane watermark calculations for linear pixel formats. Bspec update 115327 description: "Gen9+ - Updated the plane blocks per line calculation for linear cases. Adds +1 for all linear cases to handle the non-block aligned stride cases." Cc: Lyude <cpaul@redhat.com> Cc: drm-intel-fixes@lists.freedesktop.org Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470344880-27394-1-git-send-email-matthew.d.roper@intel.com Reviewed-by: Lyude <cpaul@redhat.com>	2016-08-04 16:46:49 -07:00
Chris Wilson	ad778f8967	drm/i915: Export our request as a dma-buf fence on the reservation object If the GEM objects being rendered with in this request have been exported via dma-buf to a third party, hook ourselves into the dma-buf reservation object so that the third party can serialise with our rendering via the dma-buf fences. Testcase: igt/prime_busy Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-26-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:20:06 +01:00
Chris Wilson	0eafec6d32	drm/i915: Enable lockless lookup of request tracking via RCU If we enable RCU for the requests (providing a grace period where we can inspect a "dead" request before it is freed), we can allow callers to carefully perform lockless lookup of an active request. However, by enabling deferred freeing of requests, we can potentially hog a lot of memory when dealing with tens of thousands of requests per second - with a quick insertion of a synchronize_rcu() inside our shrinker callback, that issue disappears. v2: Currently, it is our responsibility to handle reclaim i.e. to avoid hogging memory with the delayed slab frees. At the moment, we wait for a grace period in the shrinker, and block for all RCU callbacks on oom. Suggested alternatives focus on flushing our RCU callback when we have a certain number of outstanding request frees, and blocking on that flush after a second high watermark. (So rather than wait for the system to run out of memory, we stop issuing requests - both are nondeterministic.) Paul E. McKenney wrote: Another approach is synchronize_rcu() after some largish number of requests. The advantage of this approach is that it throttles the production of callbacks at the source. The corresponding disadvantage is that it slows things up. Another approach is to use call_rcu(), but if the previous call_rcu() is still in flight, block waiting for it. Yet another approach is the get_state_synchronize_rcu() / cond_synchronize_rcu() pair. The idea is to do something like this: cond_synchronize_rcu(cookie); cookie = get_state_synchronize_rcu(); You would of course do an initial get_state_synchronize_rcu() to get things going. This would not block unless there was less than one grace period's worth of time between invocations. But this assumes a busy system, where there is almost always a grace period in flight. But you can make that happen as follows: cond_synchronize_rcu(cookie); cookie = get_state_synchronize_rcu(); call_rcu(&my_rcu_head, noop_function); Note that you need additional code to make sure that the old callback has completed before doing a new one. Setting and clearing a flag with appropriate memory ordering control suffices (e.g,. smp_load_acquire() and smp_store_release()). v3: More comments on compiler and processor order of operations within the RCU lookup and discover we can use rcu_access_pointer() here instead. v4: Wrap i915_gem_active_get_rcu() to take the rcu_read_lock itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: "Goel, Akash" <akash.goel@intel.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-25-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:20:05 +01:00
Chris Wilson	00e60f2659	drm/i915: Move i915_gem_object_wait_rendering() Just move it earlier so that we can use the companion nonblocking version in a couple of more callsites without having to add a forward declaration. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-24-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:20:05 +01:00
Chris Wilson	573adb3962	drm/i915: Move obj->active:5 to obj->flags We are motivated to avoid using a bitfield for obj->active for a couple of reasons. Firstly, we wish to document our lockless read of obj->active using READ_ONCE inside i915_gem_busy_ioctl() and that requires an integral type (i.e. not a bitfield). Secondly, gcc produces abysmal code when presented with a bitfield and that shows up high on the profiles of request tracking (mainly due to excess memory traffic as it converts the bitfield to a register and back and generates frequent AGI in the process). v2: BIT, break up a long line in compute the other engines, new paint for i915_gem_object_is_active (now i915_gem_object_get_active). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-23-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:20:04 +01:00
Chris Wilson	5748b6a1f4	drm/i915: Use dev_priv consistently through the intel_frontbuffer interface Rather than a mismash of struct drm_device dev and struct drm_i915_private dev_priv being used freely within a function, be consistent and only pass along dev_priv. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-22-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:20:03 +01:00
Chris Wilson	faf5bf0ad6	drm/i915: Use atomics to manipulate obj->frontbuffer_bits The individual bits inside obj->frontbuffer_bits are protected by each plane->mutex, but the whole bitfield may be accessed by multiple KMS operations simultaneously and so the RMW need to be under atomics. However, for updating the single field we do not need to mandate that it be under the struct_mutex, one more step towards its removal as the de facto BKL. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-21-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:20:03 +01:00
Chris Wilson	b5add9591c	drm/i915: Make fb_tracking.lock a spinlock We only need a very lightweight mechanism here as the locking is only used for co-ordinating a bitfield. v2: Move the cheap unlikely tests into the caller v3: Move the kerneldoc into the header (now separated out into intel_fronbuffer.h for better kerneldoc and readability) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtien <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-20-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:20:02 +01:00
Chris Wilson	5d723d7afd	drm/i915: Separate intel_frontbuffer into its own header In view of adding inline functions into the intel_frontbuffer section, we first split the header into its own file so that we can integrate it more easily with kerneldoc. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-19-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:20:01 +01:00
Chris Wilson	de895082f7	drm/i915: Remove highly confusing i915_gem_obj_ggtt_pin() Since i915_gem_obj_ggtt_pin() is an idiom breaking curry function for i915_gem_object_ggtt_pin(), spare us the confusion and remove it. Removing it now simplifies later patches to change the i915_vma_pin() (and friends) interface. v2: Add a redundant GEM_BUG_ON(!view) to i915_gem_obj_lookup_or_create_ggtt_vma() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-18-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:20:00 +01:00
Chris Wilson	305bc234a8	drm/i915: Make i915_vma_pin() small and inline Not only is i915_vma_pin() called for every single object on every single execbuf, it is usually a simple increment as the VMA is already bound for execution by the GPU. Rearrange the tests for unbound and pin_count overflow so that we can do the increment and test very cheaply and compact enough to inline the operation into execbuf. The trick used is to note that we can check for an overflow bit (keeping space available for it inside the flags) at the same time as checking the binding bits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-17-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:20:00 +01:00
Chris Wilson	3272db5313	drm/i915: Combine all i915_vma bitfields into a single set of flags In preparation to perform some magic to speed up i915_vma_pin(), which is among the hottest of hot paths in execbuf, refactor all the bitfields accessed by i915_vma_pin() into a single unified set of flags. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-16-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:19:59 +01:00
Chris Wilson	59bfa1248e	drm/i915: Start passing around i915_vma from execbuffer During execbuffer we look up the i915_vma in order to reserve them in the VM. However, we then do a double lookup of the vma in order to then pin them, all because we lack the necessary interfaces to operate on i915_vma - so introduce i915_vma_pin()! v2: Tidy parameter lists to remove one level of redirection in the hot path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-15-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:19:58 +01:00
Chris Wilson	20dfbde463	drm/i915: Wrap vma->pin_count accessors with small inline helpers In the next few patches, the VMA pinning API is overhauled and to reduce the churn we pull out the update to the accessors into a prep patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-14-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:19:58 +01:00

1 2 3 4 5 ...

605506 Commits All Branches Search

605506 Commits

All Branches