OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Chris Wilson	80debff8d9	drm/i915: Consolidate #ifdef CONFIG_INTEL_IOMMU We depend on intel_iommu_gfx_mapped for various workarounds, but that is only available under an #ifdef CONFIG_INTEL_IOMMU. Refactor all the cut-and-paste ifdefs to a common routine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170525121612.2190-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-05-25 21:51:49 +01:00
Michal Hocko	2098105ec6	drm: drop drm_[cm]alloc* helpers Now that drm_[cm]alloc* helpers are simple one line wrappers around kvmalloc_array and drm_free_large is just kvfree alias we can drop them and replace by their native forms. This shouldn't introduce any functional change. Changes since v1 - fix typo in drivers/gpu//drm/etnaviv/etnaviv_gem.c - noticed by 0day build robot Suggested-by: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Michal Hocko <mhocko@suse.com>drm: drop drm_[cm]alloc* helpers [danvet: Fixup vgem which grew another user very recently.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517122312.GK18247@dhcp22.suse.cz	2017-05-18 17:22:39 +02:00
Chris Wilson	c5cf9a9147	drm/i915: Create a kmem_cache to allocate struct i915_priolist from The i915_priolist are allocated within an atomic context on a path where we wish to minimise latency. If we use a dedicated kmem_cache, we have the advantage of a local freelist from which to service new requests that should keep the latency impact of an allocation small. Though currently we expect the majority of requests to be at default priority (and so hit the preallocate priolist), once userspace starts using priorities they are likely to use many fine grained policies improving the utilisation of a private slab. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-9-chris@chris-wilson.co.uk	2017-05-17 13:38:12 +01:00
Chris Wilson	6c067579e6	drm/i915: Split execlist priority queue into rbtree + linked list All the requests at the same priority are executed in FIFO order. They do not need to be stored in the rbtree themselves, as they are a simple list within a level. If we move the requests at one priority into a list, we can then reduce the rbtree to the set of priorities. This should keep the height of the rbtree small, as the number of active priorities can not exceed the number of active requests and should be typically only a few. Currently, we have ~2k possible different priority levels, that may increase to allow even more fine grained selection. Allocating those in advance seems a waste (and may be impossible), so we opt for allocating upon first use, and freeing after its requests are depleted. To avoid the possibility of an allocation failure causing us to lose a request, we preallocate the default priority (0) and bump any request to that priority if we fail to allocate it the appropriate plist. Having a request (that is ready to run, so not leading to corruption) execute out-of-order is better than leaking the request (and its dependency tree) entirely. There should be a benefit to reducing execlists_dequeue() to principally using a simple list (and reducing the frequency of both rbtree iteration and balancing on erase) but for typical workloads, request coalescing should be small enough that we don't notice any change. The main gain is from improving PI calls to schedule, and the explicit list within a level should make request unwinding simpler (we just need to insert at the head of the list rather than the tail and not have to make the rbtree search more complicated). v2: Avoid use-after-free when deleting a depleted priolist v3: Michał found the solution to handling the allocation failure gracefully. If we disable all priority scheduling following the allocation failure, those requests will be executed in fifo and we will ensure that this request and its dependencies are in strict fifo (even when it doesn't realise it is only a single list). Normal scheduling is restored once we know the device is idle, until the next failure! Suggested-by: Michał Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-8-chris@chris-wilson.co.uk	2017-05-17 13:38:09 +01:00
Chris Wilson	77f0d0e925	drm/i915/execlists: Pack the count into the low bits of the port.request add/remove: 1/1 grow/shrink: 5/4 up/down: 391/-578 (-187) function old new delta execlists_submit_ports 262 471 +209 port_assign.isra - 136 +136 capture 6344 6359 +15 reset_common_ring 438 452 +14 execlists_submit_request 228 238 +10 gen8_init_common_ring 334 341 +7 intel_engine_is_idle 106 105 -1 i915_engine_info 2314 2290 -24 __i915_gem_set_wedged_BKL 485 411 -74 intel_lrc_irq_handler 1789 1604 -185 execlists_update_context 294 - -294 The most important change there is the improve to the intel_lrc_irq_handler and excclist_submit_ports (net improvement since execlists_update_context is now inlined). v2: Use the port_api() for guc as well (even though currently we do not pack any counters in there, yet) and hide all port->request_count inside the helpers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-5-chris@chris-wilson.co.uk	2017-05-17 13:38:06 +01:00
Chris Wilson	0ce8178808	drm/i915: Redefine ptr_pack_bits() and friends Rebrand the current (pointer \| bits) pack/unpack utility macros as explicit bit twiddling for PAGE_SIZE so that we can use the more flexible underlying macros for different bits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-4-chris@chris-wilson.co.uk	2017-05-17 13:38:04 +01:00
Chris Wilson	991bfc64db	drm/i915: Make ptr_unpack_bits() more function-like ptr_unpack_bits() is a function-like macro, as such it is meant to be replaceable by a function. In this case, we should be passing in the out-param as a pointer. Bizarrely this does affect code generation: function old new delta i915_gem_object_pin_map 409 389 -20 An improvement(?) in this case, but one can't help wonder what strict-aliasing optimisations we are preventing. The generated code looks identical in using ptr_unpack_bits (no extra motions to stack, the pointer and bits appear to be kept in registers), the difference appears to be code ordering and with a reorder it is able to use smaller forward jumps. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-3-chris@chris-wilson.co.uk	2017-05-17 13:38:03 +01:00
Linus Torvalds	de4d195308	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main changes are: - Debloat RCU headers - Parallelize SRCU callback handling (plus overlapping patches) - Improve the performance of Tree SRCU on a CPU-hotplug stress test - Documentation updates - Miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits) rcu: Open-code the rcu_cblist_n_lazy_cbs() function rcu: Open-code the rcu_cblist_n_cbs() function rcu: Open-code the rcu_cblist_empty() function rcu: Separately compile large rcu_segcblist functions srcu: Debloat the <linux/rcu_segcblist.h> header srcu: Adjust default auto-expediting holdoff srcu: Specify auto-expedite holdoff time srcu: Expedite first synchronize_srcu() when idle srcu: Expedited grace periods with reduced memory contention srcu: Make rcutorture writer stalls print SRCU GP state srcu: Exact tracking of srcu_data structures containing callbacks srcu: Make SRCU be built by default srcu: Fix Kconfig botch when SRCU not selected rcu: Make non-preemptive schedule be Tasks RCU quiescent state srcu: Expedite srcu_schedule_cbs_snp() callback invocation srcu: Parallelize callback handling kvm: Move srcu_struct fields to end of struct kvm rcu: Fix typo in PER_RCU_NODE_PERIOD header comment rcu: Use true/false in assignment to bool rcu: Use bool value directly ...	2017-05-10 10:30:46 -07:00
Chris Wilson	4797948071	drm/i915: Squash repeated awaits on the same fence Track the latest fence waited upon on each context, and only add a new asynchronous wait if the new fence is more recent than the recorded fence for that context. This requires us to filter out unordered timelines, which are noted by DMA_FENCE_NO_CONTEXT. However, in the absence of a universal identifier, we have to use our own i915->mm.unordered_timeline token. v2: Throw around the debug crutches v3: Inline the likely case of the pre-allocation cache being full. v4: Drop the pre-allocation support, we can lose the most recent fence in case of allocation failure -- it just means we may emit more awaits than strictly necessary but will not break. v5: Trim allocation size for leaf nodes, they only need an array of u32 not pointers. v6: Create mock_timeline to tidy selftest writing v7: s/intel_timeline_sync_get/intel_timeline_sync_is_later/ (Tvrtko) v8: Prune the stale sync points when we idle. v9: Include a small benchmark in the kselftests v10: Separate the idr implementation into its own compartment. (Tvrkto) v11: Refactor igt_sync kselftests to avoid deep nesting (Tvrkto) v12: __sync_leaf_idx() to assert that p->height is 0 when checking leaves v13: kselftests to investigate struct i915_syncmap itself (Tvrtko) v14: Foray into ascii art graphs v15: Take into account that the random lookup/insert does 2 prng calls, not 1, when benchmarking, and use for_each_set_bit() (Tvrtko) v16: Improved ascii art Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170503093924.5320-4-chris@chris-wilson.co.uk	2017-05-03 11:08:48 +01:00
Chris Wilson	9431282832	drm/i915: Mark up clflushes as belonging to an unordered timeline 2 clflushes on two different objects are not ordered, and so do not belong to the same timeline (context). Either we use a unique context for each, or we reserve a special global context to mean unordered. Ideally, we would reserve 0 to mean unordered (DMA_FENCE_NO_CONTEXT) to have the same semantics everywhere. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170503093924.5320-1-chris@chris-wilson.co.uk	2017-05-03 11:08:45 +01:00
Dave Airlie	73ba2d5c2b	Merge tag 'drm-intel-next-fixes-2017-04-27' of git://anongit.freedesktop.org/git/drm-intel into drm-next drm/i915 and gvt fixes for drm-next/v4.12 * tag 'drm-intel-next-fixes-2017-04-27' of git://anongit.freedesktop.org/git/drm-intel: drm/i915: Confirm the request is still active before adding it to the await drm/i915: Avoid busy-spinning on VLV_GLTC_PW_STATUS mmio drm/i915/selftests: Allocate inode/file dynamically drm/i915: Fix system hang with EI UP masked on Haswell drm/i915: checking for NULL instead of IS_ERR() in mock selftests drm/i915: Perform link quality check unconditionally during long pulse drm/i915: Fix use after free in lpe_audio_platdev_destroy() drm/i915: Use the right mapping_gfp_mask for final shmem allocation drm/i915: Make legacy cursor updates more unsynced drm/i915: Apply a cond_resched() to the saturated signaler drm/i915: Park the signaler before sleeping drm/i915/gvt: fix a bounds check in ring_id_to_context_switch_event() drm/i915/gvt: Fix PTE write flush for taking runtime pm properly drm/i915/gvt: remove some debug messages in scheduler timer handler drm/i915/gvt: add mmio init for virtual display drm/i915/gvt: use directly assignment for structure copying drm/i915/gvt: remove redundant ring id check which cause significant CPU misprediction drm/i915/gvt: remove redundant platform check for mocs load/restore drm/i915/gvt: Align render mmio list to cacheline drm/i915/gvt: cleanup some too chatty scheduler message	2017-04-29 05:50:27 +10:00
Joonas Lahtinen	ea117b8ddc	drm/i915: Reset ILK during GEM sanitization ILK should survive a reset without display corruption. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-28 12:11:59 +03:00
Joonas Lahtinen	f2e4d76ec2	drm/i915: Eliminate HAS_HW_CONTEXTS HAS_HW_CONTEXTS is misleading condition for GPU reset and CCID, replace it with Gen specific (to be updated in next patches). HAS_HW_CONTEXTS in i915_l3_write is bogus because each HAS_L3_DPF match also has .has_hw_contexts = 1 set. This leads to us being able to get rid of the property completely. v2: - Keep the checks at Gen6 for no functional change (Ville) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-04-28 12:11:59 +03:00
Chris Wilson	bdb57b8dca	drm/i915: Use the right mapping_gfp_mask for final shmem allocation Many sightings report the greater prevalence of allocation failures. This is all due to the incorrect use of mapping_gfp_constraint(), so remove it in favour of just querying the mapping_gfp_mask() which are the exact gfp_t we wanted in the first place. We still do expect a higher chance of reporting ENOMEM, as that is the intention of using __GFP_NORETRY -- to fail rather than oom after having reclaimed from our bo caches, and having done a direct\|kswapd reclaim pass. Reported-by: Jason Ekstrand <jason.ekstrand@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100594 Fixes: `24f8e00a8a` ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170405221514.23251-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit `b268d9fe0f`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-04-26 16:28:08 +03:00
Ingo Molnar	58d30c36d4	Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Documentation updates. - Miscellaneous fixes. - Parallelize SRCU callback handling (plus overlapping patches). Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-04-23 11:12:44 +02:00
Dave Airlie	856ee92e86	Linux 4.11-rc7 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJY881cAAoJEHm+PkMAQRiGG4UH+wa2z6Qet36Uc4nXFZuSMYrO ErUWs1QpTDDv4a+LE4fgyMvM3j9XqtpfQLy1n70jfD14IqPBhHe4gytasAf+8lg1 YvddFx0Yl3sygVu3dDBNigWeVDbfwepW59coN0vI5nrMo+wrei8aVIWcFKOxdMuO n72u9vuhrkEnLJuQk7SF+t4OQob9McXE3s7QgyRopmlKhKo7mh8On7K2BRI5uluL t0j5kZM0a43EUT5rq9xR8f5pgtyfTMG/FO2MuzZn43MJcZcyfmnOP/cTSIvAKA5U 1i12lxlokYhURNUe+S6jm8A47TrqSRSJxaQJZRlfGJksZ0LJa8eUaLDCviBQEoE= =6QWZ -----END PGP SIGNATURE----- Merge tag 'v4.11-rc7' into drm-next Backmerge Linux 4.11-rc7 from Linus tree, to fix some conflicts that were causing problems with the rerere cache in drm-tip.	2017-04-19 11:07:14 +10:00
Paul E. McKenney	5f0d5a3ae7	mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU A group of Linux kernel hackers reported chasing a bug that resulted from their assumption that SLAB_DESTROY_BY_RCU provided an existence guarantee, that is, that no block from such a slab would be reallocated during an RCU read-side critical section. Of course, that is not the case. Instead, SLAB_DESTROY_BY_RCU only prevents freeing of an entire slab of blocks. However, there is a phrase for this, namely "type safety". This commit therefore renames SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU in order to avoid future instances of this sort of confusion. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <linux-mm@kvack.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> [ paulmck: Add comments mentioning the old name, as requested by Eric Dumazet, in order to help people familiar with the old name find the new one. ] Acked-by: David Rientjes <rientjes@google.com>	2017-04-18 11:42:36 -07:00
Chris Wilson	e22d8e3c69	drm/i915: Treat WC a separate cache domain When discussing a new WC mmap, we based the interface upon the assumption that GTT was fully coherent. How naive! Commits `3b5724d702` ("drm/i915: Wait for writes through the GTT to land before reading back") and `ed4596ea99` ("drm/i915/guc: WA to address the Ringbuffer coherency issue") demonstrate that writes through the GTT are indeed delayed and may be overtaken by direct WC access. To be safe, if userspace is mixing WC mmaps with other potential GTT access (pwrite, GTT mmaps) it should use set_domain(WC). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96563 Testcase: igt/gem_pwrite/small-gtt* Testcase: igt/drv_selftest/coherency Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170412110111.26626-2-chris@chris-wilson.co.uk	2017-04-12 12:35:17 +01:00
Chris Wilson	ef74921bc6	drm/i915: Combine write_domain flushes to a single function In the next patch, we will introduce a new cache domain for differentiating between GTT access and direct WC access. This will require us to include WC in our write_domain flushes. Rather than duplicate a third function, combine the existing two into one and flushing WC writes will then be automatically handled as well. v2: Be smarter and clearer by passing in the write domains to flush (Joonas) v3: One missed ~ in v2 conversion Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170412110111.26626-1-chris@chris-wilson.co.uk	2017-04-12 12:35:16 +01:00
Chris Wilson	1d39f28170	drm/i915: Rename intel_engine_cs.exec_id to uabi_id We want to refer to the index of the engine consistently throughout the userspace ABI. We already have such an index through the execbuffer engine specifier, that needs to be able to refer to each engine specifically, so rename it the index to uabi_id to reflect its generality beyond execbuf. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170411124306.15448-1-chris@chris-wilson.co.uk	2017-04-11 14:31:39 +01:00
Sagar Arun Kamble	63987bfebd	drm/i915: Suspend GuC prior to GPU Reset during GEM suspend i915 is currently doing a full GPU reset at the end of i915_gem_suspend() followed by GuC suspend in i915_drm_suspend(). This GPU reset clobbers the GuC, causing the suspend request to then fail, leaving the GuC in an undefined state. We need to tell the GuC to suspend before we do the direct intel_gpu_reset(). v2: Commit message update. (Chris, Daniele) Fixes: `1c777c5d1d` ("drm/i915/hsw: Fix GPU hang during resume from S3-devices state") Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1491387710-20553-1-git-send-email-sagar.a.kamble@intel.com Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit `fd08923384`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-04-11 13:25:12 +03:00
Chris Wilson	f2be9d6833	drm/i915: Insert cond_resched() into i915_gem_free_objects As we may have very many objects to free, check to see if the task needs to be rescheduled whilst freeing them. Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170407102552.5781-4-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-04-07 13:37:41 +01:00
Chris Wilson	5ad08be7e3	drm/i915: Break up long runs of freeing objects Before freeing the next batch of objects from the worker, check if the worker's timeslice has expired and if so, defer the next batch to the next invocation of the worker. Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170407102552.5781-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-04-07 13:37:41 +01:00
Joonas Lahtinen	e92075ff7d	drm/i915: Simplify shrinker locking By using the same structure for both interruptible and uninterruptible locking in shrinker code, combined with the information that mm.interruptible is only being written to, the code can be greatly simplified. Also removing the i915_gem_ prefix from the locking functions so that nobody in their wildest dreams considers exporting them. Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1491562175-27680-1-git-send-email-joonas.lahtinen@linux.intel.com	2017-04-07 14:33:39 +03:00
Chris Wilson	17b93c40e7	drm/i915: Drain any freed objects prior to hibernation As we call into the shrinker during freeze, we may have freed more objects since we idled during i915_gem_suspend. Make sure we flush the i915_gem_free_objects worker prior to saving the unwanted pages into the hibernation image. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170407102552.5781-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-04-07 12:26:00 +01:00
Chris Wilson	d0aa301ae5	drm/i915: The shrinker already acquires struct_mutex, so call it unlocked The shrinker is prepared to be called unlocked (and at other times with struct_mutex held for DIRECT_RECLAIM) so we can skip acquiring the struct_mutex prior to calling the shrinker during freeze. This improves our ability to shrink as we can be more aggressive when we know the caller isn't holding struct_mutex. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170407102552.5781-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-04-07 12:25:11 +01:00
Chris Wilson	b268d9fe0f	drm/i915: Use the right mapping_gfp_mask for final shmem allocation Many sightings report the greater prevalence of allocation failures. This is all due to the incorrect use of mapping_gfp_constraint(), so remove it in favour of just querying the mapping_gfp_mask() which are the exact gfp_t we wanted in the first place. We still do expect a higher chance of reporting ENOMEM, as that is the intention of using __GFP_NORETRY -- to fail rather than oom after having reclaimed from our bo caches, and having done a direct\|kswapd reclaim pass. Reported-by: Jason Ekstrand <jason.ekstrand@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100594 Fixes: `24f8e00a8a` ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170405221514.23251-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-04-06 13:02:02 +01:00
Sagar Arun Kamble	fd08923384	drm/i915: Suspend GuC prior to GPU Reset during GEM suspend i915 is currently doing a full GPU reset at the end of i915_gem_suspend() followed by GuC suspend in i915_drm_suspend(). This GPU reset clobbers the GuC, causing the suspend request to then fail, leaving the GuC in an undefined state. We need to tell the GuC to suspend before we do the direct intel_gpu_reset(). v2: Commit message update. (Chris, Daniele) Fixes: `1c777c5d1d` ("drm/i915/hsw: Fix GPU hang during resume from S3-devices state") Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1491387710-20553-1-git-send-email-sagar.a.kamble@intel.com Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-06 11:10:01 +01:00
Tvrtko Ursulin	7a3ee5deb4	drm/i915: Remove user-triggerable WARN from i915_gem_object_create Since this can be triggered by simply attempting a huge object, a WARN_ON is not appropriate. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170330163130.24141-1-tvrtko.ursulin@linux.intel.com	2017-04-04 09:39:13 +01:00
Chris Wilson	25112b64b3	drm/i915: Wait for all engines to be idle as part of i915_gem_wait_for_idle() Make i915_gem_wait_for_idle() be a little heavier in order to try and guarantee that the GPU is indeed idle (by checking each engine individually is idle, i.e. all writes are complete and the rings stopped) after waiting for in-flight requests to be completed. v2: And return the final error. v3: Break the wait_for() out from under the WARN -- the macro expansion is hideous and unreadable in the warning message v4: If wait_for_engine() fails the result is catastrophic, mark the device as wedged and wait for the repair team. References: https://bugs.freedesktop.org/show_bug.cgi?id=98836 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170330145041.9005-4-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-31 12:07:17 +01:00
Chris Wilson	72022a705e	drm/i915: Move retire-requests into i915_gem_wait_for_idle() As we now distinguish everywhere that can call i915_gem_retire_requests() following a successful wait_for_idle, we can remove the duplication by moving that call into i915_gem_wait_for_idle() itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170330145041.9005-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-31 12:03:46 +01:00
Chris Wilson	8490ae207f	drm/i915: Suppress busy status for engines if wedged If the driver is wedged, HW state may be very inconsistent and report that it is still busy, even though we have stopped using it. This can lead to a double ERROR rather than a graceful cleanup after wedging. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170330145041.9005-2-chris@chris-wilson.co.uk	2017-03-30 17:58:44 +01:00
Chris Wilson	2c170af76c	drm/i915: Do request retirement before marking engines as wedged As we declare an engine as wedged, we mark all of its active requests as in error. However, we don't want to mark successfully completed requests as in error, which requires us to retire those requests first. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170330145041.9005-1-chris@chris-wilson.co.uk	2017-03-30 17:56:21 +01:00
Oscar Mateo	b8991403ea	drm/i915/guc: Take enable_guc_loading check out of GEM core code The should happen as soon as possible, but always within the logic that depends on it (and not interrupting the top-level driver control flow). Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1490720027-23234-1-git-send-email-oscar.mateo@intel.com	2017-03-30 13:11:40 +03:00
Chris Wilson	e2a2aa36a5	drm/i915: Check we have an wake device before flushing GTT writes We can assume that if the device is asleep then all pending GTT writes will have been posted, and so we can defer the flush from i915_gem_object_flush_gtt_write_domain() [ 1957.462568] WARNING: CPU: 0 PID: 6132 at drivers/gpu/drm/i915/intel_drv.h:1742 fwtable_read32+0x123/0x150 [i915] [ 1957.462582] RPM wakelock ref not held during HW access [ 1957.462583] Modules linked in: i915 intel_gtt drm_kms_helper prime_numbers [ 1957.462607] CPU: 0 PID: 6132 Comm: gem_concurrent_ Tainted: G U 4.11.0-rc1+ #464 [ 1957.462619] Hardware name: / , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 [ 1957.462630] Call Trace: [ 1957.462646] dump_stack+0x4d/0x6f [ 1957.462657] __warn+0xc1/0xe0 [ 1957.462667] warn_slowpath_fmt+0x4a/0x50 [ 1957.462709] fwtable_read32+0x123/0x150 [i915] [ 1957.462750] i915_gem_object_flush_gtt_write_domain+0x43/0x70 [i915] [ 1957.462791] i915_gem_object_set_to_cpu_domain+0x46/0xa0 [i915] [ 1957.462831] i915_gem_set_domain_ioctl+0x15d/0x220 [i915] [ 1957.462843] drm_ioctl+0x1d7/0x440 [ 1957.462885] ? i915_gem_obj_prepare_shmem_write+0x1d0/0x1d0 [i915] [ 1957.462896] ? pick_next_task_fair+0x436/0x440 [ 1957.462906] ? mntput+0x1f/0x30 [ 1957.462915] do_vfs_ioctl+0x8f/0x5c0 [ 1957.462925] ? __schedule+0x16f/0x5f0 [ 1957.462935] ? ____fput+0x9/0x10 [ 1957.462943] SyS_ioctl+0x3c/0x70 [ 1957.462952] entry_SYSCALL_64_fastpath+0x17/0x98 [ 1957.462961] RIP: 0033:0x7fc542179ca7 [ 1957.462968] RSP: 002b:00007ffeef12ff98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 1957.462982] RAX: ffffffffffffffda RBX: 00007ffeef1301d0 RCX: 00007fc542179ca7 [ 1957.462990] RDX: 00007ffeef12ffd0 RSI: 00000000400c645f RDI: 0000000000000003 [ 1957.462999] RBP: 0000000000000003 R08: 000055f433bc7c40 R09: 000000000000002c [ 1957.463006] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000018 [ 1957.463015] R13: 000055f432c89d20 R14: 000055f432c87690 R15: 0000000000000000 Fixes: `3b5724d702` ("drm/i915: Wait for writes through the GTT to land before reading back") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170323150053.28582-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-27 12:48:44 +01:00
Oscar Mateo	3950bf3dbf	drm/i915/guc: Add onion teardown to the GuC setup Starting with intel_guc_loader, down to intel_guc_submission and finally to intel_guc_log. v2: - Null execbuf client outside guc_client_free (Daniele) - Assert if things try to get allocated twice (Daniele/Joonas) - Null guc->log.buf_addr when destroyed (Daniele) - Newline between returning success and error labels (Joonas) - Remove some unnecessary comments (Joonas) - Keep guc_log_create_extras naming convention (Joonas) - Helper function guc_log_has_extras (Joonas) - No need for separate relay_channel create/destroy. It's just another extra. - No need to nullify guc->log.flush_wq when destroyed (Joonas) - Hoist the check for has_extras out of guc_log_create_extras (Joonas) - Try to do i915_guc_log_register/unregister calls (kind of) symmetric (Daniele) - Make sure initel_guc_fini is not called before init is ever called (Daniele) v3: - Remove unnecessary parenthesis (Joonas) - Check for logs enabled on debugfs registration - Rebase on top of Tvrtko's "Fix request re-submission after reset" v4: - Rebased - Comment around enabling/disabling interrupts inside GuC logging (Joonas) Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-23 14:57:36 +02:00
Chris Wilson	40149f00fb	drm/i915: Actually pass the reclaim gfp_t along to shmemfs! Words cannot describe the embarrassment at creating a new gfp_t relaim to only prevent the oomkiller but allow direct\|kswapd reclaim, and then not use it in the shmem_read_mapping_page_gfp(). Fixes: `24f8e00a8a` ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170322223447.7493-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-23 09:30:20 +00:00
Chris Wilson	24f8e00a8a	drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations Since gfx allocations tend to be large, unmovable and disposable, report the allocation failure back to userspace as an ENOMEM rather than incur the oomkiller. We have already tried to make room by purging our own cached gfx objects, and the oomkiller doesn't attribute ownership of gfx objects so will likely pick the wrong candidate. Instead, let userspace see the ENOMEM. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170322110521.29930-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-03-22 19:26:46 +00:00
Chris Wilson	54ec12af2f	drm/i915: Skip force-wake for uncached mmio flush of GGTT writes The trick of using an uncached mmio read to ensure that the GGTT writes are flushed does not require us to do the forcewake dance, so avoid it in the hope of reducing the frequency that we do keep the device forced awake. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170318104257.694-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-03-20 10:45:49 +00:00
Chris Wilson	be062fa427	drm/i915: Initialise i915_gem_object_create_from_data() directly Use pagecache_write to avoid shmemfs clearing the pages prior to us immediately overwriting them with our data. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170317194648.12468-2-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2017-03-17 22:55:56 +00:00
Chris Wilson	ce8ff099c4	drm/i915: i915_gem_object_create_from_data() doesn't require struct_mutex Both object creation and backing storage page allocation do not require struct_mutex, so do not require the caller to take it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170317194648.12468-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2017-03-17 22:54:06 +00:00
Chris Wilson	2e8f9d3229	drm/i915: Restore engine->submit_request before unwedging When we wedge the device, we override engine->submit_request with a nop to ensure that all in-flight requests are marked in error. However, igt would like to unwedge the device to test -EIO handling. This requires us to flush those in-flight requests and restore the original engine->submit_request. v2: Use a vfunc to unify enabling request submission to engines v3: Split new vfunc to a separate patch. v4: Make the wait interruptible -- the third party fences we wait upon may be indefinitely broken, so allow the reset to be aborted. Fixes: `821ed7df6e` ("drm/i915: Update reset path to fix incomplete requests") Testcase: igt/gem_eio Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v3 Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-3-chris@chris-wilson.co.uk	2017-03-16 17:17:14 +00:00
Chris Wilson	8c185ecaf4	drm/i915: Split I915_RESET_IN_PROGRESS into two flags I915_RESET_IN_PROGRESS is being used for both signaling the requirement to i915_mutex_lock_interruptible() to avoid taking the struct_mutex and to instruct a waiter (already holding the struct_mutex) to perform the reset. To allow for a little more coordination, split these two meaning into a couple of distinct flags. I915_RESET_BACKOFF tells i915_mutex_lock_interruptible() not to acquire the mutex and I915_RESET_HANDOFF tells the waiter to call i915_reset(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-1-chris@chris-wilson.co.uk	2017-03-16 17:17:10 +00:00
Arkadiusz Hiler	6cd5a72c35	drm/i915/guc: Simplify intel_guc_init_hw() Current version of intel_guc_init_hw() does a lot: - cares about submission - loads huc - implement WA This change offloads some of the logic to intel_uc_init_hw(), which now cares about the above. v2: rename guc_hw_reset and fix typo in define name (M. Wajdeczko) v3: rename once again v4: remove spurious comments and add some style (J. Lahtinen) v5: flow changes, got rid of dead checks (M. Wajdeczko) v6: rebase v7: rebase & onion teardown (J. Lahtinen) Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Arkadiusz Hiler	882d1db09b	drm/i915/uc: Rename intel_?uc_{setup, load}() to _init_hw() GuC historically has two "startup" functions called _init() and _setup() Then HuC came with it's _init() and _load(). This commit renames intel_guc_setup() and intel_huc_load() to *uc_init_hw() as they called from the i915_gem_init_hw(). The aim is to be consistent in that entry points called during particular driver init phases (e.g. init_hw) are all suffixed by that phase. When reading the leaf functions, it should be clear at what stage during the driver load it is called and therefore what operations are legal at that point. Also, since the functions start with intel_guc and intel_huc they take appropiate structure. v2: commit message update (Chris Wilson) v3: change taken parameters to be more "semantic" (M. Wajdeczko) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Kenneth Graunke	0f5418e564	drm/i915: Drop support for I915_EXEC_CONSTANTS_* execbuf parameters. This patch makes the I915_PARAM_HAS_EXEC_CONSTANTS getparam return 0 (indicating the optional feature is not supported), and makes execbuf always return -EINVAL if the flags are used. Apparently, no userspace ever shipped which used this optional feature: I checked the git history of Mesa, xf86-video-intel, libva, and Beignet, and there were zero commits showing a use of these flags. Kernel commit `72bfa19c8d` apparently introduced the feature prematurely. According to Chris, the intention was to use this in cairo-drm, but "the use was broken for gen6", so I don't think it ever happened. 'relative_constants_mode' has always been tracked per-device, but this has actually been wrong ever since hardware contexts were introduced, as the INSTPM register is saved (and automatically restored) as part of the render ring context. The software per-device value could therefore get out of sync with the hardware per-context value. This meant that using them is actually unsafe: a client which tried to use them could damage the state of other clients, causing the GPU to interpret their BO offsets as absolute pointers, leading to bogus memory reads. These flags were also never ported to execlist mode, making them no-ops on Gen9+ (which requires execlists), and Gen8 in the default mode. On Gen8+, userspace can write these registers directly, achieving the same effect. On Gen6-7.5, it likely makes sense to extend the command parser to support them. I don't think anyone wants this on Gen4-5. Based on a patch by Dave Gordon. v3: Return -ENODEV for the getparam, as this is what we do for other obsolete features. Suggested by Chris Wilson. Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92448 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170215093446.21291-1-kenneth@whitecape.org Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170313170433.26843-1-chris@chris-wilson.co.uk (cherry picked from commit `ef0f411f51`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-03-14 12:28:25 +02:00
Chris Wilson	4565bf58d4	drm/i915: Disable engine->irq_tasklet around resets When we restart the engines, and we have active requests, a request on the first engine may complete and queue a request to the second engine before we try to restart the second engine. That queueing of the request may race with the engine to restart, and so may corrupt the current state. Disabling the engine->irq_tasklet prevents the two paths from writing into ELSP simultaneously (and modifyin the execlists_port[] at the same time). Include fixup `1d309634bc` ("drm/i915: Kill the tasklet then disable") Fixes: `821ed7df6e` ("drm/i915: Update reset path to fix incomplete requests") Testcase: igt/gem_exec_fence/await-hang Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170208143033.11651-3-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/20170313165958.13970-2-chris@chris-wilson.co.uk (cherry picked from commit `1f7b847d72`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-03-14 12:26:44 +02:00
Chris Wilson	da9a796f54	drm/i915: Split GEM resetting into 3 phases Currently we do a reset prepare/finish around the call to reset the GPU, but it looks like we need a later stage after the hw has been reinitialised to allow GEM to restart itself. Start by splitting the 2 GEM phases into 3: prepare - before the reset, check if GEM recovered, then stop GEM reset - after the reset, update GEM bookkeeping finish - after the re-initialisation following the reset, restart GEM Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170208143033.11651-2-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/20170313165958.13970-1-chris@chris-wilson.co.uk (cherry picked from commit `d802709313`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-03-14 12:11:49 +02:00
Chris Wilson	7f5f95d8ac	drm/i915: Move whole object to CPU domain for coherent shmem access If the object is coherent, we can simply update the cache domain on the whole object rather than calculate the before/after clflushes. The advantage is that we then get correct tracking of ellided flushes when changing coherency later. Testcase: igt/gem_pwrite_snooped Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170310000942.11661-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-13 11:16:09 +00:00
Chris Wilson	4e6fdafa7a	drm/i915: Use pagecache write to prepopulate shmemfs from pwrite-ioctl Before we instantiate/pin the backing store for our use, we can prepopulate the shmemfs filp efficiently using a write into the pagecache. We avoid the penalty of instantiating all the pages, important if the user is just writing to a few and never uses the object on the GPU, and using a direct write into shmemfs allows it to avoid the cost of retrieving a page (mostly the clear-before-use, but in theory we could curtail swapin) before it is overwritten. This can be extended later to provide additional specialisation for other backends (other than shmemfs). For now it provides a defense against very large write-only allocations from exhausting all of system memory. v2: Smelling fixes. Fixes: `fe115628d5` ("drm/i915: Implement pwrite without struct-mutex") References: https://bugs.freedesktop.org/show_bug.cgi?id=99107 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.10+ Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170307120338.7277-2-chris@chris-wilson.co.uk (cherry picked from commit `7c55e2c577`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-03-09 10:46:07 +02:00
Chris Wilson	0d9dc306e1	drm/i915: Store a permanent error in obj->mm.pages Once the object has been truncated, it is unrecoverable. To facilitate detection of this state store the error in obj->mm.pages. This is required for the next patch which should be applied to v4.10 (via stable), so we also need to mark this patch for backporting. In that regard, let's consider this to be a fix/improvement too. v2: Avoid dereferencing the ERR_PTR when freeing the object. Fixes: `1233e2db19` ("drm/i915: Move object backing storage manipulation to its own locking") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.10+ Link: http://patchwork.freedesktop.org/patch/msgid/20170307132031.32461-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit `4e5462ee84`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-03-09 10:45:58 +02:00
Chris Wilson	89cf83d4e0	drm/i915: Squelch any ktime/jiffie rounding errors for wait-ioctl We wait upon jiffies, but report the time elapsed using a high-resolution timer. This discrepancy can lead to us timing out the wait prior to us reporting the elapsed time as complete. This restores the squelching lost in commit `e95433c73a` ("drm/i915: Rearrange i915_wait_request() accounting with callers"). Fixes: `e95433c73a` ("drm/i915: Rearrange i915_wait_request() accounting with callers") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20170216125441.30923-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit `c1d2061b28`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-03-09 10:42:44 +02:00
Chris Wilson	03d1cac6ee	drm/i915: Avoiding recursing on ww_mutex inside shrinker We have to avoid taking ww_mutex inside the shrinker as we use it as a plain mutex type and so need to avoid recursive deadlocks: [ 602.771969] ================================= [ 602.771970] [ INFO: inconsistent lock state ] [ 602.771973] 4.10.0gpudebug+ #122 Not tainted [ 602.771974] --------------------------------- [ 602.771975] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. [ 602.771978] kswapd0/40 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 602.771979] (reservation_ww_class_mutex){+.+.?.}, at: [<ffffffffa054680a>] i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772020] {RECLAIM_FS-ON-W} state was registered at: [ 602.772024] mark_held_locks+0x76/0x90 [ 602.772026] lockdep_trace_alloc+0xb8/0xc0 [ 602.772028] __kmalloc_track_caller+0x5d/0x130 [ 602.772031] krealloc+0x89/0xb0 [ 602.772033] reservation_object_reserve_shared+0xaf/0xd0 [ 602.772055] i915_gem_do_execbuffer.isra.35+0x1413/0x18b0 [i915] [ 602.772075] i915_gem_execbuffer2+0x10e/0x1d0 [i915] [ 602.772078] drm_ioctl+0x291/0x480 [ 602.772079] do_vfs_ioctl+0x695/0x6f0 [ 602.772081] SyS_ioctl+0x3c/0x70 [ 602.772084] entry_SYSCALL_64_fastpath+0x18/0xad [ 602.772085] irq event stamp: 5197423 [ 602.772088] hardirqs last enabled at (5197423): [<ffffffff8116751d>] kfree+0xdd/0x170 [ 602.772091] hardirqs last disabled at (5197422): [<ffffffff811674f9>] kfree+0xb9/0x170 [ 602.772095] softirqs last enabled at (5190992): [<ffffffff8107bfe1>] __do_softirq+0x221/0x280 [ 602.772097] softirqs last disabled at (5190575): [<ffffffff8107c294>] irq_exit+0x64/0xc0 [ 602.772099] other info that might help us debug this: [ 602.772100] Possible unsafe locking scenario: [ 602.772101] CPU0 [ 602.772101] ---- [ 602.772102] lock(reservation_ww_class_mutex); [ 602.772104] <Interrupt> [ 602.772105] lock(reservation_ww_class_mutex); [ 602.772107] * DEADLOCK * [ 602.772109] 2 locks held by kswapd0/40: [ 602.772110] #0: (shrinker_rwsem){++++..}, at: [<ffffffff811337b5>] shrink_slab.constprop.62+0x35/0x280 [ 602.772116] #1: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0553957>] i915_gem_shrinker_lock+0x27/0x60 [i915] [ 602.772141] stack backtrace: [ 602.772144] CPU: 2 PID: 40 Comm: kswapd0 Not tainted 4.10.0gpudebug+ #122 [ 602.772145] Hardware name: LENOVO 42433ZG/42433ZG, BIOS 8AET64WW (1.44 ) 07/26/2013 [ 602.772147] Call Trace: [ 602.772151] dump_stack+0x68/0xa1 [ 602.772153] print_usage_bug+0x1d4/0x1f0 [ 602.772155] mark_lock+0x390/0x530 [ 602.772157] ? print_irq_inversion_bug+0x200/0x200 [ 602.772159] __lock_acquire+0x405/0x1260 [ 602.772181] ? i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772183] lock_acquire+0x60/0x80 [ 602.772205] ? i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772207] mutex_lock_nested+0x69/0x760 [ 602.772229] ? i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772231] ? kfree+0xdd/0x170 [ 602.772253] ? i915_gem_object_wait+0x163/0x410 [i915] [ 602.772255] ? trace_hardirqs_on_caller+0x18d/0x1c0 [ 602.772256] ? trace_hardirqs_on+0xd/0x10 [ 602.772278] i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772300] i915_gem_object_unbind+0x5e/0x130 [i915] [ 602.772323] i915_gem_shrink+0x22d/0x3d0 [i915] [ 602.772347] i915_gem_shrinker_scan+0x3f/0x80 [i915] [ 602.772349] shrink_slab.constprop.62+0x1ad/0x280 [ 602.772352] shrink_node+0x52/0x80 [ 602.772355] kswapd+0x427/0x5c0 [ 602.772358] kthread+0x122/0x130 [ 602.772360] ? try_to_free_pages+0x270/0x270 [ 602.772362] ? kthread_stop+0x70/0x70 [ 602.772365] ret_from_fork+0x2e/0x40 v2: Add commentary about the pruning being opportunistic Reported-by: Jan Nordholz <jckn@gmx.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99977#c10 Fixes: `e54ca97747` ("drm/i915: Remove completed fences after a wait") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170308132629.7987-1-chris@chris-wilson.co.uk	2017-03-08 20:42:17 +00:00
Daniel Vetter	7ffe939dd9	Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued Backmerge drm-next to get at all the good stuff in drm-misc. We need that because: - drm_connector_list_iter conversion for i915 needs the core patches. - Maarten's patches to use the new atomic state iterators also need the core patches. - We need the new link status property to complete the DP retraining work, merging through 2 branches wasn't a good idea and we had to partially backtrack. - Chris needs reservation_object_trylock and we want to roll out kref_read everywhere. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2017-03-08 10:54:45 +01:00
Dave Airlie	2e16101780	Merge tag 'drm-intel-next-2017-03-06' of git://anongit.freedesktop.org/git/drm-intel into drm-next 4 weeks worth of stuff since I was traveling&lazy: - lspcon improvements (Imre) - proper atomic state for cdclk handling (Ville) - gpu reset improvements (Chris) - lots and lots of polish around fences, requests, waiting and everything related all over (both gem and modeset code), from Chris - atomic by default on gen5+ minus byt/bsw (Maarten did the patch to flip the default, really this is a massive joint team effort) - moar power domains, now 64bit (Ander) - big pile of in-kernel unit tests for various gem subsystems (Chris), including simple mock objects for i915 device and and the ggtt manager. - i915_gpu_info in debugfs, for taking a snapshot of the current gpu state. Same thing as i915_error_state, but useful if the kernel didn't notice something is stick. From Chris. - bxt dsi fixes (Umar Shankar) - bxt w/a updates (Jani) - no more struct_mutex for gem object unreference (Chris) - some execlist refactoring (Tvrtko) - color manager support for glk (Ander) - improve the power-well sync code to better take over from the firmware (Imre) - gem tracepoint polish (Tvrtko) - lots of glk fixes all around (Ander) - ctx switch improvements (Chris) - glk dsi support&fixes (Deepak M) - dsi fixes for vlv and clanups, lots of them (Hans de Goede) - switch to i915.ko types in lots of our internal modeset code (Ander) - byt/bsw atomic wm update code, yay (Ville) * tag 'drm-intel-next-2017-03-06' of git://anongit.freedesktop.org/git/drm-intel: (432 commits) drm/i915: Update DRIVER_DATE to 20170306 drm/i915: Don't use enums for hardware engine id drm/i915: Split breadcrumbs spinlock into two drm/i915: Refactor wakeup of the next breadcrumb waiter drm/i915: Take reference for signaling the request from hardirq drm/i915: Add FIFO underrun tracepoints drm/i915: Add cxsr toggle tracepoint drm/i915: Add VLV/CHV watermark/FIFO programming tracepoints drm/i915: Add plane update/disable tracepoints drm/i915: Kill level 0 wm hack for VLV/CHV drm/i915: Workaround VLV/CHV sprite1->sprite0 enable underrun drm/i915: Sanitize VLV/CHV watermarks properly drm/i915: Only use update_wm_{pre,post} for pre-ilk platforms drm/i915: Nuke crtc->wm.cxsr_allowed drm/i915: Compute proper intermediate wms for vlv/cvh drm/i915: Skip useless watermark/FIFO related work on VLV/CHV when not needed drm/i915: Compute vlv/chv wms the atomic way drm/i915: Compute VLV/CHV FIFO sizes based on the PM2 watermarks drm/i915: Plop vlv/chv fifo sizes into crtc state drm/i915: Plop vlv wm state into crtc_state ...	2017-03-08 12:41:47 +10:00
Chris Wilson	7c55e2c577	drm/i915: Use pagecache write to prepopulate shmemfs from pwrite-ioctl Before we instantiate/pin the backing store for our use, we can prepopulate the shmemfs filp efficiently using a write into the pagecache. We avoid the penalty of instantiating all the pages, important if the user is just writing to a few and never uses the object on the GPU, and using a direct write into shmemfs allows it to avoid the cost of retrieving a page (mostly the clear-before-use, but in theory we could curtail swapin) before it is overwritten. This can be extended later to provide additional specialisation for other backends (other than shmemfs). For now it provides a defense against very large write-only allocations from exhausting all of system memory. v2: Smelling fixes. Fixes: `fe115628d5` ("drm/i915: Implement pwrite without struct-mutex") References: https://bugs.freedesktop.org/show_bug.cgi?id=99107 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.10+ Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170307120338.7277-2-chris@chris-wilson.co.uk	2017-03-07 21:26:59 +00:00
Chris Wilson	4e5462ee84	drm/i915: Store a permanent error in obj->mm.pages Once the object has been truncated, it is unrecoverable. To facilitate detection of this state store the error in obj->mm.pages. This is required for the next patch which should be applied to v4.10 (via stable), so we also need to mark this patch for backporting. In that regard, let's consider this to be a fix/improvement too. v2: Avoid dereferencing the ERR_PTR when freeing the object. Fixes: `1233e2db19` ("drm/i915: Move object backing storage manipulation to its own locking") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.10+ Link: http://patchwork.freedesktop.org/patch/msgid/20170307132031.32461-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-07 21:25:38 +00:00
Chris Wilson	0542524944	drm/i915: Generalise wait for execlists to be idle The code to check for execlists completion is generic, so move it to intel_engine_cs.c, where we can reuse the new intel_engine_is_idle(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170303121947.20482-2-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-03-03 13:08:15 +00:00
Chris Wilson	c8659efac5	drm/i915: Drop spinlocks around adding to the client request list Adding to the tail of the client request list as the only other user is in the throttle ioctl that iterates forwards over the list. It only needs protection against deletion of a request as it reads it, it simply won't see a new request added to the end of the list, or it would be too early and rejected. We can further reduce the number of spinlocks required when throttling by removing stale requests from the client_list as we throttle. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170302122525.19675-1-chris@chris-wilson.co.uk	2017-03-02 22:33:41 +00:00
Chris Wilson	c998e8a0f4	drm/i915: Hold rpm during GEM suspend in driver unload/suspend i915_gem_suspend() tries to access the device to ensure it is idle and all writes from the device are flushed to memory. It assumed is already held the runtime pm wakeref, but we should explicitly acquire it for our access to be safe. [ 619.926287] WARNING: CPU: 3 PID: 9353 at drivers/gpu/drm/i915/intel_drv.h:1750 gen6_write32+0x23e/0x2a0 [i915] [ 619.926300] RPM wakelock ref not held during HW access [ 619.926311] Modules linked in: vgem x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_codec coretemp snd_hwdep crct10dif_pclmul snd_hda_core crc32_pclmul snd_pcm mei_me mei lpc_ich ghash_clmulni_intel i915(-) sdhci_pci sdhci mmc_core e1000e ptp pps_core prime_numbers [last unloaded: snd_hda_intel] [ 619.926578] CPU: 3 PID: 9353 Comm: drv_module_relo Tainted: G U 4.10.0-CI-Trybot_609+ #1 [ 619.926585] Hardware name: LENOVO 42962WU/42962WU, BIOS 8DET56WW (1.26 ) 12/01/2011 [ 619.926592] Call Trace: [ 619.926609] dump_stack+0x67/0x92 [ 619.926625] __warn+0xc6/0xe0 [ 619.926640] warn_slowpath_fmt+0x4a/0x50 [ 619.926726] gen6_write32+0x23e/0x2a0 [i915] [ 619.926801] gen6_mm_switch+0x38/0x70 [i915] [ 619.926871] i915_switch_context+0xec/0xa10 [i915] [ 619.926942] i915_gem_switch_to_kernel_context+0x13c/0x2b0 [i915] [ 619.927019] i915_gem_suspend+0x2b/0x180 [i915] [ 619.927079] i915_driver_unload+0x22/0x200 [i915] [ 619.927093] ? __this_cpu_preempt_check+0x13/0x20 [ 619.927105] ? trace_hardirqs_on_caller+0xe7/0x200 [ 619.927118] ? trace_hardirqs_on+0xd/0x10 [ 619.927128] ? _raw_spin_unlock_irqrestore+0x3d/0x60 [ 619.927192] i915_pci_remove+0x14/0x20 [i915] [ 619.927205] pci_device_remove+0x34/0xb0 [ 619.927219] device_release_driver_internal+0x158/0x210 [ 619.927234] driver_detach+0x3b/0x80 [ 619.927245] bus_remove_driver+0x53/0xd0 [ 619.927256] driver_unregister+0x27/0x50 [ 619.927267] pci_unregister_driver+0x25/0xa0 [ 619.927351] i915_exit+0x1a/0xb1a [i915] [ 619.927362] SyS_delete_module+0x193/0x1e0 [ 619.927378] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 619.927386] RIP: 0033:0x7f82b46c5d37 [ 619.927393] RSP: 002b:00007ffdb6f610d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0 [ 619.927408] RAX: ffffffffffffffda RBX: ffffffff81481ff3 RCX: 00007f82b46c5d37 [ 619.927415] RDX: 0000000000000001 RSI: 0000000000000800 RDI: 000000000224f558 [ 619.927422] RBP: ffffc90001187f88 R08: 0000000000000000 R09: 00007ffdb6f61100 [ 619.927428] R10: 000000000224f4e0 R11: 0000000000000246 R12: 0000000000000000 [ 619.927435] R13: 00007ffdb6f612b0 R14: 0000000000000000 R15: 0000000000000000 [ 619.927451] ? __this_cpu_preempt_check+0x13/0x20 or [ 641.646590] WARNING: CPU: 1 PID: 8913 at drivers/gpu/drm/i915/intel_drv.h:1750 intel_runtime_pm_get_noresume+0x8b/0x90 [i915] [ 641.646595] RPM wakelock ref not held during HW access [ 641.646600] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec snd_hwdep crct10dif_pclmul snd_hda_core crc32_pclmul ghash_clmulni_intel snd_pcm mei_me mei i915(-) r8169 mii prime_numbers i2c_hid [last unloaded: snd_hda_intel] [ 641.646825] CPU: 1 PID: 8913 Comm: drv_module_relo Tainted: G U 4.10.0-CI-Trybot_609+ #1 [ 641.646836] Hardware name: TOSHIBA SATELLITE P50-C/06F4 , BIOS 1.20 10/08/2015 [ 641.646843] Call Trace: [ 641.646857] dump_stack+0x67/0x92 [ 641.646869] __warn+0xc6/0xe0 [ 641.646880] warn_slowpath_fmt+0x4a/0x50 [ 641.646893] ? __this_cpu_preempt_check+0x13/0x20 [ 641.646904] ? trace_hardirqs_on_caller+0xe7/0x200 [ 641.646957] intel_runtime_pm_get_noresume+0x8b/0x90 [i915] [ 641.647022] __i915_add_request+0x423/0x540 [i915] [ 641.647080] i915_gem_switch_to_kernel_context+0x148/0x2b0 [i915] [ 641.647145] i915_gem_suspend+0x2b/0x180 [i915] [ 641.647189] i915_driver_unload+0x22/0x200 [i915] [ 641.647200] ? __this_cpu_preempt_check+0x13/0x20 [ 641.647210] ? trace_hardirqs_on_caller+0xe7/0x200 [ 641.647220] ? trace_hardirqs_on+0xd/0x10 [ 641.647231] ? _raw_spin_unlock_irqrestore+0x3d/0x60 [ 641.647276] i915_pci_remove+0x14/0x20 [i915] [ 641.647293] pci_device_remove+0x34/0xb0 [ 641.647307] device_release_driver_internal+0x158/0x210 [ 641.647321] driver_detach+0x3b/0x80 [ 641.647330] bus_remove_driver+0x53/0xd0 [ 641.647338] driver_unregister+0x27/0x50 [ 641.647348] pci_unregister_driver+0x25/0xa0 [ 641.647415] i915_exit+0x1a/0xb1a [i915] [ 641.647429] SyS_delete_module+0x193/0x1e0 [ 641.647444] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 641.647453] RIP: 0033:0x7fc622bd2d37 [ 641.647463] RSP: 002b:00007ffff8ffb5c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0 [ 641.647475] RAX: ffffffffffffffda RBX: ffffffff81481ff3 RCX: 00007fc622bd2d37 [ 641.647480] RDX: 0000000000000001 RSI: 0000000000000800 RDI: 0000000000d49118 [ 641.647485] RBP: ffffc90000997f88 R08: 0000000000000000 R09: 00007ffff8ffb5f0 [ 641.647491] R10: 0000000000d490a0 R11: 0000000000000246 R12: 0000000000000000 [ 641.647498] R13: 00007ffff8ffb7a0 R14: 0000000000000000 R15: 0000000000000000 [ 641.647510] ? __this_cpu_preempt_check+0x13/0x20 v2: Keep holding rpm until the end to cover i915_gem_sanitize() as well. Fixes: `5ab57c7020` ("drm/i915: Flush logical context image out to memory upon suspend") Fixes: `1c777c5d1d` ("drm/i915/hsw: Fix GPU hang during resume from S3-devices state") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170302083029.19576-1-chris@chris-wilson.co.uk Reviewed-by: Imre Deak <imre.deak@intel.com> Cc: <stable@vger.kernel.org> # v4.9+	2017-03-02 12:44:08 +00:00
Chris Wilson	67b807a892	drm/i915: Delay disabling the user interrupt for breadcrumbs A significant cost in setting up a wait is the overhead of enabling the interrupt. As we disable the interrupt whenever the queue of waiters is empty, if we are frequently waiting on alternating batches, we end up re-enabling the interrupt on a frequent basis. We do want to disable the interrupt during normal operations as under high load it may add several thousand interrupts/s - we have been known in the past to occupy whole cores with our interrupt handler after accidentally leaving user interrupts enabled. As a compromise, leave the interrupt enabled until the next IRQ, or the system is idle. This gives a small window for a waiter to keep the interrupt active and not be delayed by having to re-enable the interrupt. v2: Restore hangcheck/missed-irq detection for continuations v3: Be more careful restoring the hangcheck timer after reset v4: Be more careful restoring the fake irq after reset (if required!) v5: Redo changes to intel_engine_wakeup() v6: Factor out __intel_engine_wakeup() v7: Improve commentary for declaring a missed wakeup Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-4-chris@chris-wilson.co.uk	2017-02-27 21:57:23 +00:00
Dave Jiang	11bac80004	mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmf ->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [arnd@arndb.de: fix ARM build] Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-24 17:46:54 -08:00
Kenneth Graunke	ef0f411f51	drm/i915: Drop support for I915_EXEC_CONSTANTS_* execbuf parameters. This patch makes the I915_PARAM_HAS_EXEC_CONSTANTS getparam return 0 (indicating the optional feature is not supported), and makes execbuf always return -EINVAL if the flags are used. Apparently, no userspace ever shipped which used this optional feature: I checked the git history of Mesa, xf86-video-intel, libva, and Beignet, and there were zero commits showing a use of these flags. Kernel commit `72bfa19c8d` apparently introduced the feature prematurely. According to Chris, the intention was to use this in cairo-drm, but "the use was broken for gen6", so I don't think it ever happened. 'relative_constants_mode' has always been tracked per-device, but this has actually been wrong ever since hardware contexts were introduced, as the INSTPM register is saved (and automatically restored) as part of the render ring context. The software per-device value could therefore get out of sync with the hardware per-context value. This meant that using them is actually unsafe: a client which tried to use them could damage the state of other clients, causing the GPU to interpret their BO offsets as absolute pointers, leading to bogus memory reads. These flags were also never ported to execlist mode, making them no-ops on Gen9+ (which requires execlists), and Gen8 in the default mode. On Gen8+, userspace can write these registers directly, achieving the same effect. On Gen6-7.5, it likely makes sense to extend the command parser to support them. I don't think anyone wants this on Gen4-5. Based on a patch by Dave Gordon. v3: Return -ENODEV for the getparam, as this is what we do for other obsolete features. Suggested by Chris Wilson. Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92448 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170215093446.21291-1-kenneth@whitecape.org Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-02-24 17:51:05 +00:00
Chris Wilson	754c9fd576	drm/i915: Protect the request->global_seqno with the engine->timeline lock A request is assigned a global seqno only when it is on the hardware execution queue. The global seqno can be used to maintain a list of requests on the same engine in retirement order, for example for constructing a priority queue for waiting. Prior to its execution, or if it is subsequently removed in the event of preemption, its global seqno is zero. As both insertion and removal from the execution queue may operate in IRQ context, it is not guarded by the usual struct_mutex BKL. Instead those relying on the global seqno must be prepared for its value to change between reads. Only when the request is complete can the global seqno be stable (due to the memory barriers on submitting the commands to the hardware to write the breadcrumb, if the HWS shows that it has passed the global seqno and the global seqno is unchanged after the read, it is indeed complete). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-9-chris@chris-wilson.co.uk	2017-02-23 14:49:32 +00:00
Dave Airlie	94000cc329	Linux 4.10-rc8 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJYoM2fAAoJEHm+PkMAQRiGr9MH/izEAMri7rJ0QMc3ejt+WmD0 8pkZw3+MVn71z6cIEgpzk4QkEWJd5rfhkETCeCp7qQ9V6cDW1FDE9+0OmPjiphDt nnzKs7t7skEBwH5Mq5xygmIfkv+Z0QGHZ20gfQWY3F56Uxo+ARF88OBHBLKhqx3v 98C7YbMFLKBslKClA78NUEIdx0UfBaRqerlERx0Lfl9aoOrbBS6WI3iuREiylpih 9o7HTrwaGKkU4Kd6NdgJP2EyWPsd1LGalxBBjeDSpm5uokX6ALTdNXDZqcQscHjE RmTqJTGRdhSThXOpNnvUJvk9L442yuNRrVme/IqLpxMdHPyjaXR3FGSIDb2SfjY= =VMy8 -----END PGP SIGNATURE----- Merge tag 'v4.10-rc8' into drm-next Linux 4.10-rc8 Backmerge Linus rc8 to fix some conflicts, but also to avoid pulling it in via a fixes pull from someone.	2017-02-23 12:10:12 +10:00
Chris Wilson	d59b21ec6f	drm/i915: Remove 'retire' parameter from intel_fb_obj_flush Setting retire=true is identical to using origin=ORIGIN_CS, so make the same simplification to intel_fb_obj_flush() as already employed for intel_fb_obj_invalidate(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-6-chris@chris-wilson.co.uk	2017-02-22 12:12:17 +00:00
Chris Wilson	57822dc6b9	drm/i915: Perform object clflushing asynchronously Flushing the cachelines for an object is slow, can be as much as 100ms for a large framebuffer. We currently do this under the struct_mutex BKL on execution or on pageflip. But now with the ability to add fences to obj->resv for both flips and execbuf (and we naturally wait on the fence before CPU access), we can move the clflush operation to a workqueue and signal a fence for completion, thereby doing the work asynchronously and not blocking the driver or its clients. v2: Introduce i915_gem_clflush.h and use a new name, split out some extras into separate patches. Suggested-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-5-chris@chris-wilson.co.uk	2017-02-22 12:12:15 +00:00
Chris Wilson	f6aaba4dfb	drm/i915: Skip clflushes for all non-page backed objects Generalise the skip for physical and stolen objects by skipping anything we do not have a valid address for inside the sg. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-4-chris@chris-wilson.co.uk	2017-02-22 12:12:14 +00:00
Chris Wilson	5a97bcc69c	drm/i915: Amalgamate flushing of display objects We have three different paths by which userspace wants to flush the display plane (i.e. objects with obj->pin_display). Use a common helper to identify those paths and to simplify a later change. v2: Include the conditional in the name, i915_gem_object_flush_if_display Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-3-chris@chris-wilson.co.uk	2017-02-22 12:12:13 +00:00
Chris Wilson	e59dc17211	drm/i915: Move cpu_cache_is_coherent() to header For use in the next patch, take the current is-coherent helper and add it to i915_gem_object.h Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-2-chris@chris-wilson.co.uk	2017-02-22 12:12:12 +00:00
Chris Wilson	208b84a375	drm/i915: Remove change_domain tracepoint The change_domain tracepoint has been inaccurate for a few years - it doesn't fully capture the domains, especially with userspace bypassing them. It is defunct, misleading and time to be removed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-1-chris@chris-wilson.co.uk	2017-02-22 12:12:11 +00:00
Chris Wilson	e54ca97747	drm/i915: Remove completed fences after a wait If we wait upon the full (i.e. all shared fences, or upon an exclusive fence) reservation object successfully, we know that all fences beneath it have been signaled, so long as no new fences were added whilst we slept. If the reservation_object remains the same, as detected by its seqcount, we can then reap all the fences upon completion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170217151304.16665-6-chris@chris-wilson.co.uk	2017-02-17 15:31:15 +00:00
Chris Wilson	581ab1fe52	drm/i915: Unwind conversion to i915_gem_phys_ops on failure The physical object is treated as permanently pinned. If we fail to take this initial pin during i915_gem_object_attach_phys() we need to revert it back to an ordinary shmemfs object before reporting the failure. v2: git-add Reported-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215163900.11606-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-02-16 20:31:13 +00:00
Chris Wilson	c1d2061b28	drm/i915: Squelch any ktime/jiffie rounding errors for wait-ioctl We wait upon jiffies, but report the time elapsed using a high-resolution timer. This discrepancy can lead to us timing out the wait prior to us reporting the elapsed time as complete. This restores the squelching lost in commit `e95433c73a` ("drm/i915: Rearrange i915_wait_request() accounting with callers"). Fixes: `e95433c73a` ("drm/i915: Rearrange i915_wait_request() accounting with callers") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20170216125441.30923-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-02-16 20:31:13 +00:00
Chris Wilson	636deb5b22	drm/i915: Pass timeout==0 on to i915_gem_object_wait_fence() The i915_gem_object_wait_fence() uses an incoming timeout=0 to query whether the current fence is busy or idle, without waiting. This can be used by the wait-ioctl to implement a busy query. Fixes: `e95433c73a` ("drm/i915: Rearrange i915_wait_request() accounting with callers") Testcase: igt/gem_wait/basic-busy-write-all Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20170212215344.16600-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit `d892e9398e`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-02-16 11:59:13 +02:00
Chris Wilson	ec62ed3e1d	drm/i915: Restore context and pd for ringbuffer submission after reset Following a reset, the context and page directory registers are lost. However, the queue of requests that we resubmit after the reset may depend upon them - the registers are restored from a context image, but that restore may be inhibited and may simply be absent from the request if it was in the middle of a sequence using the same context. If we prime the CCID/PD registers with the first request in the queue (even for the hung request), we prevent invalid memory access for the following requests (and continually hung engines). v2: Magic BIT(8), reserved for future use but still appears unused. v3: Some commentary on handling innocent vs guilty requests v4: Add a wait for PD_BASE fetch. The reload appears to be instant on my Ivybridge, but this bit probably exists for a reason. Fixes: `821ed7df6e` ("drm/i915: Update reset path to fix incomplete requests") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170207152437.4252-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> (cherry picked from commit `c0dcb203fb`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-02-16 11:59:11 +02:00
Chris Wilson	d892e9398e	drm/i915: Pass timeout==0 on to i915_gem_object_wait_fence() The i915_gem_object_wait_fence() uses an incoming timeout=0 to query whether the current fence is busy or idle, without waiting. This can be used by the wait-ioctl to implement a busy query. Fixes: `e95433c73a` ("drm/i915: Rearrange i915_wait_request() accounting with callers") Testcase: igt/gem_wait/basic-busy-write-all Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20170212215344.16600-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-02-14 09:38:42 +00:00
Chris Wilson	170594502c	drm/i915: Test coherency of and barriers between cache domains Write into an object using WB, WC, GTT, and GPU paths and make sure that our internal API is sufficient to ensure coherent reads and writes. v2: Avoid invalid free upon allocation error Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-21-chris@chris-wilson.co.uk	2017-02-13 20:45:45 +00:00
Chris Wilson	8335fd65ce	drm/i915: Add selftests for object allocation, phys The phys object is a rarely used device (only very old machines require a chunk of physically contiguous pages for a few hardware interactions). As such, it is not exercised by CI and to combat that we want to add a test that exercises the phys object on all platforms. v2: Always set err on error paths and not rely on inheriting the err. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-17-chris@chris-wilson.co.uk	2017-02-13 20:45:42 +00:00
Chris Wilson	44653988ef	drm/i915: Create a fake object for testing huge allocations We would like to be able to exercise huge allocations even on memory constrained devices. To do this we create an object that allocates only a few pages and remaps them across its whole range - each page is reused multiple times. We can therefore pretend we are rendering into a much larger object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-9-chris@chris-wilson.co.uk	2017-02-13 20:45:34 +00:00
Chris Wilson	66d9cb5d80	drm/i915: Mock the GEM device for self-testing A simulacrum of drm_i915_private to let us pretend interactions with the device. v2: Tidy init error paths Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-6-chris@chris-wilson.co.uk	2017-02-13 20:45:28 +00:00
Chris Wilson	935a2f776a	drm/i915: Add some selftests for sg_table manipulation Start exercising the scattergather lists, especially looking at iteration after coalescing. v2: Comment on the peculiarity of table construction (i.e. why this sg_table might be interesting). v3: Added one __func__ to identify expect_pfn_sg() v4: Loop until we have crossed the chain boundary (forcing sg_table to do multiple allocations) before squelching a potential ENOMEM from oom. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-2-chris@chris-wilson.co.uk	2017-02-13 20:45:23 +00:00
Chris Wilson	2ae557388d	drm/i915: Clear the last_retired_context following a hang/reset Following a hang and reset, we know that the engine is idle and all context state has been saved or lost. Consequently, we know that the engine is no longer referencing the last context and we can relinquish our tracking. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tursulin@ursulin.net> Link: http://patchwork.freedesktop.org/patch/msgid/20170212172002.23072-5-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-02-13 11:19:12 +00:00
Chris Wilson	fe3288b5da	drm/i915: Park the breadcrumbs signaler across a GPU reset The signal threads may be running concurrently with the GPU reset. The completion from the GPU run asynchronous with the reset and two threads may see different snapshots of the state, and the signaler may mark a request as complete as we try to reset it. We don't tolerate 2 different views of the same state and complain if we try to mark a request as failed if it is already complete. Disable the signal threads during reset to prevent this conflict (even though the conflict implies that the state we resetting to is invalid, we have already made our decision!). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99733 References: https://bugs.freedesktop.org/show_bug.cgi?id=99671 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170212172002.23072-4-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-02-13 11:18:28 +00:00
Chris Wilson	1d309634bc	drm/i915: Kill the tasklet then disable Disabling the tasklet leaves it if scheduled on the ready to run list until it is re-enabled. This will leave the ksoftird thread spinning until satisfied. To prevent this situation on starting the GPU reset, we want to kill the tasklet first and then disable. The same problem will arise when a tasklet is scheduled from another device, so a better solution is required for the general case. Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: `1f7b847d72` ("drm/i915: Disable engine->irq_tasklet around resets") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170212172002.23072-3-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-02-13 11:18:23 +00:00
Chris Wilson	c00122f33f	drm/i915: Assert that the active request hasn't been signaled As the request is not complete, it should not be signaled. Assert that this is true before we process the request for a reset. References: https://bugs.freedesktop.org/show_bug.cgi?id=99671 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170212172002.23072-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-02-13 11:06:35 +00:00
Chris Wilson	8c12d12159	drm/i915: Move the irq_barrier for reset earlier into reset_prepare When updating the bookkeeping following the reset, we need the seqno to be coherent on the CPU prior to trusting its result for deciding whether any request is completed. We need the irq_barrier before we start making these decisions, i.e. in reset_prepare. References: https://bugs.freedesktop.org/show_bug.cgi?id=99733 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170210185214.23463-1-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com>	2017-02-10 21:10:51 +00:00
Chris Wilson	c4d4c1c66b	drm/i915: Flush the freed object queue on device release As dmabufs may live beyond the PCI device removal, we need to flush the freed object worker on device release, and include a warning in case there is a leak. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170210163523.17533-3-chris@chris-wilson.co.uk	2017-02-10 19:10:05 +00:00
Daniel Vetter	51a831a772	Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued Chris Wilson needs the new drm_driver->release callback to make sure the shiny new dma-buf testcases don't oops the driver on unload. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2017-02-10 16:27:24 +01:00
Chris Wilson	1f7b847d72	drm/i915: Disable engine->irq_tasklet around resets When we restart the engines, and we have active requests, a request on the first engine may complete and queue a request to the second engine before we try to restart the second engine. That queueing of the request may race with the engine to restart, and so may corrupt the current state. Disabling the engine->irq_tasklet prevents the two paths from writing into ELSP simultaneously (and modifyin the execlists_port[] at the same time). Fixes: `821ed7df6e` ("drm/i915: Update reset path to fix incomplete requests") Testcase: igt/gem_exec_fence/await-hang Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170208143033.11651-3-chris@chris-wilson.co.uk	2017-02-08 16:24:42 +00:00
Chris Wilson	d802709313	drm/i915: Split GEM resetting into 3 phases Currently we do a reset prepare/finish around the call to reset the GPU, but it looks like we need a later stage after the hw has been reinitialised to allow GEM to restart itself. Start by splitting the 2 GEM phases into 3: prepare - before the reset, check if GEM recovered, then stop GEM reset - after the reset, update GEM bookkeeping finish - after the re-initialisation following the reset, restart GEM Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170208143033.11651-2-chris@chris-wilson.co.uk	2017-02-08 16:24:42 +00:00
Chris Wilson	20a8a74ad9	drm/i915: Move calling engine->init_hw() to its own function Just a simple refactor to move a loop and some support code out of i915_gem_init_hw(). This is in preparation for avoiding a race between the tasklet writing to ELSP whilst simultaneously being written by engine->init_hw() following a GPU reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170208143033.11651-1-chris@chris-wilson.co.uk	2017-02-08 16:24:42 +00:00
Chris Wilson	519d524981	drm/i915: i915_gem_shrink_all() needs an awake device Since to unbind an object, we may need a powered up device to access the GTT entries, we only shrink bound objects if awake. Callers to i915_gem_shrink_all() had to take this into account and take the rpm wakeref, but we can move this wakeref into the shrink_all itself for convenience and making the function live up to its name. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170208104710.18089-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-02-08 16:24:42 +00:00
Chris Wilson	83bf6d55c1	drm/i915: Remove overzealous fence warn on runtime suspend The goal of the WARN was to catch when we are still actively using the fence as we go into the runtime suspend. However, the reg->pin_count is too coarse as it does not distinguish between exclusive ownership of the fence register from activity. I've not improved on the WARN, nor have we captured this WARN in an exact igt, but it is showing up regularly in the wild: [ 1915.935332] WARNING: CPU: 1 PID: 10861 at drivers/gpu/drm/i915/i915_gem.c:2022 i915_gem_runtime_suspend+0x116/0x130 [i915] [ 1915.935383] WARN_ON(reg->pin_count)[ 1915.935399] Modules linked in: snd_hda_intel i915 drm_kms_helper vgem netconsole scsi_transport_iscsi fuse vfat fat x86_pkg_temp_thermal coretemp intel_cstate intel_uncore snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mei_me mei serio_raw intel_rapl_perf intel_pch_thermal soundcore wmi acpi_pad i2c_algo_bit syscopyarea sysfillrect sysimgblt fb_sys_fops drm r8169 mii video [last unloaded: drm_kms_helper] [ 1915.935785] CPU: 1 PID: 10861 Comm: kworker/1:0 Tainted: G U W 4.9.0-rc5+ #170 [ 1915.935799] Hardware name: LENOVO 80MX/Lenovo E31-80, BIOS DCCN34WW(V2.03) 12/01/2015 [ 1915.935822] Workqueue: pm pm_runtime_work [ 1915.935845] ffffc900044fbbf0 ffffffffac3220bc ffffc900044fbc40 0000000000000000 [ 1915.935890] ffffc900044fbc30 ffffffffac059bcb 000007e6044fbc60 ffff8801626e3198 [ 1915.935937] ffff8801626e0000 0000000000000002 ffffffffc05e5d4e 0000000000000000 [ 1915.935985] Call Trace: [ 1915.936013] [<ffffffffac3220bc>] dump_stack+0x4f/0x73 [ 1915.936038] [<ffffffffac059bcb>] __warn+0xcb/0xf0 [ 1915.936060] [<ffffffffac059c4f>] warn_slowpath_fmt+0x5f/0x80 [ 1915.936158] [<ffffffffc052d916>] i915_gem_runtime_suspend+0x116/0x130 [i915] [ 1915.936251] [<ffffffffc04f1c74>] intel_runtime_suspend+0x64/0x280 [i915] [ 1915.936277] [<ffffffffac0926f1>] ? dequeue_entity+0x241/0xbc0 [ 1915.936298] [<ffffffffac36bb85>] pci_pm_runtime_suspend+0x55/0x180 [ 1915.936317] [<ffffffffac36bb30>] ? pci_pm_runtime_resume+0xa0/0xa0 [ 1915.936339] [<ffffffffac4514e2>] __rpm_callback+0x32/0x70 [ 1915.936356] [<ffffffffac451544>] rpm_callback+0x24/0x80 [ 1915.936375] [<ffffffffac36bb30>] ? pci_pm_runtime_resume+0xa0/0xa0 [ 1915.936392] [<ffffffffac45222d>] rpm_suspend+0x12d/0x680 [ 1915.936415] [<ffffffffac69f6d7>] ? _raw_spin_unlock_irq+0x17/0x30 [ 1915.936435] [<ffffffffac0810b8>] ? finish_task_switch+0x88/0x220 [ 1915.936455] [<ffffffffac4534bf>] pm_runtime_work+0x6f/0xb0 [ 1915.936477] [<ffffffffac074353>] process_one_work+0x1f3/0x4d0 [ 1915.936501] [<ffffffffac074678>] worker_thread+0x48/0x4e0 [ 1915.936523] [<ffffffffac074630>] ? process_one_work+0x4d0/0x4d0 [ 1915.936542] [<ffffffffac074630>] ? process_one_work+0x4d0/0x4d0 [ 1915.936559] [<ffffffffac07a2c9>] kthread+0xd9/0xf0 [ 1915.936580] [<ffffffffac07a1f0>] ? kthread_park+0x60/0x60 [ 1915.936600] [<ffffffffac69fe62>] ret_from_fork+0x22/0x30 In the case the register is pinned, it should be present and we will need to invalidate them to be restored upon resume as we cannot expect the owner of the pin to call get_fence prior to use after resume. Fixes: `7c108fd8fe` ("drm/i915: Move fence cancellation to runtime suspend") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98804 Reported-by: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Imre Deak <imre.deak@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Link: http://patchwork.freedesktop.org/patch/msgid/20170203125717.8431-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit `e0ec3ec698`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-02-08 13:27:27 +02:00
Chris Wilson	e3818697e1	drm/i915: Flush untouched framebuffers before display on !llc On a non-llc system, the objects are created with .cache_level = CACHE_NONE and so the transition to uncached for scanout is a no-op. However, if the object was never written to, it will still be in the CPU domain (having been zeroed out by shmemfs). Those cachelines need to be flushed prior to display. Reported-and-tested-by: Vito Caputo Fixes: `a6a7cc4b7d` ("drm/i915: Always flush the dirty CPU cache when pinning the scanout") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Link: http://patchwork.freedesktop.org/patch/msgid/20170109111932.6342-1-chris@chris-wilson.co.uk Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (cherry picked from commit `69aeafeae9`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-02-08 13:10:24 +02:00
Chris Wilson	c0dcb203fb	drm/i915: Restore context and pd for ringbuffer submission after reset Following a reset, the context and page directory registers are lost. However, the queue of requests that we resubmit after the reset may depend upon them - the registers are restored from a context image, but that restore may be inhibited and may simply be absent from the request if it was in the middle of a sequence using the same context. If we prime the CCID/PD registers with the first request in the queue (even for the hung request), we prevent invalid memory access for the following requests (and continually hung engines). v2: Magic BIT(8), reserved for future use but still appears unused. v3: Some commentary on handling innocent vs guilty requests v4: Add a wait for PD_BASE fetch. The reload appears to be instant on my Ivybridge, but this bit probably exists for a reason. Fixes: `821ed7df6e` ("drm/i915: Update reset path to fix incomplete requests") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170207152437.4252-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-02-07 21:34:58 +00:00
Chris Wilson	e0ec3ec698	drm/i915: Remove overzealous fence warn on runtime suspend The goal of the WARN was to catch when we are still actively using the fence as we go into the runtime suspend. However, the reg->pin_count is too coarse as it does not distinguish between exclusive ownership of the fence register from activity. I've not improved on the WARN, nor have we captured this WARN in an exact igt, but it is showing up regularly in the wild: [ 1915.935332] WARNING: CPU: 1 PID: 10861 at drivers/gpu/drm/i915/i915_gem.c:2022 i915_gem_runtime_suspend+0x116/0x130 [i915] [ 1915.935383] WARN_ON(reg->pin_count)[ 1915.935399] Modules linked in: snd_hda_intel i915 drm_kms_helper vgem netconsole scsi_transport_iscsi fuse vfat fat x86_pkg_temp_thermal coretemp intel_cstate intel_uncore snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mei_me mei serio_raw intel_rapl_perf intel_pch_thermal soundcore wmi acpi_pad i2c_algo_bit syscopyarea sysfillrect sysimgblt fb_sys_fops drm r8169 mii video [last unloaded: drm_kms_helper] [ 1915.935785] CPU: 1 PID: 10861 Comm: kworker/1:0 Tainted: G U W 4.9.0-rc5+ #170 [ 1915.935799] Hardware name: LENOVO 80MX/Lenovo E31-80, BIOS DCCN34WW(V2.03) 12/01/2015 [ 1915.935822] Workqueue: pm pm_runtime_work [ 1915.935845] ffffc900044fbbf0 ffffffffac3220bc ffffc900044fbc40 0000000000000000 [ 1915.935890] ffffc900044fbc30 ffffffffac059bcb 000007e6044fbc60 ffff8801626e3198 [ 1915.935937] ffff8801626e0000 0000000000000002 ffffffffc05e5d4e 0000000000000000 [ 1915.935985] Call Trace: [ 1915.936013] [<ffffffffac3220bc>] dump_stack+0x4f/0x73 [ 1915.936038] [<ffffffffac059bcb>] __warn+0xcb/0xf0 [ 1915.936060] [<ffffffffac059c4f>] warn_slowpath_fmt+0x5f/0x80 [ 1915.936158] [<ffffffffc052d916>] i915_gem_runtime_suspend+0x116/0x130 [i915] [ 1915.936251] [<ffffffffc04f1c74>] intel_runtime_suspend+0x64/0x280 [i915] [ 1915.936277] [<ffffffffac0926f1>] ? dequeue_entity+0x241/0xbc0 [ 1915.936298] [<ffffffffac36bb85>] pci_pm_runtime_suspend+0x55/0x180 [ 1915.936317] [<ffffffffac36bb30>] ? pci_pm_runtime_resume+0xa0/0xa0 [ 1915.936339] [<ffffffffac4514e2>] __rpm_callback+0x32/0x70 [ 1915.936356] [<ffffffffac451544>] rpm_callback+0x24/0x80 [ 1915.936375] [<ffffffffac36bb30>] ? pci_pm_runtime_resume+0xa0/0xa0 [ 1915.936392] [<ffffffffac45222d>] rpm_suspend+0x12d/0x680 [ 1915.936415] [<ffffffffac69f6d7>] ? _raw_spin_unlock_irq+0x17/0x30 [ 1915.936435] [<ffffffffac0810b8>] ? finish_task_switch+0x88/0x220 [ 1915.936455] [<ffffffffac4534bf>] pm_runtime_work+0x6f/0xb0 [ 1915.936477] [<ffffffffac074353>] process_one_work+0x1f3/0x4d0 [ 1915.936501] [<ffffffffac074678>] worker_thread+0x48/0x4e0 [ 1915.936523] [<ffffffffac074630>] ? process_one_work+0x4d0/0x4d0 [ 1915.936542] [<ffffffffac074630>] ? process_one_work+0x4d0/0x4d0 [ 1915.936559] [<ffffffffac07a2c9>] kthread+0xd9/0xf0 [ 1915.936580] [<ffffffffac07a1f0>] ? kthread_park+0x60/0x60 [ 1915.936600] [<ffffffffac69fe62>] ret_from_fork+0x22/0x30 In the case the register is pinned, it should be present and we will need to invalidate them to be restored upon resume as we cannot expect the owner of the pin to call get_fence prior to use after resume. Fixes: `7c108fd8fe` ("drm/i915: Move fence cancellation to runtime suspend") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98804 Reported-by: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Imre Deak <imre.deak@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Link: http://patchwork.freedesktop.org/patch/msgid/20170203125717.8431-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-02-07 14:37:37 +00:00
Chris Wilson	4e64e5539d	drm: Improve drm_mm search (and fix topdown allocation) with rbtrees The drm_mm range manager claimed to support top-down insertion, but it was neither searching for the top-most hole that could fit the allocation request nor fitting the request to the hole correctly. In order to search the range efficiently, we create a secondary index for the holes using either their size or their address. This index allows us to find the smallest hole or the hole at the bottom or top of the range efficiently, whilst keeping the hole stack to rapidly service evictions. v2: Search for holes both high and low. Rename flags to mode. v3: Discover rb_entry_safe() and use it! v4: Kerneldoc for enum drm_mm_insert_mode. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Alexandre Courbot <gnurou@gmail.com> Cc: Eric Anholt <eric@anholt.net> Cc: Sinclair Yeh <syeh@vmware.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> # vmwgfx Reviewed-by: Lucas Stach <l.stach@pengutronix.de> #etnaviv Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170202210438.28702-1-chris@chris-wilson.co.uk	2017-02-03 11:10:32 +01:00
Chris Wilson	69aeafeae9	drm/i915: Flush untouched framebuffers before display on !llc On a non-llc system, the objects are created with .cache_level = CACHE_NONE and so the transition to uncached for scanout is a no-op. However, if the object was never written to, it will still be in the CPU domain (having been zeroed out by shmemfs). Those cachelines need to be flushed prior to display. Reported-and-tested-by: Vito Caputo Fixes: `a6a7cc4b7d` ("drm/i915: Always flush the dirty CPU cache when pinning the scanout") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Link: http://patchwork.freedesktop.org/patch/msgid/20170109111932.6342-1-chris@chris-wilson.co.uk Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-02-01 10:54:17 +00:00
Chris Wilson	241455172b	drm/i915: Reset the gpu on takeover The GPU may be in an unknown state following resume and module load. The previous occupant may have left contexts loaded, or other dangerous state, which can cause an immediate GPU hang for us. The only save course of action is to reset the GPU prior to using it - similarly to how we reset the GPU prior to unload (before a second user may be affected by our leftover state). We need to reset the GPU very early in our load/resume sequence so that any stale HW pointers are revoked prior to any resource allocations we make (that may conflict). A reset should only be a couple of milliseconds on a slow device, a cost we should easily be able to absorb into our initialisation times. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170124110135.6418-2-chris@chris-wilson.co.uk	2017-01-24 12:29:48 +00:00
Chris Wilson	e0216b762a	drm/i915: Treat an error from i915_vma_instance() as unlikely When pinning into the global GTT, an error from creating the VMA is unlikely, so mark it so. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170119192659.31789-4-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-01-21 10:32:21 +00:00
Chris Wilson	befedbb7e2	drm/i915: Use common LRU inactive vma bumping for unpin_from_display Now that i915_gem_object_bump_inactive_ggtt() exists, also make use of it for the LRU bumping from i915_gem_object_unpin_from_display() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170119192659.31789-2-chris@chris-wilson.co.uk Reviewed-by: Jonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-01-21 10:31:44 +00:00
Chris Wilson	d65415df07	drm/i915: Do an unlocked wait before set-cache-level ioctl Since a change in cache level is likely to trigger an unbind, avoid waiting under the mutex by preemptively doing an unlocked wait. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170119082211.21257-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-01-21 09:19:17 +00:00
Chris Wilson	718659a630	drm/i915: Rename some warts in the VMA API Whilst writing testcases to exercise the VMA API, some oddities came to light, such as i915_gem_obj_lookup_or_create(). Joonas suggested i915_vma_instance() as a neat replacement, so rename them, move them to i915_vma.c and add some kerneldoc as a sugary bonus. s/i915_gem_obj_to_vma/i915_vma_lookup/ s/i915_gem_obj_lookup_or_create_vma/i915_vma_instance/ Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-2-chris@chris-wilson.co.uk	2017-01-19 10:15:26 +00:00
Mika Kuoppala	71895a0858	drm/i915: Add comment how we treat hung contexts Explain in a comment how and why we treat hung context like we do. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484668747-9120-7-git-send-email-mika.kuoppala@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-01-18 10:47:32 +00:00
Chris Wilson	0e178aef8f	drm/i915: Detect a failed GPU reset+recovery If we can't recover the GPU after the reset, mark it as wedged to cancel the outstanding tasks and to prevent new users from trying to use the broken GPU. v2: Check the same ring is hung again before declaring the reset broken. v3: use engine_stalled (Mika) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484668747-9120-6-git-send-email-mika.kuoppala@intel.com	2017-01-18 10:47:26 +00:00
Mika Kuoppala	61da536204	drm/i915: Tidy up engine reset logic Split engine reset for engine and request specific parts. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484668747-9120-5-git-send-email-mika.kuoppala@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-01-18 10:44:51 +00:00
Mika Kuoppala	bf2f04366c	drm/i915: Introduce engine_stalled helper Move the engine stalled/pardoned check into a helper function. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484668747-9120-4-git-send-email-mika.kuoppala@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-01-18 10:44:50 +00:00
Mika Kuoppala	211b12afe6	drm/i915: Cleanup request skip decision Since we now only skip banned contexts, preventing the skip of default contexts is no longer sensible. For a similar argument as before 'commit `7ec73b7e36` ("drm/i915: Only skip requests once a context is banned")' we end up with an inconsistent API if we only mark future execbufs from the default ctx as banned but fail to mark those currently executing as failed. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484668747-9120-3-git-send-email-mika.kuoppala@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-01-18 10:44:50 +00:00
Mika Kuoppala	36193acd54	drm/i915: Introduce engine_skip_context Add a new function for skipping all pending requests for a context in order to make engine reset flow more readable. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484668747-9120-2-git-send-email-mika.kuoppala@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-01-18 10:44:47 +00:00
Chris Wilson	4c96554365	drm/i915: Move engine reset preparation to i915_gem_reset_prepare() Now that we have prepare/finish routines for the GEM reset, move the disabling of the engine->irq_tasklet into them to reduce repetition. The device irq enable/disable is split out to ensure it is run first and last always (even if the GPU reset fails). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484668747-9120-1-git-send-email-mika.kuoppala@intel.com	2017-01-18 10:44:20 +00:00
Chris Wilson	f131e3562e	drm/i915: Skip switch to kernel context if already done Some engines are never user or already sitting idle in the kernel context and for those we can skip flushing the current context for i915_gem_switch_to_kernel_context(). We used to perform this optimisation but that was removed for convenience of converting over to multiple timelines and handling the pending request queues. From the perspective of writing selftests, reducing the number of background operations on the engines makes defining assertions easier. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170114162334.10271-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-01-16 12:21:20 +00:00
Chris Wilson	47a8e3f6ae	drm/i915: Eliminate superfluous i915_ggtt_view_normal Since commit `058d88c433` ("drm/i915: Track pinned VMA"), there is only one user of i915_ggtt_view_normal rodate. Just treat NULL as no special view in pin_to_display() like everywhere else. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-7-chris@chris-wilson.co.uk	2017-01-14 16:18:36 +00:00
Chris Wilson	8bab1193c1	drm/i915: Convert i915_ggtt_view to use an anonymous union Reading the ggtt_views is much more pleasant without the extra characters from specifying the union (i.e. ggtt_view.partial rather than ggtt_view.params.partial). To make this work inside i915_vma_compare() with only a single memcmp requires us to ensure that there are no uninitialised bytes within each branch of the union (we make sure the structs are packed) and we need to store the size of each branch. v4: Rewrite changelog and add comments explaining the assert. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-5-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-01-14 16:18:03 +00:00
Chris Wilson	3bf4d57519	drm/i915: Stop clearing i915_ggtt_view As we now use a compact memcmp in i915_vma_compare(), we can forgo clearing the entire view and only set the precise parameters used in this view. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-4-chris@chris-wilson.co.uk	2017-01-14 16:17:45 +00:00
Chris Wilson	e4621b73b6	drm/i915: Fix phys pwrite for struct_mutex-less operation Since commit `fe115628d5` ("drm/i915: Implement pwrite without struct-mutex") the lowlevel pwrite calls are now called without the protection of struct_mutex, but pwrite_phys was still asserting that it held the struct_mutex and later tried to drop and relock it. Fixes: `fe115628d5` ("drm/i915: Implement pwrite without struct-mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> Link: http://patchwork.freedesktop.org/patch/msgid/20170106152240.5793-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (cherry picked from commit `10466d2a59`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-01-12 10:15:44 +02:00
Chris Wilson	f51455d442	drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE Start converting over from the byte count to its semantic macro, either we want to allocate the size of a physical page in main memory or we want the size of a virtual page in the GTT. 4096 could mean either, but PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve code comprehension and future changes. In the future, we may want to use variable GTT page sizes and so have the challenge of knowing which hardcoded values were used to represent a physical page vs the virtual page. v2: Look for a few more 4096s to convert, discover IS_ALIGNED(). v3: 4096ul paranoia, make fence alignment a distinct value of 4096, keep bdw stolen w/a as 4096 until we know better. v4: Add asserts that i915_vma_insert() start/end are aligned to GTT page sizes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170110144734.26052-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-01-10 20:54:32 +00:00
Chris Wilson	2a20d6f858	drm/i915: Rename i915_gem_engine_cleanup() to engine_set_wedged() It has been some time since i915_gem_engine_cleanup was only called from the module unload path, and now it is only called when the GPU is wedged. Mika complained that the name is confusing, especially in light of the existence of i915_gem_cleanup_engines(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170110172246.27297-5-chris@chris-wilson.co.uk	2017-01-10 20:49:32 +00:00
Chris Wilson	3cd9442f66	drm/i915: Mark all incomplete requests as -EIO when wedged Similarly to a normal reset, after we mark the GPU as wedged (completely fubar and no more requests can be executed), set the error status on all the in flight requests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170110172246.27297-4-chris@chris-wilson.co.uk	2017-01-10 20:49:31 +00:00
Chris Wilson	3c1b284759	drm/i915: Set an error status for a resubmitted request Let userspace know if its request was resubmitted due to it being executed at the time of a global reset. In this case, the reset was for a guilty request on another engine, and this request was an innocent victim that will be re-executed upon restarting. However, since it was running at the time of the reset, we can not guarantee that it suffered no ill-effects from the reset (e.g. some context state may be lost, or some self-modifying fragment shaders will be restarted from the final state not their initial state), to let userspace know that it has been corrupted set a special value on the fence->error, -EAGAIN. If the request does hang on resubmission, the error will be overwritten with -EIO. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170110172246.27297-3-chris@chris-wilson.co.uk	2017-01-10 20:49:30 +00:00
Chris Wilson	c0d5f32c50	drm/i915: Set guilty-flag on fence after detecting a hang The struct dma_fence carries a status field exposed to userspace by sync_file. This is inspected after the fence is signaled and can convey whether or not the request completed successfully, or in our case if we detected a hang during the request (signaled via -EIO in SYNC_IOC_FILE_INFO). v2: Mark all cancelled requests as failed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170110172246.27297-2-chris@chris-wilson.co.uk	2017-01-10 20:49:30 +00:00
Chris Wilson	2edc6e0df1	drm/i915: Consolidate reset_request() Always reset the requests of the guilty context, including the hung request that we tell the hardware to skip. This should help if the reprogram fails entirely, but more importantly makes the guilty path more uniform (and simplifies the subsequent patch to tweak the cancelled requests). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170110172246.27297-1-chris@chris-wilson.co.uk	2017-01-10 20:49:29 +00:00
Daniel Vetter	05adb1850a	Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued Pull in latest drm-next from Dave Airlie to get at all the drm-misc goodies, specifically: - dma_fence error state handling rework (Chris needs that for error recovery) - crc support locking changes (Tomeu's i915 crc patches need that). Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2017-01-10 17:03:46 +01:00
Chris Wilson	8201c1fad4	drm/i915: Clip the partial view against the object not vma The VMA is later clipped against the vm_area_struct before insertion of the faulting PTE so we are free to create the partial view as we desire. If we use the object as the extents rather than the area, this partial can then be used for other areas. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170110095633.6612-2-chris@chris-wilson.co.uk	2017-01-10 11:59:24 +00:00
Chris Wilson	2d4281bb93	drm/i915: Extract compute_partial_view() In order to reuse the partial view for selftesting, extract the common function for computing the view. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170110095633.6612-1-chris@chris-wilson.co.uk	2017-01-10 11:59:24 +00:00
Chris Wilson	91d4e0aa92	drm/i915: Move ggtt fence/alignment to i915_gem_tiling.c Rename i915_gem_get_ggtt_size() and i915_gem_get_ggtt_alignment() to i915_gem_fence_size() and i915_gem_fence_alignment() respectively to better match usage. Similarly move the pair of functions into i915_gem_tiling.c next to the fence restrictions. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-6-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-01-10 08:12:22 +00:00
Chris Wilson	944397f04f	drm/i915: Store required fence size/alignment for GGTT vma The fence size/alignment is a combination of the vma size plus object tiling parameters. Those parameters are rarely changed, making the fence size/alignemnt roughly constant for the lifetime of the VMA. We can simplify subsequent calculations by precalculating the size/alignment required for GGTT vma taking fencing into account (with an update if we do change the tiling or stride). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-4-chris@chris-wilson.co.uk	2017-01-10 08:12:21 +00:00
Chris Wilson	5b30694b47	drm/i915: Align GGTT sizes to a fence tile row Ensure the view occupies the full tile row so that reads/writes into the VMA do not escape (via fenced detiling) into neighbouring objects - we will pad the object with scratch pages to satisfy the fence. This applies the lazy-tiling we employed on gen2/3 to gen4+. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-2-chris@chris-wilson.co.uk	2017-01-10 08:12:20 +00:00
Chris Wilson	6649a0b650	drm/i915: Extract tile_row_size for fencing Computing the tile row size of a tiled object (for use with fence registers) is repeated, so extract it to a common helper. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-01-10 08:12:09 +00:00
Dave Airlie	5c37daf5dd	Merge tag 'drm-intel-next-2017-01-09' of git://anongit.freedesktop.org/git/drm-intel into drm-next More 4.11 stuff, holidays edition (i.e. not much): - docs and cleanups for shared dpll code (Ander) - some kerneldoc work (Chris) - fbc by default on gen9+ too, yeah! (Paulo) - fixes, polish and other small things all over gem code (Chris) - and a few small things on top Plus a backmerge, because Dave was enjoying time off too. * tag 'drm-intel-next-2017-01-09' of git://anongit.freedesktop.org/git/drm-intel: (275 commits) drm/i915: Update DRIVER_DATE to 20170109 drm/i915: Drain freed objects for mmap space exhaustion drm/i915: Purge loose pages if we run out of DMA remap space drm/i915: Fix phys pwrite for struct_mutex-less operation drm/i915: Simplify testing for am-I-the-kernel-context? drm/i915: Use range_overflows() drm/i915: Use fixed-sized types for stolen drm/i915: Use phys_addr_t for the address of stolen memory drm/i915: Consolidate checks for memcpy-from-wc support drm/i915: Only skip requests once a context is banned drm/i915: Move a few more utility macros to i915_utils.h drm/i915: Clear ret before unbinding in i915_gem_evict_something() drm/i915/guc: Exclude the upper end of the Global GTT for the GuC drm/i915: Move a few utility macros into a separate header drm/i915/execlists: Reorder execlists register enabling drm/i915: Assert that we do create the deferred context drm/i915: Assert all timeline requests are gone before fini drm/i915: Revoke fenced GTT mmapings across GPU reset drm/i915: enable FBC on gen9+ too drm/i915: actually drive the BDW reserved IDs ...	2017-01-10 08:02:09 +10:00
Linus Torvalds	2fd8774c79	Merge branch 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb fixes from Konrad Rzeszutek Wilk: "This has one fix to make i915 work when using Xen SWIOTLB, and a feature from Geert to aid in debugging of devices that can't do DMA outside the 32-bit address space. The feature from Geert is on top of v4.10 merge window commit (specifically you pulling my previous branch), as his changes were dependent on the Documentation/ movement patches. I figured it would just easier than me trying than to cherry-pick the Documentation patches to satisfy git. The patches have been soaking since 12/20, albeit I updated the last patch due to linux-next catching an compiler error and adding an Tested-and-Reported-by tag" * 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: swiotlb: Export swiotlb_max_segment to users swiotlb: Add swiotlb=noforce debug option swiotlb: Convert swiotlb_force from int to enum x86, swiotlb: Simplify pci_swiotlb_detect_override()	2017-01-06 10:53:21 -08:00
Konrad Rzeszutek Wilk	7453c549f5	swiotlb: Export swiotlb_max_segment to users So they can figure out what is the optimal number of pages that can be contingously stitched together without fear of bounce buffer. We also expose an mechanism for sub-users of SWIOTLB API, such as Xen-SWIOTLB to set the max segment value. And lastly if swiotlb=force is set (which mandates we bounce buffer everything) we set max_segment so at least we can bounce buffer one 4K page instead of a giant 512KB one for which we may not have space. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reported-and-Tested-by: Juergen Gross <jgross@suse.com>	2017-01-06 13:00:01 -05:00
Chris Wilson	b42a13d918	drm/i915: Drain freed objects for mmap space exhaustion As we now use a deferred free queue for objects, simply retiring the active objects is not enough to immediately free them and recover their mmap space - we must now also drain the freed object list. Fixes: `fbbd37b36f` ("drm/i915: Move object release to a freelist + worker" Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> Link: http://patchwork.freedesktop.org/patch/msgid/20170106152240.5793-3-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-01-06 16:54:47 +00:00
Chris Wilson	10466d2a59	drm/i915: Fix phys pwrite for struct_mutex-less operation Since commit `fe115628d5` ("drm/i915: Implement pwrite without struct-mutex") the lowlevel pwrite calls are now called without the protection of struct_mutex, but pwrite_phys was still asserting that it held the struct_mutex and later tried to drop and relock it. Fixes: `fe115628d5` ("drm/i915: Implement pwrite without struct-mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> Link: http://patchwork.freedesktop.org/patch/msgid/20170106152240.5793-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-01-06 16:53:44 +00:00
Chris Wilson	984ff29f74	drm/i915: Simplify testing for am-I-the-kernel-context? The kernel context (dev_priv->kernel_context) is unique in that it is not associated with any user filp - it is the only one with ctx->file_priv == NULL. This is a simpler test than comparing it against dev_priv->kernel_context which involves some pointer dancing. In checking that this is true, we notice that the gvt context is allocating itself a i915_hw_ppgtt it doesn't use and not flagging that its file_priv should be invalid. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170106152013.24684-5-chris@chris-wilson.co.uk	2017-01-06 16:02:12 +00:00
Chris Wilson	7ec73b7e36	drm/i915: Only skip requests once a context is banned If we skip before banning, we have an inconsistent interface between execbuf still queueing valid request but those requests already queued being cancelled. If we only cancel the pending requests once we stop accepting new requests, the user interface is more consistent. Reported-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Fixes: `821ed7df6e` ("drm/i915: Update reset path to fix incomplete requests") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: <stable@vger.kernel.org> # v4.9+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170105170059.344-1-chris@chris-wilson.co.uk	2017-01-05 21:06:20 +00:00
Chris Wilson	40b326eefe	drm/i915: Move a few utility macros into a separate header In order to defeat some circular dependencies between headers to allow use of e.g. range_overflows() in a header, move the simple independent macros into their own header. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170105153023.30575-4-chris@chris-wilson.co.uk	2017-01-05 15:34:45 +00:00
Chris Wilson	b1ed35d917	drm/i915: Revoke fenced GTT mmapings across GPU reset The fence registers are clobbered by a GPU reset. If there is concurrent user access to a fenced region via a GTT mmaping, the access will not be fenced during the reset (until we restore the fences afterwards). In order to prevent invalid access during the reset, before we clobber the fences first we must invalidate the GTT mmapings. Access to the mmap will then be forced to fault in the page, and in handling the fault, i915_gem_fault() will take the struct_mutex and wait upon the reset to complete. v2: Fix up commentary. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99274 Testcase: igt/gem_mmap_gtt/hang Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170104145110.1486-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-01-04 18:44:30 +00:00
Daniel Vetter	a402eae64d	Linux 4.10-rc2 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJYaYNlAAoJEHm+PkMAQRiGtCUH/18PMUJpHqRKjxL3Yscw+QZC RmGlD/hwBRLUSgiTCfURNGKP4QZv2kQW7BGsGC72oL01lmxozsU72ixUIO+wXzDY K2b0OOKGZZWzFtaVm7Qs+5JhHAEKZcT046mLD8sjJuqkrFAhmNLKdwHjihKBEkm9 J3s2tpdXdN0x/Uyga/GY9khEYIrvLPeBoKSz+JXcQKdC0iq3/+PMpWnN47QCNScr 7azojkJkj/rs2cqVdOi7Wbh6PSqIvPsl8E3qJefpaVJF/IQaU1pFdy5g8kYm4V7T fr6HgIbuN4EQWdN/5cgKrUdpQyV7D8iYx02klk4R8WgfS0QMYoUcsg+XsTd02TI= =OhGe -----END PGP SIGNATURE----- Merge tag 'v4.10-rc2' into drm-intel-next-queued Backmerge Linux 4.10-rc2 to resync with our -fixes cherry-picks. I've done the backmerge directly because Dave is on vacation. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2017-01-04 11:35:18 +01:00
Chris Wilson	2471eb5fb6	drm/i915: Prevent timeline updates whilst performing reset As the fence may be signaled concurrently from an interrupt on another device, it is possible for the list of requests on the timeline to be modified as we walk it. Take both (the context's timeline and the global timeline) locks to prevent such modifications. Fixes: `80b204bce8` ("drm/i915: Enable multiple timelines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161223145804.6605-10-chris@chris-wilson.co.uk (cherry picked from commit `00c25e3f40`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-01-03 11:41:57 +02:00
Chris Wilson	64d1461ce0	drm/i915: Silence allocation failure during sg_trim() As trimming the sg table is merely an optimisation that gracefully fails if we cannot allocate a new table, we do not need to report the failure either. Fixes: `0c40ce130e` ("drm/i915: Trim the object sg table") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161223145804.6605-4-chris@chris-wilson.co.uk (cherry picked from commit `8bfc478fa4`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-01-03 11:41:48 +02:00
Chris Wilson	c3f923b554	drm/i915: Don't clflush before release phys object When we teardown the backing storage for the phys object, we copy from the coherent contiguous block back to the shmemfs object, clflushing as we go. Trying to clflush the invalid sg beforehand just oops and would be redundant (due to it already being coherent, and clflushed afterwards). Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161223145804.6605-3-chris@chris-wilson.co.uk (cherry picked from commit `e5facdf964`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-01-03 11:41:38 +02:00
Chris Wilson	6095868a27	drm/i915: Complete kerneldoc for struct i915_gem_context The existing kerneldoc was outdated, so time for a refresh. v2: Use single line kdoc, mention functions for manipulation Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161231112012.29263-3-chris@chris-wilson.co.uk	2016-12-31 11:41:47 +00:00
Chris Wilson	00c25e3f40	drm/i915: Prevent timeline updates whilst performing reset As the fence may be signaled concurrently from an interrupt on another device, it is possible for the list of requests on the timeline to be modified as we walk it. Take both (the context's timeline and the global timeline) locks to prevent such modifications. Fixes: `80b204bce8` ("drm/i915: Enable multiple timelines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161223145804.6605-10-chris@chris-wilson.co.uk	2016-12-23 16:08:10 +00:00
Chris Wilson	8bfc478fa4	drm/i915: Silence allocation failure during sg_trim() As trimming the sg table is merely an optimisation that gracefully fails if we cannot allocate a new table, we do not need to report the failure either. Fixes: `0c40ce130e` ("drm/i915: Trim the object sg table") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161223145804.6605-4-chris@chris-wilson.co.uk	2016-12-23 16:07:32 +00:00
Chris Wilson	e5facdf964	drm/i915: Don't clflush before release phys object When we teardown the backing storage for the phys object, we copy from the coherent contiguous block back to the shmemfs object, clflushing as we go. Trying to clflush the invalid sg beforehand just oops and would be redundant (due to it already being coherent, and clflushed afterwards). Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161223145804.6605-3-chris@chris-wilson.co.uk	2016-12-23 16:07:20 +00:00
Chris Wilson	bdeb978506	drm/i915: Repeat flush of idle work during suspend The idle work handler is self-arming - if it detects that it needs to run again it will queue itself from its work handler. Take greater care when trying to drain the idle work, and double check that it is flushed. The free worker has a similar issue where it is armed by an RCU task which may be running concurrently with us. This should hopefully help with the sporadic WARN_ON(dev_priv->gt.awake) from i915_gem_suspend. v2: Reuse drain_freed_objects. v3: Don't try to flush the freed objects from the shrinker, as it may be underneath the struct_mutex already. v4: do while and comment upon the excess rcu_barrier in drain_freed_objects Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161223145804.6605-2-chris@chris-wilson.co.uk	2016-12-23 16:07:11 +00:00
Chris Wilson	28f412e02b	drm/i915: Break after walking all GGTT vma in bump_inactive_ggtt Since commit `db6c2b4151` ("drm/i915: Store the vma in an rbtree under the object") the vma are once again sorted into GGTT first, then ppGTT so that the typical case of walking the GGTT vma can stop as soon as we find a non-ppGTT. Apply that optimisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161223145804.6605-1-chris@chris-wilson.co.uk	2016-12-23 16:06:56 +00:00
Dave Airlie	4a401ceeef	Merge tag 'drm-intel-next-fixes-2016-12-22' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes First set of i915 fixes for code in next. * tag 'drm-intel-next-fixes-2016-12-22' of git://anongit.freedesktop.org/git/drm-intel: drm/i915: skip the first 4k of stolen memory on everything >= gen8 drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping drm/i915: Fix use after free in logical_render_ring_init drm/i915: disable PSR by default on HSW/BDW drm/i915: Fix setting of boost freq tunable drm/i915: tune down the fast link training vs boot fail drm/i915: Reorder phys backing storage release drm/i915/gen9: Fix PCODE polling during SAGV disabling drm/i915/gen9: Fix PCODE polling during CDCLK change notification drm/i915/dsi: Fix chv_exec_gpio disabling the GPIOs it is setting drm/i915/dsi: Fix swapping of MIPI_SEQ_DEASSERT_RESET / MIPI_SEQ_ASSERT_RESET drm/i915/dsi: Do not clear DPOUNIT_CLOCK_GATE_DISABLE from vlv_init_display_clock_gating drm/i915: drop the struct_mutex when wedged or trying to reset	2016-12-23 05:28:02 +10:00
Chris Wilson	abb0deacb5	drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping If we at first do not succeed with attempting to remap our physical pages using a coalesced scattergather list, try again with one scattergather entry per page. This should help with swiotlb as it uses a limited buffer size and only searches for contiguous chunks within its buffer aligned up to the next boundary - i.e. we may prematurely cause a failure as we are unable to utilize the unused space between large chunks and trigger an error such as: i915 0000:00:02.0: swiotlb buffer is full (sz: 1630208 bytes) Reported-by: Juergen Gross <jgross@suse.com> Tested-by: Juergen Gross <jgross@suse.com> Fixes: `871dfbd67d` ("drm/i915: Allow compaction upto SWIOTLB max segment size") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> Link: http://patchwork.freedesktop.org/patch/msgid/20161219124346.550-1-chris@chris-wilson.co.uk Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (cherry picked from commit `d766ef5300`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2016-12-20 16:30:47 +02:00
Chris Wilson	057f803ff1	drm/i915: Reorder phys backing storage release In commit `a4f5ea64f0` ("drm/i915: Refactor object page API"), I reordered the object->pages teardown to be more friendly wrt to a separate obj->mm.lock. However, I overlooked the phys object and left it with a dangling use-after-free of its phys_handle. Move the allocation of the phys handle to get_pages and it release to put_pages to prevent the invalid access and to improve symmetry. v2: Add commentary about always aligning to page size. Testcase: igt/drv_selftest/objects Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Fixes: `a4f5ea64f0` ("drm/i915: Refactor object page API") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161207133411.8028-1-chris@chris-wilson.co.uk (cherry picked from commit `dbb4351bab`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2016-12-20 16:29:26 +02:00
Chris Wilson	c2dc6cc946	drm/i915: Add a test that we terminate the trimmed sgtable as expected In commit `0c40ce130e` ("drm/i915: Trim the object sg table"), we expect to copy exactly orig_st->nents across and allocate the table thusly. The copy loop should therefore end with the new_sg being NULL. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161219124346.550-2-chris@chris-wilson.co.uk Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2016-12-20 12:31:52 +00:00
Chris Wilson	d766ef5300	drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping If we at first do not succeed with attempting to remap our physical pages using a coalesced scattergather list, try again with one scattergather entry per page. This should help with swiotlb as it uses a limited buffer size and only searches for contiguous chunks within its buffer aligned up to the next boundary - i.e. we may prematurely cause a failure as we are unable to utilize the unused space between large chunks and trigger an error such as: i915 0000:00:02.0: swiotlb buffer is full (sz: 1630208 bytes) Reported-by: Juergen Gross <jgross@suse.com> Tested-by: Juergen Gross <jgross@suse.com> Fixes: `871dfbd67d` ("drm/i915: Allow compaction upto SWIOTLB max segment size") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> Link: http://patchwork.freedesktop.org/patch/msgid/20161219124346.550-1-chris@chris-wilson.co.uk Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-12-20 12:30:56 +00:00
Chris Wilson	e8a9c58fcd	drm/i915: Unify active context tracking between legacy/execlists/guc The requests conversion introduced a nasty bug where we could generate a new request in the middle of constructing a request if we needed to idle the system in order to evict space for a context. The request to idle would be executed (and waited upon) before the current one, creating a minor havoc in the seqno accounting, as we will consider the current request to already be completed (prior to deferred seqno assignment) but ring->last_retired_head would have been updated and still could allow us to overwrite the current request before execution. We also employed two different mechanisms to track the active context until it was switched out. The legacy method allowed for waiting upon an active context (it could forcibly evict any vma, including context's), but the execlists method took a step backwards by pinning the vma for the entire active lifespan of the context (the only way to evict was to idle the entire GPU, not individual contexts). However, to circumvent the tricky issue of locking (i.e. we cannot take struct_mutex at the time of i915_gem_request_submit(), where we would want to move the previous context onto the active tracker and unpin it), we take the execlists approach and keep the contexts pinned until retirement. The benefit of the execlists approach, more important for execlists than legacy, was the reduction in work in pinning the context for each request - as the context was kept pinned until idle, it could short circuit the pinning for all active contexts. We introduce new engine vfuncs to pin and unpin the context respectively. The context is pinned at the start of the request, and only unpinned when the following request is retired (this ensures that the context is idle and coherent in main memory before we unpin it). We move the engine->last_context tracking into the retirement itself (rather than during request submission) in order to allow the submission to be reordered or unwound without undue difficultly. And finally an ulterior motive for unifying context handling was to prepare for mock requests. v2: Rename to last_retired_context, split out legacy_context tracking for MI_SET_CONTEXT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-3-chris@chris-wilson.co.uk	2016-12-18 16:18:50 +00:00
Matthew Auld	966d5bf5eb	drm/i915: convert to using range_overflows Convert some of the obvious hand-rolled ranged overflow sanity checks to our shiny new range_overflows macro. Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161213203222.32564-4-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-12-16 21:22:12 +00:00
Jan Kara	1a29d85eb0	mm: use vmf->address instead of of vmf->virtual_address Every single user of vmf->virtual_address typed that entry to unsigned long before doing anything with it so the type of virtual_address does not really provide us any additional safety. Just use masked vmf->address which already has the appropriate type. Link: http://lkml.kernel.org/r/1479460644-25076-3-git-send-email-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-14 16:04:09 -08:00
Chris Wilson	dbb4351bab	drm/i915: Reorder phys backing storage release In commit `a4f5ea64f0` ("drm/i915: Refactor object page API"), I reordered the object->pages teardown to be more friendly wrt to a separate obj->mm.lock. However, I overlooked the phys object and left it with a dangling use-after-free of its phys_handle. Move the allocation of the phys handle to get_pages and it release to put_pages to prevent the invalid access and to improve symmetry. v2: Add commentary about always aligning to page size. Testcase: igt/drv_selftest/objects Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Fixes: `a4f5ea64f0` ("drm/i915: Refactor object page API") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161207133411.8028-1-chris@chris-wilson.co.uk	2016-12-12 12:24:36 +00:00
Jani Nikula	73f67aa8cc	drm/i915: distinguish G33 and Pineview from each other Pineview deserves to use its own platform enum (which was already added, unused, previously). IS_G33() no longer matches Pineview, and gets replaced by IS_G33() \|\| IS_PINEVIEW() or equivalent. Pineview is no longer an outlier among platform definitions. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1481143689-19672-1-git-send-email-jani.nikula@intel.com	2016-12-07 23:28:33 +02:00
Jani Nikula	c0f86832e3	drm/i915: rename BROADWATER and CRESTLINE to I965G and I965GM, respectively Add more consistency to our naming. Pineview remains the outlier. Keep using code names for gen5+. v2: rebased Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1481105584-23033-1-git-send-email-jani.nikula@intel.com	2016-12-07 15:18:33 +02:00
Chris Wilson	85fd4f58d7	drm/i915: Mark all non-vma being inserted into the address spaces We need to distinguish between full i915_vma structs and simple drm_mm_nodes when considering eviction (i.e. we must be careful not to treat a mere drm_mm_node as a much larger i915_vma causing memory corruption, if we are lucky). To do this, color these not-a-vma with -1 (I915_COLOR_UNEVICTABLE). v2...v200: New name for -1. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161205142941.21965-1-chris@chris-wilson.co.uk	2016-12-05 20:49:17 +00:00
Chris Wilson	ce1135c7de	drm/i915: Complete requests in nop_submit_request Since the submit/execute split in commit `d55ac5bf97` ("drm/i915: Defer transfer onto execution timeline to actual hw submission") the global seqno advance was deferred until the submit_request callback. After wedging the GPU, we were installing a nop_submit_request handler (to avoid waking up the dead hw) but I had missed converting this over to the new scheme. Under the new scheme, we have to explicitly call i915_gem_submit_request() from the submit_request handler to mark the request as on the hardware. If we don't the request is always pending, and any waiter will continue to wait indefinitely and hangcheck will not be able to resolve the lockup. References: https://bugs.freedesktop.org/show_bug.cgi?id=98748 Testcase: igt/gem_eio/in-flight Fixes: `d55ac5bf97` ("drm/i915: Defer transfer onto execution timeline to actual hw submission") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161122144121.7379-3-chris@chris-wilson.co.uk (cherry picked from commit `3dcf93f7f2`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2016-12-05 10:38:01 +02:00
Tvrtko Ursulin	cb15d9f8c3	drm/i915: More GEM init dev_priv cleanup Simplifies the code to pass the right parameter in. v2: Commit message. (Joonas Lahtinen) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-12-01 18:01:16 +00:00
Tvrtko Ursulin	bf9e8429ab	drm/i915: Make various init functions take dev_priv Like GEM init, GUC init, MOCS init and context creation. Enables them to lose dev_priv locals. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-12-01 18:01:15 +00:00
Tvrtko Ursulin	12d79d7828	drm/i915: Make GEM object create and create from data take dev_priv Makes all GEM object constructors consistent. v2: Fix compilation in GVT code. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v1)	2016-12-01 18:01:08 +00:00
Tvrtko Ursulin	187685cb90	drm/i915: Make GEM object alloc/free and stolen created take dev_priv Where it is more appropriate and also to be consistent with the direction of the driver. v2: Leave out object alloc/free inlining. (Joonas Lahtinen) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-12-01 18:00:15 +00:00
Chris Wilson	49d73912cb	drm/i915: Convert vm->dev backpointer to vm->i915 99% of the time we access i915_address_space->dev we want the i915 device and not the drm device, so let's store the drm_i915_private backpointer instead. The only real complication here are the inlines in i915_vma.h where drm_i915_private is not yet defined and so we have to choose an alternate path for our asserts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161129095008.32622-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-11-29 11:38:00 +00:00
Chris Wilson	20e4933c47	drm/i915: Stop the machine as we install the wedged submit_request handler In order to prevent a race between the old callback submitting an incomplete request and i915_gem_set_wedged() installing its nop handler, we must ensure that the swap occurs when the machine is idle (stop_machine). v2: move context lost from out of BKL. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161122144121.7379-4-chris@chris-wilson.co.uk	2016-11-22 17:42:19 +00:00
Chris Wilson	3dcf93f7f2	drm/i915: Complete requests in nop_submit_request Since the submit/execute split in commit `d55ac5bf97` ("drm/i915: Defer transfer onto execution timeline to actual hw submission") the global seqno advance was deferred until the submit_request callback. After wedging the GPU, we were installing a nop_submit_request handler (to avoid waking up the dead hw) but I had missed converting this over to the new scheme. Under the new scheme, we have to explicitly call i915_gem_submit_request() from the submit_request handler to mark the request as on the hardware. If we don't the request is always pending, and any waiter will continue to wait indefinitely and hangcheck will not be able to resolve the lockup. References: https://bugs.freedesktop.org/show_bug.cgi?id=98748 Testcase: igt/gem_eio/in-flight Fixes: `d55ac5bf97` ("drm/i915: Defer transfer onto execution timeline to actual hw submission") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161122144121.7379-3-chris@chris-wilson.co.uk	2016-11-22 17:42:18 +00:00
Chris Wilson	d9e9da64c4	drm/i915: Don't deref context->file_priv ERR_PTR upon reset When a user context is closed, it's file_priv backpointer is replaced by ERR_PTR(-EBADF); be careful not to chase this invalid pointer after a hang and a GPU reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Fixes: `b083a0870c` ("drm/i915: Add per client max context ban limit") Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161122144121.7379-1-chris@chris-wilson.co.uk	2016-11-22 17:42:17 +00:00
Mika Kuoppala	bc1d53c647	drm/i915: Wipe hang stats as an embedded struct Bannable property, banned status, guilty and active counts are properties of i915_gem_context. Make them so. v2: rebase Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1479309634-28574-1-git-send-email-mika.kuoppala@intel.com	2016-11-21 14:36:56 +02:00
Mika Kuoppala	b083a0870c	drm/i915: Add per client max context ban limit If we have a bad client submitting unfavourably across different contexts, creating new ones, the per context scoring of badness doesn't remove the root cause, the offending client. To counter, keep track of per client context bans. Deny access if client is responsible for more than 3 context bans in it's lifetime. v2: move ban check to context create ioctl (Chris) v3: add commentary about hangs needed to reach client ban (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>	2016-11-21 14:36:40 +02:00
Mika Kuoppala	841021713a	drm/i915: Add bannable context parameter Now when driver has per context scoring of 'hanging badness' and also subsequent hangs during short windows are allowed, if there is progress made in between, it does not make sense to expose a ban timing window as a context parameter anymore. Let the scoring be the sole indicator for ban policy and substitute ban period context parameter as a boolean to get/set context bannable property. v2: allow non root to opt into being banned (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>	2016-11-21 14:36:40 +02:00
Mika Kuoppala	e5e1fc47ea	drm/i915: Use request retirement as context progress As hangcheck score was removed, the active decay of score was removed also. This removed feature for hangcheck to detect if the gpu client was accidentally or maliciously causing intermittent hangs. Reinstate the scoring as a per context property, so that if one context starts to act unfavourably, ban it. v2: ban_period_secs as a gate to score check (Chris) v3: decay in proper spot. scores as tunables (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>	2016-11-21 14:36:40 +02:00
Mika Kuoppala	3fe3b030bd	drm/i915: Decouple hang detection from hangcheck period Hangcheck state accumulation has gained more steps along the years, like head movement and more recently the subunit inactivity check. As the subunit sampling is only done if the previous state check showed inactivity, we have added more stages (and time) to reach a hang verdict. Asymmetric engine states led to different actual weight of 'one hangcheck unit' and it was demonstrated in some hangs that due to difference in stages, simpler engines were accused falsely of a hang as their scoring was much more quicker to accumulate above the hang treshold. To completely decouple the hangcheck guilty score from the hangcheck period, convert hangcheck score to a rough period of inactivity measurement. As these are tracked as jiffies, they are meaningful also across reset boundaries. This makes finding a guilty engine more accurate across multi engine activity scenarios, especially across asymmetric engines. We lose the ability to detect cross batch malicious attempts to hinder the progress. Plan is to move this functionality to be part of context banning which is more natural fit, later in the series. v2: use time_before macros (Chris) reinstate the pardoning of moving engine after hc (Chris) v3: avoid global state for per engine stall detection (Chris) v4: take timeline last retirement into account (Chris) v5: do debug print on pardoning, split out retirement timestamp (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>	2016-11-21 14:36:40 +02:00
Chris Wilson	05c348377d	drm/i915: Skip final clflush if LLC is coherent If the LLC is coherent with the object, we do not need to worry about whether main memory and cache mismatch when we hand the object back to the system. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161118211747.25197-2-chris@chris-wilson.co.uk	2016-11-18 22:33:49 +00:00
Chris Wilson	a6a7cc4b7d	drm/i915: Always flush the dirty CPU cache when pinning the scanout Currently we only clflush the scanout if it is in the CPU domain. Also flush if we have a pending CPU clflush. We also want to treat the dirtyfb path similar, and flush any pending writes there as well. v2: Only send the fb flush message if flushing the dirt on flip v3: Make flush-for-flip and dirtyfb look more alike since they serve similar roles as end-of-frame marker. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v2 Link: http://patchwork.freedesktop.org/patch/msgid/20161118211747.25197-1-chris@chris-wilson.co.uk	2016-11-18 22:33:22 +00:00
Chris Wilson	b17993b7b2	drm/i915: Don't touch NULL sg on i915_gem_object_get_pages_gtt() error On the DMA mapping error path, sg may be NULL (it has already been marked as the last scatterlist entry), and we should avoid dereferencing it again. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `e227330223` ("drm/i915: avoid leaking DMA mappings") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20161114112930.2033-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2016-11-18 20:51:53 +00:00
Chris Wilson	5b8c8aec8e	drm/i915: Move frontbuffer CS write tracking from ggtt vma to object I tried to avoid having to track the write for every VMA by only tracking writes to the ggtt. However, for the purposes of frontbuffer tracking this is insufficient as we need to invalidate around writes not just to the the ggtt but all aliased ppgtt views of the framebuffer. By moving the critical section to the object and only doing so for framebuffer writes we can reduce the tracking even further by only watching framebuffers and not vma. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161116190704.5293-1-chris@chris-wilson.co.uk Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-11-18 11:15:59 +00:00
Matthew Auld	ea84aa776f	drm/i915: don't leak global_timeline We need to clean up the global_timeline in i915_gem_load_cleanup. v2: don't forget about the struct_mutex, and also WARN_ON if we have any remaining timelines before purging the global_timeline. v3: it might be a good idea to first remove the global_timeline...duh! Fixes: `73cb97010d` ("drm/i915: Combine seqno + tracking into a global timeline struct") Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1479415087-13216-1-git-send-email-matthew.auld@intel.com Link: http://patchwork.freedesktop.org/patch/msgid/20161117210411.14044-2-chris@chris-wilson.co.uk	2016-11-17 21:05:37 +00:00
Chris Wilson	c4c29d7b59	drm/i915: Demote i915_gem_open() debugging from DRIVER to USER We use DRM_DEBUG() when reporting on user actions, to try and keep intentional errors out of the CI dmesg. Demote the debug from i915_gem_open() similarly so that it is only apparent with drm.debug & 1 like its brethren. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20161109104507.21228-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2016-11-17 14:23:20 +00:00
Tvrtko Ursulin	275a991c03	drm/i915: dev_priv cleanup in i915_gem_gtt.c Started with removing INTEL_INFO(dev) and cascaded into a quite big trickle of function prototype changes. Still, I think it is for the better. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-11-17 13:56:12 +00:00
Tvrtko Ursulin	4362f4f6dd	drm/i915: Use dev_priv in INTEL_INFO in i915_gem_fence_reg.c Plus a small cascade of function prototype changes. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-11-17 13:56:06 +00:00
Tvrtko Ursulin	c6be607abc	drm/i915: dev_priv and a small cascade of cleanups in i915_gem.c Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-11-17 13:55:26 +00:00
Chris Wilson	6b5e90f58c	drm/i915/scheduler: Boost priorities for flips Boost the priority of any rendering required to show the next pageflip as we want to avoid missing the vblank by being delayed by invisible workload. We prioritise avoiding jank and jitter in the GUI over starving background tasks. v2: Descend dma_fence_array when boosting priorities. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-10-chris@chris-wilson.co.uk	2016-11-14 21:01:22 +00:00
Chris Wilson	20311bd350	drm/i915/scheduler: Execute requests in order of priorities Track the priority of each request and use it to determine the order in which we submit requests to the hardware via execlists. The priority of the request is determined by the user (eventually via the context) but may be overridden at any time by the driver. When we set the priority of the request, we bump the priority of all of its dependencies to match - so that a high priority drawing operation is not stuck behind a background task. When the request is ready to execute (i.e. we have signaled the submit fence following completion of all its dependencies, including third party fences), we put the request into a priority sorted rbtree to be submitted to the hardware. If the request is higher priority than all pending requests, it will be submitted on the next context-switch interrupt as soon as the hardware has completed the current request. We do not currently preempt any current execution to immediately run a very high priority request, at least not yet. One more limitation, is that this is first implementation is for execlists only so currently limited to gen8/gen9. v2: Replace recursive priority inheritance bumping with an iterative depth-first search list. v3: list_next_entry() for walking lists v4: Explain how the dfs solves the recursion problem with PI. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-8-chris@chris-wilson.co.uk	2016-11-14 21:01:21 +00:00
Chris Wilson	52e5420907	drm/i915/scheduler: Record all dependencies upon request construction The scheduler needs to know the dependencies of each request for the lifetime of the request, as it may choose to reschedule the requests at any time and must ensure the dependency tree is not broken. This is in additional to using the fence to only allow execution after all dependencies have been completed. One option was to extend the fence to support the bidirectional dependency tracking required by the scheduler. However the mismatch in lifetimes between the submit fence and the request essentially meant that we had to build a completely separate struct (and we could not simply reuse the existing waitqueue in the fence for one half of the dependency tracking). The extra dependency tracking simply did not mesh well with the fence, and keeping it separate both keeps the fence implementation simpler and allows us to extend the dependency tracking into a priority tree (whilst maintaining support for reordering the tree). To avoid the additional allocations and list manipulations, the use of the priotree is disabled when there are no schedulers to use it. v2: Create a dedicated slab for i915_dependency. Rename the lists. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-7-chris@chris-wilson.co.uk	2016-11-14 21:00:28 +00:00
Chris Wilson	663f71e73f	drm/i915: Remove engine->execlist_lock The execlist_lock is now completely subsumed by the engine->timeline->lock, and so we can remove the redundant layer of locking. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-5-chris@chris-wilson.co.uk	2016-11-14 21:00:25 +00:00
Chris Wilson	bb89485e99	drm/i915: Create distinct lockclasses for execution vs user timelines In order to simplify the lockdep annotation, as they become more complex in the future with deferred execution and multiple paths through the same functions, create a separate lockclass for the user timeline and the hardware execution timeline. We should only ever be locking the user timeline and the execution timeline in parallel so we only need to create two lock classes, rather than a separate class for every timeline. v2: Rename the lock classes to be more consistent with other lockdep. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-2-chris@chris-wilson.co.uk	2016-11-14 21:00:21 +00:00
Chris Wilson	2b3c83176e	drm/i915: Stop skipping the final clflush back to system pages When we release the shmem backing storage, we make sure that the pages are coherent with the cpu cache. However, our clflush routine was skipping the flush as the object had no pages at release time. Fix this by explicitly flushing the sg_table we are decoupling. Fixes: `03ac84f183` ("drm/i915: Pass around sg_table to get_pages/put_pages backend") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161111145809.9701-2-chris@chris-wilson.co.uk	2016-11-11 16:21:52 +00:00
Chris Wilson	9caa34aa93	drm/i915: Only wait upon the execution timeline when unlocked In order to walk the list of all timelines, we currently require the struct_mutex. We are sometimes called prior to the struct_mutex being taken by the caller (i.e !I915_WAIT_LOCKED) in which case we can only trust the global execution timelines (as these are owned by the device). This means in the unlocked phase we can only wait upon the currently executing requests and not all queued. [ 175.743243] general protection fault: 0000 [#1] SMP [ 175.743263] Modules linked in: nls_iso8859_1 intel_rapl x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iwlwifi aesni_intel aes_x86_64 lrw snd_soc_rt5640 gf128mul snd_soc_rl6231 snd_soc_core glue_helper snd_compress snd_pcm_dmaengine snd_hda_codec_hdmi ablk_helper snd_hda_codec_realtek cryptd snd_hda_codec_generic serio_raw cfg80211 snd_hda_intel snd_hda_codec ir_lirc_codec snd_hda_core lirc_dev snd_hwdep snd_pcm lpc_ich mei_me mei snd_seq_midi shpchp snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer rc_rc6_mce acpi_als nuvoton_cir kfifo_buf rc_core snd industrialio snd_soc_sst_acpi soundcore snd_soc_sst_match i2c_designware_platform 8250_dw i2c_designware_core dw_dmac spi_pxa2xx_platform mac_hid acpi_pad parport_pc ppdev lp parport [ 175.743509] autofs4 i915 e1000e psmouse ptp pps_core xhci_pci ehci_pci ahci xhci_hcd ehci_hcd libahci video sdhci_acpi sdhci i2c_hid hid [ 175.743560] CPU: 2 PID: 2386 Comm: wtdg_monitor.sh Tainted: G U 4.9.0-rc4-nightly+ #2 [ 175.743581] Hardware name: /NUC5i7RYB, BIOS RYBDWi35.86A.0358.2016.0606.1423 06/06/2016 [ 175.743603] task: ffff88024509ba80 task.stack: ffffc9007bd18000 [ 175.743618] RIP: 0010:[<ffffffffa01af29b>] [<ffffffffa01af29b>] i915_gem_wait_for_idle+0x3b/0x140 [i915] [ 175.743660] RSP: 0000:ffffc9007bd1b9b8 EFLAGS: 00010297 [ 175.743674] RAX: ffff88024489d248 RBX: 0000000000000000 RCX: 0000000000000000 [ 175.743691] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880244898000 [ 175.743708] RBP: ffffc9007bd1b9f0 R08: 0000000000000000 R09: 0000000000000001 [ 175.743724] R10: 00000028eaf42792 R11: 0000000000000001 R12: dead000000000100 [ 175.743741] R13: dead000000000148 R14: ffffc9007bd1ba5f R15: 0000000000000005 [ 175.743758] FS: 00007f2638330700(0000) GS:ffff880256d00000(0000) knlGS:0000000000000000 [ 175.743777] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 175.743791] CR2: 00007f885c8cea40 CR3: 00000002416b5000 CR4: 00000000003406e0 [ 175.743808] Stack: [ 175.743816] ffff88024489d248 000000004509ba80 ffff880244898000 ffff88024509ba80 [ 175.743840] 00000000ffff8b69 ffffc9007bd1ba5f ffffc9007bd1ba5e ffffc9007bd1ba28 [ 175.743863] ffffffffa01b661d 00000000ffffffff 0000000000000000 ffff880244898000 [ 175.743886] Call Trace: [ 175.743906] [<ffffffffa01b661d>] i915_gem_shrinker_lock_uninterruptible.constprop.5+0x5d/0xc0 [i915] [ 175.743937] [<ffffffffa01b6cd0>] i915_gem_shrinker_oom+0x30/0x1b0 [i915] [ 175.743955] [<ffffffff8109ca79>] notifier_call_chain+0x49/0x70 [ 175.743971] [<ffffffff8109cd9d>] __blocking_notifier_call_chain+0x4d/0x70 [ 175.743988] [<ffffffff8109cdd6>] blocking_notifier_call_chain+0x16/0x20 [ 175.744005] [<ffffffff811885dc>] out_of_memory+0x22c/0x480 [ 175.744020] [<ffffffff81205542>] __alloc_pages_slowpath+0x851/0x8ec [ 175.744037] [<ffffffff8118ca51>] __alloc_pages_nodemask+0x2c1/0x310 [ 175.744054] [<ffffffff811d8ea8>] alloc_pages_current+0x88/0x120 [ 175.744070] [<ffffffff811833a4>] __page_cache_alloc+0xb4/0xc0 [ 175.744086] [<ffffffff811865ca>] filemap_fault+0x29a/0x500 [ 175.744101] [<ffffffff81299aa6>] ext4_filemap_fault+0x36/0x50 [ 175.744117] [<ffffffff811b3d4a>] __do_fault+0x6a/0xe0 [ 175.744131] [<ffffffff811b97ee>] handle_mm_fault+0xd0e/0x1330 [ 175.744147] [<ffffffff8106738c>] __do_page_fault+0x23c/0x4d0 [ 175.744162] [<ffffffff81067650>] do_page_fault+0x30/0x80 [ 175.744177] [<ffffffff817ffbe8>] page_fault+0x28/0x30 [ 175.744191] Code: 41 57 41 56 41 55 41 54 53 48 83 ec 10 4c 8b a7 48 52 00 00 89 75 d4 48 89 45 c8 49 39 c4 74 78 4d 8d 6c 24 48 41 bf 05 00 00 00 <49> 8b 5d 00 48 85 db 74 50 8b 83 20 01 00 00 85 c0 74 15 48 8b [ 175.744320] RIP [<ffffffffa01af29b>] i915_gem_wait_for_idle+0x3b/0x140 [i915] [ 175.744351] RSP <ffffc9007bd1b9b8> Fixes: `80b204bce8` ("drm/i915: Enable multiple timelines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161111145809.9701-1-chris@chris-wilson.co.uk	2016-11-11 16:21:52 +00:00
Tvrtko Ursulin	0031fb9685	drm/i915: Assorted dev_priv cleanups A small selection of macros which can only accept dev_priv from now on and a resulting trickle of fixups. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>	2016-11-11 14:58:26 +00:00
Joonas Lahtinen	b42fe9ca0a	drm/i915: Split out i915_vma.c As a side product, had to split two other files; - i915_gem_fence_reg.h - i915_gem_object.h (only parts that needed immediate untanglement) I tried to move code in as big chunks as possible, to make review easier. i915_vma_compare was moved to a header temporarily. v2: - Use i915_gem_fence_reg.{c,h} v3: - Rebased v4: - Fix building when DEBUG_GEM is enabled by reordering a bit. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com	2016-11-11 14:34:54 +02:00
Tvrtko Ursulin	0c40ce130e	drm/i915: Trim the object sg table At the moment we allocate enough sg table entries assuming we will not be able to do any coalescing. But since in practice we most often can, and more so very effectively, this ends up wasting a lot of memory. A simple and effective way of trimming the over-allocated entries is to copy the table over to a new one allocated to the exact size. Experiments on my freshly logged and idle desktop (KDE) showed that by doing this we can save approximately 1 MiB of RAM, or when running a typical benchmark like gl_manhattan I have even seen a 6 MiB saving. More complicated techniques such as only copying the last used page and freeing the rest are left to the reader. v2: * Update commit message. * Use temporary sg_table on stack. (Chris Wilson) v3: * Commit message update. * Comment added. * Replace memcpy with copy assignment. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1478704423-7447-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-11-10 09:29:25 +00:00
Chris Wilson	d0da48cf92	drm/i915: Remove chipset flush after cache flush We always flush the chipset prior to executing with the GPU, so we can skip the flush during ordinary domain management. This should help mitigate some of the potential performance regressions, but likely trivial, from doing the flush unconditionally before execbuf introduced in commit `dcd79934b0` ("drm/i915: Unconditionally flush any chipset buffers before execbuf") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20161106130001.9509-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-11-08 11:04:04 +00:00
Imre Deak	31ab49abde	drm/i915: Add assert for no pending GPU requests during suspend/resume in LR mode During resume we will reset the SW/HW tracking for each ring head/tail pointers and so are not prepared to replay any pending requests (as opposed to GPU reset time). Add an assert for this both to the suspend and the resume code. v2: - Check for ELSP port idle already during suspend and check !gt.awake during resume. (Chris) v3: - Move the !gt.awake check to i915_gem_resume(). v4: - s/intel_lr_engines_idle/intel_execlists_idle/ (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-4-git-send-email-imre.deak@intel.com	2016-11-07 14:48:05 +02:00
Imre Deak	0cb5670baa	drm/i915: Make sure engines are idle during GPU idling in LR mode We assume that the GPU is idle once receiving the seqno via the last request's user interrupt. In execlist mode the corresponding context completed interrupt can be delayed though and until this latter interrupt arrives we consider the request to be pending on the ELSP submit port. This can cause a problem during system suspend where this last request will be seen by the resume code as still pending. Such pending requests are normally replayed after a GPU reset, but during resume we reset both SW and HW tracking of the ring head/tail pointers, so replaying the pending request with its stale tail pointer will leave the ring in an inconsistent state. A subsequent request submission can lead then to the GPU executing from uninitialized area in the ring behind the above stale tail pointer. Fix this by making sure any pending request on the ELSP port is completed before suspending. I used a polling wait since the completion time I measured was <1ms and since normally we only need to wait during system suspend. GPU idling during runtime suspend is scheduled with a delay (currently 50-100ms) after the retirement of the last request at which point the context completed interrupt must have arrived already. The chance of this bug was increased by commit `1c777c5d1d` Author: Imre Deak <imre.deak@intel.com> Date: Wed Oct 12 17:46:37 2016 +0300 drm/i915/hsw: Fix GPU hang during resume from S3-devices state but it could happen even without the explicit GPU reset, since we disable interrupts afterwards during the suspend sequence. v2: - Do an unlocked poll-wait first. (Chris) v3-4: - s/intel_lr_engines_idle/intel_execlists_idle/ and move i915.enable_execlists check to the new helper. (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98470 Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-3-git-send-email-imre.deak@intel.com	2016-11-07 14:48:05 +02:00
Imre Deak	93c97dc17f	drm/i915: Avoid early GPU idling due to race with new request There is a small race where a new request can be submitted and retired after the idle worker started to run which leads to idling the GPU too early. Fix this by deferring the idling to the pending instance of the worker. This scenario was pointed out by Chris. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-2-git-send-email-imre.deak@intel.com	2016-11-07 14:48:04 +02:00
Chris Wilson	767a222e47	drm/i915: Limit Valleyview and earlier to only using mappable scanout Valleyview appears to be limited to only scanning out from the first 512MiB of the Global GTT. Lets presume that this behaviour was inherited from the display block copied from g4x (not Ironlake) and all earlier generations are similarly affected, though testing suggests different symptoms. For simplicity, impose that these platforms must scanout from the mappable region. (For extra simplicity, use HAS_GMCH_DISPLAY even though this catches Cherryview which does not appear to be limited to the low aperture for its scanout.) v2: Use HAS_GMCH_DISPLAY() to more clearly convey my intent about limiting this workaround to the old style of display engine. v3: Update changelog to reflect testing by Ville Syrjälä v4: Include the changes to the comments as well Reported-by: Luis Botello <luis.botello.ortega@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98036 Fixes: `2efb813d53` ("drm/i915: Fallback to using unmappable memory for scanout") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Akash Goel <akash.goel@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.9-rc1+ Link: http://patchwork.freedesktop.org/patch/msgid/20161107110128.28762-1-chris@chris-wilson.co.uk Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com	2016-11-07 12:25:50 +00:00
Chris Wilson	0ef723cbce	drm/i915: Round tile chunks up for constructing partial VMAs When we split a large object up into chunks for GTT faulting (because we can't fit the whole object into the aperture) we have to align our cuts with the fence registers. Each partial VMA must cover a complete set of tile rows or the offset into each partial VMA is not aligned with the whole image. Currently we enforce a minimum size on each partial VMA, but this minimum size itself was not aligned to the tile row causing distortion. Reported-by: Andreas Reis <andreas.reis@gmail.com> Reported-by: Chris Clayton <chris2553@googlemail.com> Reported-by: Norbert Preining <preining@logic.at> Tested-by: Norbert Preining <preining@logic.at> Tested-by: Chris Clayton <chris2553@googlemail.com> Fixes: `03af84fe7f` ("drm/i915: Choose partial chunksize based on tile row size") Fixes: `a61007a83a` ("drm/i915: Fix partial GGTT faulting") # enabling patch Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98402 Testcase: igt/gem_mmap_gtt/medium-copy-odd Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.9-rc1+ Link: http://patchwork.freedesktop.org/patch/msgid/20161107105443.27855-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-11-07 11:39:59 +00:00
Chris Wilson	2c3a3f44dc	drm/i915: Fix pages pin counting around swizzle quirk commit `bc0629a767` ("drm/i915: Track pages pinned due to swizzling quirk") fixed one problem, but revealed a whole lot more. The root cause of the pin count mismatch for the swizzle quirk (for L-shaped memory on gen3/4) was that we were incrementing the pages_pin_count upon getting the backing pages but then overwriting the pages_pin_count to set it to 1 afterwards. With a little bit of adjustment to satisfy the GEM_BUG_ON sanitychecks, the fix is to replace the explicit atomic_set with an atomic_inc. v2: Consistently use atomics (not mix atomics and helpers) within the lowlevel get_pages routines. This makes the atomic operations much clearer. Fixes: `1233e2db19` ("drm/i915: Move object backing storage manipulation") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161104103001.27643-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-11-04 11:55:39 +00:00

... 2 3 4 5 6 ...

1924 Commits