OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Chris Wilson	f6322eddaf	drm/i915/preemption: Allow preemption between submission ports Sometimes we need to boost the priority of an in-flight request, which may lead to the situation where the second submission port then contains a higher priority context than the first and so we need to inject a preemption event. To do so we must always check inside execlists_dequeue() whether there is a priority inversion between the ports themselves as well as the head of the priority sorted queue, and we cannot just skip dequeuing if the queue is empty. As Michał noted, this doesn't simply extend to handling more than 2-port submission, as we may need to reorder within the array of executing requests which themselves are lower priority than the first. A task for later! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180222142229.14517-1-chris@chris-wilson.co.uk Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2018-02-23 16:37:40 +00:00
Chris Wilson	e61e0f51ba	drm/i915: Rename drm_i915_gem_request to i915_request We want to de-emphasize the link between the request (dependency, execution and fence tracking) from GEM and so rename the struct from drm_i915_gem_request to i915_request. That is we may implement the GEM user interface on top of requests, but they are an abstraction for tracking execution rather than an implementation detail of GEM. (Since they are not tied to HW, we keep the i915 prefix as opposed to intel.) In short, the spatch: @@ @@ - struct drm_i915_gem_request + struct i915_request A corollary to contracting the type name, we also harmonise on using 'rq' shorthand for local variables where space if of the essence and repetition makes 'request' unwieldy. For globals and struct members, 'request' is still much preferred for its clarity. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2018-02-21 20:57:22 +00:00
Chris Wilson	3ceda3a4a8	drm/i915: Hold rpm wakeref for printing the engine's register state When dumping the engine, we print out the current register values. This requires the rpm wakeref. If the device is alseep, we can assume the engine is asleep (and the register state is uninteresting) so skip and only acquire the rpm wakeref if the device is already awake. Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180212102415.24246-1-chris@chris-wilson.co.uk	2018-02-12 13:33:33 +00:00
Chris Wilson	74d00d28a1	drm/i915: Don't wake the device up to check if the engine is asleep If the entire device is powered off, we can safely assume that the engine is also asleep (and idle). Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: `a091d4ee93` ("drm/i915: Hold a wakeref for probing the ring registers") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180212093928.6005-1-chris@chris-wilson.co.uk	2018-02-12 10:59:01 +00:00
Chris Wilson	8e47b4b65b	drm/i915: Remove redundant check on execlists interrupt Since commit `4a118ecbe9` ("drm/i915: Filter out spurious execlists context-switch interrupts") we probe execlists->active, and no longer have to peek at the execlist interrupt to determine if the tasklet still needs to be run to drain the ELSP. References: `4a118ecbe9` ("drm/i915: Filter out spurious execlists context-switch interrupts") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180208151224.16285-1-chris@chris-wilson.co.uk	2018-02-08 18:30:01 +00:00
Chris Wilson	d637637491	drm/i915: Only allocate preempt context when required If we remove some hardcoded assumptions about the preempt context having a fixed id, reserved from use by normal user contexts, we may only allocate the i915_gem_context when required. Then the subsequent decisions on using preemption reduce to having the preempt context available. v2: Include an assert that we don't allocate the preempt context twice. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-3-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com>	2018-02-08 07:30:16 +00:00
Tvrtko Ursulin	ae504be2e0	drm/i915: Downgrade incorrect engine constructor usage warnings to development Render engine constructor helpers must only be called from the render engine constructors, but there is no need to burden the production binaries with warnings which can only be triggered during development. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180119100005.9072-1-tvrtko.ursulin@linux.intel.com	2018-01-22 17:15:20 +00:00
Tvrtko Ursulin	b86aa4458a	drm/i915/icl: Gen11 render context size Gen11 removes the Resource Streamer, which frees up a big chunk of the context image. BSpec indicates 12544 DWORDs (13 pages), plus one page for PPHWSP. Please notice that, when looking at the BSpec context image table, the right filter has to be applied as some rows are excluded for specific GENs. Also, some rows apply per-subslice (for the calculation above, we have supposed I915_MAX_SUBSLICES = 8). v2: Rebase. v3: Use the right size as per the BSpec. v4: - Rebased on top of the default context size (Rodrigo) - Clarify in the commit message where the subslice calculation comes from. v5: s/12538/12544/ (Daniele) BSpec: 18907 Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Ben Widawsky <benjamin.widawsky@intel.com> (older version) Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1515711307-28979-2-git-send-email-oscar.mateo@intel.com	2018-01-19 18:13:33 -02:00
Oscar Mateo	7ab4adbd92	drm/i915: Return a default RCS context size Instead of returning whatever size the latest GEN used. This is because context sizes for new GENs can go up or down, but the only safe thing to do for missing cases is to use the largest known one, whatever that is. Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1515711307-28979-1-git-send-email-oscar.mateo@intel.com	2018-01-19 18:09:47 -02:00
Chris Wilson	99e48bf98d	drm/i915: Lock out execlist tasklet while peeking inside for busy-stats In order to prevent a race condition where we may end up overaccounting the active state and leaving the busy-stats believing the GPU is 100% busy, lock out the tasklet while we reconstruct the busy state. There is no direct spinlock guard for the execlists->port[], so we need to utilise tasklet_disable() as a synchronous barrier to prevent it, the only writer to execlists->port[], from running at the same time as the enable. Fixes: `4900727d35` ("drm/i915/pmu: Reconstruct active state on starting busy-stats") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180115092041.13509-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2018-01-15 12:18:46 +00:00
Chris Wilson	4900727d35	drm/i915/pmu: Reconstruct active state on starting busy-stats We have a hole in our busy-stat accounting if the pmu is enabled during a long running batch, the pmu will not start accumulating busy-time until the next context switch. This then fails tests that are only sampling a single batch. v2: Count each active port just once (context in/out events are only on the first and last assignment to a port). v3: Avoid hardcoding knowledge of 2 submission ports Fixes: `30e17b7847` ("drm/i915: Engine busy time tracking") Testcase: igt/perf_pmu/busy-start Testcase: igt/perf_pmu/busy-double-start Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180111073031.14614-1-chris@chris-wilson.co.uk	2018-01-11 14:57:27 +00:00
Kenneth Graunke	ab062639ed	drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake. Geminilake requires the 3D driver to select whether barriers are intended for compute shaders, or tessellation control shaders, by whacking a "Barrier Mode" bit in SLICE_COMMON_ECO_CHICKEN1 when switching pipelines. Failure to do this properly can result in GPU hangs. Unfortunately, this means it needs to switch mid-batch, so only userspace can properly set it. To facilitate this, the kernel needs to whitelist the register. The workarounds page currently tags this as applying to Broxton only, but that doesn't make sense. The documentation for the register it references says the bit userspace is supposed to toggle only exists on Geminilake. Empirically, the Mesa patch to toggle this bit appears to fix intermittent GPU hangs in tessellation control shader barrier tests on Geminilake; we haven't seen those hangs on Broxton. v2: Mention WA #0862 in the comment (it doesn't have a name). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180105085905.9298-1-kenneth@whitecape.org	2018-01-05 09:42:33 -08:00
Chris Wilson	c1bf272857	drm/i915: Show HWSP in intel_engine_dump() Looking at a CI failure with an ominous line of [ 362.550715] hangcheck current seqno ffffff6b, last ffffff8c, hangcheck ffffff6b [6016 ms], inflight 118 with no apparent cause for the seqno to be negative, left me wondering if someone had scribbled over the HWSP. So include the HWSP in the engine dump to see if there are more signs of random scribbling. v2: Fix row pointer, i is now incremented by 8 so doesn't need scaling by 8, and we don't need to keep volatile here as the status_page isn't marked up as volatile itself. v3: Use hexdump, with suppression of identical lines. (Tvrtko) Which results in HWSP: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 * 00000040 00000001 00000000 00000018 00000002 00000001 00000000 00000018 00000000 00000060 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000003 00000080 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 * 000000c0 00000002 00000000 00000000 00000000 00000000 00000000 00000000 00000000 000000e0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 * instead of 128 lines of mostly 0s. v4: Tidy up the locals Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171222182521.18106-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-12-22 19:02:52 +00:00
Rafael Antognolli	a2b1658857	drm/i915: Implement WaDisableEarlyEOT. There seems to be another clock gating issue which the workaround is described as: "WA: Set 0xE4F0[1] = 1 to disable Early EOT of thread." Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171216001117.14232-2-rafael.antognolli@intel.com	2017-12-19 13:43:26 -08:00
Chris Wilson	a0cf579080	drm/i915: Show IPEIR and IPEHR in the engine dump A useful bit of information for inspecting GPU stalls from intel_engine_dump() are the error registers, IPEIR and IPEHR. v2: Fixup gen changes in register offsets (Tvrtko) v3: Old FADDR location as well v4: Use I915_READ64_2x32 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171218123914.19027-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-12-18 13:04:34 +00:00
Chris Wilson	d7dc4131eb	drm/i915: Don't check #active_requests from i915_gem_wait_for_idle() i915_gem_wait_for_idle() is called from inside the shrinker, to ensure that we drain the last resources from the GPU in dire circumstances (OOM). As we may allocate whilst building a request, it is then possible to hit the shrinker with a request under construction, and so we must account for the incomplete request whilst waiting. In particular, we preincrement (in reserve_engine) the i915->gt.active_requests counter and mark the GPU as busy, therefore we can not use that counter for shortcircuiting the wait-for-idle. [ 950.859024] GEM_BUG_ON(i915->gt.active_requests) [ 950.859041] WARNING: CPU: 2 PID: 2178 at drivers/gpu/drm/i915/i915_gem.c:3615 i915_gem_wait_for_idle.part.56+0x166/0x4e0 [ 950.859041] Modules linked in: ccm tun fuse nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_mangle iptable_security iptable_raw arc4 iwldvm mac80211 snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic snd_hda_intel snd_hda_codec btusb snd_hda_core btrtl btbcm iwlwifi snd_hwdep btintel bluetooth snd_seq snd_seq_device snd_pcm ecdh_generic x86_pkg_temp_thermal tpm_infineon coretemp tpm_tis crc32_pclmul wmi_bmof crc32c_intel iTCO_wdt hp_wmi snd_timer iTCO_vendor_support sparse_keymap tpm_tis_core mei_me cfg80211 [ 950.859082] snd joydev tpm mei rfkill pcspkr wmi soundcore lpc_ich hp_accel lis3lv02d input_polldev binfmt_misc e1000e ptp serio_raw pps_core [ 950.859094] CPU: 2 PID: 2178 Comm: gem_exec_nop Tainted: G U 4.15.0-rc2+ #900 [ 950.859102] Hardware name: Hewlett-Packard HP ProBook 6360b/1620, BIOS 68SCF Ver. B.42 12/29/2010 [ 950.859107] task: c5119cb4 task.stack: f3ccb8d8 [ 950.859112] EIP: i915_gem_wait_for_idle.part.56+0x166/0x4e0 [ 950.859113] EFLAGS: 00010296 CPU: 2 [ 950.859114] EAX: 00000024 EBX: f36c1888 ECX: f777a044 EDX: 00000007 [ 950.859115] ESI: f36c1888 EDI: edd53958 EBP: edd53970 ESP: edd53938 [ 950.859116] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 950.859117] CR0: 80050033 CR2: b7f39000 CR3: 2f2b3000 CR4: 000406d0 [ 950.859118] Call Trace: [ 950.859125] ? drm_printk+0x70/0x70 [ 950.859129] i915_gem_wait_for_idle+0x18/0x30 [ 950.859133] i915_gem_shrink+0x360/0x410 [ 950.859138] ? vmpressure+0xa8/0xf0 [ 950.859142] ? ktime_get+0x4a/0x100 [ 950.859147] i915_gem_shrink_all+0x21/0x40 [ 950.859151] i915_gem_shrinker_oom+0x23/0x130 [ 950.859156] notifier_call_chain+0x4e/0x70 [ 950.859160] __blocking_notifier_call_chain+0x2f/0x60 [ 950.859164] blocking_notifier_call_chain+0x11/0x20 [ 950.859169] out_of_memory+0x207/0x280 [ 950.859174] __alloc_pages_nodemask+0xd47/0xe60 [ 950.859179] new_slab+0x32d/0x450 [ 950.859183] ___slab_alloc.constprop.81+0x358/0x4e0 [ 950.859189] ? i915_sw_fence_await_dma_fence+0x53/0x160 [ 950.859193] ? __slab_free+0x1fe/0x310 [ 950.859197] ? native_sched_clock+0x1e/0xc0 [ 950.859201] ? i915_gem_request_alloc+0xcf/0x510 [ 950.859205] ? sched_clock+0x9/0x10 [ 950.859209] __slab_alloc.constprop.80+0x29/0x40 [ 950.859212] ? __slab_alloc.constprop.80+0x29/0x40 [ 950.859216] kmem_cache_alloc_trace+0x160/0x1a0 [ 950.859220] ? i915_sw_fence_await_dma_fence+0x53/0x160 [ 950.859224] i915_sw_fence_await_dma_fence+0x53/0x160 [ 950.859229] i915_gem_request_await_dma_fence+0x1eb/0x390 [ 950.859233] i915_gem_request_await_object+0xee/0x230 [ 950.859239] i915_gem_do_execbuffer+0xc16/0x1200 [ 950.859246] ? irqtime_account_irq+0x3e/0xc0 [ 950.859251] ? irq_exit+0x4f/0xb0 [ 950.859257] ? smp_apic_timer_interrupt+0x5f/0x110 [ 950.859261] ? apic_timer_interrupt+0x35/0x3c [ 950.859266] i915_gem_execbuffer2_ioctl+0x212/0x440 [ 950.859270] ? apic_timer_interrupt+0x35/0x3c [ 950.859274] ? i915_gem_do_execbuffer+0x1200/0x1200 [ 950.859279] ? insn_get_seg_base+0x1b/0x50 [ 950.859283] ? i915_gem_do_execbuffer+0x1200/0x1200 [ 950.859287] drm_ioctl_kernel+0x51/0xa0 [ 950.859291] drm_ioctl+0x2a3/0x350 [ 950.859294] ? i915_gem_do_execbuffer+0x1200/0x1200 [ 950.859300] ? sched_clock+0x9/0x10 [ 950.859303] ? drm_getunique+0x70/0x70 [ 950.859308] do_vfs_ioctl+0x7d/0x640 [ 950.859311] ? native_sched_clock+0x1e/0xc0 [ 950.859315] ? sched_clock+0x9/0x10 [ 950.859319] ? sched_clock_cpu+0x13/0x120 [ 950.859323] SyS_ioctl+0x4e/0x80 [ 950.859326] do_fast_syscall_32+0x75/0x250 [ 950.859331] ? irq_exit+0x4f/0xb0 [ 950.859334] entry_SYSENTER_32+0x47/0x71 [ 950.859338] EIP: 0xb7f81d11 [ 950.859339] EFLAGS: 00000296 CPU: 2 [ 950.859340] EAX: ffffffda EBX: 00000003 ECX: 40406469 EDX: bfde4c20 [ 950.859340] ESI: 00000003 EDI: 40406469 EBP: 00000003 ESP: bfde4b38 [ 950.859341] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [ 950.859343] Code: e8 30 60 01 00 83 c4 10 83 c3 04 39 f3 75 e0 8b 45 d8 8b 80 14 37 00 00 85 c0 74 13 68 dd 33 e4 c0 68 49 6f e3 c0 e8 4a 55 be ff <0f> ff 5e 5f b8 fe ff ff 3f bb 0a 00 00 00 e8 b7 14 c4 ff 8b 15 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171212132148.8124-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-12-13 11:15:38 +00:00
Chris Wilson	d5acadfe7d	drm/i915: Stop showing seqno info from debugfs/i915_interrupt_info Since the seqno information shown from i915_interrupt_info is just a small subset of i915_engine_info, remove it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171209104418.4223-2-chris@chris-wilson.co.uk	2017-12-11 09:14:48 +00:00
Chris Wilson	2d8d1afb4d	drm/i915: Add is-wedged flag to intel_engine_dump() Comparing the state tested by intel_engine_is_idle() and printed by intel_engine_dump(), the only bit not shown is whether or not the device is wedged. Add that little bit of information to the pretty printer so that if the engine fails to idle we can see why. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171208012303.25504-5-chris@chris-wilson.co.uk	2017-12-08 18:48:38 +00:00
Chris Wilson	528dd16a7c	drm/i915: Include the global reset count for intel_engine_dump() Since a global reset affects the engine, include that along side the per-engine reset counter when pretty printing the engine state in intel_engine_dump(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171208012303.25504-4-chris@chris-wilson.co.uk	2017-12-08 18:48:37 +00:00
Chris Wilson	832265d38c	drm/i915: Include engine state on detecting a missed breadcrumb/seqno Now that we have a common engine state pretty printer, we can use that instead of the adhoc information printed when we miss a breadcrumb. v2: Rearrange intel_engine_disarm_breadcrumbs() to avoid calling intel_engine_dump() under the rb spinlock (Mika) and to pretty-print the error state early so that we include the full list of waiters. v3: Pass missed breadcrumb msg to pretty-printer as the header v4: Preserve DRM_DEBUG_DRIVER filtering. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171208012303.25504-3-chris@chris-wilson.co.uk	2017-12-08 18:48:36 +00:00
Chris Wilson	0db18b17c8	drm/i915: Make engine state pretty-printer header configurable Pass in a format string (and args) to specify the header to be emitted along with the engine state when pretty-printing. This allows the header to be emitted inside the drm_printer stream, so sharing the same prefix and output characteristics (e.g. debug level and filtering). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171208012303.25504-2-chris@chris-wilson.co.uk	2017-12-08 18:48:34 +00:00
Chris Wilson	e8a70cab25	drm/i915: Use snprintf to avoid line-break when pretty-printing engines When printing the execlist ports, we first print the ELSP header then follow it with the pretty-printed request. Since switching to drm_printer and show the output via printk, it automatically appends a newline to each call (unlike the old seq_printf output). To avoid the unwanted line break, construct the ELSP request header in a temporary buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171208012303.25504-1-chris@chris-wilson.co.uk	2017-12-08 18:48:34 +00:00
Valtteri Rantala	7436830c8d	drm/i915/glk: Apply WaProgramL3SqcReg1DefaultForPerf for GLK too Testing the texture read performance shows that the same tuning for the SQ credits is needed on GLK as on BXT/APL. This has been also confirmed by Altug from the HW team. V4: Rebase + fix Signed-off-by: Valtteri Rantala <valtteri.rantala@intel.com> Reviewed-by: David Weinehall <david.weinehall@linux.intel.com> (v1) Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1511880305-12166-1-git-send-email-valtteri.rantala@intel.com	2017-11-30 13:32:51 +02:00
Tvrtko Ursulin	cf669b4e9f	drm/i915: Consolidate checks for engine stats availability Sagar noticed the check can be consolidated between the engine stats implementation and the PMU. My first choice was a static inline helper but that got into include ordering mess quickly fast so I went with a macro instead. At some point we should perhaps looking into taking out the non-ringubffer bits from intel_ringbuffer.h into a new intel_engine.h or something. v2: Use engine->flags. (Chris Wilson) v3: Rebase and mark GuC as not yet supported. (Chris Wilson) v4: Move flag setting to intel_engines_reset_default_submission. (Chris Wilson) v5: Move flag setting to logical_ring_setup. v6: intel_engines_reset_default_submission is the wrong place to set the flag - it needs to be in execlists_set_default_submission. (Sagar) v7: Flag setting in logical_ring_setup is not required. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> (v6) Link: https://patchwork.freedesktop.org/patch/msgid/20171129102805.22690-1-tvrtko.ursulin@linux.intel.com	2017-11-29 12:29:40 +00:00
Tvrtko Ursulin	30e17b7847	drm/i915: Engine busy time tracking Track total time requests have been executing on the hardware. We add new kernel API to allow software tracking of time GPU engines are spending executing requests. Both per-engine and global API is added with the latter also being exported for use by external users. v2: * Squashed with the internal API. * Dropped static key. * Made per-engine. * Store time in monotonic ktime. v3: Moved stats clearing to disable. v4: * Comments. * Don't export the API just yet. v5: Whitespace cleanup. v6: * Rename ref to active. * Drop engine aggregate stats for now. * Account initial busy period after enabling stats. v7: * Rebase. v8: * Move context in notification after the notifier. (Chris Wilson) v9: In cases where stats tracking is getting disabled while there is an active context on an engine, add up the current value to the total. This also implies we don't clear the total when tracking is disabled any longer. There is no real need to do so because we define the stats as relative while enabled, meaning comparison between two samples while tracking is enabled is the valid usage. However, when busy stats will later be plugged into the perf PMU API, it is beneficial to not reset the total, since the PMU core likes to do some counter disable/enable cycles on startup, and while doing so during a single long context executing on an engine we would lose some accuracy and so make unit testing more difficult than needs to be. v10: * Fix accounting for preemption. v11: * Rebase for i915_modparams.enable_execlists removal. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-5-tvrtko.ursulin@linux.intel.com	2017-11-22 11:25:02 +00:00
Tvrtko Ursulin	b46a33e271	drm/i915/pmu: Expose a PMU interface for perf queries From: Chris Wilson <chris@chris-wilson.co.uk> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> The first goal is to be able to measure GPU (and invidual ring) busyness without having to poll registers from userspace. (Which not only incurs holding the forcewake lock indefinitely, perturbing the system, but also runs the risk of hanging the machine.) As an alternative we can use the perf event counter interface to sample the ring registers periodically and send those results to userspace. Functionality we are exporting to userspace is via the existing perf PMU API and can be exercised via the existing tools. For example: perf stat -a -e i915/rcs0-busy/ -I 1000 Will print the render engine busynnes once per second. All the performance counters can be enumerated (perf list) and have their unit of measure correctly reported in sysfs. v1-v2 (Chris Wilson): v2: Use a common timer for the ring sampling. v3: (Tvrtko Ursulin) * Decouple uAPI from i915 engine ids. * Complete uAPI defines. * Refactor some code to helpers for clarity. * Skip sampling disabled engines. * Expose counters in sysfs. * Pass in fake regs to avoid null ptr deref in perf core. * Convert to class/instance uAPI. * Use shared driver code for rc6 residency, power and frequency. v4: (Dmitry Rogozhkin) * Register PMU with .task_ctx_nr=perf_invalid_context * Expose cpumask for the PMU with the single CPU in the mask * Properly support pmu->stop(): it should call pmu->read() * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE) * Introduce refcounting of event subscriptions. * Make pmu.busy_stats a refcounter to avoid busy stats going away with some deleted event. * Expose cpumask for i915 PMU to avoid multiple events creation of the same type followed by counter aggregation by perf-stat. * Track CPUs getting online/offline to migrate perf context. If (likely) cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be needed to see effect of CPU status tracking. * End result is that only global events are supported and perf stat works correctly. * Deny perf driver level sampling - it is prohibited for uncore PMU. v5: (Tvrtko Ursulin) * Don't hardcode number of engine samplers. * Rewrite event ref-counting for correctness and simplicity. * Store initial counter value when starting already enabled events to correctly report values to all listeners. * Fix RC6 residency readout. * Comments, GPL header. v6: * Add missing entry to v4 changelog. * Fix accounting in CPU hotplug case by copying the approach from arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin) v7: * Log failure message only on failure. * Remove CPU hotplug notification state on unregister. v8: * Fix error unwind on failed registration. * Checkpatch cleanup. v9: * Drop the energy metric, it is available via intel_rapl_perf. (Ville Syrjälä) * Use HAS_RC6(p). (Chris Wilson) * Handle unsupported non-engine events. (Dmitry Rogozhkin) * Rebase for intel_rc6_residency_ns needing caller managed runtime pm. * Drop HAS_RC6 checks from the read callback since creating those events will be rejected at init time already. * Add counter units to sysfs so perf stat output is nicer. * Cleanup the attribute tables for brevity and readability. v10: * Fixed queued accounting. v11: * Move intel_engine_lookup_user to intel_engine_cs.c * Commit update. (Joonas Lahtinen) v12: * More accurate sampling. (Chris Wilson) * Store and report frequency in MHz for better usability from perf stat. * Removed metrics: queued, interrupts, rc6 counters. * Sample engine busyness based on seqno difference only for less MMIO (and forcewake) on all platforms. (Chris Wilson) v13: * Comment spelling, use mul_u32_u32 to work around potential GCC issue and somne code alignment changes. (Chris Wilson) v14: * Rebase. v15: * Rebase for RPS refactoring. v16: * Use the dynamic slot in the CPU hotplug state machine so that we are free to setup our state as multi-instance. Previously we were re-using the CPUHP_AP_PERF_X86_UNCORE_ONLINE slot which is neither used as multi-instance, nor owned by our driver to start with. * Register the CPU hotplug handlers after the PMU, otherwise the callback will get called before the PMU is initialized which can end up in perf_pmu_migrate_context with an un-initialized base. * Added workaround for a probable bug in cpuhp core. v17: * Remove workaround for the cpuhp bug. v18: * Rebase for drm_i915_gem_engine_class getting upstream before us. v19: * Rebase. (trivial) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-2-tvrtko.ursulin@linux.intel.com	2017-11-22 11:24:57 +00:00
Chris Wilson	93c6e966b4	drm/i915: Remove i915.semaphores modparam Having disabled the broken semaphores on Sandybridge, there is no need for a modparam any more, so remove it in favour of a simple HAS_LEGACY_SEMAPHORES() guard. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-5-chris@chris-wilson.co.uk	2017-11-20 21:59:09 +00:00
Chris Wilson	af9ff6c70d	drm/i915: Move debugfs/i915_semaphore_status to i915_engine_info As the semaphores is just part of the engine, include it with the general pretty printer universally used for debugging. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-4-chris@chris-wilson.co.uk	2017-11-20 21:59:08 +00:00
Chris Wilson	79e6770cb1	drm/i915: Remove obsolete ringbuffer emission for gen8+ Since removing the module parameter to force selection of ringbuffer emission for gen8, the code is defunct. Remove it. To put the difference into perspective, a couple of microbenchmarks (bdw i7-5557u, 20170324): ring execlists exec continuous nops on all rings: 1.491us 2.223us exec sequential nops on each ring: 12.508us 53.682us single nop + sync: 9.272us 30.291us vblank_mode=0 glxgears: ~11000fps ~9000fps Since the earlier submission, gen8 ringbuffer submission has fallen further and further behind in features. So while ringbuffer may hold the throughput crown, in terms of interactive latency, execlists is much better. Alas, we have no convenient metrics for such, other than demonstrating things we can do with execlists but can not using legacy ringbuffer submission. We have made a few improvements to lowlevel execlists throughput, and ringbuffer currently panics on boot! (bdw i7-5557u, 20171026): ring execlists exec continuous nops on all rings: n/a 1.921us exec sequential nops on each ring: n/a 44.621us single nop + sync: n/a 21.953us vblank_mode=0 glxgears: n/a ~18500fps References: https://bugs.freedesktop.org/show_bug.cgi?id=87725 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Once-upon-a-time-Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-2-chris@chris-wilson.co.uk	2017-11-20 21:54:58 +00:00
Chris Wilson	fb5c551ad5	drm/i915: Remove i915.enable_execlists module parameter Execlists and legacy ringbuffer submission are no longer feature comparable (execlists now offer greater functionality that should overcome their performance hit) and obsoletes the unsafe module parameter, i.e. comparing the two modes of execution is no longer useful, so remove the debug tool. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #i915_perf.c Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-1-chris@chris-wilson.co.uk	2017-11-20 21:53:59 +00:00
Sagar Arun Kamble	c6dce8f140	drm/i915: Update execlists tasklet naming intel_lrc_irq_handler and i915_guc_irq_handler are HW submission related tasklet functions. Name them with "submission_tasklet" suffix and remove intel/i915 prefix as they are static. Also rename irq_tasklet as just tasklet for clarity. v2: s/_bh/_tasklet (Chris) Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-2-git-send-email-sagar.a.kamble@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-16 15:01:31 +00:00
Chris Wilson	70a84f3c60	drm/i915: Unconditionally apply the Broxton register workaround set Having removed the preproduction Broxton support (see commit `0102ba1fd8` ("drm/i915: Add early BXT sdv to the list of preproduction machines")), we know we then always need the production Broxton workaround set and do not need a predicate upon revision. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114134340.5439-2-chris@chris-wilson.co.uk	2017-11-14 15:16:31 +00:00
Chris Wilson	f3e2b2c540	drm/i915: Remove pre-production Broxton register workarounds We've begun excluding pre-production Broxton machines since commit `0102ba1fd8` ("drm/i915: Add early BXT sdv to the list of preproduction machines"), now remove the list of workaround register values for those early machines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170927093325.24206-1-chris@chris-wilson.co.uk Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114134340.5439-1-chris@chris-wilson.co.uk	2017-11-14 15:16:30 +00:00
Chris Wilson	34991bd48c	drm/i915: Unify SLICE_UNIT_LEVEL_CLKGATE w/a for cnl gem_workarounds reports that the SLICE_UNIT_LEVEL_CLKGATE write isn't sticking. Commit `0a60797a0e` ("drm/i915: Implement ReadHitWriteOnlyDisable.") presumes that SLICE_UNIT_LEVEL_CLKGATE is a masked register in the context image, but commit `90007bca61` ("drm/i915/cnl: Introduce initial Cannonlake Workarounds.") lists it as an ordering unmasked register. The masked write will be losing the default settings if we trust the original commit. That gem_workarounds reports the value is lost entirely is more worrying though -- but it clearly suggests that it is not a masked register in the context image, so unify both w/a to use the original rmw. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103705 Fixes: `0a60797a0e` ("drm/i915: Implement ReadHitWriteOnlyDisable.") References: `90007bca61` ("drm/i915/cnl: Introduce initial Cannonlake Workarounds.") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171111100336.11020-1-chris@chris-wilson.co.uk Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2017-11-14 15:16:18 +00:00
Michel Thierry	ce453b3e42	drm/i915: Clear per-engine fault register as early as possible From gen6, the hardware tracks address lookup failures and we should clear those registers upon startup to prevent false positives. However, this was happening before we have the engines defined (intel_uncore_init()) and the for_each_engine loop was just a nop. The earliest we can call this is inside intel_engines_init_mmio(). Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171111004448.12360-1-michel.thierry@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-13 19:03:13 +00:00
Chris Wilson	7c2fa7faf1	drm/i915: Stop caching the "golden" renderstate As we now record the default HW state and so only emit the "golden" renderstate once to prepare the HW, there is no advantage in keeping the renderstate batch around as it will never be used again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-8-chris@chris-wilson.co.uk	2017-11-10 17:23:22 +00:00
Chris Wilson	d2b4b97933	drm/i915: Record the default hw state after reset upon load Take a copy of the HW state after a reset upon module loading by executing a context switch from a blank context to the kernel context, thus saving the default hw state over the blank context image. We can then use the default hw state to initialise any future context, ensuring that each starts with the default view of hw state. v2: Unmap our default state from the GTT after stealing it from the context. This should stop us from accidentally overwriting it via the GTT (and frees up some precious GTT space). Testcase: igt/gem_ctx_isolation Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-7-chris@chris-wilson.co.uk	2017-11-10 17:23:10 +00:00
Chris Wilson	ae6c457478	drm/i915: Force the switch to the i915->kernel_context In the next few patches, we will have a hard requirement that we emit a context-switch to the perma-pinned i915->kernel_context (so that we can save the HW state using that context-switch). As the first context itself may be classed as a kernel context, we want to be explicit in our comparison. For an extra-layer of finesse, we can check the last unretired context on the engine; as well as the last retired context when idle. v2: verbose verbosity v3: Always force the switch, even when the engine is idle, and update the assert that this happens before suspend. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1 Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-2-chris@chris-wilson.co.uk	2017-11-10 17:20:25 +00:00
Tvrtko Ursulin	1803fcbca2	drm/i915: Define an engine class enum for the uABI We want to be able to report back to userspace details about an engine's class, and in return for userspace to be able to request actions regarding certain classes of engines. To isolate the uABI from any variations between hw generations, we define an abstract class for the engines and internally map onto the hw. v2: Remove MAX from the uABI; keep it internal if we need it, but don't let userspace make the mistake of using it themselves. v3: s/OTHER/INVALID/ The use of OTHER is ill-defined, so remove it from the uABI as any future new type of engine can define a class to suit it. But keep a reserved value for an invalid class, so that we can always unambiguously express when something doesn't belong to the classification. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v2 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-1-chris@chris-wilson.co.uk	2017-11-10 17:20:24 +00:00
Chris Wilson	30b29406d9	drm/i915: Restore the wait for idle engine after flushing interrupts So it appears that commit `5427f20785` ("drm/i915: Bump wait-times for the final CS interrupt before parking") was a little over optimistic in its belief that it had successfully waited for all residual activity on the engines before parking. Numerous sightings in CI since then of <7>[ 52.542886] [IGT] core_auth: executing <3>[ 52.561013] [drm:intel_engines_park [i915]] ERROR vcs0 is not idle before parking <7>[ 52.561215] intel_engines_park vcs0 <7>[ 52.561229] intel_engines_park current seqno 98, last 98, hangcheck 0 [-247449 ms], inflight 0 <7>[ 52.561238] intel_engines_park Reset count: 0 <7>[ 52.561266] intel_engines_park Requests: <7>[ 52.561363] intel_engines_park RING_START: 0x00000000 [0x00000000] <7>[ 52.561377] intel_engines_park RING_HEAD: 0x00000000 [0x00000000] <7>[ 52.561390] intel_engines_park RING_TAIL: 0x00000000 [0x00000000] <7>[ 52.561406] intel_engines_park RING_CTL: 0x00000000 <7>[ 52.561422] intel_engines_park RING_MODE: 0x00000200 [idle] <7>[ 52.561442] intel_engines_park ACTHD: 0x00000000_00000000 <7>[ 52.561459] intel_engines_park BBADDR: 0x00000000_00000000 <7>[ 52.561474] intel_engines_park Execlist status: 0x00000301 00000000 <7>[ 52.561489] intel_engines_park Execlist CSB read 5 [5 cached], write 5 [5 from hws], interrupt posted? no <7>[ 52.561500] intel_engines_park ELSP[0] idle <7>[ 52.561510] intel_engines_park ELSP[1] idle <7>[ 52.561519] intel_engines_park HW active? 0x0 <7>[ 52.561608] intel_engines_park Idle? yes <7>[ 52.561617] intel_engines_park on Braswell, which indicates that the engine just needs that little bit longer after flushing the tasklet to settle. So give it a few more milliseconds before declaring an err and applying the emergency brake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103479 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110112550.28909-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2017-11-10 12:55:57 +00:00
Rafael Antognolli	0a60797a0e	drm/i915: Implement ReadHitWriteOnlyDisable. The workaround for this is described as: "if RenderSurfaceState.Num_Multisamples > 1, disable RCC clock gating if RenderSurfaceState.Num_Multisamples == 1, set 0x7010[14] = 1" Further documentation in the internal bug referenced by the bspec suggest that any of the above suggestions should suffice to fix the issue. We are going with disabling RCC clock gating. Unfortunately, what we are doing doesn't match the name of the workaround, but at least it matches its description. This change improves CNL stability by avoiding some of the hangs seen in the platform. v2: Only disable RCC clock gating. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171103183027.5051-1-rafael.antognolli@intel.com	2017-11-08 12:43:17 -08:00
Chris Wilson	c400cc2a1a	drm/i915: Include intel_engine_is_idle() status in engine pretty-printer Upon parking, if we discover an active engine we dump its state. Follow that state with an indication of whether the engine was idle. References: https://bugs.freedesktop.org/show_bug.cgi?id=103479 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171107152211.19930-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-11-08 15:40:30 +00:00
Chris Wilson	820c5bbbf4	drm/i915: Flush the irq and tasklets before asserting engine is idle Before we assert that the engine is idle, make sure we flush any residual tasklet. After that point, if the engine is not idle, more work may be queued despite us trying to park the engine and go to sleep. References: https://bugs.freedesktop.org/show_bug.cgi?id=103479 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171101202149.32493-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2017-11-02 11:24:59 +00:00
Chris Wilson	3265124a2d	drm/i915: Give more details for the active-when-parking warning for the engines If the we think the engine is still active when we attempt to park it, we want more details -- so dump the engine state. References: https://bugs.freedesktop.org/show_bug.cgi?id=103479 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171027110617.31745-4-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2017-11-01 15:09:39 +00:00
Chris Wilson	680273879d	drm/i915: Move parking-while-active warning to intel_engines_park() We will want to break this down to give detailed per-engine warnings as to why we still think we are active as we attempt to park the engines. For the first step, just move the warning verbatim from the idle-worker to intel_engines_park(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171027110617.31745-3-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2017-11-01 15:09:12 +00:00
Chris Wilson	3c75de5b98	drm/i915: Include RING_MODE when dumping the engine state Knowing the RING_MODE flags is useful for checking the state of the engine, such as whether the CS is idle after trying to stop the engines before reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171026115048.20144-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2017-10-26 21:35:21 +01:00
Michał Winiarski	a4598d1755	drm/i915: Rename helpers used for unwinding, use macro for can_preempt We would also like to make use of execlist_cancel_port_requests and unwind_incomplete_requests in GuC preemption backend. Let's rename the functions to use the correct prefixes, so that we can simply add the declarations in the following patch. Similar thing for applies for can_preempt, except we're introducing HAS_LOGICAL_RING_PREEMPTION macro instad, converting other users that were previously touching device info directly. v2: s/intel_engine/execlists and pass execlists to unwind (Chris) v3: use locked version for exporting, drop const qual (Chris) Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-11-michal.winiarski@intel.com	2017-10-26 21:35:21 +01:00
Chris Wilson	aba5e27858	drm/i915: Add a hook for making the engines idle (parking) and unparking In the next patch, we will want to install a callback when the engines (GT as a whole) become idle and similarly when they first become busy. To enable that callback, first rename intel_engines_mark_idle() to intel_engines_park() and provide the companion intel_engines_unpark(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171025143943.7661-2-chris@chris-wilson.co.uk	2017-10-25 18:47:30 +01:00
Chris Wilson	20ccd4d3f6	drm/i915: Use same test for eviction and submitting kernel context During evict, we wish to idle the GPU if we see that the GGTT is full. However, our test for idle in i915_gem_evict_something() and in i915_gem_switch_to_kernel_context() do not match leading to disappointment - we never believe that we are idle and keep trying to flush the GGTT ad infinitum. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103438 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171024220855.30155-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-10-25 12:13:03 +01:00
Chris Wilson	4a118ecbe9	drm/i915: Filter out spurious execlists context-switch interrupts Back in commit `a4b2b01523` ("drm/i915: Don't mark an execlists context-switch when idle") we noticed the presence of late context-switch interrupts. We were able to filter those out by looking at whether the ELSP remained active, but in commit `beecec9017` ("drm/i915/execlists: Preemption!") that became problematic as we now anticipate receiving a context-switch event for preemption while ELSP may be empty. To restore the spurious interrupt suppression, add a counter for the expected number of pending context-switches and skip if we do not need to handle this interrupt to make forward progress. v2: Don't forget to switch on for preempt. v3: Reduce the counter to a on/off boolean tracker. Declare the HW as active when we first submit, and idle after the final completion event (with which we confirm the HW says it is idle), and track each source of activity separately. With a finite number of sources, it should aide us in debugging which gets stuck. Fixes: `beecec9017` ("drm/i915/execlists: Preemption!") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171023213237.26536-3-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2017-10-24 15:54:37 +01:00

1 2 3 4

161 Commits