OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Tvrtko Ursulin	39dabecd99	drm/i915: Use shorter route to dev_private where possible Where we have a request we can use req->i915 directly instead of going through the engine and device. Coccinelle script: @@ function f; identifier r; @@ f(..., struct drm_i915_gem_request r, ...) { ... - engine->dev->dev_private + r->i915 ... } @@ struct drm_i915_gem_request req; @@ ( req-> - engine->dev->dev_private + i915 ) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1458219850-21007-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-03-18 09:50:37 +00:00
Mika Kuoppala	ee4b6faf96	drm/i915: Modify reset func to handle per engine resets In full gpu reset we prime all engines and reset domains corresponding to each engine. Per engine reset is just a special case of this process wherein only a single engine is reset. This change is aimed to modify relevant functions to achieve this. There are some other steps we carry out in case of engine reset which are addressed in later patches. Reset func now accepts a mask of all engines that need to be reset. Where per engine resets are supported, error handler populates the mask accordingly otherwise all engines are specified. v2: ALL_ENGINES mask fixup, better for_each_ring_masked (Chris) v3: Whitespace fixes (Chris) v4: Rebase due to s/ring/engine Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1458143640-20563-1-git-send-email-mika.kuoppala@intel.com	2016-03-17 15:01:15 +02:00
Tvrtko Ursulin	666796da7a	drm/i915: More intel_engine_cs renaming Some trivial ones, first pass done with Coccinelle: @@ @@ ( - I915_NUM_RINGS + I915_NUM_ENGINES \| - intel_ring_flag + intel_engine_flag \| - for_each_ring + for_each_engine \| - i915_gem_request_get_ring + i915_gem_request_get_engine \| - intel_ring_idle + intel_engine_idle \| - i915_gem_reset_ring_status + i915_gem_reset_engine_status \| - i915_gem_reset_ring_cleanup + i915_gem_reset_engine_cleanup \| - init_ring_lists + init_engine_lists ) But that didn't fully work so I cleaned it up with: for f in .[hc]; do sed -i -e s/I915_NUM_RINGS/I915_NUM_ENGINES/ $f; done for f in .[hc]; do sed -i -e s/i915_gem_request_get_ring/i915_gem_request_get_engine/ $f; done for f in .[hc]; do sed -i -e s/intel_ring_flag/intel_engine_flag/ $f; done for f in .[hc]; do sed -i -e s/intel_ring_idle/intel_engine_idle/ $f; done for f in .[hc]; do sed -i -e s/init_ring_lists/init_engine_lists/ $f; done for f in .[hc]; do sed -i -e s/i915_gem_reset_ring_cleanup/i915_gem_reset_engine_cleanup/ $f; done for f in *.[hc]; do sed -i -e s/i915_gem_reset_ring_status/i915_gem_reset_engine_status/ $f; done v2: Rebase. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:24 +00:00
Tvrtko Ursulin	4a570db57c	drm/i915: Rename intel_engine_cs struct members below and a couple manual fixups. @@ identifier I, J; @@ struct I { ... - struct intel_engine_cs J; + struct intel_engine_cs engine; ... } @@ identifier I, J; @@ struct I { ... - struct intel_engine_cs J; + struct intel_engine_cs engine; ... } @@ struct drm_i915_private d; @@ ( - d->ring + d->engine ) @@ struct i915_execbuffer_params p; @@ ( - p->ring + p->engine ) @@ struct intel_ringbuffer r; @@ ( - r->ring + r->engine ) @@ struct drm_i915_gem_request req; @@ ( - req->ring + req->engine ) v2: Script missed the tracepoint code - fixed up by hand. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:17 +00:00
Tvrtko Ursulin	0bc40be85f	drm/i915: Rename intel_engine_cs function parameters @@ identifier func; @@ func(..., struct intel_engine_cs * - ring + engine , ...) { <... - ring + engine ...> } @@ identifier func; type T; @@ T func(..., struct intel_engine_cs * - ring + engine , ...); Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:10 +00:00
Tvrtko Ursulin	e2f8039147	drm/i915: Rename local struct intel_engine_cs variables Done by the Coccinelle script below plus a manual intervention to GEN8_RING_SEMAPHORE_INIT. @@ expression E; @@ - struct intel_engine_cs ring = E; + struct intel_engine_cs engine = E; <+... - ring + engine ...+> @@ @@ - struct intel_engine_cs ring; + struct intel_engine_cs engine; <+... - ring + engine ...+> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:00 +00:00
Chris Wilson	1c7f4bca5a	drm/i915: Rename vma->_list to _link for consistency Elsewhere we have adopted the convention of using '_link' to denote elements in the list (and '_list' for the actual list_head itself), and that the name should indicate which list the link belongs to (and preferrably not just where the link is being stored). s/vma_link/obj_link/ (we iterate over obj->vma_list) s/mm_list/vm_link/ (we iterate over vm->[in]active_list) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2016-02-26 13:15:39 +00:00
Chris Wilson	b31e51360e	drm/i915: Reject invalid-pad for context-destroy and -create ioctls Unknown parameters, especially structure padding, are expected to invoke rejection with -EINVAL. v2: similar issue exists for context-create Testcase: igt/gem_ctx_create/invalid-pad Testcase: igt/gem_ctx_bad_destroy/invalid-pad Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89602 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93999 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1454690759-31201-1-git-send-email-chris@chris-wilson.co.uk	2016-02-15 18:26:52 +01:00
Tvrtko Ursulin	f4e2deceb6	drm/i915: Fix premature LRC unpin in GuC mode In GuC mode LRC pinning lifetime depends exclusively on the request liftime. Since that is terminated by the seqno update that opens up a race condition between GPU finishing writing out the context image and the driver unpinning the LRC. To extend the LRC lifetime we will employ a similar approach to what legacy ringbuffer submission does. We will start tracking the last submitted context per engine and keep it pinned until it is replaced by another one. Note that the driver unload path is a bit fragile and could benefit greatly from efforts to unify the legacy and exec list submission code paths. At the moment i915_gem_context_fini has special casing for the two which are potentialy not needed, and also depends on i915_gem_cleanup_ringbuffer running before itself. v2: * Move pinning into engine->emit_request and actually fix the reference/unreference logic. (Chris Wilson) * ring->dev can be NULL on driver unload so use a different route towards it. v3: * Rebase. * Handle the reset path. (Chris Wilson) * Exclude default context from the pinning - it is impossible to get it right before default context special casing in general is eliminated. v4: * Rebased & moved context tracking to intel_logical_ring_advance_and_submit. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Issue: VIZ-4277 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Nick Hoath <nicholas.hoath@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1453976997-25424-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-01-28 17:23:15 +00:00
Tvrtko Ursulin	a0b4a6a8db	drm/i915: Extract context unpinning to its own function Will enable cleaner implementation of a following fix and easier code unification in the future. Idea and code by Chris Wilson. v2: Do not return before last_contexts on engines are unpinned. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk>	2016-01-28 17:23:15 +00:00
Dave Gordon	ed54c1a1d1	drm/i915: abolish separate per-ring default_context pointers Now that we've eliminated a lot of uses of ring->default_context, we can eliminate the pointer itself. All the engines share the same default intel_context, so we can just keep a single reference to it in the dev_priv structure rather than one in each of the engine[] elements. This make refcounting more sensible too, as we now have a refcount of one for the one pointer, rather than a refcount of one but multiple pointers. From an idea by Chris Wilson. v2: transform an extra instance of ring->default_context introduced by `42f1cae8c` drm/i915: Restore inhibiting the load of the default context That patch's commentary includes: v2: Mark the global default context as uninitialized on GPU reset so that the context-local workarounds are reloaded upon re-enabling The code implementing that now also benefits from the replacement of the multiple (per-ring) pointers to the default context with a single pointer to the unique kernel context. v4: Rebased, remove underused local (Nick Hoath) Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1453230175-19330-3-git-send-email-david.s.gordon@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-01-21 09:21:29 +01:00
Chris Wilson	42f1cae8c0	drm/i915: Restore inhibiting the load of the default context Following a GPU reset, we may leave the context in a poorly defined state, and reloading from that context will leave the GPU flummoxed. For secondary contexts, this will lead to that context being banned - but currently it is also causing the default context to become banned, leading to turmoil in the shared state. This is a regression from commit `6702cf16e0` [v4.1] Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Mon Mar 16 16:00:58 2015 +0000 drm/i915: Initialize all contexts which quietly introduced the removal of the MI_RESTORE_INHIBIT on the default context. v2: Mark the global default context as uninitialized on GPU reset so that the context-local workarounds are reloaded upon re-enabling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1448630935-27377-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Cc: stable@vger.kernel.org [danvet: This seems to fix a gpu hand on after the first resume, resulting in any future suspend operation failing with -EIO because the gpu seems to be in a funky state. Somehow this patch fixes that.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-01-06 11:04:53 +01:00
Wayne Boyer	4d3e904ceb	drm/i915: Only set gem object L3 cache level for IVB devices Do some further clean up based on the initial review of drm/i915: Separate cherryview from valleyview. In this case, in i915_gem_alloc_context_obj() only call i915_gem_object_set_cache_level() for Ivy Bridge devices since later platforms don't have L3 control bits in the PTE. v2: Expand comment to mention snooping requirement. (Ville, Imre) Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Wayne Boyer <wayne.boyer@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1449596332-23470-1-git-send-email-wayne.boyer@intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2015-12-10 11:07:30 +01:00
Wayne Boyer	666a45379e	drm/i915: Separate cherryview from valleyview The cherryview device shares many characteristics with the valleyview device. When support was added to the driver for cherryview, the corresponding device info structure included .is_valleyview = 1. This is not correct and leads to some confusion. This patch changes .is_valleyview to .is_cherryview in the cherryview device info structure and simplifies the IS_CHERRYVIEW macro. Then where appropriate, instances of IS_VALLEYVIEW are replaced with IS_VALLEYVIEW \|\| IS_CHERRYVIEW or equivalent. v2: Use IS_VALLEYVIEW \|\| IS_CHERRYVIEW instead of defining a new macro. Also add followup patches to fix issues discovered during the first review. (Ville) v3: Fix some style issues and one gen check. Remove CRT related changes as CRT is not supported on CHV. (Imre, Ville) v4: Make a few more optimizations. (Ville) Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Wayne Boyer <wayne.boyer@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1449692975-14803-1-git-send-email-wayne.boyer@intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com>	2015-12-10 11:07:24 +01:00
Tvrtko Ursulin	408952d43b	drm/i915: Remove incorrect warning in context cleanup Commit `e9f24d5fb7` Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Date: Mon Oct 5 13:26:36 2015 +0100 drm/i915: Clean up associated VMAs on context destruction Added a warning based on an incorrect assumption that all VMAs in a VM will be on the inactive list at the point last reference to a context and VM is dropped. This is not true because i915_gem_object_retire__read will not put VMA on the inactive list until all activities on the object in question (in all VMs) have been retired. As a consequence, whether or not a context/VM will be destroyed with its VMAs still on the active list, can depend on completely unrelated activities using the same object from a different context or engine. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92638 Testcase: igt/gem_request_retire/retire-vma-not-inactive Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1448025816-25584-1-git-send-email-tvrtko.ursulin@linux.intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-11-24 11:58:12 +01:00
Ville Syrjälä	f92a916220	drm/i915: Add functions to emit register offsets to the ring When register type safety happens, we can't just try to emit the register itself to the ring. Instead we'll need to extract the offset from it first. Add some convenience functions that will do that. v2: Convert MOCS setup too Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1446672017-24497-20-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-11-18 14:35:24 +02:00
Chris Wilson	fa8848f278	drm/i915: Report context GTT size Since the beginning we have conflated the size of the global GTT with that of the per-process context sizes. In recent times (gen8+), those are no longer the same where the global GTT is limited to 2/4GiB but the per-process GTT may be anything up to 256TiB. Userspace knows nothing of this discrepancy and outside of one or two hacks, uses the getaperture ioctl to determine the maximum size it can use. Let's leave that as reporting the global GTT and use the context reporting method to describe the per-process value (which naturally fallsback to reporting the aliasing or global on older platforms, so userspace can always use this method where available). Testcase: igt/gem_userptr_blits/minor-normal-sync Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90065 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-10-19 12:16:46 +02:00
Tvrtko Ursulin	61fb588151	drm/i915: Remove wrong warning from i915_gem_context_clean commit `e9f24d5fb7` Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Date: Mon Oct 5 13:26:36 2015 +0100 drm/i915: Clean up associated VMAs on context destruction Introduced a wrong assumption that all contexts have a ppgtt instance. This is not true when full PPGTT is not active so remove the WARN_ON_ONCE from the context cleanup code. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-10-09 10:19:48 +02:00
Tvrtko Ursulin	e9f24d5fb7	drm/i915: Clean up associated VMAs on context destruction Prevent leaking VMAs and PPGTT VMs when objects are imported via flink. Scenario is that any VMAs created by the importer will be left dangling after the importer exits, or destroys the PPGTT context with which they are associated. This is caused by object destruction not running when the importer closes the buffer object handle due the reference held by the exporter. This also leaks the VM since the VMA has a reference on it. In practice these leaks can be observed by stopping and starting the X server on a kernel with fbcon compiled in. Every time X server exits another VMA will be leaked against the fbcon's frame buffer object. Also on systems where flink buffer sharing is used extensively, like Android, this leak has even more serious consequences. This version is takes a general approach from the earlier work by Rafael Barbalho (drm/i915: Clean-up PPGTT on context destruction) and tries to incorporate the subsequent discussion between Chris Wilson and Daniel Vetter. v2: Removed immediate cleanup on object retire - it was causing a recursive VMA unbind via i915_gem_object_wait_rendering. And it is in fact not even needed since by definition context cleanup worker runs only after the last context reference has been dropped, hence all VMAs against the VM belonging to the context are already on the inactive list. v3: Previous version could deadlock since VMA unbind waits on any rendering on an object to complete. Objects can be busy in a different VM which would mean that the cleanup loop would do the wait with the struct mutex held. This is an even simpler approach where we just unbind VMAs without waiting since we know all VMAs belonging to this VM are idle, and there is nothing in flight, at the point context destructor runs. v4: Double underscore prefix for __915_vma_unbind_no_wait and a commit message typo fix. (Michel Thierry) Note that this is just a partial/interim fix since we have a bit a fundamental issue with cleaning up, e.g. https://bugs.freedesktop.org/show_bug.cgi?id=87729 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Testcase: igt/gem_ppgtt.c/flink-and-exit-vma-leak Reviewed-by: Michel Thierry <michel.thierry@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rafael Barbalho <rafael.barbalho@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> [danvet: Add a note that this isn't everything.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-10-06 14:13:26 +02:00
Zhiyuan Lv	a0bd6c3183	drm/i915: Always enable execlists on BDW for vgpu Broadwell hardware supports both ring buffer mode and execlist mode. When i915 runs inside a VM with Intel GVT-g, we allow execlist mode only. The main reason of EXECLIST only is that GVT-g does not support the dynamic mode switch between ring buffer mode and execlist mode when running multiple virtual machines. v2: - Adjust the position of vgpu check in sanitize function (Joonas) - Add vgpu error check in context initialization. (Joonas, Daniel) Signed-off-by: Zhiyuan Lv <zhiyuan.lv@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-09-02 11:45:50 +02:00
Chris Wilson	37876df61f	drm/i915: Remove the failed context from the fpriv->context_idr If we encounter an allocation failure during ppggt creation (trivial even with 16Gib+ RAM!), we need to remove the dead context from the fpriv->context_idr along with the references. gem_exec_ctx: page allocation failure: order:0, mode:0x8004 CPU: 3 PID: 27272 Comm: gem_exec_ctx Tainted: G W 4.2.0-rc5+ #37 0000000000000000 ffff880086ff7a78 ffffffff816b947a ffff88041ed90038 0000000000008004 ffff880086ff7b08 ffffffff8114b1a5 ffff880086ff7ac8 ffffffff8108d848 0000000000000000 ffffffff81ce84b8 0000000000000000 Call Trace: [<ffffffff816b947a>] dump_stack+0x45/0x57 [<ffffffff8114b1a5>] warn_alloc_failed+0xd5/0x120 [<ffffffff8108d848>] ? __wake_up+0x48/0x60 [<ffffffff8114e0ed>] __alloc_pages_nodemask+0x73d/0x8e0 [<ffffffffc0472238>] ? i915_gem_execbuffer2+0x148/0x240 [i915] [<ffffffffc0474240>] __setup_page_dma+0x30/0x110 [i915] [<ffffffffc0477f61>] gen8_ppgtt_init+0x31/0x2f0 [i915] [<ffffffffc04785e0>] i915_ppgtt_init+0x30/0x80 [i915] [<ffffffffc0478928>] i915_ppgtt_create+0x48/0xc0 [i915] [<ffffffffc046c9c2>] i915_gem_create_context+0x1c2/0x390 [i915] [<ffffffffc046d9cb>] i915_gem_context_create_ioctl+0x5b/0xa0 [i915] leading to an oops in i915_gem_context_close. Also note that this benchmark should not be running out of memory in the first place... Testcase: igt/benchmark/gem_exec_ctx -b create # ppgtt >= 2 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 17:50:41 +02:00
Daniel Vetter	ca6e440577	Merge tag 'drm-intel-fixes-2015-07-15' into drm-intel-next-queued Backmerge fixes since it's getting out of hand again with the massive split due to atomic between -next and 4.2-rc. All the bugfixes in 4.2-rc are addressed already (by converting more towards atomic instead of minimal duct-tape) so just always pick the version in next for the conflicts in modeset code. All the other conflicts are just adjacent lines changed. Conflicts: drivers/gpu/drm/i915/i915_drv.h drivers/gpu/drm/i915/i915_gem_gtt.c drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_drv.h drivers/gpu/drm/i915/intel_ringbuffer.h Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-07-15 16:36:50 +02:00
Chris Wilson	9ea4feecc3	drm/i915: Store device pointer in contexts for late tracepoint usafe [ 1572.417121] BUG: unable to handle kernel NULL pointer dereference at (null) [ 1572.421010] IP: [<ffffffffa00b2514>] ftrace_raw_event_i915_context+0x5d/0x70 [i915] [ 1572.424970] PGD 1766a3067 PUD 1767a2067 PMD 0 [ 1572.428892] Oops: 0000 [#1] SMP [ 1572.432787] Modules linked in: ipv6 dm_mod iTCO_wdt iTCO_vendor_support snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore serio_raw pcspkr lpc_ich i2c_i801 mfd_core battery ac acpi_cpufreq i915 button video drm_kms_helper drm [ 1572.441720] CPU: 2 PID: 18853 Comm: kworker/u8:0 Not tainted 4.0.0_kcloud_3f0360_20150429+ #588 [ 1572.446298] Workqueue: i915 i915_gem_retire_work_handler [i915] [ 1572.450876] task: ffff880002f428f0 ti: ffff880035724000 task.ti: ffff880035724000 [ 1572.455557] RIP: 0010:[<ffffffffa00b2514>] [<ffffffffa00b2514>] ftrace_raw_event_i915_context+0x5d/0x70 [i915] [ 1572.460423] RSP: 0018:ffff880035727ce8 EFLAGS: 00010286 [ 1572.465262] RAX: ffff880073f1643c RBX: ffff880002da9058 RCX: ffff880073e5db40 [ 1572.470179] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880035727ce8 [ 1572.475107] RBP: ffff88007bb11a00 R08: 0000000000000000 R09: 0000000000000000 [ 1572.480034] R10: 0000000000362200 R11: 0000000000000008 R12: 0000000000000000 [ 1572.484952] R13: ffff880035727d78 R14: ffff880002dc1c98 R15: ffff880002dc1dc8 [ 1572.489886] FS: 0000000000000000(0000) GS:ffff88017fd00000(0000) knlGS:0000000000000000 [ 1572.494883] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1572.499859] CR2: 0000000000000000 CR3: 000000017572a000 CR4: 00000000001006e0 [ 1572.504842] Stack: [ 1572.509834] ffff88017b0090c0 ffff880073f16438 ffff880002da9058 ffff880073f1643c [ 1572.514904] 0000000000000246 ffff880100000000 ffff88007bb11a00 ffff880002ddeb10 [ 1572.519985] ffff8801759f79c0 ffffffffa0092ff0 0000000000000000 ffff88007bb11a00 [ 1572.525049] Call Trace: [ 1572.530093] [<ffffffffa0092ff0>] ? i915_gem_context_free+0xa8/0xc1 [i915] [ 1572.535227] [<ffffffffa009b969>] ? i915_gem_request_free+0x4e/0x50 [i915] [ 1572.540347] [<ffffffffa00b5533>] ? intel_execlists_retire_requests+0x14c/0x159 [i915] [ 1572.545500] [<ffffffffa009d9ea>] ? i915_gem_retire_requests+0x9d/0xeb [i915] [ 1572.550664] [<ffffffffa009dd8c>] ? i915_gem_retire_work_handler+0x4c/0x61 [i915] [ 1572.555825] [<ffffffff8104ca7f>] ? process_one_work+0x1b2/0x31d [ 1572.560951] [<ffffffff8104d278>] ? worker_thread+0x24d/0x339 [ 1572.566033] [<ffffffff8104d02b>] ? cancel_delayed_work_sync+0xa/0xa [ 1572.571140] [<ffffffff81050b25>] ? kthread+0xce/0xd6 [ 1572.576191] [<ffffffff81050a57>] ? kthread_create_on_node+0x162/0x162 [ 1572.581228] [<ffffffff8179b3c8>] ? ret_from_fork+0x58/0x90 [ 1572.586259] [<ffffffff81050a57>] ? kthread_create_on_node+0x162/0x162 [ 1572.591318] Code: de 48 89 e7 e8 09 4d 00 e1 48 85 c0 74 27 48 89 68 10 48 8b 55 38 48 89 e7 48 89 50 18 48 8b 55 10 48 8b 12 48 8b 12 48 8b 52 38 <8b> 12 89 50 08 e8 95 4d 00 e1 48 83 c4 30 5b 5d 41 5c c3 41 55 [ 1572.596981] RIP [<ffffffffa00b2514>] ftrace_raw_event_i915_context+0x5d/0x70 [i915] [ 1572.602464] RSP <ffff880035727ce8> [ 1572.607911] CR2: 0000000000000000 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90112#c23 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-07-13 22:42:38 +02:00
Ville Syrjälä	52613921b3	Revert "drm/i915: Allocate context objects from stolen" Stolen gets trashed during hibernation, so storing contexts there is not a very good idea. On my IVB machines this leads to a totally dead GPU on resume. A reboot is required to resurrect it. So let's not store contexts where they will get trampled. This reverts commit `149c86e74f`. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-07-09 09:40:16 +02:00
Abdiel Janulgue	4c436d55b2	drm/i915: Enable Resource Streamer state save/restore on MI_SET_CONTEXT Also clarify comments on context size that the extra state for Resource Streamer is included. v2: Don't remove the extended save/restore enabled for older platforms. (Ville) Use new MI_SET_CONTEXT defines for HSW RS save/restore state instead of extended save/restore. (Daniel) Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-07-06 10:26:05 +02:00
John Harrison	5fb9de1a2e	drm/i915: Update intel_ring_begin() to take a request structure Now that everything above has been converted to use requests, intel_ring_begin() can be updated to take a request instead of a ring. This also means that it no longer needs to lazily allocate a request if no-one happens to have done it earlier. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:29 +02:00
John Harrison	a84c3ae168	drm/i915: Update ring->flush() to take a requests structure Updated the various ring->flush() functions to take a request instead of a ring. Also updated the tracer to include the request id. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> [danvet: Rebase since I didn't merge the addition of req->uniq.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:21 +02:00
John Harrison	e85b26dc1c	drm/i915: Update switch_mm() to take a request structure Updated the switch_mm() code paths to take a request instead of a ring. This includes the myriad *_mm_switch functions themselves and a bunch of PDP related helper functions. v2: Rebased to newer tree. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:21 +02:00
John Harrison	1d719cda8b	drm/i915: Update mi_set_context() to take a request structure Updated mi_set_context() to take a request structure instead of a ring and context pair. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:18 +02:00
John Harrison	6909a66646	drm/i915: Update l3_remap to take a request structure Converted i915_gem_l3_remap() to take a request structure instead of a ring. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:17 +02:00
John Harrison	b2af037693	drm/i915: Update [vma\|object]_move_to_active() to take request structures Now that everything above has been converted to use request structures, it is possible to update the lower level move_to_active() functions to be request based as well. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:16 +02:00
John Harrison	8753181e10	drm/i915: Update init_context() to take a request structure Now that everything above has been converted to use requests, it is possible to update init_context() to take a request pointer instead of a ring/context pair. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:12 +02:00
John Harrison	abd68d9ed3	drm/i915: Update do_switch() to take a request structure Updated do_switch() to take a request pointer instead of a ring/context pair. v2: Removed some overzealous req-> dereferencing. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:10 +02:00
John Harrison	ba01cc9346	drm/i915: Update i915_switch_context() to take a request structure Now that the request is guaranteed to specify the context, it is possible to update the context switch code to use requests rather than ring and context pairs. This patch updates i915_switch_context() accordingly. Also removed the warning that the request's context must match the last context switch's context. As the context switch now gets the context object from the request structure, there is no longer any scope for the two to become out of step. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:09 +02:00
John Harrison	b3dd6b9681	drm/i915: Update ppgtt_init_ring() & context_enable() to take requests The final step in removing the OLR from i915_gem_init_hw() is to pass the newly allocated request structure in to each step rather than passing a ring structure. This patch updates both i915_ppgtt_init_ring() and i915_gem_context_enable() to take request pointers. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:09 +02:00
John Harrison	90638cc1a4	drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable() The start of day context initialisation code in i915_gem_context_enable() loops over each ring and calls the legacy switch context or the execlist init context code as appropriate. This patch moves the ring looping out of that function in to the top level caller i915_gem_init_hw(). This means the a single pass can be made over all rings doing the PPGTT, L3 remap and context initialisation of each ring altogether. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:06 +02:00
David Weinehall	b1b38278e1	drm/i915: add a context parameter to {en, dis}able zero address mapping Export a new context parameter that can be set/queried through the context_{get,set}param ioctls. This parameter is passed as a context flag and decides whether or not a GPU address mapping is allowed to be made at address zero. The default is to allow such mappings. Signed-off-by: David Weinehall <david.weinehall@intel.com> Acked-by: "Zou, Nanhai" <nanhai.zou@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-05-29 10:15:19 +02:00
Chris Wilson	b47161858b	drm/i915: Implement inter-engine read-read optimisations Currently, we only track the last request globally across all engines. This prevents us from issuing concurrent read requests on e.g. the RCS and BCS engines (or more likely the render and media engines). Without semaphores, we incur costly stalls as we synchronise between rings - greatly impacting the current performance of Broadwell versus Haswell in certain workloads (like video decode). With the introduction of reference counted requests, it is much easier to track the last request per ring, as well as the last global write request so that we can optimise inter-engine read read requests (as well as better optimise certain CPU waits). v2: Fix inverted readonly condition for nonblocking waits. v3: Handle non-continguous engine array after waits v4: Rebase, tidy, rewrite ring list debugging v5: Use obj->active as a bitfield, it looks cool v6: Micro-optimise, mostly involving moving code around v7: Fix retire-requests-upto for execlists (and multiple rq->ringbuf) v8: Rebase v9: Refactor i915_gem_object_sync() to allow the compiler to better optimise it. Benchmark: igt/gem_read_read_speed hsw:gt3e (with semaphores): Before: Time to read-read 1024k: 275.794µs After: Time to read-read 1024k: 123.260µs hsw:gt3e (w/o semaphores): Before: Time to read-read 1024k: 230.433µs After: Time to read-read 1024k: 124.593µs bdw-u (w/o semaphores): Before After Time to read-read 1x1: 26.274µs 10.350µs Time to read-read 128x128: 40.097µs 21.366µs Time to read-read 256x256: 77.087µs 42.608µs Time to read-read 512x512: 281.999µs 181.155µs Time to read-read 1024x1024: 1196.141µs 1118.223µs Time to read-read 2048x2048: 5639.072µs 5225.837µs Time to read-read 4096x4096: 22401.662µs 21137.067µs Time to read-read 8192x8192: 89617.735µs 85637.681µs Testcase: igt/gem_concurrent_blit (read-read and friends) Cc: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v8] [danvet: s/\<rq\>/req/g] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-05-21 15:11:42 +02:00
Daniel Vetter	9258811c96	drm/i915: Don't use atomics for pg_dirty_rings It's already protected by the bkl^Wdev->struct_mutex. While at it realign some related code. Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-04-23 21:06:02 +02:00
Daniel Vetter	71b7e54f71	drm/i915: Don't look at pg_dirty_rings for aliasing ppgtt We load the ppgtt ptes once per gpu reset/driver load/resume and that's all that's needed. Note that this only blows up when we're using the allocate_va_range funcs and not the special-purpose ones used. With this change we can get rid of that duplication. Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-04-23 21:06:02 +02:00
Daniel Vetter	070c1d059f	drm/i915: Drop redundant GGTT rebinding Since commit `bf3d149b25` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Fri Feb 14 14:01:12 2014 +0100 drm/i915: split PIN_GLOBAL out from PIN_MAPPABLE i915_gem_obj_ggtt_pin always binds into the ggtt, but I've forgotten to remove the now redundant additional bind call later on. Fix this up. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-04-20 09:00:11 -07:00
Chris Wilson	149c86e74f	drm/i915: Allocate context objects from stolen As we never expose context objects directly to userspace, we can forgo allocating a first-class GEM object for them and prefer to use the limited resource of reserved/stolen memory for them. Note this means that their initial contents are undefined. However, a downside of using stolen objects for execlists is that we cannot access the physical address directly (thanks MCH!) which prevents their use. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 10:41:24 +02:00
Ben Widawsky	6702cf16e0	drm/i915: Initialize all contexts The problem is we're going to switch to a new context, which could be the default context. The plan was to use restore inhibit, which would be fine, except if we are using dynamic page tables (which we will). If we use dynamic page tables and we don't load new page tables, the previous page tables might go away, and future operations will fault. CTXA runs. switch to default, restore inhibit CTXA dies and has its address space taken away. Run CTXB, tries to save using the context A's address space - this fails. The general solution is to make sure every context has it's own state, and its own address space. For cases when we must restore inhibit, first thing we do is load a valid address space. I thought this would be enough, but apparently there are references within the context itself which will refer to the old address space - therefore, we also must reinitialize. v2: to->ppgtt is only valid in full ppgtt. v3: Rebased. v4: Make post PDP update clearer. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-20 11:48:19 +01:00
Ben Widawsky	563222a745	drm/i915: Track page table reload need This patch was formerly known as, "Force pd restore when PDEs change, gen6-7." I had to change the name because it is needed for GEN8 too. The real issue this is trying to solve is when a new object is mapped into the current address space. The GPU does not snoop the new mapping so we must do the gen specific action to reload the page tables. GEN8 and GEN7 do differ in the way they load page tables for the RCS. GEN8 does so with the context restore, while GEN7 requires the proper load commands in the command streamer. Non-render is similar for both. Caveat for GEN7 The docs say you cannot change the PDEs of a currently running context. We never map new PDEs of a running context, and expect them to be present - so I think this is okay. (We can unmap, but this should also be okay since we only unmap unreferenced objects that the GPU shouldn't be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag to signal that even if the context is the same, force a reload. It's unclear exactly what this does, but I have a hunch it's the right thing to do. The logic assumes that we always emit a context switch after mapping new PDEs, and before we submit a batch. This is the case today, and has been the case since the inception of hardware contexts. A note in the comment let's the user know. It's not just for gen8. If the current context has mappings change, we need a context reload to switch v2: Rebased after ppgtt clean up patches. Split the warning for aliasing and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt is always null. v3: Invalidate PPGTT TLBs inside alloc_va_range. v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when neither ctx->ppgtt and aliasing_ppgtt exist. v5: Removed references to teardown_va_range. v6: Updated needs_pd_load_pre/post. v7: Fix pd_dirty_rings check in needs_pd_load_post, and update/move comment about updated PDEs to object_pin/bind (Mika). Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-20 11:48:18 +01:00
Ben Widawsky	317b4e9036	drm/i915: Extract context switch skip and add pd load logic In Gen8, PDPs are saved and restored with legacy contexts (legacy contexts only exist on the render ring). So change the ordering of LRI vs MI_SET_CONTEXT for the initialization of the context. Also the only cases in which we need to manually update the PDPs are when MI_RESTORE_INHIBIT has been set in MI_SET_CONTEXT (i.e. when the context is not yet initialized or it is the default context). Legacy submission is not available post GEN8, so it isn't necessary to add extra checks for newer generations. v2: Use new functions to replace the logic right away (Daniel) v3: Add missing pd load logic. v4: Add warning in case pd_load_pre & pd_load_post are true, and add missing trace_switch_mm. Cleaned up pd_load conditions. Add more information about when is pd_load_post needed. (Mika) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-20 11:48:17 +01:00
Thomas Daniel	3e5b6f05a2	drm/i915: Reset logical ring contexts' head and tail during GPU reset Work was getting left behind in LRC contexts during reset. This causes a hang if the GPU is reset when HEAD==TAIL because the context's ringbuffer head and tail don't get reset and retiring a request doesn't alter them, so the ring still appears full. Added a function intel_lr_context_reset() to reset head and tail on a LRC and its ringbuffer. Call intel_lr_context_reset() for each context in i915_gem_context_reset() when in execlists mode. Testcase: igt/pm_rps --run-subtest reset #bdw Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88096 Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> [danvet: Flatten control flow in the lrc reset code a notch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-02-24 00:19:37 +01:00
Daniel Vetter	0a87a2db48	Merge tag 'topic/i915-hda-componentized-2015-01-12' into drm-intel-next-queued Conflicts: drivers/gpu/drm/i915/intel_runtime_pm.c Separate branch so that Takashi can also pull just this refactoring into sound-next. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-01-12 23:07:46 +01:00
Chris Wilson	c9dc0f3598	drm/i915: Add ioctl to set per-context parameters Sometimes we wish to tweak how an individual context behaves. Since we always create a context for every filp, this means that individual processes can fine tune their behaviour even if they do not explicitly create a context. The first example parameter here is to enable multi-process GPU testing, but the interface should be able to cope with passing arbitrarily complex parameters. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Testcase: igt/gem_reset_stats/ban-period-* Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-07 18:19:06 +01:00
Chris Wilson	676fa5721c	drm/i915: Move the ban period onto the context This will allow us to set per-file, or even per-context, periods in the future. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-07 14:20:20 +01:00
Chris Wilson	2c55018347	drm/i915: Disable PSMI sleep messages on all rings around context switches There exists a current workaround to prevent a hang on context switch should the ring go to sleep in the middle of the restore, WaProgramMiArbOnOffAroundMiSetContext (applicable to all gen7+). In spite of disabling arbitration (which prevents the ring from powering down during the critical section) we were still hitting hangs that had the hallmarks of the known erratum. That is we are still seeing hangs "on the last instruction in the context restore". By comparing -nightly (broken) with requests (working), we were able to deduce that it was the semaphore LRI cross-talk that reproduced the original failure. The key was that requests implemented deferred semaphore signalling, and disabling that, i.e. emitting the semaphore signal to every other ring after every batch restored the frequent hang. Explicitly disabling PSMI sleep on the RCS ring was insufficient, all the rings had to be awake to prevent the hangs. Fortunately, we can reduce the wakelock to the MI_SET_CONTEXT operation itself, and so should be able to limit the extra power implications. Since the MI_ARB_ON_OFF workaround is listed for all gen7 and above products, we should apply this extra hammer for all of the same platforms despite so far that we have only been able to reproduce the hang on certain ivb and hsw models. The last question is whether we want to always use the extra hammer or only when we know semaphores are in operation. At the moment, we only use LRI on non-RCS rings for semaphores, but that may change in the future with the possibility of reintroducing this bug under subtle conditions. v2: Make it explicit that the PSMI LRI are an extension to the original workaround for the other rings. v3: Bikeshedding variable names and whitespacing Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80660 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83677 Cc: Simon Farnsworth <simon@farnz.org.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Peter Frühberger <fritsch@xbmc.org> Reviewed-by: Daniel Vetter <daniel@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2014-12-16 15:07:53 +02:00
Tvrtko Ursulin	fe14d5f4e5	drm/i915: Infrastructure for supporting different GGTT views per object Things like reliable GGTT mappings and mirrored 2d-on-3d display will need to map objects into the same address space multiple times. Added a GGTT view concept and linked it with the VMA to distinguish between multiple instances per address space. New objects and GEM functions which do not take this new view as a parameter assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the previous behaviour. This now means that objects can have multiple VMA entries so the code which assumed there will only be one also had to be modified. Alternative GGTT views are supposed to borrow DMA addresses from obj->pages which is DMA mapped on first VMA instantiation and unmapped on the last one going away. v2: * Removed per view special casing in i915_gem_ggtt_prepare / finish_object in favour of creating and destroying DMA mappings on first VMA instantiation and last VMA destruction. (Daniel Vetter) * Simplified i915_vma_unbind which does not need to count the GGTT views. (Daniel Vetter) * Also moved obj->map_and_fenceable reset under the same check. * Checkpatch cleanups. v3: * Only retire objects once the last VMA is unbound. v4: * Keep scatter-gather table for alternative views persistent for the lifetime of the VMA. * Propagate binding errors to callers and handle appropriately. v5: * Explicitly look for normal GGTT view in i915_gem_obj_bound to align usage in i915_gem_object_ggtt_unpin. (Michel Thierry) * Change to single if statement in i915_gem_obj_to_ggtt. (Michel Thierry) * Removed stray semi-colon in i915_gem_object_set_cache_level. For: VIZ-4544 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Michel Thierry <michel.thierry@intel.com> [danvet: Drop hunk from i915_gem_shrink since it's just prettification but upsets a __must_check warning.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-15 11:25:04 +01:00
Daniel Vetter	8f0e2b9d95	drm/i915: Move golden context init into ->init_context Similar to a patch from Thomas Daniel for lrc contexts. This keeps both sides somewhat in sync and should make Dave Gordon happy. Note that both the wa and the golden context init code suffer a bit from an inssuficient split into driver load and hw init code. Which means we have a bunch of tests all over the place to check whether the one-time initialization has been done already or not. All that one-tim code should be moved into the one-time ring setup code, but that's work for later. Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-08 15:19:02 +01:00
Thomas Daniel	e7778be1ea	drm/i915: Fix startup failure in LRC mode after recent init changes A previous commit introduced engine init changes: commit 372ee59699d9 ("drm/i915: Only init engines once") This broke execlists as intel_lr_context_render_state_init was trying to emit commands to the RCS for the default context before the ring->init_hw was called. Made a new gen8_init_rcs_context function and assign in to render ring init_context. Moved call to intel_logical_ring_workarounds_emit into gen8_init_rcs_context to maintain previous functionality. Moved call to render_state_init from lr_context_deferred_create into gen8_init_rcs_context, and modified deferred_create to call ring->init_context for non-default contexts. Modified i915_gem_context_enable to call ring->init_context for the default context. So init_context will now always be called when the hw is ready - in i915_gem_context_enable for the default context and in lr_context_deferred_create for other contexts. Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:30 +01:00
John Harrison	41c5241555	drm/i915: Remove the now redundant 'obj->ring' The ring member of the object structure was always updated with the last_read_seqno member. Thus with the conversion to last_read_req, obj->ring is now a direct copy of obj->last_read_req->ring. This makes it somewhat redundant and potentially misleading (especially as there was no comment to explain its purpose). This checkin removes the redundant field. Many uses were simply testing for non-null to see if the object is active on the GPU. Some of these have been converted to check 'obj->active' instead. Others (where the last_read_req is about to be used anyway) have been changed to check obj->last_read_req. The rest simply pull the ring out from the request structure and proceed as before. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:23 +01:00
Michel Thierry	771b9a5324	drm/i915: Initialize workarounds in logical ring mode too Following the legacy ring submission example, update the ring->init_context() hook to support the execlist submission mode. v2: update to use the new workaround macros and cleanup unused code. This takes care of both bdw and chv workarounds. v2.1: Add missing call to init_context() during deferred context creation. v3: Split init_context (emit) in legacy/lrc modes. For lrc, get the ringbuf from the context (Mika/Daniel). v4: Merge init_context interfaces back, the legacy mode only needs the ring, but the lrc mode needs the ring and context (Mika). Issue: VIZ-4092 Issue: GMIN-3475 Change-Id: Ie3d093b2542ab0e2a44b90460533e2f979788d6c Cc: Deepak S <deepak.s@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Align function paramater lists properly.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-14 10:29:25 +01:00
Daniele Ceraolo Spurio	198c974d7e	drm/i915: Add tracepoints to track a vm during its lifetime - ppgtt init/release: these tracepoints are useful for observing the creation and destruction of Full PPGTTs. - ctx create/free: we can use the ctx_free trace in combination with the ppgtt_release one to be sure that the ppgtt doesn't stay alive for too long after the ctx is destroyed. ctx_create is there for simmetry - switch_mm: important point in the lifetime of the vm v4: add DOC information v5: pull the DOC in drm.tmpl v6: clean ppgtt init/release traces + add ctx create/free and switch_mm tracepoints (Chris) v7: drop execlist_submit_context tracepoint Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-14 10:29:13 +01:00
Tvrtko Ursulin	aff437667b	drm/i915: Move flags describing VMA mappings into the VMA If these flags are on the object level it will be more difficult to allow for multiple VMAs per object. v2: Simplification and cleanup after code review comments (Chris Wilson). Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-04 14:04:51 +01:00
Arun Siluvery	86d7f23842	drm/i915/bdw: Apply workarounds in render ring init function For BDW workarounds are currently initialized in init_clock_gating() but they are lost during reset, suspend/resume etc; this patch moves the WAs that are part of register state context to render ring init fn otherwise default context ends up with incorrect values as they don't get initialized until init_clock_gating fn. v2: Add workarounds to golden render state This method has its own issues, first of all this is different for each gen and it is generated using a tool so adding new workaround and mainitaining them across gens is not a straightforward process. v3: Use LRIs to emit these workarounds (Ville) Instead of modifying the golden render state the same LRIs are emitted from within the driver. v4: Use abstract name when exporting gen specific routines (Chris) For: VIZ-4092 Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-03 11:04:42 +02:00
Thomas Daniel	ecdb5fd861	drm/i915/bdw: Don't execute context reset and switch with Execlists These two functions make no sense in an Logical Ring Context & Execlists world. v2: We got rid of lrc_enabled and centralized everything in the sanitized i915.enable_execlists instead. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> v3: Rebased. Corrected a typo in comment for i915_switch_context and added a comment that it should not be called in execlist mode. Added WARN_ON if i915_switch_context is called in execlist mode. Moved check for execlist mode out of i915_switch_context and into callers. Added comment in context_reset explaining why nothing is done in execlist mode. Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> [danvet: Simplify the patch subject so I can understand it.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-03 11:04:17 +02:00
Ben Widawsky	e80f14b6d3	drm/i915: Don't save/restore RS when not used v2: fix conflict on rebase. Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-03 11:03:53 +02:00
McAulay, Alistair	6689c167ae	drm/i915: Rework GPU reset sequence to match driver load & thaw This patch is to address Daniels concerns over different code during reset: http://lists.freedesktop.org/archives/intel-gfx/2014-June/047758.html "The reason for aiming as hard as possible to use the exact same code for driver load, gpu reset and runtime pm/system resume is that we've simply seen too many bugs due to slight variations and unintended omissions." Tested using igt drv_hangman. V2: Cleaner way of preventing check_wedge returning -EAGAIN V3: Clean the last_context during reset, to ensure do_switch() does the MI_SET_CONTEXT. As per review. Signed-off-by: McAulay, Alistair <alistair.mcaulay@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Rebase over ctx->ppgtt rework and extend the comment in check_wedge a bit.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-03 10:54:09 +02:00
Daniel Vetter	d624d86e1e	drm/i915: Drop create_vm argument to i915_gem_create_context Now that all the flow is streamlined the rule is simple: We create a new ppgtt for a new context when we have full ppgtt enabled. Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-13 14:23:33 +02:00
Daniel Vetter	ae6c480692	drm/i915: Only track real ppgtt for a context There's a bit a confusion since we track the global gtt, the aliasing and real ppgtt in the ctx->vm pointer. And not all callers really bother to check for the different cases and just presume that it points to a real ppgtt. Now looking closely we don't actually need ->vm to always point at an address space - the only place that cares actually has fixup code already to decide whether to look at the per-proces or the global address space. So switch to just tracking the ppgtt directly and ditch all the extraneous code. v2: Fixup the ppgtt debugfs file to not oops on a NULL ctx->ppgtt. Also drop the early exit - without aliasing ppgtt we want to dump all the ppgtts of the contexts if we have full ppgtt. v3: Actually git add the compile fix. Reviewed-by: Michel Thierry <michel.thierry@intel.com> Cc: "Thierry, Michel" <michel.thierry@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> OTC-Jira: VIZ-3724 [danvet: Resolve conflicts with execlist patches while applying.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-13 14:23:33 +02:00
Daniel Vetter	fa76da3499	drm/i915: Initialize the aliasing ppgtt as part of global gtt Stuffing this into the context setup code doesn't make a lot of sense. Also reusing the real ppgtt setup code makes even less sense since the aliasing ppgtt isn't a real address space. Leaving all that stuff unitialized will make sure that we catch any abusers promptly. This is also a prep work to clean up the context->ppgtt link. v2: Fix up the logic fail, I've fumbled it so badly to completely disable ppgtt on gen6. Spotted by Ville and Michel. Also move around the pde write into the gen6 init function, since otherwise it won't work at all. v3: Only initialize the aliasing ppgtt when we actually enable it. Cc: "Thierry, Michel" <michel.thierry@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> [danvet: Squash in fixup from Fengguang Wu.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-13 14:23:32 +02:00
Daniel Vetter	82460d9724	drm/i915: Rework ppgtt init to no require an aliasing ppgtt Currently we abuse the aliasing ppgtt to set up the ppgtt support in general. Which is a bit backwards since with full ppgtt we don't ever need the aliasing ppgtt. So untangle this and separate the ppgtt init from the aliasing ppgtt. While at it drag it out of the context enabling (which just does a switch to the default context). Note that we still have the differentiation between synchronous and asynchronous ppgtt setup, but that will soon vanish. So also correctly wire up the return value handling to be prepared for when ->switch_mm drops the synchronous parameter and could start to fail. Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-13 14:23:31 +02:00
Daniel Vetter	4d884705da	drm/i915: Track file_priv, not ctx in the ppgtt structure Hardware contexts reference a ppgtt, not the other way round. And the only user of this (in debugfs) actually only cares about which file the ppgtt is associated with. So give it what it wants. While at it give the ppgtt create function a proper name&place. Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-13 14:23:28 +02:00
Daniel Vetter	ee960be7bb	drm/i915: Some cleanups for the ppgtt lifetime handling So when reviewing Michel's patch I've noticed a few things and cleaned them up: - The early checks in ppgtt_release are now redundant: The inactive list should always be empty now, so we can ditch these checks. Even for the aliasing ppgtt (though that's a different confusion) since we tear that down after all the objects are gone. - The ppgtt handling functions are splattered all over. Consolidate them in i915_gem_gtt.c, give them OCD prefixes and add wrappers for get/put. - There was a bit a confusion in ppgtt_release about whether it cares about the active or inactive list. It should care about them both, so augment the WARNINGs to check for both. There's still create_vm_for_ctx left to do, put that is blocked on the removal of ppgtt->ctx. Once that's done we can rename it to i915_ppgtt_create and move it to its siblings for handling ppgtts. v2: Move the ppgtt checks into the inline get/put functions as suggested by Chris. v3: Inline the now redundant ppgtt local variable. Cc: Michel Thierry <michel.thierry@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-12 15:24:04 +02:00
Michel Thierry	b9d06dd9d1	drm/i915: vma/ppgtt lifetime rules VMAs should take a reference of the address space they use. Now, when the fd is closed, it will release the ref that the context was holding, but it will still be referenced by any vmas that are still active. ppgtt_release() should then only be called when the last thing referencing it releases the ref, and it can just call the base cleanup and free the ppgtt. Note that with this we will extend the lifetime of ppgtts which contain shared objects. But all the non-shared objects will get removed as soon as they drop of the active list and for the shared ones the shrinker can eventually reap them. Since we currently can't evict ppgtt pagetables either I don't think that temporary leak is important. Signed-off-by: Michel Thierry <michel.thierry@intel.com> [danvet: Add note about potential ppgtt leak with this approach.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-12 15:22:26 +02:00
Oscar Mateo	48d823878d	drm/i915/bdw: Generic logical ring init and cleanup Allocate and populate the default LRC for every ring, call gen-specific init/cleanup, init/fini the command parser and set the status page (now inside the LRC object). These are things all engines/rings have in common. Stopping the ring before cleanup and initializing the seqnos is left as a TODO task (we need more infrastructure in place before we can achieve this). v2: Check the ringbuffer backing obj for ring_is_initialized, instead of the context backing obj (similar, but not exactly the same). Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 16:55:17 +02:00
Oscar Mateo	ec3e9963a6	drm/i915/bdw: Deferred creation of user-created LRCs The backing objects and ringbuffers for contexts created via open fd are actually empty until the user starts sending execbuffers to them. At that point, we allocate & populate them. We do this because, at create time, we really don't know which engine is going to be used with the context later on (and we don't want to waste memory on objects that we might never use). v2: As contexts created via ioctl can only be used with the render ring, we have enough information to allocate & populate them right away. v3: Defer the creation always, even with ioctl-created contexts, as requested by Daniel Vetter. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 16:25:58 +02:00
Oscar Mateo	8c8579176a	drm/i915/bdw: A bit more advanced LR context alloc/free Now that we have the ability to allocate our own context backing objects and we have multiplexed one of them per engine inside the context structs, we can finally allocate and free them correctly. Regarding the context size, reading the register to calculate the sizes can work, I think, however the docs are very clear about the actual context sizes on GEN8, so just hardcode that and use it. v2: Rebased on top of the Full PPGTT series. It is important to notice that at this point we have one global default context per engine, all of them using the aliasing PPGTT (as opposed to the single global default context we have with legacy HW contexts). v3: - Go back to one single global default context, this time with multiple backing objects inside. - Use different context sizes for non-render engines, as suggested by Damien (still hardcoded, since the information about the context size registers in the BSpec is, well, lacking). - Render ctx size is 20 (or 19) pages, but not 21 (caught by Damien). - Move default context backing object creation to intel_init_ring (so that we don't waste memory in rings that might not get initialized). v4: - Reuse the HW legacy context init/fini. - Create a separate free function. - Rename the functions with an intel_ preffix. v5: Several rebases to account for the changes in the previous patches. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 16:08:18 +02:00
Oscar Mateo	ede7d42bae	drm/i915/bdw: Initialization for Logical Ring Contexts For the moment this is just a placeholder, but it shows one of the main differences between the good ol' HW contexts and the shiny new Logical Ring Contexts: LR contexts allocate and free their own backing objects. Another difference is that the allocation is deferred (as the create function name suggests), but that does not happen in this patch yet, because for the moment we are only dealing with the default context. Early in the series we had our own gen8_gem_context_init/fini functions, but the truth is they now look almost the same as the legacy hw context init/fini functions. We can always split them later if this ceases to be the case. Also, we do not fall back to legacy ringbuffers when logical ring context initialization fails (not very likely to happen and, even if it does, hw contexts would probably fail as well). v2: Daniel says "explain, do not showcase". Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: s/BUG_ON/WARN_ON/.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 16:04:11 +02:00
Ben Widawsky	2f29579131	drm/i915: Reorder ctx unref on ppgtt cleanup The comment [which was mine] is wrong. The context object can never be bound in a PPGTT because it is only capable of living in the Global GTT. So, remove the comment, and reorder the unref. What's nice about the latter is it keeps the context object alive past the PPGTT. This makes the destroy ordering symmetric with the creation ordering. Create: 1. Create context 2. Create PPGTT Destroy: 1. Destroy PPGTT 2. Destroy context As far as I know, this does not fix a bug. The code previously kept the context data structure, only the object was gone. As the code was, nothing tried to use the object after this point. NOTE: If in the future we have cases where the PPGTT can/should outlive the context (which doesn't occur today, but the code permits it), this ordering does not matter. Even if this occurs, as it stands now, we do not expect that to be the normal case, and having this order makes debugging a bit easier if we're tracking object lifetimes for the context vs ppgtt Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Resolve conflict with Oscar's execlist prep patches.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-23 07:05:39 +02:00
Oscar Mateo	821d66dd7c	drm/i915: Emphasize that ctx->id is merely a user handle This is an Execlists preparatory patch, since they make context ID become an overloaded term: - In the software, it was used to distinguish which context userspace was trying to use. - In the BSpec, the term is used to describe the 20-bits long field the hardware uses to it to discriminate the contexts that are submitted to the ELSP and inform the driver about their current status (via Context Switch Interrupts and Context Status Buffers). Initially, I tried to make the different meanings converge, but it proved impossible: - The software ctx->id is per-filp, while the hardware one needs to be globally unique. - Also, we multiplex several backing states objects per intel_context, and all of them need unique HW IDs. - I tried adding a per-filp ID and then composing the HW context ID as: ctx->id + file_priv->id + ring->id, but the fact that the hardware only uses 20-bits means we have to artificially limit the number of filps or contexts the userspace can create. The ctx->user_handle renaming bits are done with this Cocci patch (plus manual frobbing of the struct declaration): @@ struct intel_context c; @@ - (c).id + c.user_handle @@ struct intel_context *c; @@ - (c)->id + c->user_handle Also, while we are at it, s/DEFAULT_CONTEXT_ID/DEFAULT_CONTEXT_HANDLE and change the type to unsigned 32 bits. v2: s/handle/user_handle and change the type to uint32_t as suggested by Chris Wilson. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v1) Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-08 12:30:41 +02:00
Oscar Mateo	ea0c76f8c3	drm/i915: Emphasize that ctx->obj & ctx->is_initialized refer to the legacy rcs ctx We have already advanced that Logical Ring Contexts have their own kind of backing objects, but everything will be better explained in the Execlists series. For now, suffice it to say that the current backing object is only ever used with the render ring, so we're making this fact more explicit (which is a good reason on its own). As for the is_initialized flag, we only use to signify that the render state has been initialized (a.k.a. golden context, a.k.a. null context). It doesn't mean anything for the other engines, so make that distinction obvious. Done with the following Coccinelle patch (plus manual frobbing of the struct): @@ struct intel_context c; @@ - (c).obj + c.legacy_hw_ctx.rcs_state @@ struct intel_context c; @@ - (c)->obj + c->legacy_hw_ctx.rcs_state @@ struct intel_context c; @@ - (c).is_initialized + c.legacy_hw_ctx.initialized @@ struct intel_context c; @@ - (c)->is_initialized + c->legacy_hw_ctx.initialized This Execlists prep-work patch has been suggested by Chris Wilson and Daniel Vetter separately. Initially, it was two separate patches: drm/i915: Rename ctx->obj to ctx->rcs_state drm/i915: Make it obvious that ctx->id is merely a user handle Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: s/id/is_initialized/ to fix the subject and resolve a conflict in i915_gem_context_reset. Also introduce a new lctx local variable to avoid overtly long lines.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-08 12:30:35 +02:00
Oscar Mateo	aa0c13daad	drm/i915: Extract context backing object allocation This is preparatory work for Execlists: we plan to use it later to allocate our own context objects (since Logical Ring Contexts do not have the same kind of backing objects). No functional changes. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-08 12:30:31 +02:00
Ville Syrjälä	4bfad3ddc8	drm/i915: Unpin last_context at reset We're forgetting to unpin the last_context from the ggtt at GPU reset time. This leads to the vma pin_count leaking at every reset if the last context wasn't the ring default context. Further use of the same context will trigger the pin_count check in i915_gem_object_pin() and userspace will be faced with EBUSY as a result. This plaques kms_flip rather badly since it performs lots of resets, and every fd has its own default context these days. Fix the problem by properly unpinning the last context at reset. This regression seems to back to commit `acce9ffa48` Author: Ben Widawsky <ben@bwidawsk.net> Date: Fri Dec 6 14:11:03 2013 -0800 drm/i915: Better reset handling for contexts Testcase: igt/gem_ctx_exec/reset-pin-leak Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-07 17:21:56 +02:00
Daniel Vetter	f1615bbe9b	Linux 3.16-rc4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJTuaWZAAoJEHm+PkMAQRiGfkIH/2Hhwrg51GWazUYIXVxz5zLU kPMlaws3vankbhka9HCg02eS3tkzr6shO3F/qlBba+5GUkUDKCcCisIsvk4hgZZg 7YqepTvcaupNxIp4TmTGm1FYVK1GpaWFdJVgg2PDdGFahw3HSlfZoTkBzirNCwga p/jfeRzathbUixpz9OAC1AEn2gP1AxNRpSt1wShL5rexBb1YRXCPuCEt9B0UsVoR mzKf5xEsuaZnpCuvWK4S60fjfVhTe8UJ/xGPPfdLyIXU0rvhaKzfeVQO6F5nIQBy Xvrar1f7oOPZaJRdlmPvAimS7iS8lq/YctuHu7ia1NdJSihtA5sRPf7cWAw2d7s= =4PrL -----END PGP SIGNATURE----- Merge tag 'v3.16-rc4' into drm-intel-next-queued Due to Dave's vacation drm-next hasn't opened yet for 3.17 so I couldn't move my drm-intel-next queue forward yet like I usually do. Just pull in the latest upstream -rc to unblock patch merging - I don't want to needlessly rebase my current patch pile really and void all the testing we've done already. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-07 10:17:56 +02:00
Chris Wilson	967ab6b177	drm/i915: Only mark the ctx as initialised after a SET_CONTEXT operation Fallout from commit `46470fc932` Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Wed May 21 19:01:06 2014 +0300 drm/i915: Add null state batch to active list undid the earlier fix of only marking the ctx as initialised after it is saved by the hardware during a SET_CONTEXT operation: commit `ad1d219974` Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Sat Dec 28 13:31:49 2013 -0800 drm/i915: set ctx->initialized only after RCS Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Damien Lespiau <damien.lespiau@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> [Jani: add reference to the earlier fix in the commit messsage.] Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2014-06-24 14:48:41 +03:00
Oscar Mateo	14d8ec544f	drm/i915: Remove ctx->last_ring The original comment that introduced it said: commit `0009e46cd5` Author: Ben Widawsky <ben@bwidawsk.net> Date: Fri Dec 6 14:11:02 2013 -0800 drm/i915: Track which ring a context ran on Previously we dropped the association of a context to a ring. It is however very important to know which ring a context ran on (we could have reused the other member, but I was nitpicky). This is very important when we switch address spaces, which unlike context objects, do change per ring. As an example, if we have: RCS BCS ctx A ctx A ctx B ctx B Without tracking the last ring B ran on, we wouldn't know to switch the address space on BCS in the last row. But this is not really true, because we are already checking to != from (with "from" being = ring->last_context) and that should be enough to make sure we switch to the right address space. We would have a problem if we switched the context object for every ring (since then we would fail to do it in some situations) but we only switch it for the render ring, so we don't care. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-06-18 21:42:42 +02:00
Oscar Mateo	f83d6518a1	drm/i915: Kill private_default_ctx off It's barely alive now anyway, so give it the "coup de grâce". Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 23:44:44 +02:00
Oscar Mateo	273497e5cd	drm/i915: s/i915_hw_context/intel_context Up until now, contexts had one (and only one) backing object that was used by the hardware to save/restore render ring contexts (via the MI_SET_CONTEXT command). Other rings did not have or need this, so our i915_hw_context struct had a 1:1 relationship with a a real HW context. With Logical Ring Contexts and Execlists, this is not possible anymore: all rings need a backing object, and it cannot be reused. To prepare for that, rename our contexts to the more generic term intel_context. No functional changes. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 23:41:17 +02:00
Oscar Mateo	a4872ba6d0	drm/i915: s/intel_ring_buffer/intel_engine_cs In the upcoming patches we plan to break the correlation between engine command streamers (a.k.a. rings) and ringbuffers, so it makes sense to refactor the code and make the change obvious. No functional changes. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 23:01:05 +02:00
Mika Kuoppala	46470fc932	drm/i915: Add null state batch to active list for proper refcounting to take place as we use i915_add_request() for it. i915_add_request() also takes the context for the request from ring->last_context so move the null state batch submission after the ring context has been set. v2: we need to check for correct ring now (Ville Syrjälä) v3: no need to expose i915_gem_move_object_to_active (Chris Wilson) v4: cargoculted vma/active/inactive error handling removed (Chris Wilson) Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 14:10:37 +02:00
Ville Syrjälä	b3f797ac49	drm/i915/chv: Add some workaround notes We implement the following workarounds: * WaDisableAsyncFlipPerfMode:chv * WaProgramMiArbOnOffAroundMiSetContext:chv v2: Drop WaDisableSemaphoreAndSyncFlipWait note Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-20 15:20:05 +02:00
Chris Wilson	d3b448d991	drm/i915: Only unpin the default ctx object if it exists Since commit `691e6415c8` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Apr 9 09:07:36 2014 +0100 drm/i915: Always use kref tracking for all contexts. we have contexts everywhere, and so we must be careful to distinguish fake contexts, which do not have an associated bo, and real ones, which do. In particular, we now need to be careful not to dereference NULL pointers. This is one such example, as the commit highlighted above failed to move the unpinning of the default ctx object into the real-context-only branch. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78792 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-16 21:41:12 +02:00
Mika Kuoppala	9d0a6fa6c5	drm/i915: add render state initialization HW guys say that it is not a cool idea to let device go into rc6 without proper 3d pipeline state. For each new uninitialized context, generate a valid null render state to be run on context creation. This patch introduces a skeleton with empty states. v2: - No need to vmap (Chris Wilson) - use .c files for state (Daniel Vetter) - no need to flush as i915_add_request does it - remove parameter for batch alloc size - don't wait for the init (Ben Widawsky) v3: - move to cpu/gpu (Chris Wilson) Tested-by: Kristen Carlson Accardi <kristen@linux.intel.com> (v1) Tested-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-14 19:16:13 +02:00
Dave Airlie	885ac04ab3	Merge tag 'drm-intel-next-2014-04-16' of git://anongit.freedesktop.org/drm-intel into drm-next drm-intel-next-2014-04-16: - vlv infoframe fixes from Jesse - dsi/mipi fixes from Shobhit - gen8 pageflip fixes for LRI/SRM from Damien - cmd parser fixes from Brad Volkin - some prep patches for CHV, DRRS, ... - and tons of little things all over drm-intel-next-2014-04-04: - cmd parser for gen7 but only in enforcing and not yet granting mode - the batch copying stuff is still missing. Also performance is a bit ... rough (Brad Volkin + OACONTROL fix from Ken). - deprecate UMS harder (i.e. CONFIG_BROKEN) - interrupt rework from Paulo Zanoni - runtime PM support for bdw and snb, again from Paulo - a pile of refactorings from various people all over the place to prep for new stuff (irq reworks, power domain polish, ...) drm-intel-next-2014-04-04: - cmd parser for gen7 but only in enforcing and not yet granting mode - the batch copying stuff is still missing. Also performance is a bit ... rough (Brad Volkin + OACONTROL fix from Ken). - deprecate UMS harder (i.e. CONFIG_BROKEN) - interrupt rework from Paulo Zanoni - runtime PM support for bdw and snb, again from Paulo - a pile of refactorings from various people all over the place to prep for new stuff (irq reworks, power domain polish, ...) Conflicts: drivers/gpu/drm/i915/i915_gem_context.c	2014-05-01 09:11:37 +10:00
Chris Wilson	691e6415c8	drm/i915: Always use kref tracking for all contexts. If we always initialize kref for the context, even if we are using fake contexts for hangstats when there is no hw support, we can forgo the dance to dereference the ctx->obj and inspect whether we are permitted to use kref inside i915_gem_context_reference() and _unreference(). My ulterior motive here is to improve the debugging of a use-after-free of ctx->obj. This patch avoids the dereference here and instead forces the assertion checks associated with kref. v2: Refactor the fake contexts to being even more like the real contexts, so that there is much less duplicated and special case code. v3: Tweaks. v4: Tweaks, minor. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76671 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: lu hua <huax.lu@intel.com> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> [Jani: tiny change to backport to drm-intel-fixes.] Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2014-04-11 13:29:51 +03:00
Ville Syrjälä	ad2ac08bf3	drm/i915: Make contexts non-snooped on non-LLC platforms We don't do CPU access to GPU contexts so making the GPU access snoop the CPU caches seems silly, and potentially expensive. v2: Use !IS_VALLEYVIEW instead of HAS_LLC as this is really about what the PTEs can represent. Add a comment clarifying the situation. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-09 14:37:10 +02:00
Ben Widawsky	057f6a8ad7	drm/i915: Invariably invalidate before ctx switch We have been setting the bit which was originally BIOS dependent since: commit `f05bb0c7b6` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sun Jan 20 16:33:32 2013 +0000 drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waits Therefore, we do not need to try to figure it out dynamically and we can just always invalidate the TLBs. It's a partial revert of: commit `12b0286f49` Author: Ben Widawsky <ben@bwidawsk.net> Date: Mon Jun 4 14:42:50 2012 -0700 drm/i915: possibly invalidate TLB before context switch The original commit attempted to only invalidate when necessary (very much a relic from the old days). Now, we can just always invalidate. I guess the old TODO still exists. Since we seem to have abandoned ILK contexts however, there isn't much point in even remembering. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-03 11:41:39 +02:00
Ville Syrjälä	64bed78820	drm/i915: Implement WaProgramMiArbOnOffAroundMiSetContext:bdw BSpec seems to tell us we need the MI_ARB_ON_OFF w/a around MI_SET_CONTEXT on gen8. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-02 09:11:22 +02:00
Chris Wilson	6313c20490	drm/i915: Per-process stats work better when evaluated per-process The idea of printing objects used by each process is to judge how each process is using them. This means that we need to evaluate whether the object is bound for that particular process, rather than just whether it is bound into the global GTT. v2: Restore the non-full-ppgtt path for simplicity as we may not even create vma with older hardware. v3: Tweak handling of global entries and default context entries. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-20 15:09:21 +01:00
Mika Kuoppala	a95f6a0070	drm/i915: Switch to fake context on older gens We used to have per file descriptor hang stats for the i915_get_reset_stats_ioctl() and for default context banning. commit `0eea67eb26` Author: Ben Widawsky <ben@bwidawsk.net> Date: Fri Dec 6 14:11:19 2013 -0800 drm/i915: Create a per file_priv default context made having separate hangstats in file_private redundant as i915_hw_context already contained hangstats. So commit `c482972a08` Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Fri Dec 6 14:11:20 2013 -0800 drm/i915: Piggy back hangstats off of contexts consolidated the hangstats and enabled further improvements. commit `44e2c0705a` Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Thu Jan 30 16:01:15 2014 +0200 drm/i915: Use i915_hw_context to set reset stats tried to reap full benefits of consolidation but fell short as we never 'switch' to the fake private context on gens that don't have hw_contexts, so request->ctx remained NULL on those. Fix this by 'switching' to fake context so that when request is submitted to ring, proper context gets assigned to it. Testcase: igt/drv_hangman Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76055 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-18 16:31:35 +01:00
Damien Lespiau	96a6f0f1db	drm/i915: Fix i915_switch_context() argument name in kerneldoc While reading some code, out of boredom, stumbled on a tiny tiny fix. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-05 21:30:41 +01:00
Ben Widawsky	b18b6bde30	drm/i915/bdw: Free PPGTT struct GEN8 never freed the PPGTT struct. As GEN8 doesn't use full PPGTT, the leak is small and only found on a module reload. ie. I don't think this needs to go to stable. v2: The very naive, kfree in gen8 ppgtt cleanup, is subject to a double free on PPGTT initialization failure. (Spotted by Imre). Instead this patch pulls the ppgtt struct freeing out of the cleanup and leaves it to the allocators/callers or the one doing the last kref_put as in standard convention Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-04 15:53:58 +01:00
Ben Widawsky	321f2ada91	drm/i915: Move ppgtt_release out of the header At one time it was expected to be called in multiple places by kref_put. At the current time however, it is all contained within i915_gem_context.c. This patch makes an upcoming required addition a bit nicer since it too doesn't need to be defined in a header file. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-04 15:52:15 +01:00
Thierry Reding	d4d5be6192	drm/i915: Remove dead code The i915 driver sets DRIVER_GEM unconditionally, so testing for the feature will always fail. Signed-off-by: Thierry Reding <treding@nvidia.com> [danvet: Fix up conflicts.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-04 09:56:48 +01:00
Daniel Vetter	1ec9e26dda	drm/i915: Consolidate binding parameters into flags Anything more than just one bool parameter is just a pain to read, symbolic constants are much better. Split out from Chris' vma-binding rework patch. v2: Undo the behaviour change in object_pin that Chris spotted. v3: Split out misplaced hunk to handle set_cache_level errors, spotted by Jani. v4: Keep the current over-zealous binding logic in the execbuffer code working with a quick hack while the overall binding code gets shuffled around. v5: Reorder the PIN_ flags for more natural patch splitup. v6: Pull out the PIN_GLOBAL split-up again. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:16:58 +01:00
Mika Kuoppala	7f76b23aae	drm/i915: check for oom when allocating private_default_ctx Found with smatch Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-04 12:10:26 +01:00
Mika Kuoppala	3fac8978f5	drm/i915: Tune down debug output when context is banned If we have stopped rings then we know that test is running so no need for spam. In addition, only spam when default context gets banned. v2: - make sure default context ban gets shown (Chris) - use helper for checking for default context, everywhere (Chris) v3: - dont be quiet when debug is set (Ben, Daniel) Reference: https://bugs.freedesktop.org/show_bug.cgi?id=73652 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-30 17:25:38 +01:00
Ben Widawsky	c5dc5cecf8	drm/i915: Create a USES_PPGTT macro There are cases where we want to know if there is a full, or aliased PPGTT. Currently, in fact the only distinction we ever need to make is when we're using full PPGTT. This patch is simply to promote readability and clarify for the confusing existing usage where "aliasing" meant aliasing and full. v2: Remove USES_ALIASING_PPGTT since there are currently no cases where we need to check if we're using aliasing, but not full PPGTT. (Daniel) Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-28 09:13:50 +01:00
Ville Syrjälä	2b7e8082b2	drm/i915: We implement WaMiSetContext_Hang WaMiSetContext_Hang tells us that a MI_NOOP must follow MI_SET_CONTEXT. The other thing WaMiSetContext_Hang seems to say is that URB_FENCE isn't allowed to straddle two cachelines. But we don't issue those from the kernel so we don't care. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-27 17:16:45 +01:00
Chris Wilson	42c3b603da	drm/i915: Always pin the default context Through a twisty and circuituous path it is possible to currently trick the code into creating a default context and forgetting to pin it immediately into the GGTT. (This requires a system using contexts without an aliasing ppgtt, which is currently restricted to Baytrails machines manually specifying a module parameter to force enable contexts, or on Sandybridge and later that manually disable the aliasing ppgtt.) The consequence is that during module unload we attempt to unpin the default context twice and encounter a BUG remonstrating that we attempt to unpin an unbound object. [ 161.002869] Kernel BUG at f84861f8 [verbose debug info unavailable] [ 161.002875] invalid opcode: 0000 [#1] SMP [ 161.002882] Modules linked in: coretemp kvm_intel kvm crc32_pclmul aesni_intel aes_i586 xts lrw gf128mul ablk_helper cryptd hid_sensor_accel_3d hid_sensor_gyro_3d hid_sensor_magn_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf industrialio hid_sensor_iio_common snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi snd_seq_midi_event dm_multipath scsi_dh asix ppdev usbnet snd_rawmidi mii hid_sensor_hub microcode snd_seq rfcomm bnep snd_seq_device bluetooth snd_timer snd parport_pc binfmt_misc soundcore dw_dmac_pci dw_dmac_core mac_hid lp parport dm_mirror dm_region_hash dm_log hid_generic usbhid hid i915(O-) drm_kms_helper(O) igb dca ptp pps_core i2c_algo_bit drm(O) ahci libahci video [ 161.002991] CPU: 0 PID: 2114 Comm: rmmod Tainted: G W O 3.13.0-rc8+ #2 [ 161.002997] Hardware name: NEXCOM VTC1010/Aptio CRB, BIOS 5.6.5 09/24/2013 [ 161.003004] task: dbdd6800 ti: dbe0e000 task.ti: dbe0e000 [ 161.003010] EIP: 0060:[<f84861f8>] EFLAGS: 00010246 CPU: 0 [ 161.003044] EIP is at i915_gem_object_ggtt_unpin+0x88/0x90 [i915] [ 161.003050] EAX: dfce3840 EBX: 00000000 ECX: dfafd690 EDX: dfce3874 [ 161.003056] ESI: c0086b40 EDI: df962e00 EBP: dbe0fe1c ESP: dbe0fe0c [ 161.003062] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 161.003068] CR0: 8005003b CR2: b7718000 CR3: 1bec0000 CR4: 001007f0 [ 161.003076] Stack: [ 161.003081] 00afc014 00000004 c0086b40 dfafc000 dbe0fe38 f8487e5a dfaa5400 c0086b40 [ 161.003099] dfafc000 dfaa5400 dfaa5414 dbe0fe58 f84741aa 00000000 f89c34b9 dfaa5414 [ 161.003117] dfaa5400 dfaa5400 f644b000 dbe0fe6c f89a5443 dfaa5400 f8505000 f644b000 [ 161.003134] Call Trace: [ 161.003169] [<f8487e5a>] i915_gem_context_fini+0xba/0x1c0 [i915] [ 161.003202] [<f84741aa>] i915_driver_unload+0x1fa/0x2f0 [i915] [ 161.003232] [<f89a5443>] drm_dev_unregister+0x23/0x90 [drm] [ 161.003259] [<f89a54ed>] drm_put_dev+0x3d/0x70 [drm] [ 161.003294] [<f8470615>] i915_pci_remove+0x15/0x20 [i915] [ 161.003306] [<c1338a6f>] pci_device_remove+0x2f/0xa0 [ 161.003317] [<c140c871>] __device_release_driver+0x61/0xc0 [ 161.003328] [<c140d12f>] driver_detach+0x8f/0xa0 [ 161.003341] [<c140c54f>] bus_remove_driver+0x4f/0xc0 [ 161.003353] [<c140d708>] driver_unregister+0x28/0x60 [ 161.003362] [<c10cee42>] ? stop_cpus+0x32/0x40 [ 161.003372] [<c10bd510>] ? module_refcount+0x90/0x90 [ 161.003383] [<c13378c5>] pci_unregister_driver+0x15/0x60 [ 161.003413] [<f89a739f>] drm_pci_exit+0x9f/0xb0 [drm] [ 161.003458] [<f84e624a>] i915_exit+0x1b/0x1d [i915] [ 161.003468] [<c10bf8a8>] SyS_delete_module+0x158/0x1f0 [ 161.003480] [<c1173d5d>] ? ____fput+0xd/0x10 [ 161.003488] [<c106f0fe>] ? task_work_run+0x7e/0xb0 [ 161.003499] [<c165a68d>] sysenter_do_call+0x12/0x28 [ 161.003505] Code: 0f b6 4d f3 8d 51 0f 83 e1 f0 83 e2 0f 09 d1 84 d2 88 48 54 75 07 80 a7 91 00 00 00 7f 83 c4 04 5b 5e 5f 5d c3 8d b6 00 00 00 00 <0f> 0b 8d b6 00 00 00 00 55 89 e5 57 56 53 83 ec 64 3e 8d 74 26 [ 161.003586] EIP: [<f84861f8>] i915_gem_object_ggtt_unpin+0x88/0x90 [i915] SS:ESP 0068:dbe0fe0c v2: Rename the local variable (is_default_ctx) to avoid confusion with the function is_default_ctx(). And correct Jesse's email address. Reported-by: Jesse Barnes <jbarnes@virtuousgeek.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73985 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Tested-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com> [danvet: Fix up the rebase fail from my first attempt, thankfully pointed out by Ville.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-27 17:16:10 +01:00
Ben Widawsky	ad1d219974	drm/i915: set ctx->initialized only after RCS The initialized flag is used to specify a context has been initialized and it's context is safe to load, ie. the 3d state is setup properly. With full PPGTT, we emit the address space loads during context switch and this currently marks a context as initialized. With full PPGTT patches, if a client first emits a batch to !RCS, then later, RCS, the code will mistake the context as initialized and try to reload an uninitialized context. 1. context 1 blit // context marked as initialized, but isn't 2. context 1 render // loads random state from step 2 It is really easy to hit this with a planned upcoming patch which makes default context reuse possible. NOTE: This should only effect full PPGTT branches, ie. current drm-intel-nightly. Thanks to Chris for helping me track this down. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Simplify the failure scenario in the commit message according to Chris' review a bit.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-10 08:21:51 +01:00
Ben Widawsky	c2cf2416ca	drm/i915/bdw: Return -ENONENT on default ctx destroy This was an accidental "ABI" change introduced during PPGTT: commit `0eea67eb26` Author: Ben Widawsky <ben@bwidawsk.net> Date: Fri Dec 6 14:11:19 2013 -0800 drm/i915: Create a per file_priv default context The failure test application actually tests the return type. The other option is to simply change the test. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-10 08:21:50 +01:00
Ben Widawsky	72ad5c45f0	drm/i915/ppgtt: Fix ioctl errno for "no such context" Without this fix the ioctls silently succeeded (but actually did nothing). It makes all the code which calls into this function way too confusing. v2: Fix destroy IOCTL as well v3: Clarify the other two callers of i915_gem_context_get() to never check for NULL. (Mika) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72903 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Testcase: igt/gem_ctx_exec/basic [danvet: Fix up the commit message and actually bother to mention the testcase this fixes.] Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-07 08:50:11 +01:00
Ben Widawsky	7e0d96bc03	drm/i915: Use multiple VMs -- the point of no return As with processes which run on the CPU, the goal of multiple VMs is to provide process isolation. Specific to GEN, there is also the ability to map more objects per process (2GB each instead of 2Gb-2k total). For the most part, all the pipes have been laid, and all we need to do is remove asserts and actually start changing address spaces with the context switch. Since prior to this we've converted the setting of the page tables to a streamed version, this is quite easy. One important thing to point out (since it'd been hotly contested) is that with this patch, every context created will have it's own address space (provided the HW can do it). v2: Disable BDW on rebase NOTE: I tried to make this commit as small as possible. I needed one place where I could "turn everything on" and that is here. It could be split into finer commits, but I didn't really see much point. Cc: Eric Anholt <eric@anholt.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 16:24:52 +01:00
Daniel Vetter	3d7f0f9dcc	Merge commit drm-intel-fixes into topic/ppgtt I need the tricky do_switch fix before I can merge the final piece of the ppgtt enabling puzzle. Otherwise the conflict will be a real pain to resolve since the do_switch hunk from -fixes must be placed at the exact right place within a hunk in the next patch. Conflicts: drivers/gpu/drm/i915/i915_gem_context.c drivers/gpu/drm/i915/i915_gem_execbuffer.c drivers/gpu/drm/i915/intel_display.c Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 16:23:37 +01:00
Ben Widawsky	41bde5535a	drm/i915: Get context early in execbuf We need to have the address space when reserving space for the objects. Since the address space and context are tied together, and reserve occurs before context switch (for good reason), we must lookup our context earlier in the process. This leaves some room for optimizations where we no longer need to use ctx_id in certain places. This will be addressed in a subsequent patch. Important tricky bit: Because slow relocations during execbuffer drop struct_mutex Perhaps it would be best to acquire the reference when we get the context, but I'll save that for another day (note I have written the patch before, and I found the changes required to be uglier than this). Note that since we currently access everything via context id, and not the data structure this is fine, though not desirable. The next change attempts to get the context only once via the context ID idr lookup, and as such, the following can happen: CTX-A is created, refcount = 1 CTX-A execbuf, mutex dropped close IOCTL called on CTX-A, refcount = 0 CTX-A resumes in execbuf. v2: Rebased on top of commit `b6359918b8` Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Wed Oct 30 15:44:16 2013 +0200 drm/i915: add i915_get_reset_stats_ioctl v3: Rebased on top of commit `25b3dfc87b` Author: Mika Westerberg <mika.westerberg@linux.intel.com> Date: Tue Nov 12 11:57:30 2013 +0200 Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Tue Nov 26 16:14:33 2013 +0200 drm/i915: check context reset stats before relocations Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:52:42 +01:00
Ben Widawsky	c482972a08	drm/i915: Piggy back hangstats off of contexts To simplify the codepaths somewhat, we can simply always create a context. Contexts already keep hangstat information. This prevents us from having to differentiate at other parts in the code. There is allocation overhead, but it should not be measurable. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:51:58 +01:00
Ben Widawsky	0eea67eb26	drm/i915: Create a per file_priv default context Every file will get it's own context, and we use this context instead of the default context. The default context still exists for future shrinker usage as well as reset handling. v2: Updated to address Mika's recent context guilty changes Some more changes around this come up in later patches as well. v3: Use a fake context to avoid allocation for the !HAS_HW_CONTEXT case. I've tried the alternatives. This looks the best to me. Removed hangstat stuff from v2 - for a separate patch Demote failed PPGTT set to DRM_DEBUG_DRIVER since it can now be invoked easily from userspace. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:44:29 +01:00
Ben Widawsky	bdf4fd7ea0	drm/i915: Do aliasing PPGTT init with contexts We have a default context which suits the aliasing PPGTT well. Tie them together so it looks like any other context/PPGTT pair. This makes the code cleaner as it won't have to special case aliasing as often. The patch has one slightly tricky part in the default context creation function. In the future (and on aliased setup) we create a new VM for a context (potentially). However, if we have aliasing PPGTT, which occurs at this point in time for all platforms GEN6+, we can simply manage the refcounting to allow things to behave as normal. Now is a good time to recall that the aliasing_ppgtt doesn't have a real VM, it uses the GGTT drm_mm. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:32:14 +01:00
Ben Widawsky	c7c48dfdff	drm/i915: Add VM to context Pretty straightforward so far except for the bit about the refcounting. The PPGTT will potentially be shared amongst multiple contexts. Because contexts themselves have a refcounted lifecycle, the easiest way to manage this will be to refcount the PPGTT. To acheive this, we piggy back off of the existing context refcount, and will increment and decrement the PPGTT refcount with context creation, and destruction. To put it more clearly, if context A, and context B both use PPGTT 0, we can't free the PPGTT until both A, and B are destroyed. Note that because the PPGTT is permanently pinned (for now), it really just matters for the PPGTT destruction, as opposed to making space under memory pressure. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:31:20 +01:00
Ben Widawsky	a45d0f6a7f	drm/i915: Generalize default context setup The plan to to make every file descriptor have a default context. To accommodate this, generalize out default context setup function so it can be used at file open time. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:56 +01:00
Ben Widawsky	2fa48d8d4a	drm/i915: Split context enabling from init We need to do this for exactly 1 reason, because we want to embed a PPGTT into the context, but we don't want to special case the default context. To achieve that, we must be able to initialize contexts after the GTT is setup (so we can allocate and pin the default context's BO), but before the PPGTT and rings are initialized. This is because, currently, context initialization requires ring usage. We don't have rings until after the GTT is setup. If we split the enabling part of context initialization, the part requiring the ringbuffer, we can untangle this, and then later embed the PPGTT Incidentally this allows us to also adhere to the original design of context init/fini in future patches: they were only ever meant to be called at driver load and unload. v2: Move hw_contexts_disabled test in i915_gem_context_enable() (Chris) v3: BUG_ON after checking for disabled contexts. Or else it blows up pre gen6 (Ben) v4: Forward port Modified enable for each ring, since that patch is earlier in the series Dropped ring arg from create_default_context so it can be used by others Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:55 +01:00
Ben Widawsky	acce9ffa48	drm/i915: Better reset handling for contexts This patch adds to changes for contexts on reset: Sets last context to default - this will prevent the context switch happening after a reset. That switch is not possible because the rings are hung during reset and context switch requires reset. This behavior will need to be reworked in the future, but this is what we want for now. In the future, we'll also want to reset the guilty context to uninitialized. We should wait for ARB_Robustness related code to land for that. This is somewhat for paranoia. Because we really don't know what the GPU was doing when it hung, or the state it was in (mid context write, for example), later restoring the context is a bad idea. By setting the flag to not initialized, the next load of that context will not restore the state, and thus on the subsequent switch away from the context will overwrite the old data. NOTE: This code needs a fixup when we actually have multiple VMs. The issue that can occur is inactive objects in a VM will need to be destroyed before the last context unref. This can now happen via the fake switch introduced in this patch (and it other ways in the future) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:54 +01:00
Ben Widawsky	0009e46cd5	drm/i915: Track which ring a context ran on Previously we dropped the association of a context to a ring. It is however very important to know which ring a context ran on (we could have reused the other member, but I was nitpicky). This is very important when we switch address spaces, which unlike context objects, do change per ring. As an example, if we have: RCS BCS ctx A ctx A ctx B ctx B Without tracking the last ring B ran on, we wouldn't know to switch the address space on BCS in the last row. As a result, we no longer need to track which ring a context "belongs" to, as it never really made much sense anyway. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:54 +01:00
Ben Widawsky	67e3d2979b	drm/i915: Permit contexts on all rings If we want to use contexts in more abstract terms (specifically with PPGTT in mind), we need to allow them to be specified for any ring. Since the upcoming patches will bring about the use of multiple address spaces, and each ring needs to have an address space programmed (which we intend to do at context switch time), we can no longer only use RCS. With multiple rings having a last context, we must now unreference these contexts. NOTE: This commit requires an update to intel-gpu-tools to make it not fail. v2: Rebased with some logical conflicts. Squashed in the context fini refcount patch Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:53 +01:00
Ben Widawsky	b731d33d05	drm/i915: relax context alignment With the introduction of contexts per fd in the future, one can easily envision more contexts being used. We do not have an easy remedy to reduce the space requirements of the contexts, we can make things slightly better by using less stringent alignments on later hardware. Ville: Since I can almost predict you'll point this out. I can no longer find the docs which specify the 64k requirement on certain gen6 SKUs. If you'd like to change that too, be my guest. CC: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:52 +01:00
Ben Widawsky	e422b888eb	drm/i915: Add a context open function We'll be doing a bit more stuff with each file, so having our own open function should make things clean. This also allows us to easily add conditionals for stuff we don't want to do when we don't have HW contexts. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:51 +01:00
Ben Widawsky	6f65e29aca	drm/i915: Create bind/unbind abstraction for VMAs To sum up what goes on here, we abstract the vma binding, similarly to the previous object binding. This helps for distinguishing legacy binding, versus modern binding. To keep the code churn as minimal as possible, I am leaving in insert_entries(). It serves as the per platform pte writing basically. bind_vma and insert_entries do share a lot of similarities, and I did have designs to combine the two, but as mentioned already... too much churn in an already massive patchset. What follows are the 3 commits which existed discretely in the original submissions. Upon rebasing on Broadwell support, it became clear that separation was not good, and only made for more error prone code. Below are the 3 commit messages with all their history. drm/i915: Add bind/unbind object functions to VMA drm/i915: Use the new vm [un]bind functions drm/i915: reduce vm->insert_entries() usage drm/i915: Add bind/unbind object functions to VMA As we plumb the code with more VM information, it has become more obvious that the easiest way to deal with bind and unbind is to simply put the function pointers in the vm, and let those choose the correct way to handle the page table updates. This change allows many places in the code to simply be vm->bind, and not have to worry about distinguishing PPGTT vs GGTT. Notice that this patch has no impact on functionality. I've decided to save the actual change until the next patch because I think it's easier to review that way. I'm happy to squash the two, or let Daniel do it on merge. v2: Make ggtt handle the quirky aliasing ppgtt Add flags to bind object to support above Don't ever call bind/unbind directly for PPGTT until we have real, full PPGTT (use NULLs to assert this) Make sure we rebind the ggtt if there already is a ggtt binding. This happens on set cache levels. Use VMA for bind/unbind (Daniel, Ben) v3: Reorganize ggtt_vma_bind to be more concise and easier to read (Ville). Change logic in unbind to only unbind ggtt when there is a global mapping, and to remove a redundant check if the aliasing ppgtt exists. v4: Make the bind function a bit smarter about the cache levels to avoid unnecessary multiple remaps. "I accept it is a wart, I think unifying the pin_vma / bind_vma could be unified later" (Chris) Removed the git notes, and put version info here. (Daniel) v5: Update the comment to not suck (Chris) v6: Move bind/unbind to the VMA. It makes more sense in the VMA structure (always has, but I was previously lazy). With this change, it will allow us to keep a distinct insert_entries. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> drm/i915: Use the new vm [un]bind functions Building on the last patch which created the new function pointers in the VM for bind/unbind, here we actually put those new function pointers to use. Split out as a separate patch to aid in review. I'm fine with squashing into the previous patch if people request it. v2: Updated to address the smart ggtt which can do aliasing as needed Make sure we bind to global gtt when mappable and fenceable. I thought we could get away without this initialy, but we cannot. v3: Make the global GTT binding explicitly use the ggtt VM for bind_vma(). While at it, use the new ggtt_vma helper (Chris) At this point the original mailing list thread diverges. ie. v4^: use target_obj instead of obj for gen6 relocate_entry vma->bind_vma() can be called safely during pin. So simply do that instead of the complicated conditionals. Don't restore PPGTT bound objects on resume path Bug fix in resume path for globally bound Bos Properly handle secure dispatch Rebased on vma bind/unbind conversion Signed-off-by: Ben Widawsky <ben@bwidawsk.net> drm/i915: reduce vm->insert_entries() usage FKA: drm/i915: eliminate vm->insert_entries() With bind/unbind function pointers in place, we no longer need insert_entries. We could, and want, to remove clear_range, however it's not totally easy at this point. Since it's used in a couple of place still that don't only deal in objects: setup, ppgtt init, and restore gtt mappings. v2: Don't actually remove insert_entries, just limit its usage. It will be useful when we introduce gen8. It will always be called from the vma bind/unbind. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:50 +01:00
Ben Widawsky	d7f46fc4e7	drm/i915: Make pin count per VMA Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:49 +01:00
Daniel Vetter	acc240d41e	drm/i915: Fix use-after-free in do_switch So apparently under ridiculous amounts of memory pressure we can get into trouble in do_switch when we try to move the old hw context backing storage object onto the active lists. With list debugging enabled that usually results in us chasing a poisoned pointer - which means we've hit upon a vma that has been removed from all lrus with list_del (and then deallocated, so it's a real use-after free). Ian Lister has done some great callchain chasing and noticed that we can reenter do_switch: i915_gem_do_execbuffer() i915_switch_context() do_switch() from = ring->last_context; i915_gem_object_pin() i915_gem_object_bind_to_gtt() ret = drm_mm_insert_node_in_range_generic(); // If the above call fails then it will try i915_gem_evict_something() // If that fails it will call i915_gem_evict_everything() ... i915_gem_evict_everything() i915_gpu_idle() i915_switch_context(DEFAULT_CONTEXT) Like with everything else where the shrinker or eviction code can invalidate pointers we need to reload relevant state. Note that there's no need to recheck whether a context switch is still required because: - Doing a switch to the same context is harmless (besides wasting a bit of energy). - This can only happen with the default context. But since that one's pinned we'll never call down into evict_everything under normal circumstances. Note that there's a little driver bringup fun involved namely that we could recourse into do_switch for the initial switch. Atm we're fine since we assign the context pointer only after the call to do_switch at driver load or resume time. And in the gpu reset case we skip the entire setup sequence (which might be a bug on its own, but definitely not this one here). Cc'ing stable since apparently ChromeOS guys are seeing this in the wild (and not just on artificial stress tests), see the reference. Note that in upstream code doesn't calle evict_everything directly from evict_something, that's an extension in this product branch. But we can still hit upon this bug (and apparently we do, see the linked backtraces). I've noticed this while trying to construct a testcase for this bug and utterly failed to provoke it. It looks like we need to driver the system squarly into the lowmem wall and provoke the shrinker to evict the context object by doing the last-ditch evict_everything call. Aside: There's currently no means to get a badly-fragmenting hw context object away from a bad spot in the upstream code. We should fix this by at least adding some code to evict_something to handle hw contexts. References: https://code.google.com/p/chromium/issues/detail?id=248191 Reported-by: Ian Lister <ian.lister@intel.com> Cc: Ian Lister <ian.lister@intel.com> Cc: stable@vger.kernel.org Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Stéphane Marchesin <marcheu@chromium.org> Cc: Bloomfield, Jon <jon.bloomfield@intel.com> Tested-by: Rafael Barbalho <rafael.barbalho@intel.com> Reviewed-by: Ian Lister <ian.lister@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-06 13:09:11 +01:00
Chris Wilson	0d1430a3f4	drm/i915: Hold mutex across i915_gem_release Inorder to serialise the closing of the file descriptor and its subsequent release of client requests with i915_gem_free_request(), we need to hold the struct_mutex in i915_gem_release(). Failing to do so has the potential to trigger an OOPS, later with a use-after-free. Testcase: igt/gem_close_race Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70874 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71029 Reported-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-04 16:57:02 +01:00
Ben Widawsky	2f88542607	drm/i915: Remove defunct ctx switch comments Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-26 10:12:16 +01:00
Daniel Vetter	c09cd6e969	Merge branch 'backlight-rework' into drm-intel-next-queued Pull in Jani's backlight rework branch. This was merged through a separate branch to be able to sort out the Broadwell conflicts properly before pulling it into the main development branch. Conflicts: drivers/gpu/drm/i915/intel_display.c Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-15 10:02:39 +01:00
Ben Widawsky	8897644a6d	drm/i915/bdw: HW context support BDW context sizes varies a bit. v2: Squash in fixup for the hw context size from Ben. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:37 +01:00
Ben Widawsky	8245be3139	drm/i915: Require HW contexts (when possible) v2: Fixed the botched locking on init_hw failure in i915_reset (Ville) Call cleanup_ringbuffer on failed context create in init_hw (Ville) v3: Add dev argument ti clean_ringbuffer Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-07 09:35:44 +01:00
Ben Widawsky	71b76d004f	drm/i915: cleanup context fini I had this lying around from he original PPGTT series, and thought we might try to get it in by itself. With the introduction of context refcounting we never explicitly ref/unref the backing object. As such, the previous fix was a bit wonky. Aside from fixing the above, this patch also puts us in good shape for an upcoming patch which allows a failure to occur in between context_init and the first do_switch. CC: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 11:08:30 +02:00
Ben Widawsky	e2d05a8b1e	drm/i915: Convert active API to VMA Even though we track object activity and not VMA, because we have the active_list be based on the VM, it makes the most sense to use VMAs in the APIs. NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for now, leave them be. v2: Remove leftover hunk from the previous patch which didn't keep i915_gem_object_move_to_active. That patch had to rely on the ring to get the dev instead of the obj. (Chris) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:21 +02:00
Ben Widawsky	3ccfd19dea	drm/i915: Do remaps for all contexts On both Ivybridge and Haswell, row remapping information is saved and restored with context. This means, we never actually properly supported the l3 remapping because our sysfs interface is asynchronous (and not tied to any context), and the known faulty HW would be reused by the next context to run. Not that due to the asynchronous nature of the sysfs entry, there is no point modifying the registers for the existing context. Instead we set a flag for all contexts to load the correct remapping information on the next run. Interested clients can use debugfs to determine whether or not the row has been remapped. One could propose at this point that we just do the remapping in the kernel. I guess since we have to maintain the sysfs interface anyway, I'm not sure how useful it is, and I do like keeping the policy in userspace; (it wasn't my original decision to make the interface the way it is, so I'm not attached). v2: Force a context switch when we have a remap on the next switch. (Ville) Don't let userspace use the interface with disabled contexts. v3: Don't force a context switch, just let it nop Improper context slice remap initialization, 1<<1 instead of 1<<i, but I rewrote it to avoid a second round of confusion. Error print moved to error path (All Ville) Added a comment on why the slice remap initialization happens. CC: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:39:56 +02:00
Ben Widawsky	a33afea5ff	drm/i915: Keep a list of all contexts I have implemented this patch before without creating a separate list (I'm having trouble finding the links, but the messages ids are: <1364942743-6041-2-git-send-email-ben@bwidawsk.net> <1365118914-15753-9-git-send-email-ben@bwidawsk.net>) However, the code is much simpler to just use a list and it makes the code from the next patch a lot more pretty. As you'll see in the next patch, the reason for this is to be able to specify when a context needs to get L3 remapping. More details there. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:39:43 +02:00
Damien Lespiau	508842a036	drm/i915: It's its! Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:54 +02:00
Chris Wilson	c0321e2c5a	drm/i915: Do not add an interrupt for a context switch We use the request to ensure we hold a reference to the context for the duration that it remains in use by the ring. Each request only holds a reference to the current context, hence we emit a request after switching contexts with the final reference to the old context. However, the extra interrupt caused by that request is not useful (no timing critical function will wait for the context object), instead the overhead of servicing the IRQ shows up in some (lightweight) benchmarks. In order to keep the useful property of using the request to manage the context lifetime, we want to add a dummy request that is associated with the interrupt from the subsequent real request following the batch. The extra interrupt was added as a side-effect of using i915_add_request() in commit `112522f678` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu May 2 16:48:07 2013 +0300 drm/i915: put context upon switching v2: Daniel convinced me that the request here was solely for context lifetime tracking and that we have the active ref to keep the object alive whilst the MI_SET_CONTEXT. So the only concern then is which context should get the blame for MI_SET_CONTEXT failing. The old scheme added a request for the old context so that any hang upto and including the switch away would mark the old context as guilty. Now any hang here implicates the new context. However since we have already gone through a complete flush with the last context in its last request, and all that lies in no-man's-land is an invalidate flush and the MI_SET_CONTEXT, we should be safe in not unduly placing blame on the new context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:53 +02:00
Ben Widawsky	ca191b1313	drm/i915: mm_list is per VMA formerly: "drm/i915: Create VMAs (part 5) - move mm_list" The mm_list is used for the active/inactive LRUs. Since those LRUs are per address space, the link should be per VMx . Because we'll only ever have 1 VMA before this point, it's not incorrect to defer this change until this point in the patch series, and doing it here makes the change much easier to understand. Shamelessly manipulated out of Daniel: "active/inactive stuff is used by eviction when we run out of address space, so needs to be per-vma and per-address space. Bound/unbound otoh is used by the shrinker which only cares about the amount of memory used and not one bit about in which address space this memory is all used in. Of course to actual kick out an object we need to unbind it from every address space, but for that we have the per-object list of vmas." v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris) v3: Moved earlier in the series v4: Add dropped message from v3 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Frob patch to apply and use vma->node.size directly as discused with Ben. Also drop a needles BUG_ON before move_to_inactive, the function itself has the same check.] [danvet 2nd: Rebase on top of the lost "drm/i915: Cleanup more of VMA in destroy", specifically unlink the vma from the mm_list in vma_unbind (to keep it symmetric with bind_to_vm) instead of vma_destroy.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-08 14:06:58 +02:00
Chris Wilson	350ec881d9	drm/i915: Rename I915_CACHE_MLC_LLC to L3_LLC for Ivybridge MLC_LLC was never validated for Sandybridge and was superseded by a new level of cacheing for the GPU in Ivybridge. Update our names to be consistent with usage, and in the process stop setting the unwanted bit on Sandybridge. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> [danvet: s/BUG/WARN_ON(1) bikeshed.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-06 16:35:30 +02:00
Ben Widawsky	c37e220461	drm/i915: Add VM to pin To verbalize it, one can say, "pin an object into the given address space." The semantics of pinning remain the same otherwise. Certain objects will always have to be bound into the global GTT. Therefore, global GTT is a special case, and keep a special interface around for it (i915_gem_obj_ggtt_pin). v2: s/i915_gem_ggtt_pin/i915_gem_obj_ggtt_pin Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:09 +02:00
Chris Wilson	11fa338404	drm/i915: Fix retrieval of hangcheck stats The default context is always supported (as it contains the global hangcheck stats) and the contexts for hangcheck are not limited to any ring. References: https://bugs.freedesktop.org/show_bug.cgi?id=65845 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-16 10:40:25 +02:00
Ben Widawsky	f343c5f647	drm/i915: Getter/setter for object attributes Soon we want to gut a lot of our existing assumptions how many address spaces an object can live in, and in doing so, embed the drm_mm_node in the object (and later the VMA). It's possible in the future we'll want to add more getter/setter methods, but for now this is enough to enable the VMAs. v2: Reworked commit message (Ben) Added comments to the main functions (Ben) sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/.[ch] (Daniel) v3: Rebased on new reserve_node patch Changed DRM_DEBUG_KMS to actually work (will need fixing later) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:34 +02:00
Ben Widawsky	a0de80a0e0	drm/i915: Fix context sizes on HSW With updates to the spec, we can actually see the context layout, and how many dwords are allocated. That table suggests we need 70720 bytes per HW context. Rounded up, this is 18 pages. Looking at what lives after the current 4 pages we use, I can't see too much important (mostly it's d3d related), but there are a couple of things which look scary. I am hopeful this can explain some of our odd HSW failures. v2: Make the context only 17 pages. The power context space isn't used ever, and execlists aren't used in our driver, making the actual total 66944 bytes. v3: Add a comment to the code. (Jesse & Paulo) Reported-by: "Azad, Vinit" <vinit.azad@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-01 11:14:54 +02:00
Mika Kuoppala	0025c0772d	drm/i915: change i915_add_request to macro Only execbuffer needed all the parameters on i915_add_request(). By putting __i915_add_request behind macro, all current callsites become cleaner. Following patch will introduce a new parameter for __i915_add_request. With this patch, only the relevant callsite will reflect the change making commit smaller and easier to understand. v2: _i915_add_request as function name (Chris Wilson) v3: change name __i915_add_request and fix ordering of params (Ben Widawsky) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-13 17:42:15 +02:00
Mika Kuoppala	c0bb617a70	drm/i915: add i915_gem_context_get_hang_stats() To get context hang statistics for specified context, add i915_gem_context_get_hang_stats(). For arb-robustness, every context needs to have its own hang statistics tracking. Added function will return the user specified context statistics or in case of default context, statistics from drm_i915_file_private. v2: handle default context inside get_reset_state v3: return struct pointer instead of passing it in as param (Chris Wilson) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-13 17:42:15 +02:00
Ben Widawsky	bb0364130f	drm/i915: context debug messages Add some debug messages to help figure out what goes wrong on context initialization. Later in the PPGTT series, I ended up having a lot of failures after reset. In many cases it was extra difficult to debug because I hadn't even realized that contexts failed to reinitialize after reset (again an artifact of some later patches). This fairly benign patch does help debug some potential issues which arise later. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:53:58 +02:00
Damien Lespiau	8693a82487	drm/i915: Add references to some workaround we implement We did not mention the workaround name when implementing those. This should help us track what we already implement. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-10 21:56:34 +02:00
Ben Widawsky	186507e9e8	drm/i915: Assert mutex_is_locked on context lookup Because our context refcounting doesn't grab a ref at lookup time, it is unsafe to do so without the lock. NOTE: We don't have an easy way to put the assertion in the lookup function which is where this really belongs. Context switching is good enough because it actually asserts even more correctness by protecting the default_context. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: s/BUG/WARN/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-06 11:30:29 +02:00
Chris Wilson	112522f678	drm/i915: put context upon switching In order to be notified of when the context and all of its associated objects is idle (for if the context maps to a ppgtt) we need a callback from the retire handler. We can arrange this by using the kref_get/put of the context for request tracking and by inserting a request to demarque the switch away from the old context. [Ben: fixed minor error to patch compile, AND s/last_context/from/] Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-06 11:20:48 +02:00
Mika Kuoppala	168f836602	drm/i915: unreference default context on module unload Before module unload is called, gpu_idle() will switch to default context. This will increment ref count of base object as the default context is 'running' on module unload time. Unreference the drm object so that when context is freed, base object is freed as well. v2: added comment to explain the refcounts (Ben Widawsky) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-03 18:19:56 +02:00
Mika Kuoppala	dce3271b1e	drm/i915: reference count for i915_hw_contexts Enabling PPGTT and also the need to track which context was guilty of gpu hang (arb robustness enabling) have put pressure for struct i915_hw_context to be more than just a placeholder for hw context state. In order to track object lifetime properly in a multi peer usage, add reference counting for i915_hw_context. v2: track i915_hw_context pointers instead of using ctx_ids (from Chris Wilson) v3 (Ben): Get rid of do_release() and handle refcounting more compactly. (recommended by Chis) v4: kref_* put inside static inlines (Daniel Vetter) remove code duplication on freeing context (Chris Wilson) v5: idr_remove and ctx->file_priv = NULL in destroy ioctl (Chris) This actually will cause a problem if one destroys a context and later refers to the idea of the context (multiple contexts may have the same id, but only 1 will exist in the idr). v6: Strip out the request related stuff. Reworded commit message. Got rid of do_destroy and introduced i915_gem_context_release_handle, suggested by Chris Wilson. v7: idr_remove can't be called inside idr_for_each (Chris Wilson) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v5) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v7) Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Squash sob lines, the patch ping-ponged between Ben and Mika a bit ...] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-30 23:40:13 +02:00
Chris Wilson	4615d4c9e2	drm/i915: Use MLC (l3$) for context objects Enabling context support increases SwapBuffers latency by about 20% (measured on an i7-3720qm). We can offset that loss slightly by enabling faster caching for the contexts. As they are not backed by any particular cache (such as the sampler or render caches) our only option is to select the generic mid-level cache. This reduces the latency of the swap by about 5%. Oddly this effect can be observed running smokin-guns on IVB at 1280x1024: Using BLT copies for swaps: 151.67 fps Using Render copies for swaps (unpatched): 141.70 fps With contexts disabled: 150.23 fps With contexts in L3$: 150.77 fps Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:11 +02:00

1 2 3 4 5 ...

284 Commits