OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Chris Wilson	381b943b07	drm/i915: Remove i915_address_space.start Once upon a time, back in the UMS days, we supported userspace initialising the GTT and sharing portions of the GTT with other users. Now, we own the GTT (both global and per-process) and the tables always start at 0 - so we can remove i915_address_space.start and forget about this old complication. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-20-chris@chris-wilson.co.uk	2017-02-15 10:07:32 +00:00
Chris Wilson	998f6c00a1	drm/i915: Remove unused ppgtt->enable() We never assign or use the ppgtt->enable() callback, so remove it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-19-chris@chris-wilson.co.uk	2017-02-15 10:07:31 +00:00
Chris Wilson	c5d092a429	drm/i915: Remove bitmap tracking for used-pml4 We only operate on known extents (both for alloc/clear) and so we can use both the knowledge of the bind/unbind range along with the knowledge of the existing pagetable to avoid having to allocate temporary and auxiliary bitmaps. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-15-chris@chris-wilson.co.uk	2017-02-15 10:07:28 +00:00
Chris Wilson	e2b763caa6	drm/i915: Remove bitmap tracking for used-pdpes We only operate on known extents (both for alloc/clear) and so we can use both the knowledge of the bind/unbind range along with the knowledge of the existing pagetable to avoid having to allocate temporary and auxiliary bitmaps. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-14-chris@chris-wilson.co.uk	2017-02-15 10:07:27 +00:00
Chris Wilson	fe52e37fa8	drm/i915: Remove bitmap tracking for used-pdes We only operate on known extents (both for alloc/clear) and so we can use both the knowledge of the bind/unbind range along with the knowledge of the existing pagetable to avoid having to allocate temporary and auxiliary bitmaps. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-13-chris@chris-wilson.co.uk	2017-02-15 10:07:26 +00:00
Chris Wilson	dd19674bac	drm/i915: Remove bitmap tracking for used-ptes We only operate on known extents (both for alloc/clear) and so we can use both the knowledge of the bind/unbind range along with the knowledge of the existing pagetable to avoid having to allocate temporary and auxiliary bitmaps. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99295 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-12-chris@chris-wilson.co.uk	2017-02-15 10:07:24 +00:00
Chris Wilson	8448661d65	drm/i915: Convert clflushed pagetables over to WC maps We flush the entire page every time we update a few bytes, making the update of a page table many, many times slower than is required. If we create a WC map of the page for our updates, we can avoid the clflush but incur additional cost for creating the pagetable. We amoritize that cost by reusing page vmappings, and only changing the page protection in batches. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-02-15 10:07:18 +00:00
Chris Wilson	6cde9a02e0	drm/i915: Extract aliasing ppgtt setup In order to force testing of the aliasing ppgtt, extract its initialisation function. v2: Also extract the cleanup function for symmetry. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-39-chris@chris-wilson.co.uk	2017-02-13 20:46:45 +00:00
Chris Wilson	949e8ab3a9	drm/i915: Use the size/type of address space to make decisions Once the address space has been created (using 3 or 4 levels of page tables), we should use that to program the appropriate type into the contexts. This gives us the flexibility to handle different types of address spaces at runtime. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170209144036.23664-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-02-09 17:09:27 +00:00
Chris Wilson	47a8e3f6ae	drm/i915: Eliminate superfluous i915_ggtt_view_normal Since commit `058d88c433` ("drm/i915: Track pinned VMA"), there is only one user of i915_ggtt_view_normal rodate. Just treat NULL as no special view in pin_to_display() like everywhere else. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-7-chris@chris-wilson.co.uk	2017-01-14 16:18:36 +00:00
Chris Wilson	7b92c047ba	drm/i915: Eliminate superfluous i915_ggtt_view_rotated It is only being used to clear a struct and set the type, after which it is overwritten. Since we no longer check the unset bits of the union, skipping the clear is permissible. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-6-chris@chris-wilson.co.uk	2017-01-14 16:18:35 +00:00
Chris Wilson	8bab1193c1	drm/i915: Convert i915_ggtt_view to use an anonymous union Reading the ggtt_views is much more pleasant without the extra characters from specifying the union (i.e. ggtt_view.partial rather than ggtt_view.params.partial). To make this work inside i915_vma_compare() with only a single memcmp requires us to ensure that there are no uninitialised bytes within each branch of the union (we make sure the structs are packed) and we need to store the size of each branch. v4: Rewrite changelog and add comments explaining the assert. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-5-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-01-14 16:18:03 +00:00
Chris Wilson	992e418dd9	drm/i915: Compact memcmp in i915_vma_compare() In preparation for the next patch to convert to using an anonymous union and leaving the excess bytes in the union uninitialised, we first need to make sure we do not compare using those uninitialised bytes. We also want to preserve the compactness of the code, avoiding a second call to memcmp or introducing a switch, so we take advantage of using the type as an encoded size (as well as a unique identifier for each type of view). v2: Add the rationale for why we encode size into ggtt_view.type as a comment before the memcmp() v3: Use a switch to also assert that no two i915_ggtt_view_type have the same value. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-3-chris@chris-wilson.co.uk	2017-01-14 16:17:44 +00:00
Chris Wilson	8d9046ad5d	drm/i915: Mark the ggtt_view structs as packed In the next few patches, we will depend upon there being no uninitialised bits inside the ggtt_view. To ensure this we add the __packed attribute and double check with a build bug that gcc hasn't expanded the struct to include some padding bytes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-2-chris@chris-wilson.co.uk	2017-01-14 16:17:44 +00:00
Chris Wilson	7ff19c560f	drm/i915: Name the anonymous structs inside i915_ggtt_view Naming this pair will become useful shortly... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-1-chris@chris-wilson.co.uk	2017-01-14 16:17:43 +00:00
Chris Wilson	0c7eeda1af	drm/i915: Move i915_ppgtt_close() into i915_gem_gtt.c Move it alongside its ppgtt counterparts, in order to make it available for the ppgtt selftests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170111210937.29252-26-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-01-13 16:35:27 +00:00
Chris Wilson	7c3f86b6dc	drm/i915: Invalidate the guc ggtt TLB upon insertion Move the GuC invalidation of its ggtt TLB to where we perform the ggtt modification rather than proliferate it into all the callers of the insert (which may or may not in fact have to do the insertion). v2: Just do the guc invalidate unconditionally, (afaict) it has no impact without the guc loaded on gen8+ v3: Conditionally invalidate the guc - just in case that register has not been validated for other modes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170112110050.25333-1-chris@chris-wilson.co.uk	2017-01-12 16:47:04 +00:00
Chris Wilson	625d988acc	drm/i915: Extract reserving space in the GTT to a helper Extract drm_mm_reserve_node + calling i915_gem_evict_for_node into its own routine so that it can be shared rather than duplicated. v2: Kerneldoc Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: igvt-g-dev@lists.01.org Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-2-chris@chris-wilson.co.uk	2017-01-11 12:28:13 +00:00
Chris Wilson	e007b19d7b	drm/i915: Use the MRU stack search after evicting When we evict from the GTT to make room for an object, the hole we create is put onto the MRU stack inside the drm_mm range manager. On the next search pass, we can speed up a PIN_HIGH allocation by referencing that stack for the new hole. v2: Pull together the 3 identical implements (ahem, a couple were outdated) into a common routine for allocating a node and evicting as necessary. v3: Detect invalid calls to i915_gem_gtt_insert() v4: kerneldoc Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-1-chris@chris-wilson.co.uk	2017-01-11 12:25:15 +00:00
Chris Wilson	f51455d442	drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE Start converting over from the byte count to its semantic macro, either we want to allocate the size of a physical page in main memory or we want the size of a virtual page in the GTT. 4096 could mean either, but PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve code comprehension and future changes. In the future, we may want to use variable GTT page sizes and so have the challenge of knowing which hardcoded values were used to represent a physical page vs the virtual page. v2: Look for a few more 4096s to convert, discover IS_ALIGNED(). v3: 4096ul paranoia, make fence alignment a distinct value of 4096, keep bdw stolen w/a as 4096 until we know better. v4: Add asserts that i915_vma_insert() start/end are aligned to GTT page sizes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170110144734.26052-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-01-10 20:54:32 +00:00
Chris Wilson	edd1f2fe11	drm/i915: Use fixed-sized types for stolen Stolen memory is a hardware resource of known size, so use an accurate fixed integer type rather than the ambiguous variable size_t. This was motivated by the next patch spotting inconsistencies in our types. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170106152013.24684-3-chris@chris-wilson.co.uk	2017-01-06 16:02:09 +00:00
Paulo Zanoni	3c6b29b2df	drm/i915: fully apply WaSkipStolenMemoryFirstPage Don't even tell the mm allocator to handle the first page of stolen on the affected platforms. This means that we won't inherit the FB in case the BIOS decides to put it at the start of stolen. But the BIOS should not be putting it at the start of stolen since it's going to get corrupted. I suppose the bug here is that some pixels at the very top of the screen will be corrupted, so it's not exactly easy to notice. We have confirmation that the first page of stolen does actually get corrupted, so I really think we should do this in order to avoid any possible future headaches, even if that means losing BIOS framebuffer inheritance. Let's not use the HW in a way it's not supposed to be used. Notice that now ggtt->stolen_usable_size won't reflect the ending address of the stolen usable range anymore, so we have to fix the places that rely on this. To simplify, we'll just use U64_MAX. v2: don't even put the first page on the mm (Chris) v3: drm_mm_init() takes size instead of end as argument (Ville) v4: add a comment explaining the reserved ranges (Chris) use 0 for start and U64_MAX for end when possible (Chris) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94605 Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1481808235-27607-1-git-send-email-paulo.r.zanoni@intel.com	2016-12-20 10:45:33 -02:00
Chris Wilson	49d73912cb	drm/i915: Convert vm->dev backpointer to vm->i915 99% of the time we access i915_address_space->dev we want the i915 device and not the drm device, so let's store the drm_i915_private backpointer instead. The only real complication here are the inlines in i915_vma.h where drm_i915_private is not yet defined and so we have to choose an alternate path for our asserts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161129095008.32622-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-11-29 11:38:00 +00:00
Tvrtko Ursulin	275a991c03	drm/i915: dev_priv cleanup in i915_gem_gtt.c Started with removing INTEL_INFO(dev) and cascaded into a quite big trickle of function prototype changes. Still, I think it is for the better. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-11-17 13:56:12 +00:00
Tvrtko Ursulin	c6be607abc	drm/i915: dev_priv and a small cascade of cleanups in i915_gem.c Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-11-17 13:55:26 +00:00
Joonas Lahtinen	b42fe9ca0a	drm/i915: Split out i915_vma.c As a side product, had to split two other files; - i915_gem_fence_reg.h - i915_gem_object.h (only parts that needed immediate untanglement) I tried to move code in as big chunks as possible, to make review easier. i915_vma_compare was moved to a header temporarily. v2: - Use i915_gem_fence_reg.{c,h} v3: - Rebased v4: - Fix building when DEBUG_GEM is enabled by reordering a bit. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com	2016-11-11 14:34:54 +02:00
Chris Wilson	db6c2b4151	drm/i915: Store the vma in an rbtree under the object With full-ppgtt one of the main bottlenecks is the lookup of the VMA underneath the object. For execbuf there is merit in having a very fast direct lookup of ctx:handle to the vma using a hashtree, but that still leaves a large number of other lookups. One way to speed up the lookup would be to use a rhashtable, but that requires extra allocations and may exhibit poor worse case behaviour. An alternative is to use an embedded rbtree, i.e. no extra allocations and deterministic behaviour, but at the slight cost of O(lgN) lookups (instead of O(1) for rhashtable). The major of such tree will be very shallow and so not much slower, and still scales much, much better than the current unsorted list. v2: Bump vma_compare() to return a long, as we return the result of comparing two pointers. References: https://bugs.freedesktop.org/show_bug.cgi?id=87726 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161101115400.15647-1-chris@chris-wilson.co.uk	2016-11-01 13:00:40 +00:00
Chris Wilson	80b204bce8	drm/i915: Enable multiple timelines With the infrastructure converted over to tracking multiple timelines in the GEM API whilst preserving the efficiency of using a single execution timeline internally, we can now assign a separate timeline to every context with full-ppgtt. v2: Add a comment to indicate the xfer between timelines upon submission. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-35-chris@chris-wilson.co.uk	2016-10-28 20:53:57 +01:00
Chris Wilson	d07f0e59b2	drm/i915: Move GEM activity tracking into a common struct reservation_object In preparation to support many distinct timelines, we need to expand the activity tracking on the GEM object to handle more than just a request per engine. We already use the struct reservation_object on the dma-buf to handle many fence contexts, so integrating that into the GEM object itself is the preferred solution. (For example, we can now share the same reservation_object between every consumer/producer using this buffer and skip the manual import/export via dma-buf.) v2: Reimplement busy-ioctl (by walking the reservation object), postpone the ABI change for another day. Similarly use the reservation object to find the last_write request (if active and from i915) for choosing display CS flips. Caveats: * busy-ioctl: busy-ioctl only reports on the native fences, it will not warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is being rendered to by external fences. It also will not report the same busy state as wait-ioctl (or polling on the dma-buf) in the same circumstances. On the plus side, it does retain reporting of which i915 engines are engaged with this object. * non-blocking atomic modesets take a step backwards as the wait for render completion blocks the ioctl. This is fixed in a subsequent patch to use a fence instead for awaiting on the rendering, see "drm/i915: Restore nonblocking awaits for modesetting" * dynamic array manipulation for shared-fences in reservation is slower than the previous lockless static assignment (e.g. gem_exec_lut_handle runtime on ivb goes from 42s to 66s), mainly due to atomic operations (maintaining the fence refcounts). * loss of object-level retirement callbacks, emulated by VMA retirement tracking. * minor loss of object-level last activity information from debugfs, could be replaced with per-vma information if desired Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk	2016-10-28 20:53:50 +01:00
Chris Wilson	03ac84f183	drm/i915: Pass around sg_table to get_pages/put_pages backend The plan is to move obj->pages out from under the struct_mutex into its own per-object lock. We need to prune any assumption of the struct_mutex from the get_pages/put_pages backends, and to make it easier we pass around the sg_table to operate on rather than indirectly via the obj. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-13-chris@chris-wilson.co.uk	2016-10-28 20:53:47 +01:00
Michał Winiarski	4fb84d991e	drm/i915: Remove unused "valid" parameter from pte_encode We never used any invalid ptes, those were put in place for a possibility of doing gpu faults. However our batchbuffers are not restricted in length, so everything needs to be pointing to something and thus out-of-bounds is pointing to scratch. Remove the valid flag as it is always true. v2: Expand commit msg, patch reorder (Mika) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-1-git-send-email-michal.winiarski@intel.com	2016-10-14 12:40:32 +01:00
Chris Wilson	95374d759a	drm/i915: Always use the GTT for error capture Since the GTT provides universal access to any GPU page, we can use it to reduce our plethora of read methods to just one. It also has the important characteristic of being exactly what the GPU sees - if there are incoherency problems, seeing the batch as executed (rather than as trapped inside the cpu cache) is important. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-4-chris@chris-wilson.co.uk	2016-10-12 12:00:32 +01:00
Chris Wilson	8bcdd0f756	drm/i915: Embed the scratch page struct into each VM As the scratch page is no longer shared between all VM, and each has their own, forgo the small allocation and simply embed the scratch page struct into the i915_address_space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@intel.com>	2016-08-22 12:19:52 +01:00
Chris Wilson	f7bbe7883c	drm/i915: Embed the io-mapping struct inside drm_i915_private As io_mapping.h now always allocates the struct, we can avoid that allocation and extra pointer dance by embedding the struct inside drm_i915_private Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160819155428.1670-5-chris@chris-wilson.co.uk	2016-08-19 17:13:35 +01:00
Chris Wilson	d8923dcfa5	drm/i915: Track display alignment on VMA When using the aliasing ppgtt and pageflipping with the shrinker/eviction active, we note that we often have to rebind the backbuffer before flipping onto the scanout because it has an invalid alignment. If we store the worst-case alignment required for a VMA, we can avoid having to rebind at critical junctures. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-28-chris@chris-wilson.co.uk	2016-08-18 22:36:56 +01:00
Chris Wilson	821188778b	drm/i915: Choose not to evict faultable objects from the GGTT Often times we do not want to evict mapped objects from the GGTT as these are quite expensive to teardown and frequently reused (causing an equally, if not more so, expensive setup). In particular, when faulting in a new object we want to avoid evicting an active object, or else we may trigger a page-fault-of-doom as we ping-pong between evicting two objects. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-26-chris@chris-wilson.co.uk	2016-08-18 22:36:55 +01:00
Chris Wilson	49ef5294cd	drm/i915: Move fence tracking from object to vma In order to handle tiled partial GTT mmappings, we need to associate the fence with an individual vma. v2: A couple of silly drops replaced spotted by Joonas Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-21-chris@chris-wilson.co.uk	2016-08-18 22:36:50 +01:00
Chris Wilson	05a20d098d	drm/i915: Move map-and-fenceable tracking to the VMA By moving map-and-fenceable tracking from the object to the VMA, we gain fine-grained tracking and the ability to track individual fences on the VMA (subsequent patch). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-16-chris@chris-wilson.co.uk	2016-08-18 22:36:48 +01:00
Chris Wilson	bde13ebdab	drm/i915: Introduce i915_ggtt_offset() This little helper only exists to safely discard the upper unused 32bits of the general 64-bit VMA address - as we know that all Global GTT currently are less than 4GiB in size and so that the upper bits must be zero. In many places, we use a u32 for the global GTT offset and we want to document where we are discarding the full VMA offset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-28-git-send-email-chris@chris-wilson.co.uk	2016-08-15 11:01:14 +01:00
Chris Wilson	058d88c433	drm/i915: Track pinned VMA Treat the VMA as the primary struct responsible for tracking bindings into the GPU's VM. That is we want to treat the VMA returned after we pin an object into the VM as the cookie we hold and eventually release when unpinning. Doing so eliminates the ambiguity in pinning the object and then searching for the relevant pin later. v2: Joonas' stylistic nitpicks, a fun rebase. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-27-git-send-email-chris@chris-wilson.co.uk	2016-08-15 11:01:13 +01:00
Chris Wilson	19880c4a3f	drm/i915: Consolidate i915_vma_unpin_and_release() In a few places, we repeat a call to clear a pointer to a vma whilst unpinning and releasing a reference to its owner. Refactor those into a common function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-26-git-send-email-chris@chris-wilson.co.uk	2016-08-15 11:01:12 +01:00
Chris Wilson	8b797af189	drm/i915: Track pinned vma inside guc Since the guc allocates and pins and object into the GGTT for its usage, it is more natural to use that pinned VMA as our resource cookie. v2: Embrace naming tautology v3: Rewrite comments for guc_allocate_vma() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-12-git-send-email-chris@chris-wilson.co.uk	2016-08-15 11:00:58 +01:00
Chris Wilson	81a8aa4a6c	drm/i915: Create a VMA for an object In many places, we wish to store the VMA in preference to the object itself and so being able to create the persistent VMA is useful. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-9-git-send-email-chris@chris-wilson.co.uk	2016-08-15 11:00:55 +01:00
Chris Wilson	247177ddd5	drm/i915: Always set the vma->pages Previously, we would only set the vma->pages pointer for GGTT entries. However, if we always set it, we can use it to prettify some code that may want to access the backing store associated with the VMA (as assigned to the VMA). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-8-git-send-email-chris@chris-wilson.co.uk	2016-08-15 11:00:54 +01:00
Ville Syrjälä	6687c9062c	drm/i915: Rewrite fb rotation GTT handling Redo the fb rotation handling in order to: - eliminate the NV12 special casing - handle fb->offsets[] properly - make the rotation handling easier for the plane code To achieve these goals we reduce intel_rotation_info to only contain (for each plane) the rotated view width,height,stride in tile units, and the page offset into the object where the plane starts. Each plane is handled exactly the same way, no special casing for NV12 or other formats. We then store the computed rotation_info under intel_framebuffer so that we don't have to recompute it again. To handle fb->offsets[] we treat them as a linear offsets and convert them to x/y offsets from the start of the relevant GTT mapping (either normal or rotated). We store the x/y offsets under intel_framebuffer, and for some extra convenience we also store the rotated pitch (ie. tile aligned plane height). So for each plane we have the normal x/y offsets, rotated x/y offsets, and the rotated pitch. The normal pitch is available already in fb->pitches[]. While we're gathering up all that extra information, we can also easily compute the storage requirements for the framebuffer, so that we can check that the object is big enough to hold it. When it comes time to deal with the plane source coordinates, we first rotate the clipped src coordinates to match the relevant GTT view orientation, then add to them the fb x/y offsets. Next we compute the aligned surface page offset, and as a result we're left with some residual x/y offsets. Finally, if required by the hardware, we convert the remaining x/y offsets into a linear offset. For gen2/3 we simply skip computing the final page offset, and just convert the src+fb x/y offsets directly into a linear offset since that's what the hardware wants. After this all platforms, incluing SKL+, compute these things in exactly the same way (excluding alignemnt differences). v2: Use BIT(DRM_ROTATE_270) instead of ROTATE_270 when rotating plane src coordinates Drop some spurious changes that got left behind during development v3: Split out more changes to prep patches (Daniel) s/intel_fb->plane[].foo.bar/intel_fb->foo[].bar/ for brevity Rename intel_surf_gtt_offset to intel_fb_gtt_offset Kill the pointless 'plane' parameter from intel_fb_gtt_offset() v4: Fix alignment vs. alignment-1 when calling _intel_compute_tile_offset() from intel_fill_fb_info() Pass the pitch in tiles in stad of pixels to intel_adjust_tile_offset() from intel_fill_fb_info() Pass the full width/height of the rotated area to drm_rect_rotate() for clarity Use u32 for more offsets v5: Preserve the upper_32_bits()/lower_32_bits() handling for the fb ggtt offset (Sivakumar) v6: Rebase due to drm_plane_state src/dst rects Cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-2-git-send-email-ville.syrjala@linux.intel.com Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-08-11 18:32:46 +03:00
Chris Wilson	305bc234a8	drm/i915: Make i915_vma_pin() small and inline Not only is i915_vma_pin() called for every single object on every single execbuf, it is usually a simple increment as the VMA is already bound for execution by the GPU. Rearrange the tests for unbound and pin_count overflow so that we can do the increment and test very cheaply and compact enough to inline the operation into execbuf. The trick used is to note that we can check for an overflow bit (keeping space available for it inside the flags) at the same time as checking the binding bits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-17-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:20:00 +01:00
Chris Wilson	3272db5313	drm/i915: Combine all i915_vma bitfields into a single set of flags In preparation to perform some magic to speed up i915_vma_pin(), which is among the hottest of hot paths in execbuf, refactor all the bitfields accessed by i915_vma_pin() into a single unified set of flags. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-16-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:19:59 +01:00
Chris Wilson	59bfa1248e	drm/i915: Start passing around i915_vma from execbuffer During execbuffer we look up the i915_vma in order to reserve them in the VM. However, we then do a double lookup of the vma in order to then pin them, all because we lack the necessary interfaces to operate on i915_vma - so introduce i915_vma_pin()! v2: Tidy parameter lists to remove one level of redirection in the hot path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-15-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:19:58 +01:00
Chris Wilson	20dfbde463	drm/i915: Wrap vma->pin_count accessors with small inline helpers In the next few patches, the VMA pinning API is overhauled and to reduce the churn we pull out the update to the accessors into a prep patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-14-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:19:58 +01:00
Chris Wilson	de18003328	drm/i915: Record allocated vma size Tracking the size of the VMA as allocated allows us to dramatically reduce the complexity of later functions (like inserting the VMA in to the drm_mm range manager). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-13-git-send-email-chris@chris-wilson.co.uk	2016-08-04 20:19:57 +01:00
Chris Wilson	50e046b6a0	drm/i915: Mark the context and address space as closed When the user closes the context mark it and the dependent address space as closed. As we use an asynchronous destruct method, this has two purposes. First it allows us to flag the closed context and detect internal errors if we to create any new objects for it (as it is removed from the user's namespace, these should be internal bugs only). And secondly, it allows us to immediately reap stale vma. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-27-git-send-email-chris@chris-wilson.co.uk	2016-08-04 08:09:33 +01:00
Chris Wilson	b1f788c6ac	drm/i915: Release vma when the handle is closed In order to prevent a leak of the vma on shared objects, we need to hook into the object_close callback to destroy the vma on the object for this file. However, if we destroyed that vma immediately we may cause unexpected application stalls as we try to unbind a busy vma - hence we defer the unbind to when we retire the vma. v2: Keep vma allocated until closed. This is useful for a later optimisation, but it is required now in order to handle potential recursion of i915_vma_unbind() by retiring itself. v3: Comments are important. Testcase: igt/gem_ppggtt/flink-and-close-vma-leak Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-26-git-send-email-chris@chris-wilson.co.uk	2016-08-04 08:09:33 +01:00
Chris Wilson	b0decaf75b	drm/i915: Track active vma requests Hook the vma itself into the i915_gem_request_retire() so that we can accurately track when a solitary vma is inactive (as opposed to having to wait for the entire object to be idle). This improves the interaction when using multiple contexts (with full-ppgtt) and eliminates some frequent list walking when retiring objects after a completed request. A side-effect is that we get an active vma reference for free. The consequence of this is shown in the next patch... v2: Update inline names to be consistent with i915_gem_object_get_active() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-25-git-send-email-chris@chris-wilson.co.uk	2016-08-04 08:09:32 +01:00
Chris Wilson	2bfa996e03	drm/i915: Store owning file on the i915_address_space For the global GTT (and aliasing GTT), the address space is owned by the device (it is a global resource) and so the per-file owner field is NULL. For per-process GTT (where we create an address space per context), each is owned by the opening file. We can use this ownership information to both distinguish GGTT and ppGTT address spaces, as well as occasionally inspect the owner. v2: Whitespace, tells us who owns i915_address_space Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-6-git-send-email-chris@chris-wilson.co.uk	2016-08-04 08:09:17 +01:00
Chris Wilson	34c998b4eb	drm/i915: Rearrange GGTT probing to avoid needing a vfunc Since we have a static if-else-chain for device probing of the global GTT, we do not need to use a function pointer, let alone store it when we never use it again. So use the if-else-chain to call down into the device specific probe. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-5-git-send-email-chris@chris-wilson.co.uk	2016-08-04 08:09:16 +01:00
Chris Wilson	f6b9d5cabd	drm/i915: Split early global GTT initialisation Initialising the global GTT is tricky as we wish to use the drm_mm range manager during the modesetting initialisation (to capture stolen allocations from the BIOS) before we actually enable GEM. To overcome this, we currently setup the drm_mm first and then carefully rebind them. v2: Fixup after rebasing v3: GGTT initialisation needs to be split around kicking out conflicts v4: Restore an old UMS BUG_ON(mappable > total) as a DRM_ERROR plus fixup of probe results. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-4-git-send-email-chris@chris-wilson.co.uk	2016-08-04 08:09:15 +01:00
Chris Wilson	97d6d7ab68	drm/i915: Update GGTT initialisation functions to take drm_i915_private Since these are internal functions they operate on drm_i915_private and not the drm_device being passed in. So pass in the drm_i915_private instead, and remove one layer of dancing. No space wins here, just conforming to the norm in function parameters. v2: Include all the probe functions Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-3-git-send-email-chris@chris-wilson.co.uk	2016-08-04 08:09:14 +01:00
Chris Wilson	0088e522dd	drm/i915: Split GGTT initialisation between probing and setup In order to handle conflicting drivers (i.e. vgacon) having a different setup of hardware, we have to remove those other drivers before we try to setup our own mappings. This requires us to split GGTT initialisation between probing for the hardware location (part of the PCI BAR) and later establishing the kernel resources for it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-2-git-send-email-chris@chris-wilson.co.uk	2016-08-04 08:09:13 +01:00
Chris Wilson	406ea8d22f	drm/i915: Treat ringbuffer writes as write to normal memory Ringbuffers are now being written to either through LLC or WC paths, so treating them as simply iomem is no longer adequate. However, for the older !llc hardware, the hardware is documentated as treating the TAIL register update as serialising, so we can relax the barriers when filling the rings (but even if it were not, it is still an uncached register write and so serialising anyway.). For simplicity, let's ignore the iomem annotation. v2: Remove iomem from ringbuffer->virtual_address v3: And for good measure add iomem elsewhere to keep sparse happy Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v2 Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-8-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-7-git-send-email-chris@chris-wilson.co.uk	2016-07-20 13:40:14 +01:00
Dave Gordon	731f74c5c6	drm/i915: tweak gen6_for_{each_pde, all_pdes} macros Gen8 versions of these macros were updated a few months ago (`e8ebd8e` drm/i915: eliminate 'temp' in gen8_for_each macros) originally because at least one iterator could generate an out of bounds access, but also because eliminating the 'temp' parameter generated smaller and faster code. Matthew Auld recently noticed the same problem with the gen6 versions and provided a patch https://lists.freedesktop.org/archives/intel-gfx/2016-June/099334.html but while we're changing these, we might as well make them as much like the gen8 versions as possible, including the style of using "&& (..., true)" rather than ": (..., 1) : 0", and of course eliminating the redundant 'temp'. Furthermore, the "all_pdes" version is only used in one place, so we can improve code efficiency by changing both the macro parameters and the calling code to reduce extra dereferences. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1466793466-23500-1-git-send-email-david.s.gordon@intel.com	2016-06-27 13:13:53 +01:00
Chris Wilson	d6473f5664	drm/i915: Add support for mapping an object page by page Introduced a new vm specfic callback insert_page() to program a single pte in ggtt or ppgtt. This allows us to map a single page in to the mappable aperture space. This can be iterated over to access the whole object by using space as meagre as page size. v2: Added low level rpm assertions to insert_page routines (Chris) v3: Added POSTING_READ post register write (Tvrtko) v4: Rebase (Ankit) v5: Removed wmb() and FLUSH_CTL from insert_page, caller to take care of it (Chris) v6: insert_page not working correctly without FLSH_CNTL write, added the write again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2016-06-13 10:03:54 +01:00
Chris Wilson	dc97997a21	drm/i915: Use drm_i915_private as the native pointer for intel_uncore.c Pass drm_i915_private to the uncore init/fini routines and their subservients as it is their native type. text data bss dec hex filename 6309978 3578778 696320 10585076 a183f4 vmlinux 6309530 3578778 696320 10584628 a18234 vmlinux a modest 400 bytes of saving, but 60 lines of code deleted! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462885804-26750-1-git-send-email-chris@chris-wilson.co.uk	2016-05-10 17:20:20 +01:00
Ville Syrjälä	ac840ae535	drm/i915: Re-enable GGTT earlier during resume on pre-gen6 platforms Move the intel_enable_gtt() call to happen before we touch the GTT during resume. Right now it's done way too late. Before commit `ebb7c78d35` ("agp/intel-gtt: Only register fake agp driver for gen1") it was actually done earlier on account of also getting called from the resume hook of the fake agp driver. With the fake agp driver no longer getting registered we must move the call up. The symptoms I've seen on my 830 machine include lowmem corruption, other kinds of memory corruption, and straight up hung machine during or just after resume. Not really sure what causes the memory corruption, but so far I've not seen any with this fix. I think we shouldn't really need to call this during init, but we have been doing that so I've decided to keep the call. However moving that call earlier could be prudent as well. Doing it right after the intel-gtt probe seems appropriate. Also tested this on 946gz,elk,ilk and all seemed quite happy with this change. v2: Reorder init_hw vs. enable_hw functions (Chris) Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: drm-intel-fixes@lists.freedesktop.org Fixes: `ebb7c78d35` ("agp/intel-gtt: Only register fake agp driver for gen1") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462559755-353-1-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-05-10 12:29:33 +03:00
Chris Wilson	cba6dba4e5	drm/i915: Unexport i915_ppgtt_init() As i915_ppgtt_init() is not used outside of i915_gem_gtt.c we can make it static. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1462443767-5194-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2016-05-05 14:16:39 +01:00
Chris Wilson	f9326be5f1	drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use The code to switch_mm() is already handled by i915_switch_context(), the only difference required to setup the aliasing ppgtt is that we need to emit te switch_mm() on the first context, i.e. when transitioning from engine->last_context == NULL. This allows us to defer the initialisation of the GPU from early device initialisation to first use, which should marginally speed up both. The caveat is that we then defer the context initialisation until first use - i.e. we cannot assume that the GPU engines are initialised. For example, this means that power contexts for rc6 (Ironlake) need to explicitly loaded, as they are. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-11-git-send-email-chris@chris-wilson.co.uk	2016-04-28 12:17:32 +01:00
Chris Wilson	8ef8561f2c	drm/i915: Move ioremap_wc tracking onto VMA By tracking the iomapping on the VMA itself, we can share that area between multiple users. Also by only revoking the iomapping upon unbinding from the mappable portion of the GGTT, we can keep that iomap across multiple invocations (e.g. execlists context pinning). Note that by moving the iounnmap tracking to the VMA, we actually end up fixing a leak of the iomapping in intel_fbdev. v1.5: Rebase prompted by Tvrtko v2: Drop dev_priv parameter, we can recover the i915_ggtt from the vma. v3: Move handling of ioremap space exhaustion to vmap_purge and also allow vmallocs to recover old iomaps. Add Tvrtko's kerneldoc. v4: Fix a use-after-free in shrinker and rearrange i915_vma_iomap v5: Back to i915_vm_to_ggtt v6: Use i915_vma_pin_iomap and i915_vma_unpin_iomap to mark critical sections and ensure the VMA cannot be reaped whilst mapped. v7: Move i915_vma_iounmap so that consumers of the API are not tempted, and add iomem annotations Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-5-git-send-email-chris@chris-wilson.co.uk	2016-04-28 12:17:32 +01:00
Joonas Lahtinen	72e96d6450	drm/i915: Refer to GGTT {,VM} consistently Refer to the GGTT VM consistently as "ggtt->base" instead of just "ggtt", "vm" or indirectly through other variables like "dev_priv->ggtt.base" to avoid confusion with the i915_ggtt object itself and PPGTT VMs. Refer to the GGTT as "ggtt" instead of indirectly through chaining. As a bonus gets rid of the long-standing i915_obj_to_ggtt vs. i915_gem_obj_to_ggtt conflict, due to removal of i915_obj_to_ggtt! v2: - Added some more after grepping sources with Chris v3: - Refer to GGTT VM through ggtt->base consistently instead of ggtt_vm (Chris) v4: - Convert all dev_priv->ggtt->foo accesses to ggtt->foo. v5: - Make patch checker happy Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-31 17:55:43 +03:00
Joonas Lahtinen	d85489d314	drm/i915: Rename GGTT init functions Rename and document the GGTT init functions to give a better idea of the context where they are called from. i915_gem_gtt_init => i915_ggtt_init_hw i915_gem_init_global_gtt => i915_gem_init_ggtt i915_global_gtt_cleanup => i915_ggtt_cleanup_hw Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1458830866-12578-1-git-send-email-joonas.lahtinen@linux.intel.com	2016-03-30 13:47:18 +03:00
Joonas Lahtinen	d507d73578	drm/i915/gtt: Clean up GGTT probing code Use less pointers with the probing code, making it much less confusing to read. Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-03-18 15:18:15 +02:00
Joonas Lahtinen	62106b4f6b	drm/i915: Rename dev_priv->gtt to dev_priv->ggtt Refer to Global GTT consistently as GGTT, thus rename dev_priv->gtt to dev_priv->ggtt and struct i915_gtt to struct i915_ggtt. Fix a couple of whitespace problems while at it. v2: - Fix a typo in commit message. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-03-18 15:18:15 +02:00
Ville Syrjälä	1663b9d6a2	drm/i915: Reorganize intel_rotation_info Throw out a bunch of unnecessary stuff from struct intel_rotation_info, and pull most of the remaining stuff to live under an array of per-color plane sub-structures. What still remains outside the sub-structure will be reorgranized later as well, but that requires more work elsewhere so leave it be for now. v2: Split the vma size == luma+chroma size fix to prep patch (Daniel) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1) Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-8-git-send-email-ville.syrjala@linux.intel.com	2016-03-01 12:48:09 +02:00
Chris Wilson	596c592319	drm/i915: Reduce the pointer dance of i915_is_ggtt() The multiple levels of indirect do nothing but hinder the compiler and the pointer chasing turns to be quite painful but painless to fix. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1456484600-11477-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-02-26 13:15:39 +00:00
Chris Wilson	1c7f4bca5a	drm/i915: Rename vma->_list to _link for consistency Elsewhere we have adopted the convention of using '_link' to denote elements in the list (and '_list' for the actual list_head itself), and that the name should indicate which list the link belongs to (and preferrably not just where the link is being stored). s/vma_link/obj_link/ (we iterate over obj->vma_list) s/mm_list/vm_link/ (we iterate over vm->[in]active_list) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2016-02-26 13:15:39 +00:00
Alan	69603dbb31	i915: cast before shifting in i915_pte_count Otherwise a pde_shift big enough to overflow a u32 will be truncated before assignment Note: We never asked for ranges spanning a 4G boundary, so this issue doesn't cause a real problem. Signed-off-by: Alan Cox <alan@linux.intel.com> [danvet: Add note why this isn't a real problem.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20160217142043.4947.60447.stgit@localhost.localdomain	2016-02-17 17:00:38 +01:00
Sagar Arun Kamble	274008e89d	drm/i915/bxt: Check BIOS RC6 setup before enabling RC6 RC6 setup is shared between BIOS and Driver. BIOS sets up subset of RC6 setup registers. If those are not setup Driver should not enable RC6. For implementing this, driver can check RC_CTRL0 and RC_CTRL1 values to know if BIOS has enabled HW/SW RC6. This will also enable user to control RC6 using BIOS settings alone. RC6 related instability can be avoided by disabling via BIOS settings till driver fixes it. v2: Had placed logic in gen8 function by mistake. Fixed it. Ensuring RPM is not enabled in case BIOS disabled RC6. v3: Need to disable RPM if RC6 is disabled due to BIOS settings. (Daniel) Runtime PM enabling happens before gen9_enable_rc6. Moved the updation of enable_rc6 parameter in intel_uncore_sanitize. v4: Added elaborate check for BIOS RC6 setup. Prepared check_pctx for bxt. (Imre) v5: Caching reserved stolen base and size in the driver private data. Reorganized RC6 setup check. Moved from gen9_enable_rc6 to intel_uncore_sanitize. (Imre) v6: Rebasing on the patch submitted by Imre that moves gem_init_stolen earlier in the load. v7: Removed PWRCTX_MAXCNT_VCSUNIT1 check as it applies to SKL. (Imre) v8: Fixed formatting and checkpatch issues. Fixed functional issue where RC6 ctx size check was missing. (Imre) Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1454697809-22113-1-git-send-email-sagar.a.kamble@intel.com	2016-02-05 23:35:43 +02:00
Ville Syrjälä	7723f47dc6	drm/i915: Rename the rotated gtt view member to 'rotated' Also rename 'rotation_info' to 'rotated' to match the view type exactly, this should avoid confusion which union members is valid for each view type. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1453316739-13296-2-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-01-28 20:55:55 +02:00
Tvrtko Ursulin	ca82580c9c	drm/i915: Do not call API requiring struct_mutex where it is not available LRC code was calling GEM API like i915_gem_obj_ggtt_offset from places where the struct_mutex cannot be grabbed (irq handlers). To avoid that this patch caches some interesting bits and values in the engine and context structures. Some usages are also removed where they are not needed like a few asserts which are either impossible or have been checked already during engine initialization. Side benefit is also that interrupt handlers and command submission stop evaluating invariant conditionals, like what Gen we are running on, on every interrupt and every command submitted. This patch deals with logical ring context id and descriptors while subsequent patches will deal with the remaining issues. v2: * Cache the VMA instead of the address. (Chris Wilson) * Incorporate Dave Gordon's good comments and function name. v3: * Extract ctx descriptor template to a function and group functions dealing with ctx descriptor & co together near top of the file. (Dave Gordon) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1452870629-13830-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-01-18 09:58:36 +00:00
Dave Gordon	e8ebd8e2bd	drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros All of these iterator macros require a 'temp' argument, used merely to hold internal partial results. We can instead declare the temporary variable inside the macro, so the caller need not provide it. Some of the old code contained nested iterators that actually reused the same 'temp' variable for both inner and outer instances. It's quite surprising that this didn't introduce bugs! But it does show that the value of 'temp' isn't required to persist during the iterated body. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1449581451-11848-2-git-send-email-david.s.gordon@intel.com	2015-12-10 09:36:42 +01:00
Daniel Vetter	ce7f172856	drm/i915: Fix i915_ggtt_view_equal to handle rotation correctly The rotated view depends upon the rotation paramters, but thus far we didn't bother checking for those. This seems to have been an issue ever since this was introduce in commit `fe14d5f4e5` Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Date: Wed Dec 10 17:27:58 2014 +0000 drm/i915: Infrastructure for supporting different GGTT views per object But userspace is allowed to reuse framebuffer backing storage with different framebuffers with different pixel formats/stride/whatever. And e.g. SNA indeed does this. Hence we must check for all the paramters to match, not just that it's rotated. v2: intel_plane_obj_offset also needs to construct the full view, to avoid fallout since they don't fully match. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1444834266-12689-3-git-send-email-daniel.vetter@ffwll.ch Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-11-19 16:42:06 +01:00
Daniel Vetter	a6d09186fa	drm/i915: Stuff rotation params into view union We don't need 2 separate unions. Note that this was done intentinoally Author: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Date: Wed May 6 14:35:38 2015 +0300 drm/i915: Add a partial GGTT view type on Tvrtko's request, but without a clear justification. Rotated views are also not checking for matching paramters in i915_ggtt_view_equal, which seems like a bug. But this patch here doesn't change that. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1444834266-12689-2-git-send-email-daniel.vetter@ffwll.ch Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-11-19 16:42:01 +01:00
Michel Thierry	24dfd0736c	drm/i915: prevent out of range pt in the PDE macros (take 3) We tried to fix this in commit `fdc454c148` ("drm/i915: Prevent out of range pt in gen6_for_each_pde"). But the static analyzer still complains that, just before we break due to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an iter value that is bigger than I915_PDES. Of course, this isn't really a problem since no one uses pt outside the macro. Still, every single new usage of the macro will create a new issue for us to mark as a false positive. Also, Paulo re-started the discussion a while ago [1], but didn't end up implemented. In order to "solve" this "problem", this patch takes the ideas from Chris and Dave, but that check would change the desired behavior of the code, because the object (for example pdp->page_directory[iter]) can be null during init/alloc, and C would take this as false, breaking the for loop immediately. This has been already verified with "static analysis tools". [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html v2: Make it a single statement, while preventing the common subexpression elimination (Chris) Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-10-06 14:15:29 +02:00
Tvrtko Ursulin	dedf278ce6	drm/i915: Enable querying offset of UV plane with intel_plane_obj_offset v2: Rebase. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-09-23 17:31:29 +02:00
Tvrtko Ursulin	89e3e14276	drm/i915: Support NV12 in rotated GGTT mapping Just adding the rotated UV plane at the end of the rotated Y plane. v2: Rebase. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-09-23 17:28:40 +02:00
Paulo Zanoni	a9da512b3e	drm/i915: avoid the last 8mb of stolen on BDW/SKL The FBC hardware for these platforms doesn't have access to the bios_reserved range, so it always assumes the maximum (8mb) is used. So avoid this range while allocating. This solves a bunch of FIFO underruns that happen if you end up putting the CFB in that memory range. On my machine, with 32mb of stolen, I need a 2560x1440 mode for that. Testcase: igt/kms_frontbuffer_tracking/fbc-* (given the right setup) Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-09-23 14:39:17 +02:00
Michel Thierry	088e0df402	drm/i915/gtt: Allow >= 4GB offsets in X86_32 Similar to commit `c44ef60e43` ("drm/i915/gtt: Allow >= 4GB sizes for vm"), i915_gem_obj_offset and i915_gem_obj_ggtt_offset return an unsigned long, which in only 4-bytes long in 32-bit kernels. Change return type (and other related offset variables) to u64. Since Global GTT is always limited to 4GB, this change would not be required in i915_gem_obj_ggtt_offset, but this is done for consistency. v2: Remove unnecessary offset variable in do_pin, as we already have vma->node.start (Chris). Update GGTT offset too (Tvrtko). Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:30 +02:00
Michel Thierry	69ab76fd3d	drm/i915/gen8: Initialize PDPs and PML4 Similar to PDs, while setting up a page directory pointer, make all entries of the pdp point to the scratch pd before mapping (and make all its entries point to the scratch page); this is to be safe in case of out of bound access or proactive prefetch. Also add a scratch pdp, which the PML4 entries point to. v2: Handle scratch_pdp allocation failure correctly, and keep initialize_px functions together (Akash) v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Rely on the added macros to initialize the pdps. v4: Rebase after final merged version of Mika's ppgtt/scratch patches (and removed commit message part related to v3). v5: Update commit message to also mention PML4 table initialization and the new scratch pdp (Akash). Suggested-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:23 +02:00
Michel Thierry	762d99363d	drm/i915/gen8: implement alloc/free for 4lvl PML4 has no special attributes, and there will always be a PML4. So simply initialize it at creation, and destroy it at the end. The code for 4lvl is able to call into the existing 3lvl page table code to handle all of the lower levels. v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the compiler happy. And define ret only in one place. Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl. v3: Use i915_dma_unmap_single instead of pci API. Fix a couple of incorrect checks when unmapping pdp and pd pages (Akash). v4: Call __pdp_fini also for 32b PPGTT. Clean up alloc_pdp param list. v5: Prevent (harmless) out of range access in gen8_for_each_pml4e. v6: Simplify alloc_vma_range_4lvl and gen8_ppgtt_init_common error paths. (Akash) v7: Rebase, s/gen8_ppgtt_free_/gen8_ppgtt_cleanup_/. v8: Change location of pml4_init/fini. It will make next patches cleaner. v9: Rebase after Mika's ppgtt cleanup / scratch merge patch series, while trying to reuse as much as possible for pdp alloc. pml4_init/fini replaced by setup/cleanup_px macros. v10: Rebase after Mika's merged ppgtt cleanup patch series. v11: Rebase after final merged version of Mika's ppgtt/scratch patches. v12: Fix pdpe start value in trace (Akash) v13: Define all 4lvl functions in this patch directly, instead of previous patches, add i915_page_directory_pointer_entry_alloc here, use test_bit to detect when pdp is already allocated (Akash). v14: Move pdp allocation into a new gen8_ppgtt_alloc_page_dirpointers funtion, as we do for pds and pts; move pd and pdp setup functions to this patch (Akash). v15: Added kfree(pdp) from previous patch to this (Akash). Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:21 +02:00
Michel Thierry	81ba8aefd0	drm/i915/gen8: Add PML4 structure Introduces the Page Map Level 4 (PML4), ie. the new top level structure of the page tables. To facilitate testing, 48b mode will be available on Broadwell and GEN9+, when i915.enable_ppgtt = 3. v2: Remove unnecessary CONFIG_X86_64 checks, ppgtt code is already 32/64-bit safe (Chris). v3: Add goto free_scratch in temp 48-bit mode init code (Akash). v4: kfree the pdp until the 4lvl alloc/free patch (Akash). v5: Postpone 48-bit code in sanitize_enable_ppgtt (Akash). v6: Keep _insert_pte_entries changes outside this patch (Akash). Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:20 +02:00
Michel Thierry	6ac1850220	drm/i915/gen8: Make pdp allocation more dynamic This transitional patch doesn't do much for the existing code. However, it should make upcoming patches to use the full 48b address space a bit easier. 32-bit ppgtt uses just 4 PDPs, while 48-bit ppgtt will have up-to 512; this patch prepares the existing functions to query the right number of pdps at run-time. This also means that used_pdpes should also be allocated during ppgtt_init, as the bitmap size will depend on the ppgtt address range selected. v2: Renamed pdp_free to be similar to pd/pt (unmap_and_free_pdp). v3: To facilitate testing, 48b mode will be available on Broadwell and GEN9+, when i915.enable_ppgtt = 3. v4: Rebase after s/page_tables/page_table/, added extra information about 4-level page table formats and use IS_ENABLED macro. v5: Check CONFIG_X86_64 instead of CONFIG_64BIT. v6: Rebase after Mika's ppgtt cleanup / scratch merge patch series, and follow his nomenclature in pdp functions (there is no alloc_pdp yet). v7: Rebase after merged version of Mika's ppgtt cleanup patch series. v8: Rebase after final merged version of Mika's ppgtt/scratch patches. v9: Introduce PML4 (and 48-bit checks) until next patch (Akash). v10: Also use test_bit to detect when pd/pt are already allocated (Akash) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Akash Goel <akash.goel@intel.com> [danvet: Amend commit message as suggested by Michel.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:17 +02:00
Michel Thierry	09120d4e88	drm/i915: Remove unnecessary gen8_clamp_pd gen8_clamp_pd clamps to the next page directory boundary, but the macro gen8_for_each_pde already has a check to stop at the page directory boundary. Furthermore, i915_pte_count also restricts to the next page table boundary. v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Suggested-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: "Akash Goel" <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:16 +02:00
Mika Kuoppala	79ab937054	drm/i915/gtt: Move scratch_pd and scratch_pt into vm struct Scratch page is part of struct i915_address_space. Move other scratch entities into the same struct. This is a preparatory patch for having only one instance of each scratch_pt/pd. v2: make commit msg more readable Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v1) [danvet: Bikeshed summary to avoid confusion with vmas.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-26 11:06:30 +02:00
Mika Kuoppala	c114f76a0a	drm/i915/gtt: Make scratch page i915_page_dma compatible Lay out scratch page structure in similar manner than other paging structures. This allows us to use the same tools for setup and teardown. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-26 10:54:00 +02:00
Mika Kuoppala	567047be2a	drm/i915/gtt: Use macros to access dma mapped pages Make paging structure type agnostic *_px macros to access page dma struct, the backing page and the dma address. This makes the code less cluttered on internals of i915_page_dma. v2: Superfluous const -> nonconst removed v3: Rebased Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-26 10:53:50 +02:00
Mika Kuoppala	44159ddbea	drm/i915/gtt: Introduce struct i915_page_dma All our paging structures have struct page and dma address for that page. Add struct for page/dma address pairs and use it to make the setup and teardown for different paging structures identical. Include the page directory offset also in the struct for legacy gens. Rename it to clearly point out that it is offset into the ggtt. v2: Add comment about ggtt_offset (Michel) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-26 10:51:04 +02:00
Mika Kuoppala	d852c7bf90	drm/i915/gtt: Introduce i915_page_dir_dma_addr The legacy mode mm switch and the execlist context assignment needs dma address for the page directories. Introduce a function that encapsulates the scratch_pd dma fallback if no pd is found. v2: Rebase, s/ring/req Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-26 10:50:52 +02:00
Mika Kuoppala	c44ef60e43	drm/i915/gtt: Allow >= 4GB sizes for vm. We can have exactly 4GB sized ppgtt with 32bit system. size_t is inadequate for this. v2: Convert a lot more places (Daniel) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-26 10:41:13 +02:00
Tvrtko Ursulin	84fe03f7b2	drm/i915: Move rotated geometry calculations into the fill helper This way data is available as soon as the view is passed into the call chain. v2: Store size in bytes instead of pages under the appropriate name. (Chris Wilson) v3: Use uint64_t instead of size_t. (Daniel Vetter) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-24 15:11:05 +02:00
John Harrison	e85b26dc1c	drm/i915: Update switch_mm() to take a request structure Updated the switch_mm() code paths to take a request instead of a ring. This includes the myriad *_mm_switch functions themselves and a bunch of PDP related helper functions. v2: Rebased to newer tree. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:21 +02:00
John Harrison	b3dd6b9681	drm/i915: Update ppgtt_init_ring() & context_enable() to take requests The final step in removing the OLR from i915_gem_init_hw() is to pass the newly allocated request structure in to each step rather than passing a ring structure. This patch updates both i915_ppgtt_init_ring() and i915_gem_context_enable() to take request pointers. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:09 +02:00
John Harrison	4ad2fd888b	drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring The i915_gem_init_hw() function calls a bunch of smaller initialisation functions. Multiple of which have generic sections and per ring sections. This means multiple passes are done over the rings. Each pass writes data to the ring which floats around in that ring's OLR until some random point in the future when an add_request() is done by some random other piece of code. This patch breaks i915_ppgtt_init_hw() in two with the per ring initialisation now being done in i915_ppgtt_init_ring(). The ring looping is now done at the top level in i915_gem_init_hw(). v2: Fix dumb loop variable re-use. For: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tomas Elf <tomas.elf@intel.com> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:02:06 +02:00
Joonas Lahtinen	8bd7ef1638	drm/i915: Add a partial GGTT view type Partial view type allows manipulating parts of huge BOs through the GGTT, which was not previously possible due to constraint that whole object had to be mapped for any access to it through GGTT. v2: - Retain error value from sg_alloc_table (Tvrtko Ursulin) - Do not zero already zeroed variable (Tvrtko Ursulin) - Use more common variable types for page size/offset (Tvrtko Ursulin) v3: - Only compare additional view parameters when need to (Tvrtko Ursulin) v4: - Do zero out the variable that needs to be (bug introduced in v2). Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-05-08 13:04:18 +02:00
Joonas Lahtinen	91e6711e30	drm/i915: Do not make assumptions on GGTT VMA sizes GGTT VMA sizes might be smaller than the whole object size due to different GGTT views. v2: - Separate GGTT view constraint calculations from normal view constraint calculations (Chris Wilson) v3: - Do not bother with debug wording. (Tvrtko Ursulin) v4: - Clearer logic for calculating map_and_fenceable (Tvrtko Ursulin) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [danvet: Drop BUG_ON, it's redudant.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-05-08 13:04:17 +02:00
Daniel Vetter	70b9f6f832	rm/i915: Move i915_get_ggtt_vma_pages into ggtt_bind_vma We have this neat abstraction between ppgtt and ggtt for (un)bind_vma and didn't end up using it really. What a shame, so fix this and make the ->bind_vma hook a bit more useful. Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-04-23 21:07:53 +02:00
Daniel Vetter	f329f5f6eb	drm/i915: Move PTE_READ_ONLY to ->pte_encode vfunc It's only used as a flag there, so unconfuse things a bit. Also separate the bind_vma flag space from the pte_encode flag space in the code. Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-20 08:59:14 -07:00
Daniel Vetter	777dc5bb26	drm/i915: Move vma vfuns to adddress_space They change with the address space and not with each vma, so move them into the right pile of vfuncs. Save 2 pointers per vma and clarifies the code. Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-04-20 08:54:29 -07:00
Michel Thierry	33c8819f1b	drm/i915/gen8: begin bitmap tracking Like with gen6/7, we can enable bitmap tracking with all the preallocations to make sure things actually don't blow up. v2: Rebased to match changes from previous patches. v3: Without teardown logic, rely on used_pdpes and used_pdes when freeing page tables. v4: Rebased after s/page_tables/page_table/. v5: Rebased after page table generalizations. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:12 +02:00
Michel Thierry	09942c656b	drm/i915: num_pd_pages/num_pd_entries isn't useful These values are never quite useful for dynamic allocations of the page tables. Getting rid of them will help prevent later confusion. v2: Updated to use unmap_and_free_pd functions. v3: Updated gen8_ppgtt_free after teardown logic was removed. v4: Rebase after s/page_tables/page_table/. v5: Keep allocating all page directories in GEN8+ systems with less than 4GB of memory. Updated gen6_for_all_pdes. v6: Prevent (harmless) out of range access in gen6_for_all_pdes. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:11 +02:00
Michel Thierry	7cb6d7ac63	drm/i915/gen8: Update pdp switch and point unused PDPs to scratch page One important part of this patch is we now write a scratch page directory into any unused PDP descriptors. This matters for 2 reasons, first, we're not allowed to just use 0, or an invalid pointer, and second, we must wipe out any previous contents from the last context. The latter point only matters with full PPGTT. The former point only effect platforms with less than 4GB memory. v2: Updated commit message to point that we must set unused PDPs to the scratch page. v3: Unmap scratch_pd in gen8_ppgtt_free. v4: Initialize scratch_pd. (Mika) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:10 +02:00
Michel Thierry	9271d959dc	drm/i915/gen8: Add dynamic allocation macros and helper functions Similar to gen6, we will use for_each_pde/for_each_pdpe and pte/pde/pdpe_index to iterate over these new structures. v2: Match trace_i915_va_teardown params v3: Multiple rebases. v4: Updated to use unmap_and_free_pt. v5: teardown_va_range logic no longer needed. v6: Rebase after s/page_tables/page_table/. v7: Renamed commit to match what it does now (it was "Use dynamic allocation idioms on free"). v8: Prevent (harmless) out of range access in gen8_for_each_pde and gen8_for_each_pdpe_e. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: s/BUG/WARN/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:09 +02:00
Michel Thierry	ec565b3c15	drm/i915: Remove _entry from PPGTT page structures Lets try to keep this consistent: Page Directory Pointer (PDP). Page Directory (PD), also known as page directory pointer entries. Page Table (PT), also known as page directory entries. s/struct i915_page_table_entry/struct i915_page_table/ s/struct i915_page_directory_entry/struct i915_page_directory/ s/struct i915_page_directory_pointer_entry/struct i915_page_directory_pointer/ Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:07 +02:00
Joonas Lahtinen	9abc464854	drm/i915: Compare GGTT view structs instead of types To allow for views where the view type is not defined by the view type only, like it is in stereo or rotated 90 degree view, change the semantic to require the whole view structure for comparison when we match a GGTT view. This allows including parameters like offset to be included in the view which is useful for eg. partial views. v3: - Rely on ggtt_view type being 0 for non-GGTT vma's, which equals to I915_GGTT_VIEW_NORMAL. (Daniel Vetter) - Do not use potentially slower comparison when we only want to know if something is or is not a normal view. - Rebase on top of rotated view patches. Add rotated view singleton. - If one view is missing in comparison they're equal only if both are missing. v4: - Use comparison helper in obj_to_ggtt_view too. (Tvrtko Ursulin) - Do WARN_ON if one view is NULL. (Tvrtko Ursulin) Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-27 15:05:22 +01:00
Michel Thierry	4933d51955	drm/i915: Finish gen6/7 dynamic page table allocation This patch continues on the idea from "Track GEN6 page table usage". From here on, in the steady state, PDEs are all pointing to the scratch page table (as recommended in the spec). When an object is allocated in the VA range, the code will determine if we need to allocate a page for the page table. Similarly when the object is destroyed, we will remove, and free the page table pointing the PDE back to the scratch page. Following patches will work to unify the code a bit as we bring in GEN8 support. GEN6 and GEN8 are different enough that I had a hard time to get to this point with as much common code as I do. The aliasing PPGTT must pre-allocate all of the page tables. There are a few reasons for this. Two trivial ones: aliasing ppgtt goes through the ggtt paths, so it's hard to maintain, we currently do not restore the default context (assuming the previous force reload is indeed necessary). Most importantly though, the only way (it seems from empirical evidence) to invalidate the CS TLBs on non-render ring is to either use ring sync (which requires actually stopping the rings in order to synchronize when the sync completes vs. where you are in execution), or to reload DCLV. Since without full PPGTT we do not ever reload the DCLV register, there is no good way to achieve this. The simplest solution is just to not support dynamic page table creation/destruction in the aliasing PPGTT. We could always reload DCLV, but this seems like quite a bit of excess overhead only to save at most 2MB-4k of memory for the aliasing PPGTT page tables. v2: Make the page table bitmap declared inside the function (Chris) Simplify the way scratching address space works. Move the alloc/teardown tracepoints up a level in the call stack so that both all implementations get the trace. v3: Updated trace event to spit out a name v4: Aliasing ppgtt is now initialized differently (in setup global gtt) v5: Rebase to latest code. Also removed unnecessary aliasing ppgtt check for trace, as it is no longer possible after the PPGTT cleanup patch series of a couple of months ago (Daniel). v6: Implement changes from code review (Daniel): - allocate/teardown_va_range calls added. - Add a scratch page allocation helper (only need the address). - Move trace events to a new patch. - Use updated mark_tlbs_dirty. - Moved pt preallocation for aliasing ppgtt into gen6_ppgtt_init. v7: teardown_va_range removed (Daniel). In init, gen6_ppgtt_clear_range call is only needed for aliasing ppgtt. v8: Rebase after s/page_tables/page_table/. v9: Remove unnecessary scratch flag in page_table struct, future patches can just compare against ppgtt->scratch_pt, and alloc_pt_scratch becomes redundant. Initialize scratch_pt and pt. (Mika) v10: Clean up aliasing ppgtt init error path and prevent leaking the ppgtt obj when init fails. (Mika) Updated commit author. (Daniel) Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v4+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-27 09:25:26 +01:00
Michel Thierry	fdc454c148	drm/i915: Prevent out of range pt in gen6_for_each_pde Found by static analysis tool, this was harmless as the pt was not used out of scope though. Introduced by commit `678d96fbb3` ("drm/i915: Track GEN6 page table usage"). Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-27 09:24:55 +01:00
Tvrtko Ursulin	50470bb011	drm/i915/skl: Support secondary (rotated) frame buffer mapping 90/270 rotated scanout needs a rotated GTT view of the framebuffer. This is put in a separate VMA with a dedicated ggtt view and wired such that it is created when a framebuffer is pinned to a 90/270 rotated plane. Rotation is only possible with Yb/Yf buffers and error is propagated to user space in case of a mismatch. Special rotated page view is constructed at the VMA creation time by borrowing the DMA addresses from obj->pages. v2: * Do not bother with pages for rotated sg list, just populate the DMA addresses. (Daniel Vetter) * Checkpatch cleanup. v3: * Rebased on top of new plane handling (create rotated mapping when setting the rotation property). * Unpin rotated VMA on unpinning from display plane. * Simplify rotation check using bitwise AND. (Chris Wilson) v4: * Fix unpinning of optional rotated mapping so it is really considered to be optional. v5: * Rebased for fb modifier changes. * Rebased for atomic commit. * Only pin needed view for display. (Ville Syrjälä, Daniel Vetter) v6: * Rebased after preparatory work has been extracted out. (Daniel Vetter) v7: * Slightly simplified tiling geometry calculation. * Moved rotated GGTT view implementation into i915_gem_gtt.c (Daniel Vetter) v8: * Do not use i915_gem_obj_size to get object size since that actually returns the size of an VMA which may not exist. * Rebased for ggtt view changes. v9: * Rebased after code review changes on the preceding patches. * Tidy function definitions. (Joonas Lahtinen) For: VIZ-4726 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v4) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-23 15:06:31 +01:00
Ben Widawsky	563222a745	drm/i915: Track page table reload need This patch was formerly known as, "Force pd restore when PDEs change, gen6-7." I had to change the name because it is needed for GEN8 too. The real issue this is trying to solve is when a new object is mapped into the current address space. The GPU does not snoop the new mapping so we must do the gen specific action to reload the page tables. GEN8 and GEN7 do differ in the way they load page tables for the RCS. GEN8 does so with the context restore, while GEN7 requires the proper load commands in the command streamer. Non-render is similar for both. Caveat for GEN7 The docs say you cannot change the PDEs of a currently running context. We never map new PDEs of a running context, and expect them to be present - so I think this is okay. (We can unmap, but this should also be okay since we only unmap unreferenced objects that the GPU shouldn't be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag to signal that even if the context is the same, force a reload. It's unclear exactly what this does, but I have a hunch it's the right thing to do. The logic assumes that we always emit a context switch after mapping new PDEs, and before we submit a batch. This is the case today, and has been the case since the inception of hardware contexts. A note in the comment let's the user know. It's not just for gen8. If the current context has mappings change, we need a context reload to switch v2: Rebased after ppgtt clean up patches. Split the warning for aliasing and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt is always null. v3: Invalidate PPGTT TLBs inside alloc_va_range. v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when neither ctx->ppgtt and aliasing_ppgtt exist. v5: Removed references to teardown_va_range. v6: Updated needs_pd_load_pre/post. v7: Fix pd_dirty_rings check in needs_pd_load_post, and update/move comment about updated PDEs to object_pin/bind (Mika). Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-20 11:48:18 +01:00
Ben Widawsky	678d96fbb3	drm/i915: Track GEN6 page table usage Instead of implementing the full tracking + dynamic allocation, this patch does a bit less than half of the work, by tracking and warning on unexpected conditions. The tracking itself follows which PTEs within a page table are currently being used for objects. The next patch will modify this to actually allocate the page tables only when necessary. With the current patch there isn't much in the way of making a gen agnostic range allocation function. However, in the next patch we'll add more specificity which makes having separate functions a bit easier to manage. One important change introduced here is that DMA mappings are created/destroyed at the same page directories/tables are allocated/deallocated. Notice that aliasing PPGTT is not managed here. The patch which actually begins dynamic allocation/teardown explains the reasoning for this. v2: s/pdp.page_directory/pdp.page_directories Make a scratch page allocation helper v3: Rebase and expand commit message. v4: Allocate required pagetables only when it is needed, _bind_to_vm instead of bind_vma (Daniel). v5: Rebased to remove the unnecessary noise in the diff, also: - PDE mask is GEN agnostic, renamed GEN6_PDE_MASK to I915_PDE_MASK. - Removed unnecessary checks in gen6_alloc_va_range. - Changed map/unmap_px_single macros to use dma functions directly and be part of a static inline function instead. - Moved drm_device plumbing through page tables operation to its own patch. - Moved allocate/teardown_va_range calls until they are fully implemented (in subsequent patch). - Merged pt and scratch_pt unmap_and_free path. - Moved scratch page allocator helper to the patch that will use it. v6: Reduce complexity by not tearing down pagetables dynamically, the same can be achieved while freeing empty vms. (Daniel) v7: s/i915_dma_map_px_single/i915_dma_map_single s/gen6_write_pdes/gen6_write_pde Prevent a NULL case when only GGTT is available. (Mika) v8: Rebased after s/page_tables/page_table/. v9: Reworked i915_pte_index and i915_pte_count. Also exercise bitmap allocation here (gen6_alloc_va_range) and fix incorrect write_page_range in i915_gem_restore_gtt_mappings (Mika). Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-20 11:48:18 +01:00
Michel Thierry	07749ef32c	drm/i915: page table generalizations No functional changes, but will improve code clarity and removed some duplicated defines. Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-20 11:48:17 +01:00
Ben Widawsky	06fda602db	drm/i915: Create page table allocators As we move toward dynamic page table allocation, it becomes much easier to manage our data structures if break do things less coarsely by breaking up all of our actions into individual tasks. This makes the code easier to write, read, and verify. Aside from the dissection of the allocation functions, the patch statically allocates the page table structures without a page directory. This remains the same for all platforms, The patch itself should not have much functional difference. The primary noticeable difference is the fact that page tables are no longer allocated, but rather statically declared as part of the page directory. This has non-zero overhead, but things gain additional complexity as a result. This patch exists for a few reasons: 1. Splitting out the functions allows easily combining GEN6 and GEN8 code. Page tables have no difference based on GEN8. As we'll see in a future patch when we add the DMA mappings to the allocations, it requires only one small change to make work, and error handling should just fall into place. 2. Unless we always want to allocate all page tables under a given PDE, we'll have to eventually break this up into an array of pointers (or pointer to pointer). 3. Having the discrete functions is easier to review, and understand. All allocations and frees now take place in just a couple of locations. Reviewing, and catching leaks should be easy. 4. Less important: the GFP flags are confined to one location, which makes playing around with such things trivial. v2: Updated commit message to explain why this patch exists v3: For lrc, s/pdp.page_directory[i].daddr/pdp.page_directory[i]->daddr/ v4: Renamed free_pt/pd_single functions to unmap_and_free_pt/pd (Daniel) v5: Added additional safety checks in gen8 clear/free/unmap. v6: Use WARN_ON and return -EINVAL in alloc_pt_range (Mika). v7: Make err_out loop symmetrical to the way we allocate in alloc_pt_range. Also s/page_tables/page_table and correct commit message (Mika) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-02-25 16:53:43 +01:00
Ben Widawsky	7324cc0491	drm/i915: Complete page table structures Move the remaining members over to the new page table structures. This can be squashed with the previous commit if desire. The reasoning is the same as that patch. I simply felt it is easier to review if split. v2: In lrc: s/ppgtt->pd_dma_addr[i]/ppgtt->pdp.page_directory[i].daddr/ v3: Rebase. v4: Rebased after s/page_tables/page_table/. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-02-25 16:53:07 +01:00
Ben Widawsky	d7b3de9121	drm/i915: page table abstractions When we move to dynamic page allocation, keeping page_directory and pagetabs as separate structures will help to break actions into simpler tasks. To help transition the code nicely there is some wasted space in gen6/7. This will be ameliorated shortly. Following the x86 pagetable terminology: PDPE = struct i915_page_directory_pointer_entry. PDE = struct i915_page_directory_entry [page_directory]. PTE = struct i915_page_table_entry [page_tables]. v2: fixed mismatches after clean-up/rebase. v3: Clarify the names of the multiple levels of page tables (Daniel) v4: Addressing Mika's review comments. s/gen8_free_page_directories/gen8_free_page_directory and free the page tables for the directory there. In gen8_ppgtt_allocate_page_directories, do not leak previously allocated pt in case the page_directory alloc fails. Update error return handling in gen8_ppgtt_alloc. v5: Do not leak pt on error in gen6_ppgtt_allocate_page_tables. (Mika) v6: s/page_tables/page_table/. (Mika) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-02-25 16:52:34 +01:00
Ben Widawsky	17d5538d54	drm/i915/gen8: Un-hardcode number of page directories Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-02-13 23:28:12 +01:00
Ben Widawsky	766436004b	drm/i915: Rename to GEN8_LEGACY_PDPES In gen8, 32b PPGTT has always had one "pdp" (it doesn't actually have one, but it resembles having one). The #define was confusing as is, and using "PDPE" is a much better description. sed -i 's/GEN8_LEGACY_PDPS/GEN8_LEGACY_PDPES/' drivers/gpu/drm/i915/*.[ch] It also matches the x86 pagetable terminology: PTE = Page Table Entry - pagetable level 1 page PDE = Page Directory Entry - pagetable level 2 page PDPE = Page Directory Pointer Entry - pagetable level 3 page And in the near future (for 48b addressing): PML4E = Page Map Level 4 Entry v2: Expanded information about Page Directory/Table nomenclature. Cc: Daniel Vetter <daniel@ffwll.ch> CC: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-02-13 23:28:11 +01:00
Tvrtko Ursulin	fe14d5f4e5	drm/i915: Infrastructure for supporting different GGTT views per object Things like reliable GGTT mappings and mirrored 2d-on-3d display will need to map objects into the same address space multiple times. Added a GGTT view concept and linked it with the VMA to distinguish between multiple instances per address space. New objects and GEM functions which do not take this new view as a parameter assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the previous behaviour. This now means that objects can have multiple VMA entries so the code which assumed there will only be one also had to be modified. Alternative GGTT views are supposed to borrow DMA addresses from obj->pages which is DMA mapped on first VMA instantiation and unmapped on the last one going away. v2: * Removed per view special casing in i915_gem_ggtt_prepare / finish_object in favour of creating and destroying DMA mappings on first VMA instantiation and last VMA destruction. (Daniel Vetter) * Simplified i915_vma_unbind which does not need to count the GGTT views. (Daniel Vetter) * Also moved obj->map_and_fenceable reset under the same check. * Checkpatch cleanups. v3: * Only retire objects once the last VMA is unbound. v4: * Keep scatter-gather table for alternative views persistent for the lifetime of the VMA. * Propagate binding errors to callers and handle appropriately. v5: * Explicitly look for normal GGTT view in i915_gem_obj_bound to align usage in i915_gem_object_ggtt_unpin. (Michel Thierry) * Change to single if statement in i915_gem_obj_to_ggtt. (Michel Thierry) * Removed stray semi-colon in i915_gem_object_set_cache_level. For: VIZ-4544 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Michel Thierry <michel.thierry@intel.com> [danvet: Drop hunk from i915_gem_shrink since it's just prettification but upsets a __must_check warning.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-15 11:25:04 +01:00
John Harrison	97b2a6a10a	drm/i915: Replace last_[rwf]_seqno with last_[rwf]_req The object structure contains the last read, write and fenced seqno values for use in syncrhonisation operations. These have now been replaced with their request structure counterparts. Note that to ensure that objects do not end up with dangling pointers, the assignments of last_*_req include reference count updates. Thus a request cannot be freed if an object is still hanging on to it for any reason. v2: Corrected 'last_rendering_' to 'last_read_' in a number of comments that did not get updated when 'last_rendering_seqno' became 'last_read\|write_seqno' several millenia ago. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:14 +01:00
Daniel Vetter	4feb765943	drm/i915: Remove user pinning code Now unused. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2014-12-03 09:35:11 +01:00
Daniel Vetter	f548c0e9d4	drm/i915: Can i915_gem_init_ioctl Found one more! With this we can clear up the ggtt init code a bit, yay! Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2014-11-20 13:03:31 +01:00
Tvrtko Ursulin	aff437667b	drm/i915: Move flags describing VMA mappings into the VMA If these flags are on the object level it will be more difficult to allow for multiple VMAs per object. v2: Simplification and cleanup after code review comments (Chris Wilson). Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-04 14:04:51 +01:00
McAulay, Alistair	6689c167ae	drm/i915: Rework GPU reset sequence to match driver load & thaw This patch is to address Daniels concerns over different code during reset: http://lists.freedesktop.org/archives/intel-gfx/2014-June/047758.html "The reason for aiming as hard as possible to use the exact same code for driver load, gpu reset and runtime pm/system resume is that we've simply seen too many bugs due to slight variations and unintended omissions." Tested using igt drv_hangman. V2: Cleaner way of preventing check_wedge returning -EAGAIN V3: Clean the last_context during reset, to ensure do_switch() does the MI_SET_CONTEXT. As per review. Signed-off-by: McAulay, Alistair <alistair.mcaulay@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Rebase over ctx->ppgtt rework and extend the comment in check_wedge a bit.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-03 10:54:09 +02:00
Daniel Vetter	90d0a0e8d0	drm/i915: Extract commmon global gtt cleanup code We want to move the aliasing ppgtt cleanup back into the global gtt cleanup code for symmetry, but first we need to create such a place. Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-13 14:23:35 +02:00
Daniel Vetter	82460d9724	drm/i915: Rework ppgtt init to no require an aliasing ppgtt Currently we abuse the aliasing ppgtt to set up the ppgtt support in general. Which is a bit backwards since with full ppgtt we don't ever need the aliasing ppgtt. So untangle this and separate the ppgtt init from the aliasing ppgtt. While at it drag it out of the context enabling (which just does a switch to the default context). Note that we still have the differentiation between synchronous and asynchronous ppgtt setup, but that will soon vanish. So also correctly wire up the return value handling to be prepared for when ->switch_mm drops the synchronous parameter and could start to fail. Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-13 14:23:31 +02:00
Daniel Vetter	6c5566a82c	drm/i915: Allow i915_gem_setup_global_gtt to fail We already needs this just as a safety check in case the preallocation reservation dance fails. But we definitely need this to be able to move tha aliasing ppgtt setup back out of the context code to this place, where it belongs. Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-13 14:23:30 +02:00
Daniel Vetter	4d884705da	drm/i915: Track file_priv, not ctx in the ppgtt structure Hardware contexts reference a ppgtt, not the other way round. And the only user of this (in debugfs) actually only cares about which file the ppgtt is associated with. So give it what it wants. While at it give the ppgtt create function a proper name&place. Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-13 14:23:28 +02:00
Daniel Vetter	ee960be7bb	drm/i915: Some cleanups for the ppgtt lifetime handling So when reviewing Michel's patch I've noticed a few things and cleaned them up: - The early checks in ppgtt_release are now redundant: The inactive list should always be empty now, so we can ditch these checks. Even for the aliasing ppgtt (though that's a different confusion) since we tear that down after all the objects are gone. - The ppgtt handling functions are splattered all over. Consolidate them in i915_gem_gtt.c, give them OCD prefixes and add wrappers for get/put. - There was a bit a confusion in ppgtt_release about whether it cares about the active or inactive list. It should care about them both, so augment the WARNINGs to check for both. There's still create_vm_for_ctx left to do, put that is blocked on the removal of ppgtt->ctx. Once that's done we can rename it to i915_ppgtt_create and move it to its siblings for handling ppgtts. v2: Move the ppgtt checks into the inline get/put functions as suggested by Chris. v3: Inline the now redundant ppgtt local variable. Cc: Michel Thierry <michel.thierry@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-12 15:24:04 +02:00
Jesse Barnes	692ef70c01	drm/i915: clean up PPGTT checking logic sanitize_enable_ppgtt is the function that checks all the conditions, honoring a forced ppgtt status or doing auto-detect as necessary. Just make sure it returns the right value in all cases and use that in the macros instead of the confusing intel_enable_ppgtt() function. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [danvet: Don't reenable full ppgtt through the backdoor.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-08 17:43:58 +02:00
Akash Goel	24f3a8cf77	drm/i915: Added write-enable pte bit supportt This adds support for a write-enable bit in the entry of GTT. This is handled via a read-only flag in the GEM buffer object which is then used to see how to set the bit when writing the GTT entries. Currently by default the Batch buffer & Ring buffers are marked as read only. v2: Moved the pte override code for read-only bit to 'byt_pte_encode'. (Chris) Fixed the issue of leaving 'gt_old_ro' as unused. (Chris) v3: Removed the 'gt_old_ro' field, now setting RO bit only for Ring Buffers(Daniel). v4: Added a new 'flags' parameter to all the pte(gen6) encode & insert_entries functions, in lieu of overloading the cache_level enum (Daniel). v5: Removed the superfluous VLV check & changed the definition location of PTE_READ_ONLY flag (Imre) Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-06-17 09:21:47 +02:00
Oscar Mateo	273497e5cd	drm/i915: s/i915_hw_context/intel_context Up until now, contexts had one (and only one) backing object that was used by the hardware to save/restore render ring contexts (via the MI_SET_CONTEXT command). Other rings did not have or need this, so our i915_hw_context struct had a 1:1 relationship with a a real HW context. With Logical Ring Contexts and Execlists, this is not possible anymore: all rings need a backing object, and it cannot be reused. To prepare for that, rename our contexts to the more generic term intel_context. No functional changes. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 23:41:17 +02:00
Oscar Mateo	a4872ba6d0	drm/i915: s/intel_ring_buffer/intel_engine_cs In the upcoming patches we plan to break the correlation between engine command streamers (a.k.a. rings) and ringbuffers, so it makes sense to refactor the code and make the change obvious. No functional changes. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 23:01:05 +02:00
Ville Syrjälä	ee0ce4784a	drm/i915/chv: PPAT setup for Cherryview Ignore the cache bits in PPAT and just set the snoop bit where appropriate. BDW WB is mapped to snooped access, while all other modes are mapped to non-snooped access. The hardware supposedly ignores everything except the snoop bit in the PPAT entries. Additionally the hardware actually enforces snooping for all page table accesses, and thus the snoop bit is ignored for PDEs. v2: Rebased on top of the bdw resume fix to reload the ppat entries. v3: Rebase on top of the i915_gem_gtt.h header extraction. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1) Acked-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-06 18:29:34 +02:00
Ben Widawsky	0260c42003	drm/i915: Split out GTT specific header file This file contains all necessary defines, prototypes and typesdefs for manipulating GEN graphics address translation (this does not include the legacy AGP driver) Reiterating the comment in the header, "Please try to maintain the following order within this file unless it makes sense to do otherwise. From top to bottom: 1. typedefs 2. #defines, and macros 3. structure definitions 4. function prototypes Within each section, please try to order by generation in ascending order, from top to bottom (ie. GEN6 on the top, GEN8 on the bottom)." I've made some minor cleanups, and fixed a couple of typos while here - but there should be no functional changes. The purpose of the patch is to reduce clutter in our main header file, making room for new growth, and make documentation of our interfaces easier by splitting things out. With a little more work, like making i915_gtt a pointer, we could potentially completely isolate this header from i915_drv.h. At the moment however, I don't think it's worth the effort. Personally, I would have liked to put the PTE encoding functions in this file too, but I didn't want to rock the boat too much. A similar patch has been in use on my machine for some time. This exact patch though has only been compile tested. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-01 22:58:07 +02:00

1 2 3 4 5

239 Commits