OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
John Harrison	74328ee510	drm/i915: Convert trace functions from seqno to request All the code above is now using requests not seqnos so it is possible to convert the trace functions across. Note that rather than get into problematic reference counting issues, the trace code only saves the seqno and ring values from the request structure not the structure pointer itself. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:21 +01:00
John Harrison	9400ae5c82	drm/i915: Remove obsolete seqno parameter from 'i915_add_request' There is no longer any need to retrieve a seqno value from an i915_add_request() call. The calling code already knows which request structure is being processed (it can only be ring->OLR). And as the request itself is now used in preference to the basic seqno value, the latter is now redundant in this situation. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:19 +01:00
John Harrison	6259cead57	drm/i915: Remove 'outstanding_lazy_seqno' The OLS value is now obsolete. Exactly the same value is guarateed to be always available as PLR->seqno. Thus it is safe to remove the OLS completely. And also to rename the PLR to OLR to keep the 'outstanding lazy ...' naming convention valid. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:16 +01:00
John Harrison	97b2a6a10a	drm/i915: Replace last_[rwf]_seqno with last_[rwf]_req The object structure contains the last read, write and fenced seqno values for use in syncrhonisation operations. These have now been replaced with their request structure counterparts. Note that to ensure that objects do not end up with dangling pointers, the assignments of last_*_req include reference count updates. Thus a request cannot be freed if an object is still hanging on to it for any reason. v2: Corrected 'last_rendering_' to 'last_read_' in a number of comments that did not get updated when 'last_rendering_seqno' became 'last_read\|write_seqno' several millenia ago. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:14 +01:00
Dave Airlie	26045b53c9	Merge tag 'drm-intel-next-2014-11-21-fixed' of git://anongit.freedesktop.org/drm-intel into drm-next drm-intel-next-2014-11-21: - infoframe tracking (for fastboot) from Jesse - start of the dri1/ums support removal - vlv forcewake timeout fixes (Imre) - bunch of patches to polish the rps code (Imre) and improve it on bdw (Tom O'Rourke) - on-demand pinning for execlist contexts - vlv/chv backlight improvements (Ville) - gen8+ render ctx w/a work from various people - skl edp programming (Satheeshakrishna et al.) - psr docbook (Rodrigo) - piles of little fixes and improvements all over, as usual * tag 'drm-intel-next-2014-11-21-fixed' of git://anongit.freedesktop.org/drm-intel: (117 commits) drm/i915: Don't pin LRC in GGTT when dumping in debugfs drm/i915: Update DRIVER_DATE to 20141121 drm/i915/g4x: fix g4x infoframe readout drm/i915: Only call mod_timer() if not already pending drm/i915: Don't rely upon encoder->type for infoframe hw state readout drm/i915: remove the IRQs enabled WARN from intel_disable_gt_powersave drm/i915: Use ggtt error obj capture helper for gen8 semaphores drm/i915: vlv: increase timeout when setting idle GPU freq drm/i915: vlv: fix cdclk setting during modeset while suspended drm/i915: Dump hdmi pipe_config state drm/i915: Gen9 shadowed registers drm/i915/skl: Gen9 multi-engine forcewake drm/i915: Read power well status before other registers for drpc info drm/i915: Pin tiled objects for L-shaped configs drm/i915: Update ring freq for full gpu freq range drm/i915: change initial rps frequency for gen8 drm/i915: Keep min freq above floor on HSW/BDW drm/i915: Use efficient frequency for HSW/BDW drm/i915: Can i915_gem_init_ioctl drm/i915: Sanitize ->lastclose ...	2014-12-03 08:25:59 +10:00
Thomas Hellstrom	355a701838	drm/gem: Warn on illegal use of the dumb buffer interface v2 It happens on occasion that developers of generic user-space applications abuse the dumb buffer API to get hold of drm buffers that they can both mmap() and use for GPU acceleration, using the assumptions that dumb buffers and buffers available for GPU are a) The same type and can be aribtrarily type-casted. b) fully coherent. This patch makes the most widely used drivers warn nicely when that happens, the next step will be to fail. v2: Move drmP.h changes to drm_gem.h. Fix Radeon dumb mmap breakage. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-11-21 12:12:41 +10:00
Daniel Vetter	8725548307	drm/i915: Ditch dev_priv->ums.mm_suspend Again just complicates gem init functions and makes a general mess out of everything. Good riddance! v2: In my enthusiasm to start removing dri1/ums crud I went overboard a bit and killed parts of hangcheck. Resurrect it. Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2014-11-20 13:02:57 +01:00
Chris Wilson	5c6c600354	drm/i915: Remove DRI1 ring accessors and API With the deprecation of UMS, and by association DRI1, we have a tough choice when updating the ring access routines. We either rewrite the DRI1 routines blindly without testing (so likely to be broken) or take the liberty of declaring them no longer supported and remove them entirely. This takes the latter approach. v2: Also remove the DRI1 sarea updates Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Fix rebase conflicts.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-19 21:17:11 +01:00
Chris Wilson	c826c44938	drm/i915: Request PIN_GLOBAL when pinning a vma for GTT relocations Always require PIN_GLOBAL when we want a mappable offset (PIN_MAPPABLE). This causes the pin to fixup the global binding in cases were the vma was already bound (and due to the proceeding bug, we considered it to be already mappable). References: https://bugs.freedesktop.org/show_bug.cgi?id=85671 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Add WARN_ON to check that PIN_MAP implies PIN_GLOBAL as discussed on irc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-07 18:42:00 +01:00
Brad Volkin	42c7156af9	drm/i915: Abort command parsing for chained batches libva uses chained batch buffers in a way that the command parser can't generally handle. Fortunately, libva doesn't need to write registers from batch buffers in the way that mesa does, so this patch causes the driver to fall back to non-secure dispatch if the parser detects a chained batch buffer. Note: The 2nd hunk to munge the error code of the parser looks a bit superflous. At least until we have the batch copy code ready and can run the cmd parser in granting mode. But it isn't since we still need to let existing libva buffers pass (though not with elevated privs ofc!). Testcase: igt/gem_exec_parse/chained-batch Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> [danvet: Add note - this confused me in review and Brad clarified things (after a few mails ...).] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-04 14:04:54 +01:00
Tvrtko Ursulin	aff437667b	drm/i915: Move flags describing VMA mappings into the VMA If these flags are on the object level it will be more difficult to allow for multiple VMAs per object. v2: Simplification and cleanup after code review comments (Chris Wilson). Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-04 14:04:51 +01:00
Daniel Vetter	ae6c480692	drm/i915: Only track real ppgtt for a context There's a bit a confusion since we track the global gtt, the aliasing and real ppgtt in the ctx->vm pointer. And not all callers really bother to check for the different cases and just presume that it points to a real ppgtt. Now looking closely we don't actually need ->vm to always point at an address space - the only place that cares actually has fixup code already to decide whether to look at the per-proces or the global address space. So switch to just tracking the ppgtt directly and ditch all the extraneous code. v2: Fixup the ppgtt debugfs file to not oops on a NULL ctx->ppgtt. Also drop the early exit - without aliasing ppgtt we want to dump all the ppgtts of the contexts if we have full ppgtt. v3: Actually git add the compile fix. Reviewed-by: Michel Thierry <michel.thierry@intel.com> Cc: "Thierry, Michel" <michel.thierry@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> OTC-Jira: VIZ-3724 [danvet: Resolve conflicts with execlist patches while applying.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-13 14:23:33 +02:00
Oscar Mateo	ba8b7ccb19	drm/i915/bdw: Workload submission mechanism for Execlists This is what i915_gem_do_execbuffer calls when it wants to execute some worload in an Execlists world. v2: Check arguments before doing stuff in intel_execlists_submission. Also, get rel_constants parsing right. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Drop the chipset flush, that's pre-gen6. And appease checkpatch a bit .... again!] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 23:18:38 +02:00
Oscar Mateo	a83014d3f8	drm/i915: Abstract the legacy workload submission mechanism away As suggested by Daniel Vetter. The idea, in subsequent patches, is to provide an alternative to these vfuncs for the Execlists submission mechanism. v2: Splitted into two and reordered to illustrate our intentions, instead of showing it off. Also, remove the add_request vfunc and added the stop_ring one. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: - Make checkpatch happy. - Be grumpy about the excessive vtable. - Ditch gt->is_ring_initialized.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 16:40:32 +02:00
Oscar Mateo	ec3e9963a6	drm/i915/bdw: Deferred creation of user-created LRCs The backing objects and ringbuffers for contexts created via open fd are actually empty until the user starts sending execbuffers to them. At that point, we allocate & populate them. We do this because, at create time, we really don't know which engine is going to be used with the context later on (and we don't want to waste memory on objects that we might never use). v2: As contexts created via ioctl can only be used with the render ring, we have enough information to allocate & populate them right away. v3: Defer the creation always, even with ioctl-created contexts, as requested by Daniel Vetter. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 16:25:58 +02:00
Chris Wilson	906843c3a1	drm/i915: Simplify relocate_entry_gtt() and make 64-bit safe Even though we should not try to use 4+GiB GTTs on 32-bit systems, by using a local variable we can future proof the code whilst making it easier to read. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Appease checkpatch a bit.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 14:16:04 +02:00
Chris Wilson	060e82c6f4	drm/i915: Remove redundant list_empty(eb->vmas) tests in execbuffer Part of the pre-validation for an execbuffer call is that there is at least one object in the execlist. As we bail if we fail to lookup any object, we can be sure that after the eb_lookup_vma() there is at least one object in the vma list and so we do not need to assert. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 14:15:03 +02:00
Chris Wilson	ad19f10bc2	drm/i915: Pre-validate the NEED_GTTS flag for execbuffer We have an implementation requirement that precludes the user from requesting a ggtt entry when the device is operating in ppgtt mode. Move the current check from inside the execbuffer object collation to the prevalidation phase. v2: Roll both invalid flags checks into one Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 14:15:02 +02:00
Daniel Vetter	da51a1e7e3	drm/i915: Fix secure dispatch with full ppgtt Based upon a hunk from a patch from Chris Wilson, but augmented to: - Process the batch in the full ppgtt vm so that self-relocations match again with userspace's expectations.. - Add a comment why plain pin for the global gtt binding is safe at that point. v2: Drop local bind_vm variable (Chris). v3: Explain why this works despite the lack of proper active tracking for the ggtt batch vma. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 13:49:57 +02:00
Chris Wilson	82b6b6d786	drm/i915: Remove fenced_gpu_access and pending_fenced_gpu_access This migrates the fence tracking onto the existing seqno infrastructure so that the later conversion to tracking via requests is simplified. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 12:20:25 +02:00
Chris Wilson	e6a844687c	drm/i915: Force CPU relocations if not GTT mapped Move the decision on whether we need to have a mappable object during execbuffer to the fore and then reuse that decision by propagating the flag through to reservation. As a corollary, before doing the actual relocation through the GTT, we can make sure that we do have a GTT mapping through which to operate. Note that the key to make this work is to ditch the obj->map_and_fenceable unbind optimization - with full ppgtt it doesn't make a lot of sense any more anyway. v2: Revamp and resend to ease future patches. v3: Refresh patch rationale References: https://bugs.freedesktop.org/show_bug.cgi?id=81094 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> [danvet: Explain why obj->map_and_fenceable is key and split out the secure batch fix.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 12:01:29 +02:00
Oscar Mateo	78382593e9	drm/i915: Extract the actual workload submission mechanism from execbuffer So that we isolate the legacy ringbuffer submission mechanism, which becomes a good candidate to be abstracted away. This is prep-work for Execlists (which will its own workload submission mechanism). No functional changes. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-08 12:31:06 +02:00
Oscar Mateo	821d66dd7c	drm/i915: Emphasize that ctx->id is merely a user handle This is an Execlists preparatory patch, since they make context ID become an overloaded term: - In the software, it was used to distinguish which context userspace was trying to use. - In the BSpec, the term is used to describe the 20-bits long field the hardware uses to it to discriminate the contexts that are submitted to the ELSP and inform the driver about their current status (via Context Switch Interrupts and Context Status Buffers). Initially, I tried to make the different meanings converge, but it proved impossible: - The software ctx->id is per-filp, while the hardware one needs to be globally unique. - Also, we multiplex several backing states objects per intel_context, and all of them need unique HW IDs. - I tried adding a per-filp ID and then composing the HW context ID as: ctx->id + file_priv->id + ring->id, but the fact that the hardware only uses 20-bits means we have to artificially limit the number of filps or contexts the userspace can create. The ctx->user_handle renaming bits are done with this Cocci patch (plus manual frobbing of the struct declaration): @@ struct intel_context c; @@ - (c).id + c.user_handle @@ struct intel_context *c; @@ - (c)->id + c->user_handle Also, while we are at it, s/DEFAULT_CONTEXT_ID/DEFAULT_CONTEXT_HANDLE and change the type to unsigned 32 bits. v2: s/handle/user_handle and change the type to uint32_t as suggested by Chris Wilson. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v1) Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-08 12:30:41 +02:00
Daniel Vetter	f99d70690e	drm/i915: Track frontbuffer invalidation/flushing So these are the guts of the new beast. This tracks when a frontbuffer gets invalidated (due to frontbuffer rendering) and hence should be constantly scaned out, and when it's flushed again and can be compressed/one-shot-upload. Rules for flushing are simple: The frontbuffer needs one more full upload starting from the next vblank. Which means that the flushing can _only_ be called once the frontbuffer update has been latched. But this poses a problem for pageflips: We can't just delay the flushing until the pageflip is latched, since that would pose the risk that we override frontbuffer rendering that has been scheduled in-between the pageflip ioctl and the actual latching. To handle this track asynchronous invalidations (and also pageflip) state per-ring and delay any in-between flushing until the rendering has completed. And also cancel any delayed flushing if we get a new invalidation request (whether delayed or not). Also call intel_mark_fb_busy in both cases in all cases to make sure that we keep the screen at the highest refresh rate both on flips, synchronous plane updates and for frontbuffer rendering. v2: Lots of improvements Suggestions from Chris: - Move invalidate/flush in flush__domain and set_to__domain. - Drop the flush in busy_ioctl since it's redundant. Was a leftover from an earlier concept to track flips/delayed flushes. - Don't forget about the initial modeset enable/final disable. Suggested by Chris. Track flips accurately, too. Since flips complete independently of rendering we need to track pending flips in a separate mask. Again if an invalidate happens we need to cancel the evenutal flush to avoid races. v3: Provide correct header declarations for flip functions. Currently not needed outside of intel_display.c, but part of the proper interface. v4: Add proper domain management to fbcon so that the fbcon buffer is also tracked correctly. v5: Fixup locking around the fbcon set_to_gtt_domain call. v6: More comments from Chris: - Split out fbcon changes. - Drop superflous checks for potential scanout before calling intel_fb functions - we can micro-optimize this later. - s/intel_fb_/intel_fb_obj_/ to make it clear that this deals in gem object. We already have precedence for fb_obj in the pin_and_fence functions. v7: Clarify the semantics of the flip flush handling by renaming things a bit: - Don't go through a gem object but take the relevant frontbuffer bits directly. These functions center on the plane, the actual object is irrelevant - even a flip to the same object as already active should cause a flush. - Add a new intel_frontbuffer_flip for synchronous plane updates. It currently just calls intel_frontbuffer_flush since the implemenation differs. This way we achieve a clear split between one-shot update events on one side and frontbuffer rendering with potentially a very long delay between the invalidate and flush. Chris and I also had some discussions about mark_busy and whether it is appropriate to call from flush. But mark busy is a state which should be derived from the 3 events (invalidate, flush, flip) we now have by the users, like psr does by tracking relevant information in psr.busy_frontbuffer_bits. DRRS (the only real use of mark_busy for frontbuffer) needs to have similar logic. With that the overall mark_busy in the core could be removed. v8: Only when retiring gpu buffers only flush frontbuffer bits we actually invalidated in a batch. Just for safety since before any additional usage/invalidate we should always retire current rendering. Suggested by Chris Wilson. v9: Actually use intel_frontbuffer_flip in all appropriate places. Spotted by Chris. v10: Address more comments from Chris: - Don't call _flip in set_base when the crtc is inactive, avoids redunancy in the modeset case with the initial enabling of all planes. - Add comments explaining that the initial/final plane enable/disable still has work left to do before it's fully generic. v11: Only invalidate for gtt/cpu access when writing. Spotted by Chris. v12: s/_flush/_flip/ in intel_overlay.c per Chris' comment. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-06-19 18:14:47 +02:00
Ville Syrjälä	d593d992a9	drm/i915: Fix __user sparse warning CHECK linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1529:47: warning: incorrect type in initializer (different address spaces) linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1529:47: expected struct drm_i915_gem_exec_object2 user_exec_list linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1529:47: got void [noderef] <asn:1> linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1533:61: warning: incorrect type in argument 1 (different address spaces) linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1533:61: expected void [noderef] <asn:1>dst linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1533:61: got unsigned long long <noident> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-06-13 17:45:29 +02:00
Dave Airlie	8d4ad9d4bb	Merge commit '9e9a928eed8796a0a1aaed7e0b676db86ba84594' into drm-next Merge drm-fixes into drm-next. Both i915 and radeon need this done for later patches. Conflicts: drivers/gpu/drm/drm_crtc_helper.c drivers/gpu/drm/i915/i915_drv.h drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/i915_gem_execbuffer.c drivers/gpu/drm/i915/i915_gem_gtt.c	2014-06-05 20:28:59 +10:00
Chris Wilson	d23db88c3a	drm/i915: Prevent negative relocation deltas from wrapping This is pure evil. Userspace, I'm looking at you SNA, repacks batch buffers on the fly after generation as they are being passed to the kernel for execution. These batches also contain self-referenced relocations as a single buffer encompasses the state commands, kernels, vertices and sampler. During generation the buffers are placed at known offsets within the full batch, and then the relocation deltas (as passed to the kernel) are tweaked as the batch is repacked into a smaller buffer. This means that userspace is passing negative relocations deltas, which subsequently wrap to large values if the batch is at a low address. The GPU hangs when it then tries to use the large value as a base for its address offsets, rather than wrapping back to the real value (as one would hope). As the GPU uses positive offsets from the base, we can treat the relocation address as the minimum address read by the GPU. For the upper bound, we trust that userspace will not read beyond the end of the buffer. So, how do we fix negative relocations from wrapping? We can either check that every relocation looks valid when we write it, and then position each object such that we prevent the offset wraparound, or we just special-case the self-referential behaviour of SNA and force all batches to be above 256k. Daniel prefers the latter approach. This fixes a GPU hang when it tries to use an address (relocation + offset) greater than the GTT size. The issue would occur quite easily with full-ppgtt as each fd gets its own VM space, so low offsets would often be handed out. However, with the rearrangement of the low GTT due to capturing the BIOS framebuffer, it is already affecting kernels 3.15 onwards. I think only IVB+ is susceptible to this bug, but the workaround should only kick in rarely, so it seems sensible to always apply it. v3: Use a bias for batch buffers to prevent small negative delta relocations from wrapping. v4 from Daniel: - s/BIAS/BATCH_OFFSET_BIAS/ - Extract eb_vma_misplaced/i915_vma_misplaced since the conditions were growing rather cumbersome. - Add a comment to eb_get_batch explaining why we do this. - Apply the batch offset bias everywhere but mention that we've only observed it on gen7 gpus. - Drop PIN_OFFSET_FIX for now, that slipped in from a feature patch. v5: Add static to eb_get_batch, spotted by 0-day tester. Testcase: igt/gem_bad_reloc Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78533 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3) Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-27 11:18:40 +03:00
Chris Wilson	9aab8bff7a	drm/i915: Only copy back the modified fields to userspace from execbuffer We only want to modifiy a single field in the userspace view of the execbuffer command buffer, so explicitly change that rather than copy everything back again. This serves two purposes: 1. The single fields are much cheaper to copy (constant size so the copy uses special case code) and much smaller than the whole array. 2. We modify the array for internal use that need to be masked from the user. Note: We need this backported since without it the next bugfix will blow up when userspace recycles batchbuffers and relocations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-27 11:18:39 +03:00
Oscar Mateo	273497e5cd	drm/i915: s/i915_hw_context/intel_context Up until now, contexts had one (and only one) backing object that was used by the hardware to save/restore render ring contexts (via the MI_SET_CONTEXT command). Other rings did not have or need this, so our i915_hw_context struct had a 1:1 relationship with a a real HW context. With Logical Ring Contexts and Execlists, this is not possible anymore: all rings need a backing object, and it cannot be reused. To prepare for that, rename our contexts to the more generic term intel_context. No functional changes. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 23:41:17 +02:00
Oscar Mateo	a4872ba6d0	drm/i915: s/intel_ring_buffer/intel_engine_cs In the upcoming patches we plan to break the correlation between engine command streamers (a.k.a. rings) and ringbuffers, so it makes sense to refactor the code and make the change obvious. No functional changes. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 23:01:05 +02:00
Daniel Vetter	bdf1e7e3db	drm/i915: move bsd dispatch index somewhere better Adding stuff at the bottom is really no how this should be done, since that's the place for ums/dri dungeons. This was added in commit `a8ebba75b3` Author: Zhao Yakui <yakui.zhao@intel.com> Date: Thu Apr 17 10:37:40 2014 +0800 drm/i915: Use the coarse ping-pong mechanism based on drm fd to dispatch the BSD command on BDW GT3 Also add a note to prevent this from happening again - people really should be less lazy and take more time to look for a good home of their new driver-global state. Cc: Imre Deak <imre.deak@intel.com> Cc: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 15:06:31 +02:00
Chris Wilson	227f782e46	drm/i915: Retire requests before creating a new one More fallout from commit `c8725f3dc0` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Mar 17 12:21:55 2014 +0000 drm/i915: Do not call retire_requests from wait_for_rendering is that we can completely fill all of memory using small objects, such that we exhaust the filp space, and spend all of our time evicting objects from the aperture. As such, we never fill the ring, and never trigger the last resort flushing in commit `1cf0ba1474` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon May 5 09:07:33 2014 +0100 drm/i915: Flush request queue when waiting for ring space and so all the requests are left active and the objects keep that last active reference. Eventually the system comes to a halt as it runs out of memory. The impact is mainly limited to test cases as regular userspace will trigger retirement by manually checking whether an object is active. Testcase: igt/gem_lut_handle Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78724 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Guo Jinxian <jinxianx.guo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-19 15:30:56 +02:00
Daniel Vetter	ffd93f2480	drm/i915: Work-around garbage DR4 from UXA Somehow UXA submits a completely bogus DR4 value since essentially forever. It was originally introduced in commit bade7d7d2505a10a8a7d24b084aff9742e2d6d64 Author: Eric Anholt <eric@anholt.net> Date: Fri Jun 6 14:03:25 2008 -0700 Use the DRM for submitting batchbuffers when available. and dutifully copied around ever since. Since we want to keep the general dirt catching around just special case the UXA value. This regression was introduced in commit `9cb346648d` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Apr 24 08:09:11 2014 +0200 drm/i915: Catch dirt in unused execbuffer fields Comment from Chris' review: "To be fair, it is a sensible value if one supposes a Region style API to cliprects. Under that API, DR[14] define the extents of the clip region, and ((0,0), (0,0)) [DR1==DR4==0] would mean all clipped, do not draw anything." v2: Pimp commit message a bit and remove the double space. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78494 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jörg Otte <jrg.otte@gmail.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-13 17:16:13 +02:00
Ben Widawsky	d9ceb957fd	drm/i915: Support 64b relocations All the rest of the code to enable this is in my branch. Without my branch, hitting > 32b offsets is impossible. The code has always "supported" 64b, but it's never actually been run of tested. This change doesn't actually fix anything. [1] I am not sure why X won't work yet. I do not get hangs or obvious errors. There are 3 fixes grouped together here. First is to remove the hardcoded 0 for the upper dword of the relocation. The next fix is to use a 64b value for target_offset. The final fix is to not directly apply target_offset to reloc->delta. reloc->delta is part of ABI, and so we cannot change it. As it stands, 32b is enough to represent everything we're interested in representing anyway. The main problem is, we cannot add greater than 32b values to it directly. [1] Almost all of intel-gpu-tools is not yet ready to test 64b relocations. There are a few places that expect 32b values for offsets and these all won't work. Cc: Rafael Barbalho <rafael.barbalho@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 16:04:23 +02:00
Ben Widawsky	9bcb144c83	drm/i915: Support 64b execbuf Previously, our code only had a 32b offset value for where the batchbuffer starts. With full PPGTT, and 64b canonical GPU address space, that is an insufficient value. The code to expand is pretty straight forward, and only one platform needs to do anything with the extra bits. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 16:01:58 +02:00
Chris Wilson	c8725f3dc0	drm/i915: Do not call retire_requests from wait_for_rendering A common issue we have is that retiring requests causes recursion through GTT manipulation or page table manipulation which we can only handle at very specific points. However, to maintain internal consistency (enforced through our sanity checks on write_domain at various points in the GEM object lifecycle) we do need to retire the object prior to marking it with a new write_domain, and also clear the write_domain for the implicit flush following a batch. Note that this then allows the unbound objects to still be on the active lists, and so care must be taken when removing objects from unbound lists (similar to the caveats we face processing the bound lists). v2: Fix i915_gem_shrink_all() to handle updated object lifetime rules, by refactoring it to call into __i915_gem_shrink(). v3: Missed an object-retire prior to changing cache domains in i915_gem_object_set_cache_leve() v4: Rebase Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:09:15 +02:00
Zhao Yakui	a8ebba75b3	drm/i915: Use the coarse ping-pong mechanism based on drm fd to dispatch the BSD command on BDW GT3 The BDW GT3 has two independent BSD rings, which can be used to process the video commands. To be simpler, it is transparent to user-space driver/middle. Instead the kernel driver will decide which ring is to dispatch the BSD video command. As every BSD ring is powerful, it is enough to dispatch the BSD video command based on the drm fd. In such case it can play back video stream while encoding another video stream. The coarse ping-pong mechanism is used to determine which BSD ring is used to dispatch the BSD video command. V1->V2: Follow Daniel's comment and use the simple ping-pong mechanism. This is only to add the support of dual BSD rings on BDW GT3 machine. The further optimization will be considered in another patch set. V2->V3: Follow Daniel's comment to use the struct_mutext instead of atomic_t during determining which ring can be used to dispatch Video command. Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:49 +02:00
Zhao Yakui	b1a93306ed	drm/i915: Update the restrict check to filter out wrong Ring ID passed by user-space Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:44 +02:00
Daniel Vetter	9cb346648d	drm/i915: Catch dirt in unused execbuffer fields We need to make sure that userspace keeps on following the contract, otherwise we won't be able to use the reserved fields at all. v2: Add DRM_DEBUG (Chris) Testcase: igt/gem_exec_params/*-dirt Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:43 +02:00
Daniel Vetter	c0f5b82cd1	drm/i915: Catch abuse of I915_EXEC_CONSTANTS_* A bit tricky since 0 is also a valid constant ... v2: Add DRM_DEBUG (Chris) Testcase: igt/gem_exec_params/rel-constants-* Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:42 +02:00
Daniel Vetter	9d662da8b6	drm/i915: Catch abuse of I915_EXEC_GEN7_SOL_RESET Currently we catch it, but silently succeed. Our userspace is better than this. v2: Add DRM_DEBUG (Chris) Testcase: igt/gem_exec_params/sol-reset-* Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:41 +02:00
Dave Airlie	885ac04ab3	Merge tag 'drm-intel-next-2014-04-16' of git://anongit.freedesktop.org/drm-intel into drm-next drm-intel-next-2014-04-16: - vlv infoframe fixes from Jesse - dsi/mipi fixes from Shobhit - gen8 pageflip fixes for LRI/SRM from Damien - cmd parser fixes from Brad Volkin - some prep patches for CHV, DRRS, ... - and tons of little things all over drm-intel-next-2014-04-04: - cmd parser for gen7 but only in enforcing and not yet granting mode - the batch copying stuff is still missing. Also performance is a bit ... rough (Brad Volkin + OACONTROL fix from Ken). - deprecate UMS harder (i.e. CONFIG_BROKEN) - interrupt rework from Paulo Zanoni - runtime PM support for bdw and snb, again from Paulo - a pile of refactorings from various people all over the place to prep for new stuff (irq reworks, power domain polish, ...) drm-intel-next-2014-04-04: - cmd parser for gen7 but only in enforcing and not yet granting mode - the batch copying stuff is still missing. Also performance is a bit ... rough (Brad Volkin + OACONTROL fix from Ken). - deprecate UMS harder (i.e. CONFIG_BROKEN) - interrupt rework from Paulo Zanoni - runtime PM support for bdw and snb, again from Paulo - a pile of refactorings from various people all over the place to prep for new stuff (irq reworks, power domain polish, ...) Conflicts: drivers/gpu/drm/i915/i915_gem_context.c	2014-05-01 09:11:37 +10:00
Chris Wilson	691e6415c8	drm/i915: Always use kref tracking for all contexts. If we always initialize kref for the context, even if we are using fake contexts for hangstats when there is no hw support, we can forgo the dance to dereference the ctx->obj and inspect whether we are permitted to use kref inside i915_gem_context_reference() and _unreference(). My ulterior motive here is to improve the debugging of a use-after-free of ctx->obj. This patch avoids the dereference here and instead forces the assertion checks associated with kref. v2: Refactor the fake contexts to being even more like the real contexts, so that there is much less duplicated and special case code. v3: Tweaks. v4: Tweaks, minor. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76671 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: lu hua <huax.lu@intel.com> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> [Jani: tiny change to backport to drm-intel-fixes.] Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2014-04-11 13:29:51 +03:00
Ben Widawsky	935f38d694	drm/i915: Unref context on failed eb_create I opted to do this instead of grabbing the context reference after eb_create since eb_create can potentially call the shrinker, and that makes things very complicated. This simple patch balances the ref count without requiring a great deal of review to make sure the shrinker path is safe. Theoretically (by design) the shrinker can end up destroying a context, which enforces the reasoning for doing the fix this way instead of moving the reference to later in the function. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-09 14:37:11 +02:00
Jani Nikula	50227e1cae	drm/i915: prefer struct drm_i915_private to drm_i915_private_t Remove the rest of the references to drm_i915_private_t. No functional changes. Signed-off-by: Jani Nikula <jani.nikula@intel.com> [danvet: Drop hunk in i915_cmd_parser.c] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-31 15:34:21 +02:00
Brad Volkin	351e3db2b3	drm/i915: Implement command buffer parsing logic The command parser scans batch buffers submitted via execbuffer ioctls before the driver submits them to hardware. At a high level, it looks for several things: 1) Commands which are explicitly defined as privileged or which should only be used by the kernel driver. The parser generally rejects such commands, with the provision that it may allow some from the drm master process. 2) Commands which access registers. To support correct/enhanced userspace functionality, particularly certain OpenGL extensions, the parser provides a whitelist of registers which userspace may safely access (for both normal and drm master processes). 3) Commands which access privileged memory (i.e. GGTT, HWS page, etc). The parser always rejects such commands. See the overview comment in the source for more details. This patch only implements the logic. Subsequent patches will build the tables that drive the parser. v2: Don't set the secure bit if the parser succeeds Fail harder during init Makefile cleanup Kerneldoc cleanup Clarify module param description Convert ints to bools in a few places Move client/subclient defs to i915_reg.h Remove the bits_count field OTC-Tracker: AXIA-4631 Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: Appease checkpatch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-07 22:37:00 +01:00
Daniel Vetter	8ea99c9287	drm/i915: Only bind each object rather than for every execbuffer One side-effect of the introduction of ppgtt was that we needed to rebind the object into the appropriate vm (and global gtt in some peculiar cases). For simplicity this was done twice for every object on every call to execbuffer. However, that adds a tremendous amount of CPU overhead (rewriting all the PTE for all objects into WC memory) per draw. The fix is to push all the decision about which vm to bind into and when down into the low-level bind routines through hints rather than performing the bind unconditionally in the execbuffer routine. Note that this is a regression introduced in the full ppgtt feature branch, before this we've only done re-bound objects when the relevant has_(aliasing_ppgtt\|global_gtt)_mapping flag was clear. But since that's per-object and not per-vma that optimization broke. v2: Split out prep work and unrelated changes. v3: Bring back functional change around PIN_GLOBAL that I've accidentally split out. v4: Remove the temporary hack for the old binding logic to avoid bisection issues. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72906 Tested-by: jianx.zhou@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:18:38 +01:00
Daniel Vetter	bf3d149b25	drm/i915: split PIN_GLOBAL out from PIN_MAPPABLE With abitrary pin flags it makes sense to split out a "please bind this into global gtt" from the "please allocate in the mappable range". Use this unconditionally in our global gtt pin helper since this is what its callers want. Later patches will drop PIN_MAPPABLE where it's not strictly needed. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:17:27 +01:00
Daniel Vetter	1ec9e26dda	drm/i915: Consolidate binding parameters into flags Anything more than just one bool parameter is just a pain to read, symbolic constants are much better. Split out from Chris' vma-binding rework patch. v2: Undo the behaviour change in object_pin that Chris spotted. v3: Split out misplaced hunk to handle set_cache_level errors, spotted by Jani. v4: Keep the current over-zealous binding logic in the execbuffer code working with a quick hack while the overall binding code gets shuffled around. v5: Reorder the PIN_ flags for more natural patch splitup. v6: Pull out the PIN_GLOBAL split-up again. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:16:58 +01:00
Jani Nikula	d330a9530c	drm/i915: move module parameters into a struct, in a new file With 20+ module parameters, I think referring to them via a struct improves clarity over just having a bunch of globals. While at it, move the parameter initialization and definitions into a new file i915_params.c to reduce clutter in i915_drv.c. Apart from the ill-named i915_enable_rc6, i915_enable_fbc and i915_enable_ppgtt parameters, for which we lose the "i915_" prefix internally, the module parameters now look the same both on the kernel command line and in code. For example, "i915.modeset". The downsides of the change are losing static on a couple of variables and not having the initialization and module_param_named() right next to each other. On the other hand, all module parameters are now defined in one place at i915_params.c. Plus you can do this to find all module parameter references: $ git grep "i915\." -- drivers/gpu/drm/i915 v2: - move the definitions into a new file - s/i915_params/i915/ - make i915_try_reset i915.reset, for consistency Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-27 17:16:45 +01:00
Rodrigo Vivi	a25eebb0af	drm: dp helper: Add DP test sink CRC definition. This address will be used to verify panel CRC for test and validation purposes. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> [danvet: Fix whitespace fail.] Acked-by: Dave Airlie <airlied@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-27 09:55:23 +01:00
Daniel Vetter	0e5539b923	Merge branch 'topic/ppgtt' into drm-intel-next-queued Because whatever.* * This should contain a fairly long list of issues and still unresolved resgressions, but I didn't really get a vote. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-25 21:14:57 +01:00
Ben Widawsky	8b78f0e588	drm/i915: Clarify relocation errnos While trying to find a random -EINVAL from a failing test, I noticed we had a few hard to follow return values. The first two hunks in this patch replace completely useless initialization of ret. The last several hunks help to distinguish between altering 'return ret' and 'return <ERROR>' Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-22 09:58:25 +01:00
Daniel Vetter	0d9d349d87	Merge commit origin/master into drm-intel-next Conflicts are getting out of hand, and now we have to shuffle even more in -next which was also shuffled in -fixes (the call for drm_mode_config_reset needs to move yet again). So do a proper backmerge. I wanted to wait with this for the 3.13 relaese, but alas let's just do this now. Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_ddi.c drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_pm.c Besides the conflict around the forcewake get/put (where we chaged the called function in -fixes and added a new parameter in -next) code all the current conflicts are of the adjacent lines changed type. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-16 22:06:30 +01:00
Ben Widawsky	72ad5c45f0	drm/i915/ppgtt: Fix ioctl errno for "no such context" Without this fix the ioctls silently succeeded (but actually did nothing). It makes all the code which calls into this function way too confusing. v2: Fix destroy IOCTL as well v3: Clarify the other two callers of i915_gem_context_get() to never check for NULL. (Mika) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72903 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Testcase: igt/gem_ctx_exec/basic [danvet: Fix up the commit message and actually bother to mention the testcase this fixes.] Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-07 08:50:11 +01:00
Daniel Vetter	a7c1d426ef	drm/i915: Don't check for NEEDS_GTT when deciding the address space This means something different and is only relevant for gen6 and the reason why we cant use anything else than aliasing ppgtt there. Note that the currently implemented logic for secure batches is broken: Userspace wants the buffer both in ppgtt (for self-referencing relocations) and in ggtt (for priveledge operations). This is the same issue the command parser is also facing. Unfortunately our coverage for corner-cases of self-referencing batches is spotty. Note that this will break vsync'ed Xv and DRI2 copies. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 17:50:40 +01:00
Daniel Vetter	2c9f8d56a1	drm/i915: Reject NEEDS_GTT relocations with full ppgtt Doesn't make sense. Spotted while fixing an issue Chris noticed in the same area. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 17:50:39 +01:00
Daniel Vetter	7c9c4b8f5d	drm/i915: Reject non-default contexts on non-render again This reverts the abi-change from commit `67e3d2979b` Author: Ben Widawsky <ben@bwidawsk.net> Date: Fri Dec 6 14:11:01 2013 -0800 drm/i915: Permit contexts on all rings We don't actually need this, only the internal changes to allow contexts on all rings for the purpose of ppgtt switching are required. And I'm not sure whether this is the right thing to do given some of the hw features in the pipeline. Also, new abi needs userspace patches as a proof-of-need, which is completely lacking here. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 17:50:38 +01:00
Ben Widawsky	7e0d96bc03	drm/i915: Use multiple VMs -- the point of no return As with processes which run on the CPU, the goal of multiple VMs is to provide process isolation. Specific to GEN, there is also the ability to map more objects per process (2GB each instead of 2Gb-2k total). For the most part, all the pipes have been laid, and all we need to do is remove asserts and actually start changing address spaces with the context switch. Since prior to this we've converted the setting of the page tables to a streamed version, this is quite easy. One important thing to point out (since it'd been hotly contested) is that with this patch, every context created will have it's own address space (provided the HW can do it). v2: Disable BDW on rebase NOTE: I tried to make this commit as small as possible. I needed one place where I could "turn everything on" and that is here. It could be split into finer commits, but I didn't really see much point. Cc: Eric Anholt <eric@anholt.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 16:24:52 +01:00
Daniel Vetter	3d7f0f9dcc	Merge commit drm-intel-fixes into topic/ppgtt I need the tricky do_switch fix before I can merge the final piece of the ppgtt enabling puzzle. Otherwise the conflict will be a real pain to resolve since the do_switch hunk from -fixes must be placed at the exact right place within a hunk in the next patch. Conflicts: drivers/gpu/drm/i915/i915_gem_context.c drivers/gpu/drm/i915/i915_gem_execbuffer.c drivers/gpu/drm/i915/intel_display.c Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 16:23:37 +01:00
Ben Widawsky	41bde5535a	drm/i915: Get context early in execbuf We need to have the address space when reserving space for the objects. Since the address space and context are tied together, and reserve occurs before context switch (for good reason), we must lookup our context earlier in the process. This leaves some room for optimizations where we no longer need to use ctx_id in certain places. This will be addressed in a subsequent patch. Important tricky bit: Because slow relocations during execbuffer drop struct_mutex Perhaps it would be best to acquire the reference when we get the context, but I'll save that for another day (note I have written the patch before, and I found the changes required to be uglier than this). Note that since we currently access everything via context id, and not the data structure this is fine, though not desirable. The next change attempts to get the context only once via the context ID idr lookup, and as such, the following can happen: CTX-A is created, refcount = 1 CTX-A execbuf, mutex dropped close IOCTL called on CTX-A, refcount = 0 CTX-A resumes in execbuf. v2: Rebased on top of commit `b6359918b8` Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Wed Oct 30 15:44:16 2013 +0200 drm/i915: add i915_get_reset_stats_ioctl v3: Rebased on top of commit `25b3dfc87b` Author: Mika Westerberg <mika.westerberg@linux.intel.com> Date: Tue Nov 12 11:57:30 2013 +0200 Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Tue Nov 26 16:14:33 2013 +0200 drm/i915: check context reset stats before relocations Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:52:42 +01:00
Ben Widawsky	67e3d2979b	drm/i915: Permit contexts on all rings If we want to use contexts in more abstract terms (specifically with PPGTT in mind), we need to allow them to be specified for any ring. Since the upcoming patches will bring about the use of multiple address spaces, and each ring needs to have an address space programmed (which we intend to do at context switch time), we can no longer only use RCS. With multiple rings having a last context, we must now unreference these contexts. NOTE: This commit requires an update to intel-gpu-tools to make it not fail. v2: Rebased with some logical conflicts. Squashed in the context fini refcount patch Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:53 +01:00
Ben Widawsky	ca01b12b40	drm/i915: Simplify ring handling in execbuf Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:52 +01:00
Ben Widawsky	3e7a032295	drm/i915: Remove vm arg from relocate entry The only place we were using it was for GEN6, which won't have PPGTT support anyway (ie. the VM is always the same). To clear things up, (it only added confusion for me since it doesn't allow us to assert vma->vm is what we always want, when just looking at the code). Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:50 +01:00
Ben Widawsky	6f65e29aca	drm/i915: Create bind/unbind abstraction for VMAs To sum up what goes on here, we abstract the vma binding, similarly to the previous object binding. This helps for distinguishing legacy binding, versus modern binding. To keep the code churn as minimal as possible, I am leaving in insert_entries(). It serves as the per platform pte writing basically. bind_vma and insert_entries do share a lot of similarities, and I did have designs to combine the two, but as mentioned already... too much churn in an already massive patchset. What follows are the 3 commits which existed discretely in the original submissions. Upon rebasing on Broadwell support, it became clear that separation was not good, and only made for more error prone code. Below are the 3 commit messages with all their history. drm/i915: Add bind/unbind object functions to VMA drm/i915: Use the new vm [un]bind functions drm/i915: reduce vm->insert_entries() usage drm/i915: Add bind/unbind object functions to VMA As we plumb the code with more VM information, it has become more obvious that the easiest way to deal with bind and unbind is to simply put the function pointers in the vm, and let those choose the correct way to handle the page table updates. This change allows many places in the code to simply be vm->bind, and not have to worry about distinguishing PPGTT vs GGTT. Notice that this patch has no impact on functionality. I've decided to save the actual change until the next patch because I think it's easier to review that way. I'm happy to squash the two, or let Daniel do it on merge. v2: Make ggtt handle the quirky aliasing ppgtt Add flags to bind object to support above Don't ever call bind/unbind directly for PPGTT until we have real, full PPGTT (use NULLs to assert this) Make sure we rebind the ggtt if there already is a ggtt binding. This happens on set cache levels. Use VMA for bind/unbind (Daniel, Ben) v3: Reorganize ggtt_vma_bind to be more concise and easier to read (Ville). Change logic in unbind to only unbind ggtt when there is a global mapping, and to remove a redundant check if the aliasing ppgtt exists. v4: Make the bind function a bit smarter about the cache levels to avoid unnecessary multiple remaps. "I accept it is a wart, I think unifying the pin_vma / bind_vma could be unified later" (Chris) Removed the git notes, and put version info here. (Daniel) v5: Update the comment to not suck (Chris) v6: Move bind/unbind to the VMA. It makes more sense in the VMA structure (always has, but I was previously lazy). With this change, it will allow us to keep a distinct insert_entries. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> drm/i915: Use the new vm [un]bind functions Building on the last patch which created the new function pointers in the VM for bind/unbind, here we actually put those new function pointers to use. Split out as a separate patch to aid in review. I'm fine with squashing into the previous patch if people request it. v2: Updated to address the smart ggtt which can do aliasing as needed Make sure we bind to global gtt when mappable and fenceable. I thought we could get away without this initialy, but we cannot. v3: Make the global GTT binding explicitly use the ggtt VM for bind_vma(). While at it, use the new ggtt_vma helper (Chris) At this point the original mailing list thread diverges. ie. v4^: use target_obj instead of obj for gen6 relocate_entry vma->bind_vma() can be called safely during pin. So simply do that instead of the complicated conditionals. Don't restore PPGTT bound objects on resume path Bug fix in resume path for globally bound Bos Properly handle secure dispatch Rebased on vma bind/unbind conversion Signed-off-by: Ben Widawsky <ben@bwidawsk.net> drm/i915: reduce vm->insert_entries() usage FKA: drm/i915: eliminate vm->insert_entries() With bind/unbind function pointers in place, we no longer need insert_entries. We could, and want, to remove clear_range, however it's not totally easy at this point. Since it's used in a couple of place still that don't only deal in objects: setup, ppgtt init, and restore gtt mappings. v2: Don't actually remove insert_entries, just limit its usage. It will be useful when we introduce gen8. It will always be called from the vma bind/unbind. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:50 +01:00
Ben Widawsky	d7f46fc4e7	drm/i915: Make pin count per VMA Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:49 +01:00
Chris Wilson	9ae9ab5220	drm/i915: Prevent double unref following alloc failure during execbuffer Whilst looking up the objects required for an execbuffer, an untimely allocation failure in creating the vma results in the object being unreferenced from two lists. The ownership during the lookup is meant to be moved from the list of objects being looked to the vma, and this double unreference upon error results in a use-after-free. Fixes regression from commit `27173f1f95` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Aug 14 11:38:36 2013 +0200 drm/i915: Convert execbuf code to use vmas Based on the fix by Ben Widawsky. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: stable@vger.kernel.org [danvet: Bikeshed the crucial comment above the ownership transfer as discussed on irc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-12 10:44:57 +01:00
Paulo Zanoni	f65c916898	drm/i915: add runtime put/get calls at the basic places If I add code to enable runtime PM on my Haswell machine, start a desktop environment, then enable runtime PM, these functions will complain that they're trying to read/write registers while the graphics card is suspended. v2: - Simplify i915_gem_fault changes. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> [danvet: Drop the hunk in i915_hangcheck_elapsed, it's the wrong thing to do.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-10 22:47:33 +01:00
Mika Kuoppala	d299cce76e	drm/i915: check context reset stats before relocations Doing it early prevents moving and relocating objects in vain for contexts that won't get any GPU time. Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-04 13:20:31 +01:00
Chris Wilson	a415d35564	drm/i915: Pin relocations for the duration of constructing the execbuffer As the execbuffer dispatch grows ever more complex and involves multiple stages of moving objects into the aperture, we need to take greater care that we do not evict our execbuffer objects prior to dispatch. This is relatively simple as we can just keep the objects pinned for not just the relocation but until we are finished. One such example is the possibility of the context switch causing an eviction or hitting the shrinker in order to fit its object into the aperture. Link: http://lists.freedesktop.org/archives/intel-gfx/2013-November/036166.html Reported-by: "Siluvery, Arun" <arun.siluvery@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: stable@vger.kernel.org Acked-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Add the additional explanations from Chris to the commit message.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-27 09:04:36 +01:00
Ben Widawsky	5ce097254e	drm/i915: Missed dropped VMA conversion This belonged in commit `07fe0b1280` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Jul 31 17:00:10 2013 -0700 drm/i915: plumb VM into bind/unbind code But it was somehow missed along the way. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-26 10:12:32 +01:00
Ben Widawsky	17601cbc93	drm/i915: Removed unused vm args i915_gem_execbuffer_relocate became defunct in: commit `27173f1f95` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Aug 14 11:38:36 2013 +0200 drm/i915: Convert execbuf code to use vmas eb_create: never used? Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: The lingering vm parameter to eb_create might have been back from the days where we didn't yet keep both vmas and obj lists in the eb struct. But I didn't check really.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-26 10:11:21 +01:00
Ben Widawsky	28cf541543	drm/i915/bdw: unleash PPGTT v2: Squash in fix from Ben: Set PPGTT batches as necessary This fixes the regression in the last couple of days when we enabled PPGTT. v3: Squash in fixup to still use GTT for secure batches from Ville: BDW doesn't have a separate secure vs. non-secure bit in MI_BATCH_BUFFER_START. So for secure batches we have to simply leave the PPGTT bit unset. Fortunately older generations (except HSW) had similar limitations so execbuffer already creates a GTT mapping for all secure batches. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:48 +01:00
Ben Widawsky	3c94ceeee2	drm/i915/bdw: Support 64b relocations We don't actually return any to userspace yet, however we can pretend like we do now so userspace will support it when it happens. This is just to please Chris as the code itself isn't ready for > 64b relocations. v2: Rebase on top of the refactored relocate_entry_gtt\|cpu functions. v3: Squash in fixup from Rafal Barbalho for 64 byte relocs using cpu relocs and those crossing a page boundary. v4: Squash in a fixup for the fixup from Rafael. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Barbalho, Rafael <rafael.barbalho@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:41 +01:00
Ben Widawsky	e2d05a8b1e	drm/i915: Convert active API to VMA Even though we track object activity and not VMA, because we have the active_list be based on the VM, it makes the most sense to use VMAs in the APIs. NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for now, leave them be. v2: Remove leftover hunk from the previous patch which didn't keep i915_gem_object_move_to_active. That patch had to rely on the ring to get the dev instead of the obj. (Chris) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:21 +02:00
Daniel Vetter	b205ca5721	drm/i915: Use unsigned for overflow checks in execbuf There's actually no real risk since we already check for stricter constraints earlier (using UINT_MAX / sizeof (struct drm_i915_gem_exec_object2) as the limit). But in eb_create we use signed integers, which steals a factor of 2. Luckily struct drm_i915_gem_exec_object2 for this to not matter. Still, be consistent and use unsigned integers. Similar use unsinged integers when checking for overflows in the relocation entry processing. I've also added a new subtests to igt/gem_reloc_overflow to also test for overflowing args->buffer_count values. v2: Give the variables again tighter scope to make it clear that the computation is purely local and doesn't leak out to the 2nd block. Requested by Chris Wilson. Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:03 +02:00
Daniel Vetter	a1e2265332	drm/i915: Use kcalloc more No buffer overflows here, but better safe than sorry. v2: - Fixup the sizeof conversion, I've missed the pointer deref (Jani). - Drop the redundant GFP_ZERO, kcalloc alreads memsets (Jani). - Use kmalloc_array for the execbuf fastpath to avoid the memset (Chris). I've opted to leave all other conversions as-is since they aren't in a fastpath and dealing with cleared memory instead of random garbage is just generally nicer. Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: Drop the contentious kmalloc_array hunk in execbuf.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:01 +02:00
Ben Widawsky	68c8c17f52	drm/i915: evict VM instead of everything When reserving objects during execbuf, it is possible to come across an object which will not fit given the current fragmentation of the address space. We do not have any defragment in drm_mm, so the strategy is to instead evict everything, and reallocate objects. With the upcoming addition of multiple VMs, there is no point to evict everything since doing so is overkill for the specific case mentioned above. Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: One additional s/evict_everything/evict_vm/ to update a comment in the code.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-12 21:58:22 +02:00
Mika Kuoppala	be62acb4cc	drm/i915: ban badly behaving contexts Now when we have mechanism in place to track which context was guilty of hanging the gpu, it is possible to punish for bad behaviour. If context has recently submitted a faulty batchbuffers guilty of gpu hang and submits another batch which hangs gpu in quick succession, ban it permanently. If ctx is banned, no more batchbuffers will be queued for execution. There is no need for global wedge machinery anymore and it would be unwise to wedge the whole gpu if we have multiple hanging batches queued for execution. Instead just ban the guilty ones and carry on. v2: Store guilty ban status bool in gpu_error instead of pointers that might become danling before hang is declared. v3: Use return value for banned status instead of stashing state into gpu_error (Chris Wilson) v4: - rebase on top of fixed hang stats api - add define for ban period - rename commit and improve commit msg v5: - rely context banning instead of wedging the gpu - beautification and fix for ban calculation (Chris) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-06 17:55:50 +02:00
Chris Wilson	2cc86b8260	drm/i915: Always prefer CPU relocations with LLC A follow-on to the update of the LLC coherency logic is that we can rely on the LLC being coherent with the CS for rewriting batchbuffers irrespective of their cache domain. (This should have no effect currently as all the batch buffers are expected to be I915_CACHE_LLC and so using the cpu relocation path anyway.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:43 +02:00
Daniel Vetter	e656a6cba0	drm/i915: inline vma_create into lookup_or_create_vma In the execbuf code we don't clean up any vmas which ended up not getting bound for code simplicity. To make sure that we don't end up creating multiple vma for the same vm kill the somewhat dangerous vma_create function and inline it into lookup_or_create. This is just a safety measure to prevent surprises in the future. Also update the somewhat confused comment in the execbuf code and clarify what kind of magic is going on with a new one. v2: Keep the function separate as requested by Chris. But give it a __ prefix for paranoia and move it tighter together with the other vma stuff. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:41 +02:00
Ben Widawsky	27173f1f95	drm/i915: Convert execbuf code to use vmas In order to transition more of our code over to using a VMA instead of an <OBJ, VM> pair - we must have the vma accessible at execbuf time. Up until now, we've only had a VMA when actually binding an object. The previous patch helped handle the distinction on bound vs. unbound. This patch will help us catch leaks, and other issues before we actually shuffle a bunch of stuff around. This attempts to convert all the execbuf code to speak in vmas. Since the execbuf code is very self contained it was a nice isolated conversion. The meat of the code is about turning eb_objects into eb_vma, and then wiring up the rest of the code to use vmas instead of obj, vm pairs. Unfortunately, to do this, we must move the exec_list link from the obj structure. This list is reused in the eviction code, so we must also modify the eviction code to make this work. WARNING: This patch makes an already hotly profiled path slower. The cost is unavoidable. In reply to this mail, I will attach the extra data. v2: Release table lock early, and two a 2 phase vma lookup to avoid having to use a GFP_ATOMIC. (Chris) v3: s/obj_exec_list/obj_exec_link/ Updates to address commit `6d2b888569` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Aug 7 18:30:54 2013 +0100 drm/i915: List objects allocated from stolen memory in debugfs v4: Use obj = vma->obj for neatness in some places (Chris) need_reloc_mappable() should return false if ppgtt (Chris) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Split out prep patches. Also remove a FIXME comment which is now taken care of.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:41 +02:00
Daniel Vetter	d4d36014ca	drm/i915: fix up the relocate_entry refactoring Somehow we've lost the error handling in the patch split-up between the internal and external patch. This regression has been introduced in commit `5032d871f7` Author: Rafael Barbalho <rafael.barbalho@intel.com> Date: Wed Aug 21 17:10:51 2013 +0100 drm/i915: Cleaning up the relocate entry function This bug is exercised by igt/gem_reloc_vs_gpu/interruptible. Cc: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-03 19:18:01 +02:00
Rafael Barbalho	5032d871f7	drm/i915: Cleaning up the relocate entry function As the relocate entry function was getting a bit too big I've moved the code that used to use either the cpu or the gtt to for the relocation into two separate functions. Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-23 14:52:31 +02:00
Chris Wilson	000433b67e	drm/i915: Only do a chipset flush after a clflush Now that we skip clflushes more often, return a boolean indicating whether the clflush was actually performed, and only if it was do the chipset flush. (Though on most of the architectures where the clflush will be skipped, the chipset flush is a no-op!) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:34 +02:00
Chris Wilson	2c22569bba	drm/i915: Update rules for writing through the LLC with the cpu As mentioned in the previous commit, reads and writes from both the CPU and GPU go through the LLC. This gives us coherency between the CPU and GPU irrespective of the attribute settings either device sets. We can use to avoid having to clflush even uncached memory. Except for the scanout. The scanout resides within another functional block that does not use the LLC but reads directly from main memory. So in order to maintain coherency with the scanout, writes to uncached memory must be flushed. In order to optimize writes elsewhere, we start tracking whether an framebuffer is attached to an object. v2: Use pin_display tracking rather than fb_count (to ensure we flush cursors as well etc) and only force the clflush along explicit writes to the scanout paths (i.e. pin_to_display_plane and pwrite into scanout). v3: Force the flush after hitting the slowpath in pwrite, as after dropping the lock the object's cache domain may be invalidated. (Ville) Based on a patch by Ville Syrjälä. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-10 11:20:49 +02:00
Ben Widawsky	ca191b1313	drm/i915: mm_list is per VMA formerly: "drm/i915: Create VMAs (part 5) - move mm_list" The mm_list is used for the active/inactive LRUs. Since those LRUs are per address space, the link should be per VMx . Because we'll only ever have 1 VMA before this point, it's not incorrect to defer this change until this point in the patch series, and doing it here makes the change much easier to understand. Shamelessly manipulated out of Daniel: "active/inactive stuff is used by eviction when we run out of address space, so needs to be per-vma and per-address space. Bound/unbound otoh is used by the shrinker which only cares about the amount of memory used and not one bit about in which address space this memory is all used in. Of course to actual kick out an object we need to unbind it from every address space, but for that we have the per-object list of vmas." v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris) v3: Moved earlier in the series v4: Add dropped message from v3 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Frob patch to apply and use vma->node.size directly as discused with Ben. Also drop a needles BUG_ON before move_to_inactive, the function itself has the same check.] [danvet 2nd: Rebase on top of the lost "drm/i915: Cleanup more of VMA in destroy", specifically unlink the vma from the mm_list in vma_unbind (to keep it symmetric with bind_to_vm) instead of vma_destroy.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-08 14:06:58 +02:00
Ben Widawsky	9843877d10	drm/i915: turn bound_ggtt checks to bound_any In some places, we want to know if an object is bound in any address space, and not just the global GTT. This often applies when there is a single global resource (object, pages, etc.) function \| reason -------------------------------------------------- i915_gem_object_is_inactive \| global object i915_gem_object_put_pages \| object's pages 915_gem_object_unpin \| global object i915_gem_execbuffer_unreserve_object \| temporary until we plumb vma pread/pwrite \| see the note below Note: set_to_gtt_domain in pwrite/pread is abused as a wait_rendering call - but that once only worked if the object is bound. We really should replace this with a plain wait_rendering call, which would have the upside that in pread it would be clearer that we actually only wait for oustanding gpu writes. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Explain the set_to_gtt_domain in pwrite/pread and volunteer Ben to replace those with wait_rendering calls.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-08 14:04:43 +02:00
Ben Widawsky	07fe0b1280	drm/i915: plumb VM into bind/unbind code As alluded to in several patches, and it will be reiterated later... A VMA is an abstraction for a GEM BO bound into an address space. Therefore it stands to reason, that the existing bind, and unbind are the ones which will be the most impacted. This patch implements this, and updates all callers which weren't already updated in the series (because it was too messy). This patch represents the bulk of an earlier, larger patch. I've pulled out a bunch of things by the request of Daniel. The history is preserved for posterity with the email convention of ">" One big change from the original patch aside from a bunch of cropping is I've created an i915_vma_unbind() function. That is because we always have the VMA anyway, and doing an extra lookup is useful. There is a caveat, we retain an i915_gem_object_ggtt_unbind, for the global cases which might not talk in VMAs. > drm/i915: plumb VM into object operations > > This patch was formerly known as: > "drm/i915: Create VMAs (part 3) - plumbing" > > This patch adds a VM argument, bind/unbind, and the object > offset/size/color getters/setters. It preserves the old ggtt helper > functions because things still need, and will continue to need them. > > Some code will still need to be ported over after this. > > v2: Fix purge to pick an object and unbind all vmas > This was doable because of the global bound list change. > > v3: With the commit to actually pin/unpin pages in place, there is no > longer a need to check if unbind succeeded before calling put_pages(). > Make put_pages only BUG() after checking pin count. > > v4: Rebased on top of the new hangcheck work by Mika > plumbed eb_destroy also > Many checkpatch related fixes > > v5: Very large rebase > > v6: > Change BUG_ON to WARN_ON (Daniel) > Rename vm to ggtt in preallocate stolen, since it is always ggtt when > dealing with stolen memory. (Daniel) > list_for_each will short-circuit already (Daniel) > remove superflous space (Daniel) > Use per object list of vmas (Daniel) > Make obj_bound_any() use obj_bound for each vm (Ben) > s/bind_to_gtt/bind_to_vm/ (Ben) > > Fixed up the inactive shrinker. As Daniel noticed the code could > potentially count the same object multiple times. While it's not > possible in the current case, since 1 object can only ever be bound into > 1 address space thus far - we may as well try to get something more > future proof in place now. With a prep patch before this to switch over > to using the bound list + inactive check, we're now able to carry that > forward for every address space an object is bound into. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Rebase on top of the loss of "drm/i915: Cleanup more of VMA in destroy".] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-08 14:04:20 +02:00
Ben Widawsky	28d6a7bfa2	drm/i915: thread address space through execbuf This represents the first half of hooking up VMs to execbuf. Here we basically pass an address space all around to the different internal functions. It should be much more readable, and have less risk than the second half, which begins switching over to using VMAs instead of an obj,vm. The overall series echoes this style of, "add a VM, then make it smart later" Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Switch a BUG_ON to WARN_ON.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:11 +02:00
Ben Widawsky	c37e220461	drm/i915: Add VM to pin To verbalize it, one can say, "pin an object into the given address space." The semantics of pinning remain the same otherwise. Certain objects will always have to be bound into the global GTT. Therefore, global GTT is a special case, and keep a special interface around for it (i915_gem_obj_ggtt_pin). v2: s/i915_gem_ggtt_pin/i915_gem_obj_ggtt_pin Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:09 +02:00
Chris Wilson	de51f04f06	drm/i915: Replace open-coded offset_in_page() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-26 19:45:11 +02:00
Xiong Zhang	0b74b508f7	drm/i915: add prefault_disable module option prefault is stll enabled by default which prevent most of pwrite/pread/reloc from running slow path, in order to verify these slow pathes, prefault need to be disabled. Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> [danvet: Make checkpatch happy and bikeshed the module option help text a bit.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-19 09:29:26 +02:00
Chris Wilson	e852096986	drm/i915: Replace open-coding of DEFAULT_CONTEXT_ID The intent of the check is made more clear if we use the proper name for 0 here. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-16 10:40:40 +02:00
Daniel Vetter	db1b76ca6a	drm/i915: don't frob mm.suspended when not using ums In kernel modeset driver mode we're in full control of the chip, always. So there's no need at all to set mm.suspended in i915_gem_idle. Hence move that out into the leavevt ioctl. Since i915_gem_idle doesn't suspend gem any more we can also drop the re-enabling for KMS in the thaw function. Also clean up the handling of mm.suspend at driver load by coalescing all the assignments. Stumbled over while reading through our resume code for unrelated reasons. v2: Shovel mm.suspended into the (newly created) ums dungeon as suggested by Chris Wilson. The plan is that once we've completely stopped relying on the register save/restore code we could shovel even that in there. v3: Improve the locking for the entervt/leavevt ioctls a bit by moving the dev->struct_mutex locking outside of i915_gem_idle. Also don't clear dev_priv->ums.mm_suspended for the kms case, we allocate it with kzalloc. Both suggested by Chris Wilson. Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-10 14:30:25 +02:00
Ben Widawsky	f343c5f647	drm/i915: Getter/setter for object attributes Soon we want to gut a lot of our existing assumptions how many address spaces an object can live in, and in doing so, embed the drm_mm_node in the object (and later the VMA). It's possible in the future we'll want to add more getter/setter methods, but for now this is enough to enable the VMAs. v2: Reworked commit message (Ben) Added comments to the main functions (Ben) sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/.[ch] (Daniel) v3: Rebased on new reserve_node patch Changed DRM_DEBUG_KMS to actually work (will need fixing later) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:34 +02:00
Mika Kuoppala	7d736f4f0b	drm/i915: add batch bo to i915_add_request() In order to track down a batch buffer and context which caused the ring to hang, store reference to bo into the request struct. Request can also cause gpu to hang after the batch in the flush section in the ring. To detect this add start of the flush portion offset into the request. v2: Included comment about request vs batch_obj lifetimes (Chris Wilson) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-13 17:42:16 +02:00
Mika Kuoppala	0025c0772d	drm/i915: change i915_add_request to macro Only execbuffer needed all the parameters on i915_add_request(). By putting __i915_add_request behind macro, all current callsites become cleaner. Following patch will introduce a new parameter for __i915_add_request. With this patch, only the relevant callsite will reflect the change making commit smaller and easier to understand. v2: _i915_add_request as function name (Chris Wilson) v3: change name __i915_add_request and fix ordering of params (Ben Widawsky) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-13 17:42:15 +02:00
Chris Wilson	c65355bbef	drm/i915: Track when we dirty the scanout with render commands This is required for tracking render damage for use with FBC and will be used in subsequent patches. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-07 17:56:45 +02:00
Xiang, Haihao	82f91b6e93	drm/i915: add I915_EXEC_VEBOX to i915_gem_do_execbuffer() A user can run batchbuffer via VEBOX ring. Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:21 +02:00

1 2 3 4 5 ...

262 Commits