OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Daniel Vetter	d8ffa60b52	drm/i915: WARN_ON fence pin leaks The fence pin count should always be <= the bo pin count. If that's not the case then we have a funny problem and are leaking references somewhere. Which means we can catch fence pin leaks by checking for the same upper limit as we do for the bo pin count. Inspired by a discussion with Ville about a fence leak igt testcase. v2: Also check for fence->pin_count <= ggtt_vma->pin_count, since that might catch a leak even quicker. Also de-inline them, they're getting too big. v3: Don't separately check for MAX_PIN_COUNT since the > vma->pin_count check will catch that already (Chris). Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-13 17:16:12 +02:00
Chris Wilson	1cf0ba1474	drm/i915: Flush request queue when waiting for ring space During the review of commit `1f70999f90` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jan 27 22:43:07 2014 +0000 drm/i915: Prevent recursion by retiring requests when the ring is full Ville raised the point that our interaction with request->tail was likely to foul up other uses elsewhere (such as hang check comparing ACTHD against requests). However, we also need to restore the implicit retire requests that certain test cases depend upon (e.g. igt/gem_exec_lut_handle), this raises the spectre that the ppgtt will randomly call i915_gpu_idle() and recurse back into intel_ring_begin(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78023 Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> [danvet: Remove now unused 'tail' variable as spotted by Brad.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-08 01:23:34 +02:00
Ben Widawsky	6e7186af3b	drm/i915: Make aliasing a 2nd class VM There is a good debate to be had about how best to fit the aliasing PPGTT into the code. However, as it stands right now, getting aliasing PPGTT bindings is a hack, and done through implicit arguments. To make this absolutely clear, WARN and return an error if a driver writer tries to do something they shouldn't. I have no issue with an eventual revert of this patch. It makes sense for what we have today. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-07 10:01:41 +02:00
Ben Widawsky	ebc348b2ad	drm/i915: Move semaphore specific ring members to struct This will be helpful in abstracting some of the code in preparation for gen8 semaphores. v2: Move mbox stuff to a separate struct v3: Rebased over VCS2 work Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 10:56:52 +02:00
Chris Wilson	c8725f3dc0	drm/i915: Do not call retire_requests from wait_for_rendering A common issue we have is that retiring requests causes recursion through GTT manipulation or page table manipulation which we can only handle at very specific points. However, to maintain internal consistency (enforced through our sanity checks on write_domain at various points in the GEM object lifecycle) we do need to retire the object prior to marking it with a new write_domain, and also clear the write_domain for the implicit flush following a batch. Note that this then allows the unbound objects to still be on the active lists, and so care must be taken when removing objects from unbound lists (similar to the caveats we face processing the bound lists). v2: Fix i915_gem_shrink_all() to handle updated object lifetime rules, by refactoring it to call into __i915_gem_shrink(). v3: Missed an object-retire prior to changing cache domains in i915_gem_object_set_cache_leve() v4: Rebase Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:09:15 +02:00
Imre Deak	981a5aead1	drm/i915: vlv: clean up GTLC wake control/status register macros These will be needed by the upcoming VLV RPM helpers. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:50 +02:00
Zhao Yakui	845f74a701	drm/i915:Initialize the second BSD ring on BDW GT3 machine Based on the hardware spec, the BDW GT3 machine has two independent BSD ring that can be used to dispatch the video commands. So just initialize it. V3->V4: Follow Imre's comment to do some minor updates. For example: more comments are added to describe the semaphore between ring. Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> [danvet: Fix up checkpatch error.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:46 +02:00
Chris Wilson	6099032045	drm/i915: Allow the module to load even if we fail to setup rings Even without enabling the ringbuffers to allow command execution, we can still control the display engines to enable modesetting. So make the ringbuffer initialization failure soft, and mark the GPU as wedged instead. v2: Only treat an EIO from ring initialisation as a soft failure, and abort module load for any other failure, such as allocation failures. v3: Add an ERROR prior to declaring the GPU wedged so that it stands out like a sore thumb in the logs Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:38 +02:00
Chris Wilson	e3efda49e7	drm/i915: Preserve ring buffers objects across resume Tearing down the ring buffers across resume is overkill, risks unnecessary failure and increases fragmentation. After failure, since the device is still active we may end up trying to write into the dangling iomapping and trigger an oops. v2: stop_ringbuffers() was meant to call stop(ring) not cleanup(ring) during resume! Reported-by: Jae-hyeon Park <jhyeon@gmail.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=72351 References: https://bugs.freedesktop.org/show_bug.cgi?id=76554 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> [danvet: s/ring->obj == NULL/!intel_ring_initialized(ring)/ as suggested by Oscar.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:37 +02:00
Dave Airlie	444c9a08bf	Merge branch 'drm-init-cleanup' of git://people.freedesktop.org/~danvet/drm into drm-next Next pull request, this time more of the drm de-midlayering work. The big thing is that his patch series here removes everything from drm_bus except the set_busid callback. Thierry has a few more patches on top of this to make that one optional to. With that we can ditch all the non-pci drm_bus implementations, which Thierry has already done for the fake tegra host1x drm_bus. Reviewed by Thierry, Laurent and David and now also survived some testing on my intel boxes to make sure the irq fumble is fixed correctly ;-) The last minute rebase was just to add the r-b tags from Thierry for the 2 patches I've redone. * 'drm-init-cleanup' of git://people.freedesktop.org/~danvet/drm: drm/<drivers>: don't set driver->dev_priv_size to 0 drm: Remove dev->kdriver drm: remove drm_bus->get_name drm: rip out dev->devname drm: inline drm_pci_set_unique drm: remove bus->get_irq implementations drm: pass the irq explicitly to drm_irq_install drm/irq: Look up the pci irq directly in the drm_control ioctl drm/irq: track the irq installed in drm_irq_install in dev->irq drm: rename dev->count_lock to dev->buf_lock drm: Rip out totally bogus vga_switcheroo->can_switch locking drm: kill drm_bus->bus_type drm: remove drm_dev_to_irq from drivers drm/irq: remove cargo-culted locking from irq_install/uninstall drm/irq: drm_control is a legacy ioctl, so pci devices only drm/pci: fold in irq_by_busid support drm/irq: simplify irq checks in drm_wait_vblank	2014-05-01 09:32:21 +10:00
Dave Airlie	885ac04ab3	Merge tag 'drm-intel-next-2014-04-16' of git://anongit.freedesktop.org/drm-intel into drm-next drm-intel-next-2014-04-16: - vlv infoframe fixes from Jesse - dsi/mipi fixes from Shobhit - gen8 pageflip fixes for LRI/SRM from Damien - cmd parser fixes from Brad Volkin - some prep patches for CHV, DRRS, ... - and tons of little things all over drm-intel-next-2014-04-04: - cmd parser for gen7 but only in enforcing and not yet granting mode - the batch copying stuff is still missing. Also performance is a bit ... rough (Brad Volkin + OACONTROL fix from Ken). - deprecate UMS harder (i.e. CONFIG_BROKEN) - interrupt rework from Paulo Zanoni - runtime PM support for bdw and snb, again from Paulo - a pile of refactorings from various people all over the place to prep for new stuff (irq reworks, power domain polish, ...) drm-intel-next-2014-04-04: - cmd parser for gen7 but only in enforcing and not yet granting mode - the batch copying stuff is still missing. Also performance is a bit ... rough (Brad Volkin + OACONTROL fix from Ken). - deprecate UMS harder (i.e. CONFIG_BROKEN) - interrupt rework from Paulo Zanoni - runtime PM support for bdw and snb, again from Paulo - a pile of refactorings from various people all over the place to prep for new stuff (irq reworks, power domain polish, ...) Conflicts: drivers/gpu/drm/i915/i915_gem_context.c	2014-05-01 09:11:37 +10:00
Daniel Vetter	bb0f1b5c16	drm: pass the irq explicitly to drm_irq_install Unfortunately this requires a drm-wide change, and I didn't see a sane way around that. Luckily it's fairly simple, we just need to inline the respective get_irq implementation from either drm_pci.c or drm_platform.c. With that we can now also remove drm_dev_to_irq from drm_irq.c. Reviewed-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-23 10:32:50 +02:00
Daniel Vetter	e090c53b21	drm/irq: remove cargo-culted locking from irq_install/uninstall The dev->struct_mutex locking in drm_irq.c only protects dev->irq_enabled. Which isn't really much at all and only prevents especially nasty ums userspace from concurrently installing the interrupt handling a few times. Or at least trying. There are tons of unlocked readers of dev->irqs_enabled in the vblank wait code (and by extension also in the pageflip code since that uses the same vblank timestamp engine). Real modesetting drivers should ensure that nothing can go haywire with a sane setup teardown sequence. So we only really need this for the drm_control ioctl, everywhere else this will just paper over nastiness. Note that drm/i915 is a bit specially due to the gem+ums combination. So there we also need to properly protect the entervt and leavevt ioctls. But it's definitely saner to do everything in one go than to drop the lock in-between. Finally there's the gpu reset code in drm/i915. That one's just race (concurrent userspace calls to for vblank waits of pageflips could spuriously fail). So wrap it up in with a nice comment since fixing this is more involved. v2: Rebase and fix commit message (Thierry) Reviewed-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-22 11:41:12 +02:00
Chris Wilson	691e6415c8	drm/i915: Always use kref tracking for all contexts. If we always initialize kref for the context, even if we are using fake contexts for hangstats when there is no hw support, we can forgo the dance to dereference the ctx->obj and inspect whether we are permitted to use kref inside i915_gem_context_reference() and _unreference(). My ulterior motive here is to improve the debugging of a use-after-free of ctx->obj. This patch avoids the dereference here and instead forces the assertion checks associated with kref. v2: Refactor the fake contexts to being even more like the real contexts, so that there is much less duplicated and special case code. v3: Tweaks. v4: Tweaks, minor. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76671 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: lu hua <huax.lu@intel.com> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> [Jani: tiny change to backport to drm-intel-fixes.] Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2014-04-11 13:29:51 +03:00
Mika Kuoppala	88b4aa8770	drm/i915: add flags to i915_ring_stop Piglit runner and QA are both looking at the dmesg for DRM_ERRORs with test cases. Add a flag to control those when we they are expected from related test cases. Also add flag to control if contexts should be banned that introduced the hang. Hangcheck is timer based and preventing bans by adding sleeps to testcases makes testing slower. v2: intel_ring_stopped(), readable comment (Chris) v3: keep compatibility (Daniel) References: https://bugs.freedesktop.org/show_bug.cgi?id=75876 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-09 15:07:42 +02:00
Dave Airlie	9f97ba806a	Merge tag 'drm-intel-fixes-2014-04-04' of git://anongit.freedesktop.org/drm-intel into drm-next Merge window -fixes pull request as usual. Well, I did sneak in Jani's drm_i915_private_t typedef removal, need to have fun with a big sed job too ;-) Otherwise: - hdmi interlaced fixes (Jesse&Ville) - pipe error/underrun/crc tracking fixes, regression in late 3.14-rc (but not cc: stable since only really relevant for igt runs) - large cursor wm fixes (Chris) - fix gpu turbo boost/throttle again, was getting stuck due to vlv rps patches (Chris+Imre) - fix runtime pm fallout (Paulo) - bios framebuffer inherit fix (Chris) - a few smaller things * tag 'drm-intel-fixes-2014-04-04' of git://anongit.freedesktop.org/drm-intel: (196 commits) Skip intel_crt_init for Dell XPS 8700 drm/i915: vlv: fix RPS interrupt mask setting Revert "drm/i915/vlv: fixup DDR freq detection per Punit spec" drm/i915: move power domain init earlier during system resume drm/i915: Fix the computation of required fb size for pipe drm/i915: don't get/put runtime PM at the debugfs forcewake file drm/i915: fix WARNs when reading DDI state while suspended drm/i915: don't read cursor registers on powered down pipes drm/i915: get runtime PM at i915_display_info drm/i915: don't read pp_ctrl_reg if we're suspended drm/i915: get runtime PM at i915_reg_read_ioctl drm/i915: don't schedule force_wake_timer at gen6_read drm/i915: vlv: reserve the GT power context only once during driver init drm/i915: prefer struct drm_i915_private to drm_i915_private_t drm/i915/overlay: prefer struct drm_i915_private to drm_i915_private_t drm/i915/ringbuffer: prefer struct drm_i915_private to drm_i915_private_t drm/i915/display: prefer struct drm_i915_private to drm_i915_private_t drm/i915/irq: prefer struct drm_i915_private to drm_i915_private_t drm/i915/gem: prefer struct drm_i915_private to drm_i915_private_t drm/i915/dma: prefer struct drm_i915_private to drm_i915_private_t ...	2014-04-05 16:14:21 +10:00
Lauri Kasanen	62347f9e0f	drm: Add support for two-ended allocation, v3 Clients like i915 need to segregate cache domains within the GTT which can lead to small amounts of fragmentation. By allocating the uncached buffers from the bottom and the cacheable buffers from the top, we can reduce the amount of wasted space and also optimize allocation of the mappable portion of the GTT to only those buffers that require CPU access through the GTT. For other drivers, allocating small bos from one end and large ones from the other helps improve the quality of fragmentation. Based on drm_mm work by Chris Wilson. v3: Changed to use a TTM placement flag v2: Updated kerneldoc Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Christian König <deathsimple@vodafone.de> Signed-off-by: Lauri Kasanen <cand@gmx.com> Signed-off-by: David Airlie <airlied@redhat.com>	2014-04-04 09:28:14 +10:00
Jani Nikula	3e31c6c017	drm/i915/gem: prefer struct drm_i915_private to drm_i915_private_t No functional changes. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-31 15:32:38 +02:00
Chris Wilson	df6f783a4e	drm/i915: Fix unsafe loop iteration over vma whilst unbinding them On non-LLC platforms, when changing the cache level of an object, we may need to unbind it so that prefetching across page boundaries does not cross into a different memory domain. This requires us to unbind conflicting vma, but we did so iterating over the objects vma in an unsafe manner (as the list was being modified as we iterated). The regression was introduced in commit `3089c6f239` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Jul 31 17:00:03 2013 -0700 drm/i915: make caching operate on all address spaces apparently as far back as v3.12-rc1, but it has only just begun to trigger real world bug reports. Reported-and-tested-by: Nikolay Martynov <mar.kolya@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76384 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-21 16:13:08 +01:00
Paulo Zanoni	5d584b2eca	drm/i915: move pc8.irqs_disabled to pm.irqs_disabled When other platforms add runtime PM support they will also need to disable interrupts, so move the variable to the runtime PM struct. Also notice that the longer-term goal is to completely kill the regsave struct, and I even have patches for that. v2: - Rebase. v3: - Rebase. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-19 16:39:46 +01:00
Daniel Vetter	b80d6c781e	Merge branch 'topic/dp-aux-rework' into drm-intel-next-queued Conflicts: drivers/gpu/drm/i915/intel_dp.c A bit a mess with reverts which differe in details between -fixes and -next and some other unrelated shuffling. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-19 15:54:37 +01:00
Dave Airlie	d1583c9997	Merge branch 'drm-next' of git://people.freedesktop.org/~dvdhrm/linux into drm-next This is the 3rd respin of the drm-anon patches. They allow module unloading, use the pin_fs_* helpers recommended by Al and are rebased on top of drm-next. Note that there are minor conflicts with the "drm-minor" branch. * 'drm-next' of git://people.freedesktop.org/~dvdhrm/linux: drm: init TTM dev_mapping in ttm_bo_device_init() drm: use anon-inode instead of relying on cdevs drm: add pseudo filesystem for shared inodes	2014-03-18 19:17:02 +10:00
David Herrmann	6796cb16c0	drm: use anon-inode instead of relying on cdevs DRM drivers share a common address_space across all character-devices of a single DRM device. This allows simple buffer eviction and mapping-control. However, DRM core currently waits for the first ->open() on any char-dev to mark the underlying inode as backing inode of the device. This delayed initialization causes ugly conditions all over the place: if (dev->dev_mapping) do_sth(); To avoid delayed initialization and to stop reusing the inode of the char-dev, we allocate an anonymous inode for each DRM device and reset filp->f_mapping to it on ->open(). Signed-off-by: David Herrmann <dh.herrmann@gmail.com>	2014-03-16 12:23:33 +01:00
Ville Syrjälä	3ddffb7b8a	drm/i915: Unbind all vmas whose new cache_level doesn't agree with the neighbours When we change the cache_level for an object we need to make sure we don't put differing types of snoopable memory too close to each other on non-LLC machines. Currently i915_gem_object_set_cache_level() will stop looking when it finds just one vma that has such a conflict. Drop the bogus break statement to make sure it will unbind all vmas which need to be moved around to avoid the conflict. I suppose this is a theoretical issue as currently we don't enable ppgtt on non-LLC machines, so each object can only have one vma. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-13 12:22:44 +01:00
Chris Wilson	c2831a94b5	drm/i915: Do not force non-caching copies for pwrite along shmem path We don't always want to write into main memory with pwrite. The shmem fast path in particular is used for memory that is cacheable - under such circumstances forcing the cache eviction is undesirable. As we will always flush the cache when targeting incoherent buffers, we can rely on that second pass to apply the cache coherency rules and so benefit from in-cache copies otherwise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-08 00:03:26 +01:00
Chris Wilson	17793c9a46	drm/i915: Process page flags once rather than per pwrite/pread We used to lock individual pages inside the buffer object and so needed to update the page flags every time. However, we now pin the pages into the object for the duration of the pwrite/pread (and hopefully much longer) and so we can forgo the flag updates until we release all the pages. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-08 00:03:01 +01:00
Brad Volkin	4c914c0c7c	drm/i915: Refactor shmem pread setup The command parser is going to need the same synchronization and setup logic, so factor it out for reuse. v2: Add a check that the object is backed by shmem Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-07 22:36:59 +01:00
Damien Lespiau	cb216aa844	drm/i915: Make i915_gem_retire_requests_ring() static Its last usage outside of i915_gem.c was removed in: commit `1f70999f90` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jan 27 22:43:07 2014 +0000 drm/i915: Prevent recursion by retiring requests when the ring is full Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-05 21:30:39 +01:00
Chris Wilson	ab0e7ff9f2	drm/i915: Record pid/comm of hanging task After finding the guilty batch and request, we can use it to find the process that submitted the batch and then add the culprit into the error state. This is a slightly different approach from Ben's in that instead of adding the extra information into the struct i915_hw_context, we use the information already captured in struct drm_file which is then referenced from the request. v2: Also capture the workaround buffer for gen2, so that we can compare its contents against the intended batch for the active request. v3: Rebase (Mika) v4: Check for null context (Chris) checkpatch warnings fixed Link: http://lists.freedesktop.org/archives/intel-gfx/2013-August/032280.html Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v2) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v4) Acked-by: Ben Widawsky <ben@bwidawsk.net> Cc: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-05 21:30:24 +01:00
Chris Wilson	8d9fc7fd2d	drm/i915: Rely on accurate request tracking for finding hung batches In the past, it was possible to have multiple batches per request due to a stray signal or ENOMEM. As a result we had to scan each active object (filtered by those having the COMMAND domain) for the one that contained the ACTHD pointer. This was then made more complicated by the introduction of ppgtt, whereby ACTHD then pointed into the address space of the context and so also needed to be taken into account. This is a fairly robust approach (though the implementation is a little fragile and depends upon the per-generation setup, registers and parameters). However, due to the requirements for hangstats, we needed a robust method for associating batches with a particular request and having that we can rely upon it for finding the associated batch object for error capture. If the batch buffer tracking is not robust enough, that should become apparent quite quickly through an erroneous error capture. That should also help to make sure that the runtime reporting to userspace is robust. It also means that we then report the oldest incomplete batch on each ring, which can be useful for determining the state of userspace at the time of a hang. v2: Use i915_gem_find_active_request (Mika) v3: remove check for ring->get_seqno, split long lines (Ben) v4: check that context is available (Chris) checkpatch warnings fixed Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v3) Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v3) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-05 21:30:24 +01:00
Chris Wilson	64bf930379	drm/i915: Reset vma->mm_list after unbinding In place of true activity counting, we walk the list of vma associated with an object managing each on the vm's active/inactive list everytime we call move-to-inactive. This depends upon the vma->mm_list being cleared after unbinding, or else we run into difficulty when tracking the object in multiple vm's - we see a use-after free and corruption of the mm_list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-05 21:30:23 +01:00
Ville Syrjälä	ccc7bed05e	drm/i915: Don't ban default context when stop_rings!=0 If we've explicitly stopped the rings for testing purposes, don't ban the default context. Fixes kms_flip hang tests. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-05 21:30:14 +01:00
Chris Wilson	f62a007603	drm/i915: Accurately track when we mark the hardware as idle/busy We currently call intel_mark_idle() too often, as we do so as a side-effect of processing the request queue. However, we the calls to intel_mark_idle() are expected to be paired with a call to intel_mark_busy() (or else we try to idle the hardware by accessing registers that are already disabled). Make the idle/busy tracking explicit to prevent the multiple calls. v2: We can drop some of the complexity in __i915_add_request() as queue_delayed_work() already behaves as we want (not requeuing the item if it is already in the queue) and mark_busy/mark_idle imply that the idle task is inactive. v3: We do still need to cancel the pending idle task so that it is sent again after the current busy load completes (not in the middle of it). Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-05 21:30:10 +01:00
Daniel Vetter	8ea99c9287	drm/i915: Only bind each object rather than for every execbuffer One side-effect of the introduction of ppgtt was that we needed to rebind the object into the appropriate vm (and global gtt in some peculiar cases). For simplicity this was done twice for every object on every call to execbuffer. However, that adds a tremendous amount of CPU overhead (rewriting all the PTE for all objects into WC memory) per draw. The fix is to push all the decision about which vm to bind into and when down into the low-level bind routines through hints rather than performing the bind unconditionally in the execbuffer routine. Note that this is a regression introduced in the full ppgtt feature branch, before this we've only done re-bound objects when the relevant has_(aliasing_ppgtt\|global_gtt)_mapping flag was clear. But since that's per-object and not per-vma that optimization broke. v2: Split out prep work and unrelated changes. v3: Bring back functional change around PIN_GLOBAL that I've accidentally split out. v4: Remove the temporary hack for the old binding logic to avoid bisection issues. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72906 Tested-by: jianx.zhou@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:18:38 +01:00
Daniel Vetter	262de14531	drm/i915: Directly return the vma from bind_to_vm This is prep work for reworking the object_pin logic. Atm it still does a (now redundant) lookup of the vma. The next patch will fix this. Split out from Chris vma-bind rework. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:18:30 +01:00
Daniel Vetter	b287110e89	drm/i915: Simplify i915_gem_object_ggtt_unpin Split out from Chris vma-bind rework. Jani wondered why this is save, and the reason is that i915_vma_unbind does all these checks, too. So they're redundant. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:18:21 +01:00
Daniel Vetter	bf3d149b25	drm/i915: split PIN_GLOBAL out from PIN_MAPPABLE With abitrary pin flags it makes sense to split out a "please bind this into global gtt" from the "please allocate in the mappable range". Use this unconditionally in our global gtt pin helper since this is what its callers want. Later patches will drop PIN_MAPPABLE where it's not strictly needed. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:17:27 +01:00
Daniel Vetter	1ec9e26dda	drm/i915: Consolidate binding parameters into flags Anything more than just one bool parameter is just a pain to read, symbolic constants are much better. Split out from Chris' vma-binding rework patch. v2: Undo the behaviour change in object_pin that Chris spotted. v3: Split out misplaced hunk to handle set_cache_level errors, spotted by Jani. v4: Keep the current over-zealous binding logic in the execbuffer code working with a quick hack while the overall binding code gets shuffled around. v5: Reorder the PIN_ flags for more natural patch splitup. v6: Pull out the PIN_GLOBAL split-up again. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:16:58 +01:00
Chris Wilson	6e4930f6ee	drm/i915: Flush GPU rendering with a lockless wait during a pagefault Arjan van de Ven reported that on his test machine that he was seeing stalls of greater than 1 frame greatly impacting the user experience. He tracked this down to being the locked flush during a pagefault as being the culprit hogging the struct_mutex and so blocking any other user from proceeding. Stalling on a pagefault is bad behaviour on userspace's part, for one it means that they are ignoring the coherency rules on pointer access through the GTT, but fortunately we can apply the same trick as the set-to-domain ioctl to do a lightweight, nonblocking flush of outstanding rendering first. "Prior to the patch it looks like this (this one testrun does not show the 20ms+ I've seen occasionally) 4.99 ms 2.36 ms 31360 __wait_seqno i915_wait_seqno i915_gem_object_wait_rendering i915_gem_object_set_to_gtt_domain i915_gem_fault __do_fault handle_ +pte_fault handle_mm_fault __do_page_fault do_page_fault page_fault 4.99 ms 2.75 ms 107751 __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 4.99 ms 1.63 ms 1666 i915_mutex_lock_interruptible i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_fault do_page_fault page_fa +ult 4.93 ms 2.45 ms 980 i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_ +sysret 4.89 ms 2.20 ms 3283 i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 4.34 ms 1.66 ms 1715 i915_mutex_lock_interruptible i915_gem_pwrite_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 3.73 ms 3.73 ms 49 i915_mutex_lock_interruptible i915_gem_set_domain_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 3.17 ms 0.33 ms 931 i915_mutex_lock_interruptible i915_gem_madvise_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 2.97 ms 0.43 ms 1029 i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 2.55 ms 0.51 ms 735 i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret After the patch it looks like this: 4.99 ms 2.14 ms 22212 __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 4.86 ms 0.99 ms 14170 __wait_seqno i915_gem_object_wait_rendering__nonblocking i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_ +fault do_page_fault page_fault 3.59 ms 1.31 ms 325 i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 3.37 ms 3.37 ms 65 i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 2.58 ms 2.58 ms 65 i915_mutex_lock_interruptible i915_gem_do_execbuffer.isra.23 i915_gem_execbuffer2 drm_ioctl i915_compat_ioctl compat_sys_ioctl +ia32_sysret 2.19 ms 2.19 ms 65 i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_ +sysret 2.18 ms 2.18 ms 65 i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 1.66 ms 1.66 ms 65 i915_gem_set_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret It may not look like it, but this is quite a large difference, and I've been unable to reproduce > 5 msec delays at all, while before they do happen (just not in the trace above)." gem_gtt_hog on an old Pineview (GMA3150), before: 4969.119ms after: 4122.749ms Reported-by: Arjan van de Ven <arjan.van.de.ven@intel.com> Testcase: igt/gem_gtt_hog Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-12 18:52:59 +01:00
Chris Wilson	bd9b6a4ec5	drm/i915: Downgrade ERROR message for invalid user input When we detect that the user passed along an invalid handle or object, we emit a warning as an aide for debugging. Since these are indeed only for debugging user triggerable errors (and the errors are reported back to userspace by the errno), the messages should only be at the debug level and not claiming that there is a catastrophic error in the driver/hardware. References: https://bugs.freedesktop.org/show_bug.cgi?id=74704 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-12 18:52:54 +01:00
Damien Lespiau	3d13ef2e2d	drm/i915: Always use INTEL_INFO() to access the device_info structure If we make sure that all the dev_priv->info usages are wrapped by INTEL_INFO(), we can easily modify the ->info field to be structure and not a pointer while keeping the const protection in the INTEL_INFO() macro. v2: Rebased onto latest drm-nightly Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-12 18:52:50 +01:00
Chris Wilson	8c99e57d39	drm/i915: Treat using a purged buffer as a source of EFAULT Since a purged buffer is one without any associated pages, attempting to use it should generate EFAULT rather than EINVAL, as it is not strictly an invalid parameter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-04 17:03:32 +01:00
Chris Wilson	45d678173a	drm/i915: Convert EFAULT into a silent SIGBUS EFAULT will be a possible return code where backing storage is transient, such after it is purged by madvise. As such it is to be expected and so should not trigger a WARN inside i915_gem_fault() but be converted silently to SIGBUS. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-04 17:03:27 +01:00
Mika Kuoppala	e38486943e	drm/i915: release mutex in i915_gem_init()'s error path Found with smatch. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-04 12:10:45 +01:00
Mika Kuoppala	939fd76208	drm/i915: Get rid of acthd based guilty batch search As we seek the guilty batch using request and hangcheck score, this code is not needed anymore. v2: Rebase. Passing dev_priv instead of getting it from last_ring Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-04 11:57:29 +01:00
Mika Kuoppala	b6b0fac04d	drm/i915: Use hangcheck score to find guilty context With full ppgtt using acthd is not enough to find guilty batch buffer. We get multiple false positives as acthd is per vm. Instead of scanning which vm was running on a ring, to find corressponding context, use a different, simpler, strategy of finding batches that caused gpu hang: If hangcheck has declared ring to be hung, find first non complete request on that ring and claim it was guilty. v2: Rebase Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73652 Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-04 11:57:24 +01:00
Mika Kuoppala	3fac8978f5	drm/i915: Tune down debug output when context is banned If we have stopped rings then we know that test is running so no need for spam. In addition, only spam when default context gets banned. v2: - make sure default context ban gets shown (Chris) - use helper for checking for default context, everywhere (Chris) v3: - dont be quiet when debug is set (Ben, Daniel) Reference: https://bugs.freedesktop.org/show_bug.cgi?id=73652 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-30 17:25:38 +01:00
Mika Kuoppala	44e2c0705a	drm/i915: Use i915_hw_context to set reset stats With full ppgtt support drm_i915_file_private gained knowledge about the default context. Also reset stats are now inside i915_hw_context so we can use proper abstraction. v2: Move BUG_ON and WARN_ON to more proper locations (Ben) v3: Pass dev directly to i915_context_is_banned to avoid the need to dereference ctx->last_ring. Spotted by Mika when checking my s/BUG/WARN/ change, I've missed this ->last_ring dereference. Suggested-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v2) Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v2) [danvet: s/BUG/WARN/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-30 17:24:36 +01:00
Daniel Vetter	6ba844b090	drm/i915: GEN7_MSG_CONTROL is ivb-only At least I couldn't find it in the Haswell Bspec any more and we've tried to test-boot a Haswell machine with num_pipes forced to 0 (i.e. hit the PCH_NOP path) and the unclaimed register logic complained. So restrict this dance to just ivb platforms. v2: Art pointed out that the bits simply moved on hsw+ v3: Buy code terseneness with a notch of sublety as suggested by Chris. v4: Frob the right bit, spotted by Art. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Arthur Ranyan <arthur.j.runyan@intel.com> Cc: Dave Airlie <airlied@gmail.com> Reviewed-by: Art Runyan <arthur.j.runyan@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-27 17:16:47 +01:00
Jani Nikula	d330a9530c	drm/i915: move module parameters into a struct, in a new file With 20+ module parameters, I think referring to them via a struct improves clarity over just having a bunch of globals. While at it, move the parameter initialization and definitions into a new file i915_params.c to reduce clutter in i915_drv.c. Apart from the ill-named i915_enable_rc6, i915_enable_fbc and i915_enable_ppgtt parameters, for which we lose the "i915_" prefix internally, the module parameters now look the same both on the kernel command line and in code. For example, "i915.modeset". The downsides of the change are losing static on a couple of variables and not having the initialization and module_param_named() right next to each other. On the other hand, all module parameters are now defined in one place at i915_params.c. Plus you can do this to find all module parameter references: $ git grep "i915\." -- drivers/gpu/drm/i915 v2: - move the definitions into a new file - s/i915_params/i915/ - make i915_try_reset i915.reset, for consistency Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-27 17:16:45 +01:00
Daniel Vetter	0e5539b923	Merge branch 'topic/ppgtt' into drm-intel-next-queued Because whatever.* * This should contain a fairly long list of issues and still unresolved resgressions, but I didn't really get a vote. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-25 21:14:57 +01:00
Chris Wilson	f72d21eddf	drm/i915: Place the Global GTT VM first in the list of VM This is useful for debugging as we then know that the first entry is always the global GTT, and all later entries the per-process GTT VM. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-25 20:07:15 +01:00
Chris Wilson	5dce5b9387	drm/i915: Wait for completion of pending flips when starved of fences On older generations (gen2, gen3) the GPU requires fences for many operations, such as blits. The display hardware also requires fences for scanouts and this leads to a situation where an arbitrary number of fences may be pinned by old scanouts following a pageflip but before we have executed the unpin workqueue. This is unpredictable by userspace and leads to random EDEADLK when submitting an otherwise benign execbuffer. However, we can detect when we have an outstanding flip and so cause userspace to wait upon their completion before finally declaring that the system is starved of fences. This is really no worse than forcing the GPU to stall waiting for older execbuffer to retire and release their fences before we can reallocate them for the next execbuffer. v2: move the test for a pending fb unpin to a common routine for later reuse during eviction Reported-and-tested-by: dimon@gmx.net Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73696 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-22 10:34:40 +01:00
Ben Widawsky	1d62beeaeb	drm/i915/ppgtt: Defer request freeing on reset We need to defer the free request until the object/vma is capable of being freed - or else we have a problem when we try to destroy the context. The exact same issue is described and fixed here: commit `e20780439b` Author: Ben Widawsky <ben@bwidawsk.net> Date: Fri Dec 6 14:11:22 2013 -0800 drm/i915: Defer request freeing I had this fix previously, but decided not to keep it for some reason I can no longer remember. gem_reset_stats is a really good test at hitting the problem. For the inquisitive: [ 170.516392] ------------[ cut here ]------------ [ 170.517227] WARNING: CPU: 1 PID: 105 at drivers/gpu/drm/drm_mm.c:578 drm_mm_takedown+0x2e/0x30 [drm]() [ 170.518064] Memory manager not clean during takedown. [ 170.518941] CPU: 1 PID: 105 Comm: kworker/1:1 Not tainted 3.13.0-rc4-BEN+ #28 [ 170.519787] Hardware name: Hewlett-Packard HP EliteBook 8470p/179B, BIOS 68ICF Ver. F.02 04/27/2012 [ 170.520662] Call Trace: [ 170.521517] [<ffffffff814f0589>] dump_stack+0x4e/0x7a [ 170.522373] [<ffffffff81049e6d>] warn_slowpath_common+0x7d/0xa0 [ 170.523227] [<ffffffff81049edc>] warn_slowpath_fmt+0x4c/0x50 [ 170.524079] [<ffffffffa06c414e>] drm_mm_takedown+0x2e/0x30 [drm] [ 170.524934] [<ffffffffa07213f3>] gen6_ppgtt_cleanup+0x23/0x110 [i915] [ 170.525777] [<ffffffffa07837ed>] ppgtt_release.part.5+0x24/0x29 [i915] [ 170.526603] [<ffffffffa071aaa5>] i915_gem_context_free+0x195/0x1a0 [i915] [ 170.527423] [<ffffffffa071189d>] i915_gem_free_request+0x9d/0xb0 [i915] [ 170.528247] [<ffffffffa0718af9>] i915_gem_reset+0x1f9/0x3f0 [i915] [ 170.529065] [<ffffffffa0700cce>] i915_reset+0x4e/0x180 [i915] [ 170.529870] [<ffffffffa070829d>] i915_error_work_func+0xcd/0x120 [i915] [ 170.530666] [<ffffffff8106c13a>] process_one_work+0x1fa/0x6d0 [ 170.531453] [<ffffffff8106c0d8>] ? process_one_work+0x198/0x6d0 [ 170.532230] [<ffffffff8106c72b>] worker_thread+0x11b/0x3a0 [ 170.532996] [<ffffffff8106c610>] ? process_one_work+0x6d0/0x6d0 [ 170.533771] [<ffffffff810743ef>] kthread+0xff/0x120 [ 170.534548] [<ffffffff810742f0>] ? insert_kthread_work+0x80/0x80 [ 170.535322] [<ffffffff814f97ac>] ret_from_fork+0x7c/0xb0 [ 170.536089] [<ffffffff810742f0>] ? insert_kthread_work+0x80/0x80 [ 170.536847] ---[ end trace 3d4c12892e42d58f ]--- v2: Whitespace fix. (Chris) Note: This is a bug that only hits the ppgtt topic branch but I've figured that doing the request cleanup in this order is generally the right thing to do. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Add a code comment to clarify what's actually going on since the lifetime rules aroung ppgtt cleanup are ... fuzzy a best atm. Also add a note about why we need this.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-22 10:34:37 +01:00
Daniel Vetter	8664850087	drm/i915: Tune down reset_stat output from ERROR to debug This is user-triggerable and hence we should not allow it to spam dmesg. Also, it upsets the nice dmesg tracking piglit does. Note that this is just extra debugging information, mostly unwanted, in case of a hang and that there is a separate message to the user giving instructions on how to report a bug for a GPU hang. v2: Add note as suggests in Chris' reply. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72740 Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-22 10:34:35 +01:00
Daniel Vetter	0d9d349d87	Merge commit origin/master into drm-intel-next Conflicts are getting out of hand, and now we have to shuffle even more in -next which was also shuffled in -fixes (the call for drm_mode_config_reset needs to move yet again). So do a proper backmerge. I wanted to wait with this for the 3.13 relaese, but alas let's just do this now. Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_ddi.c drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_pm.c Besides the conflict around the forcewake get/put (where we chaged the called function in -fixes and added a new parameter in -next) code all the current conflicts are of the adjacent lines changed type. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-16 22:06:30 +01:00
Chris Wilson	e910303802	drm/i915: Free requests after object release when retiring requests Freeing a request triggers the destruction of the context. This needs to occur after all objects are themselves unbound from the context, and so the free request needs to occur after the object release during retire. This tidies up commit `e20780439b` Author: Ben Widawsky <ben@bwidawsk.net> Date: Fri Dec 6 14:11:22 2013 -0800 drm/i915: Defer request freeing by simply swapping the order of operations rather than introducing further complexity - as noted during review. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-10 08:21:52 +01:00
Daniel Vetter	bfca05275a	Revert "drm/i915: Do not allow buffers at offset 0" This reverts commit `4fe9adbc36`. The patch completely lacks a detailed explanation of what exactly blows up and how, so is insufficiently justified as a band-aid. Otoh the justification as a safety measure against userspace botching up relocations is also fairly weak: If we want real project we need to at least make the gab big enough that the gpu doesn't scribble over more important stuff. With 4k screens that would be 32MB. Also I think this would be much better in conjunction with a (debug) switch to disable our use of the scratch page. Hence revert this. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 17:50:39 +01:00
Daniel Vetter	02f6bcccf7	drm/i915: Reject the pin ioctl on gen6+ Especially with ppgtt this kinda stopped making sense. And if we indeed need this to hack around an issue, we need something that also works for non-root. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 16:30:22 +01:00
Ben Widawsky	7e0d96bc03	drm/i915: Use multiple VMs -- the point of no return As with processes which run on the CPU, the goal of multiple VMs is to provide process isolation. Specific to GEN, there is also the ability to map more objects per process (2GB each instead of 2Gb-2k total). For the most part, all the pipes have been laid, and all we need to do is remove asserts and actually start changing address spaces with the context switch. Since prior to this we've converted the setting of the page tables to a streamed version, this is quite easy. One important thing to point out (since it'd been hotly contested) is that with this patch, every context created will have it's own address space (provided the HW can do it). v2: Disable BDW on rebase NOTE: I tried to make this commit as small as possible. I needed one place where I could "turn everything on" and that is here. It could be split into finer commits, but I didn't really see much point. Cc: Eric Anholt <eric@anholt.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 16:24:52 +01:00
Daniel Vetter	3d7f0f9dcc	Merge commit drm-intel-fixes into topic/ppgtt I need the tricky do_switch fix before I can merge the final piece of the ppgtt enabling puzzle. Otherwise the conflict will be a real pain to resolve since the do_switch hunk from -fixes must be placed at the exact right place within a hunk in the next patch. Conflicts: drivers/gpu/drm/i915/i915_gem_context.c drivers/gpu/drm/i915/i915_gem_execbuffer.c drivers/gpu/drm/i915/intel_display.c Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 16:23:37 +01:00
Ben Widawsky	4fe9adbc36	drm/i915: Do not allow buffers at offset 0 This is primarily a band aid for an unexplainable error in gem_reloc_vs_gpu/forked-faulting-reloc-thrashing. Essentially as soon as a relocated buffer (which had a non-zero presumed offset) moved to offset 0, something goes bad. Since I have been unable to solve this, and potentially this is a good thing to do anyway, since many things can accidentally write to offset 0, why not? Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 16:15:40 +01:00
Ben Widawsky	e20780439b	drm/i915: Defer request freeing With context destruction, we always want to be able to tear down the underlying address space. This is invoked on the last unreference to the context which could happen before we've moved all objects to the inactive list. To enable a clean tear down the address space, make sure to process the request free lastly. Without this change, we cannot guarantee to we don't still have active objects in the VM. As an example of a failing case: CTX-A is created, count=1 CTX-A is used during execbuf does a context switch count = 2 and add_request count = 3 CTX B runs, switches, CTX-A count = 2 CTX-A is destroyed, count = 1 retire requests is called free_request from CTX-A, count = 0 <--- free context with active object As mentioned above, by doing the free request after processing the active list, we can avoid this case. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:52:51 +01:00
Ben Widawsky	41bde5535a	drm/i915: Get context early in execbuf We need to have the address space when reserving space for the objects. Since the address space and context are tied together, and reserve occurs before context switch (for good reason), we must lookup our context earlier in the process. This leaves some room for optimizations where we no longer need to use ctx_id in certain places. This will be addressed in a subsequent patch. Important tricky bit: Because slow relocations during execbuffer drop struct_mutex Perhaps it would be best to acquire the reference when we get the context, but I'll save that for another day (note I have written the patch before, and I found the changes required to be uglier than this). Note that since we currently access everything via context id, and not the data structure this is fine, though not desirable. The next change attempts to get the context only once via the context ID idr lookup, and as such, the following can happen: CTX-A is created, refcount = 1 CTX-A execbuf, mutex dropped close IOCTL called on CTX-A, refcount = 0 CTX-A resumes in execbuf. v2: Rebased on top of commit `b6359918b8` Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Wed Oct 30 15:44:16 2013 +0200 drm/i915: add i915_get_reset_stats_ioctl v3: Rebased on top of commit `25b3dfc87b` Author: Mika Westerberg <mika.westerberg@linux.intel.com> Date: Tue Nov 12 11:57:30 2013 +0200 Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Tue Nov 26 16:14:33 2013 +0200 drm/i915: check context reset stats before relocations Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:52:42 +01:00
Ben Widawsky	c482972a08	drm/i915: Piggy back hangstats off of contexts To simplify the codepaths somewhat, we can simply always create a context. Contexts already keep hangstat information. This prevents us from having to differentiate at other parts in the code. There is allocation overhead, but it should not be measurable. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:51:58 +01:00
Ben Widawsky	bdf4fd7ea0	drm/i915: Do aliasing PPGTT init with contexts We have a default context which suits the aliasing PPGTT well. Tie them together so it looks like any other context/PPGTT pair. This makes the code cleaner as it won't have to special case aliasing as often. The patch has one slightly tricky part in the default context creation function. In the future (and on aliased setup) we create a new VM for a context (potentially). However, if we have aliasing PPGTT, which occurs at this point in time for all platforms GEN6+, we can simply manage the refcounting to allow things to behave as normal. Now is a good time to recall that the aliasing_ppgtt doesn't have a real VM, it uses the GGTT drm_mm. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:32:14 +01:00
Ben Widawsky	a3d67d2396	drm/i915: PPGTT vfuncs should take a ppgtt argument Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:56 +01:00
Ben Widawsky	2fa48d8d4a	drm/i915: Split context enabling from init We need to do this for exactly 1 reason, because we want to embed a PPGTT into the context, but we don't want to special case the default context. To achieve that, we must be able to initialize contexts after the GTT is setup (so we can allocate and pin the default context's BO), but before the PPGTT and rings are initialized. This is because, currently, context initialization requires ring usage. We don't have rings until after the GTT is setup. If we split the enabling part of context initialization, the part requiring the ringbuffer, we can untangle this, and then later embed the PPGTT Incidentally this allows us to also adhere to the original design of context init/fini in future patches: they were only ever meant to be called at driver load and unload. v2: Move hw_contexts_disabled test in i915_gem_context_enable() (Chris) v3: BUG_ON after checking for disabled contexts. Or else it blows up pre gen6 (Ben) v4: Forward port Modified enable for each ring, since that patch is earlier in the series Dropped ring arg from create_default_context so it can be used by others Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:55 +01:00
Ben Widawsky	acce9ffa48	drm/i915: Better reset handling for contexts This patch adds to changes for contexts on reset: Sets last context to default - this will prevent the context switch happening after a reset. That switch is not possible because the rings are hung during reset and context switch requires reset. This behavior will need to be reworked in the future, but this is what we want for now. In the future, we'll also want to reset the guilty context to uninitialized. We should wait for ARB_Robustness related code to land for that. This is somewhat for paranoia. Because we really don't know what the GPU was doing when it hung, or the state it was in (mid context write, for example), later restoring the context is a bad idea. By setting the flag to not initialized, the next load of that context will not restore the state, and thus on the subsequent switch away from the context will overwrite the old data. NOTE: This code needs a fixup when we actually have multiple VMs. The issue that can occur is inactive objects in a VM will need to be destroyed before the last context unref. This can now happen via the fake switch introduced in this patch (and it other ways in the future) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:54 +01:00
Ben Widawsky	e422b888eb	drm/i915: Add a context open function We'll be doing a bit more stuff with each file, so having our own open function should make things clean. This also allows us to easily add conditionals for stuff we don't want to do when we don't have HW contexts. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:51 +01:00
Ben Widawsky	6f65e29aca	drm/i915: Create bind/unbind abstraction for VMAs To sum up what goes on here, we abstract the vma binding, similarly to the previous object binding. This helps for distinguishing legacy binding, versus modern binding. To keep the code churn as minimal as possible, I am leaving in insert_entries(). It serves as the per platform pte writing basically. bind_vma and insert_entries do share a lot of similarities, and I did have designs to combine the two, but as mentioned already... too much churn in an already massive patchset. What follows are the 3 commits which existed discretely in the original submissions. Upon rebasing on Broadwell support, it became clear that separation was not good, and only made for more error prone code. Below are the 3 commit messages with all their history. drm/i915: Add bind/unbind object functions to VMA drm/i915: Use the new vm [un]bind functions drm/i915: reduce vm->insert_entries() usage drm/i915: Add bind/unbind object functions to VMA As we plumb the code with more VM information, it has become more obvious that the easiest way to deal with bind and unbind is to simply put the function pointers in the vm, and let those choose the correct way to handle the page table updates. This change allows many places in the code to simply be vm->bind, and not have to worry about distinguishing PPGTT vs GGTT. Notice that this patch has no impact on functionality. I've decided to save the actual change until the next patch because I think it's easier to review that way. I'm happy to squash the two, or let Daniel do it on merge. v2: Make ggtt handle the quirky aliasing ppgtt Add flags to bind object to support above Don't ever call bind/unbind directly for PPGTT until we have real, full PPGTT (use NULLs to assert this) Make sure we rebind the ggtt if there already is a ggtt binding. This happens on set cache levels. Use VMA for bind/unbind (Daniel, Ben) v3: Reorganize ggtt_vma_bind to be more concise and easier to read (Ville). Change logic in unbind to only unbind ggtt when there is a global mapping, and to remove a redundant check if the aliasing ppgtt exists. v4: Make the bind function a bit smarter about the cache levels to avoid unnecessary multiple remaps. "I accept it is a wart, I think unifying the pin_vma / bind_vma could be unified later" (Chris) Removed the git notes, and put version info here. (Daniel) v5: Update the comment to not suck (Chris) v6: Move bind/unbind to the VMA. It makes more sense in the VMA structure (always has, but I was previously lazy). With this change, it will allow us to keep a distinct insert_entries. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> drm/i915: Use the new vm [un]bind functions Building on the last patch which created the new function pointers in the VM for bind/unbind, here we actually put those new function pointers to use. Split out as a separate patch to aid in review. I'm fine with squashing into the previous patch if people request it. v2: Updated to address the smart ggtt which can do aliasing as needed Make sure we bind to global gtt when mappable and fenceable. I thought we could get away without this initialy, but we cannot. v3: Make the global GTT binding explicitly use the ggtt VM for bind_vma(). While at it, use the new ggtt_vma helper (Chris) At this point the original mailing list thread diverges. ie. v4^: use target_obj instead of obj for gen6 relocate_entry vma->bind_vma() can be called safely during pin. So simply do that instead of the complicated conditionals. Don't restore PPGTT bound objects on resume path Bug fix in resume path for globally bound Bos Properly handle secure dispatch Rebased on vma bind/unbind conversion Signed-off-by: Ben Widawsky <ben@bwidawsk.net> drm/i915: reduce vm->insert_entries() usage FKA: drm/i915: eliminate vm->insert_entries() With bind/unbind function pointers in place, we no longer need insert_entries. We could, and want, to remove clear_range, however it's not totally easy at this point. Since it's used in a couple of place still that don't only deal in objects: setup, ppgtt init, and restore gtt mappings. v2: Don't actually remove insert_entries, just limit its usage. It will be useful when we introduce gen8. It will always be called from the vma bind/unbind. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:50 +01:00
Ben Widawsky	d7f46fc4e7	drm/i915: Make pin count per VMA Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:49 +01:00
Ben Widawsky	feb822cfc2	drm/i915: Handle inactivating objects for all VMAs This came from a patch called, "drm/i915: Move active to vma" When moving an object to the inactive list, we do it for all VMs for which the object is bound. The primary difference from that patch is this time around we don't not track 'active' per vma, but rather by object. Therefore, we only need one unref. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:46 +01:00
Ben Widawsky	c39538a88d	drm/i915: Takedown drm_mm on failed gtt setup This was found by code inspection. If the GTT setup fails then we are left without properly tearing down the drm_mm. Hopefully this never happens. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:45 +01:00
Ben Widawsky	6e164c3382	drm/i915: Allow ggtt lookups to not WARN To be able to effectively use the GGTT object lookup function, we don't want to warn when there is no GGTT mapping. Let the caller deal with it instead. Originally, I had intended to have this behavior, and has not introduced the WARN. It was introduced during review with the addition of the follow commit commit `5c2abbeab7` Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Tue Sep 24 09:57:57 2013 -0700 drm/i915: Provide a cheap ggtt vma lookup Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:45 +01:00
Ben Widawsky	6f425321e0	drm/i915: Don't unconditionally try to deref aliasing ppgtt Since the beginning, the functions which try to properly reference the aliasing PPGTT have deferences a potentially null aliasing_ppgtt member. Since the accessors are meant to be global, this will not do. Introduced originally in: commit `a70a3148b0` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Jul 31 16:59:56 2013 -0700 drm/i915: Make proper functions for VMs Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:44 +01:00
Paulo Zanoni	48018a57a8	drm/i915: release the GTT mmaps when going into D3 So we'll get a fault when someone tries to access the mmap, then we'll wake up from D3. v2: - Rebase v3: - Use gtt active/inactive Testcase: igt/pm_pc8/gem-mmap-gtt Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> [danvet: Add comment + WARN as discussed with Paulo on irc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-14 15:35:52 +01:00
Mika Kuoppala	168c3f2151	drm/i915: dont call irq_put when irq test is on If test is running, irq_get was not called so we should gain balance by not doing irq_put "So the rule is: if you access unlocked values, you use ACCESS_ONCE(). You don't say "but it can't matter". Because you simply don't know." -- Linus v2: use local variable so it can't change during test (Chris) v3: update commit msg and use ACCESS_ONCE (Ville) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-12 22:58:44 +01:00
Mika Kuoppala	47e9766df0	drm/i915: Fix timeout with missed interrupts in __wait_seqno Commit `094f9a54e3` ("drm/i915: Fix __wait_seqno to use true infinite timeouts") added support for __wait_seqno to detect missing interrupts and go around them by polling. As there is also timeout detection in __wait_seqno, the polling and timeout detection were done with the same timer. When there has been missed interrupts and polling is needed, the timer is set to trigger in (now + 1) jiffies in future, instead of the caller specified timeout. Now when io_schedule() returns, we calculate the jiffies left to timeout using the timer expiration value. As the current jiffies is now bound to be always equal or greater than the expiration value, the timeout_jiffies will become zero or negative and we return -ETIME to caller even tho the timeout was never reached. Fix this by decoupling timeout calculation from timer expiration. v2: Commit message with some sense in it (Chris Wilson) v3: add parenthesis on timeout_expire calculation v4: don't read jiffies without timeout (Chris Wilson) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-12 15:27:21 +01:00
Chris Wilson	4db080f9e9	drm/i915: Fix erroneous dereference of batch_obj inside reset_status As the rings may be processed and their requests deallocated in a different order to the natural retirement during a reset, /* Whilst this request exists, batch_obj will be on the * active_list, and so will hold the active reference. Only when this * request is retired will the the batch_obj be moved onto the * inactive_list and lose its active reference. Hence we do not need * to explicitly hold another reference here. */ is violated, and the batch_obj may be dereferenced after it had been freed on another ring. This can be simply avoided by processing the status update prior to deallocating any requests. Fixes regression (a possible OOPS following a GPU hang) from commit `aa60c664e6` Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Wed Jun 12 15:13:20 2013 +0300 drm/i915: find guilty batch buffer on ring resets Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Add the code comment Chris supplied.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-12 10:49:05 +01:00
Paulo Zanoni	f65c916898	drm/i915: add runtime put/get calls at the basic places If I add code to enable runtime PM on my Haswell machine, start a desktop environment, then enable runtime PM, these functions will complain that they're trying to read/write registers while the graphics card is suspended. v2: - Simplify i915_gem_fault changes. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> [danvet: Drop the hunk in i915_hangcheck_elapsed, it's the wrong thing to do.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-10 22:47:33 +01:00
Chris Wilson	70903c3ba8	drm/i915: Fix ordering of unbind vs unpin pages It is useful to assert that if the object is bound, then it must have its pages pinned to prevent the shrinker from reaping its backing store. This is even more useful with the introduction of real-ppgtt whereupon we may have the object bound into several vma, with each instance pinning the backing store. This assertion breaks down during unbind where we unpinned the backing store before decoupling the vma binding. This can be fixed with a trivial reording of the unbind sequence, which reinforces the pin pages bind to vma ... unbind from vma unpin pages concept. v2: Bonus comment Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-04 12:10:50 +01:00
Ville Syrjälä	0bf2134780	drm/i915: MI_PREDICATE_RESULT_2 is HSW only The MI_PREDICATE_RESULT_2 register exits only on HSW. On other platforms the same offset is either reserved, or contains some other register. So write the register only on HSW. This regression has been introduced in commit `9435373ef8` Author: Rodrigo Vivi <rodrigo.vivi@gmail.com> Date: Wed Aug 28 16:45:46 2013 -0300 drm/i915: Report enabled slices on Haswell GT3 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> [danvet: Add regression notice.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-29 15:00:03 +01:00
Daniel Vetter	c09cd6e969	Merge branch 'backlight-rework' into drm-intel-next-queued Pull in Jani's backlight rework branch. This was merged through a separate branch to be able to sort out the Broadwell conflicts properly before pulling it into the main development branch. Conflicts: drivers/gpu/drm/i915/intel_display.c Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-15 10:02:39 +01:00
Ben Widawsky	31a5336e1c	drm/i915/bdw: Swizzling support Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:37 +01:00
Ben Widawsky	5ab31333ac	drm/i915/bdw: Fences on gen8 look just like gen7 Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:36 +01:00
Ben Widawsky	8245be3139	drm/i915: Require HW contexts (when possible) v2: Fixed the botched locking on init_hw failure in i915_reset (Ville) Call cleanup_ringbuffer on failed context create in init_hw (Ville) v3: Add dev argument ti clean_ringbuffer Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-07 09:35:44 +01:00
Paulo Zanoni	de45eaf7b9	drm/i915: fix open-coded DIV_ROUND_UP Use the nice Kernel macro, it makes the code much more readable. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-21 10:04:03 +02:00
Daniel Vetter	aa5f802181	drm/i915: Use unsigned long for obj->user_pin_count At least on linux sizeof(long) == sizeof(void*) and the thinking is that you can grab about as many references as there's memory. Doesn't really matter, just a bit of OCD since the fixed size data type in a pure in-kernel datastructure look off. v2: Ville asked for an overflow check since no one prevents userspace from incrementing the pin count forever. v3: s/INT/LONG/, noticed by Chris. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 22:06:39 +02:00
Chris Wilson	45c5f2022c	drm/i915: Disable all GEM timers and work on unload We have two once very similar functions, i915_gpu_idle() and i915_gem_idle(). The former is used as the lower level operation to flush work on the GPU, whereas the latter is the high level interface to flush the GEM bookkeeping in addition to flushing the GPU. As such i915_gem_idle() also clears out the request and activity lists and cancels the delayed work. This is what we need for unloading the driver, unfortunately we called i915_gpu_idle() instead. In the process, make sure that when cancelling the delayed work and timer, which is synchronous, that we do not hold any locks to prevent a deadlock if the work item is already waiting upon the mutex. This requires us to push the mutex down from the caller to i915_gem_idle(). v2: s/i915_gem_idle/i915_gem_suspend/ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70334 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: xunx.fang@intel.com [danvet: Only set ums.suspended for !kms as discussed earlier. Chris noticed that this slipped through.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 19:42:14 +02:00
Ben Widawsky	3d57e5bd12	drm/i915: Do a fuller init after reset I had this lying around from he original PPGTT series, and thought we might try to get it in by itself. It's convenient to just call i915_gem_init_hw at reset because we'll be adding new things to that function, and having just one function to call instead of reimplementing it in two places is nice. In order to accommodate we cleanup ringbuffers in order to bring them back up cleanly. Optionally, we could also teardown/re initialize the default context but this was causing some problems on reset which I wasn't able to fully debug, and is unnecessary with the previous context init/enable split. This essentially reverts: commit `8e88a2bd59` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jun 19 18:40:00 2012 +0200 drm/i915: don't call modeset_init_hw in i915_reset It seems to work for me on ILK now. Perhaps it's due to: commit `8a5c2ae753` Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Thu Mar 28 13:57:19 2013 -0700 drm/i915: fix ILK GPU reset for render Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 11:08:08 +02:00
Daniel Vetter	3bbbe706e8	drm/i915: check that the i965g/gm 4G limit is really obeyed In truly crazy circumstances shmem might give us the wrong type of page. So be a bit paranoid and double check this. Reviewer: Damien Lespiau <damien.lespiau@intel.com> Cc: Rob Clark <robdclark@gmail.com> References: http://lkml.org/lkml/2011/7/11/238 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:47:05 +02:00
Chris Wilson	d9973b4356	drm/i915: Fix type mismatch and accounting in i915_gem_shrink The interface uses an unsigned long, and we can use the unsigned counter throughout our code, so do so. In the process, we notice one instance where the shrink count is based on a heuristic rather than the result, and another where we ask for too many pages to be purged. v2: nr_to_scan needs to be promoted to a long as well, so just use sc->nr_to_scan directly. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:46:48 +02:00
Chris Wilson	5035c275af	drm/i915: Call io_schedule() whilst whilsting for the GPU Since we are waiting upon IO completion, inform the kernel through use of the io_schedule() call rather than the regular schedule(). This should allow the kernel to make better decisions regarding scheduling and power management. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:46:47 +02:00
Daniel Vetter	967ad7f148	Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next The conflict in intel_drv.h tripped me up a bit since a patch in dinq moves all the functions around, but another one in drm-next removes a single function. So I'ev figured backing this into a backmerge would be good. i915_dma.c is just adjacent lines changed, nothing nefarious there. Conflicts: drivers/gpu/drm/i915/i915_dma.c drivers/gpu/drm/i915/intel_drv.h Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:44:43 +02:00
David Herrmann	16eb5f4379	drm: kill ->gem_init_object() and friends All drivers embed gem-objects into their own buffer objects. There is no reason to keep drm_gem_object_alloc(), gem->driver_private and ->gem_init_object() anymore. New drivers are highly encouraged to do the same. There is no benefit in allocating gem-objects separately. Cc: Dave Airlie <airlied@gmail.com> Cc: Alex Deucher <alexdeucher@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Inki Dae <inki.dae@samsung.com> Cc: Ben Skeggs <skeggsb@gmail.com> Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-10-09 14:38:02 +10:00
Chris Wilson	b29c19b645	drm/i915: Boost RPS frequency for CPU stalls If we encounter a situation where the CPU blocks waiting for results from the GPU, give the GPU a kick to boost its the frequency. This should work to reduce user interface stalls and to quickly promote mesa to high frequencies - but the cost is that our requested frequency stalls high (as we do not idle for long enough before rc6 to start reducing frequencies, nor are we aggressive at down clocking an underused GPU). However, this should be mitigated by rc6 itself powering off the GPU when idle, and that energy use is dependent upon the workload of the GPU in addition to its frequency (e.g. the math or sampler functions only consume power when used). Still, this is likely to adversely affect light workloads. In particular, this nearly eliminates the highly noticeable wake-up lag in animations from idle. For example, expose or workspace transitions. (However, given the situation where we fail to downclock, our requested frequency is almost always the maximum, except for Baytrail where we manually downclock upon idling. This often masks the latency of upclocking after being idle, so animations are typically smooth - at the cost of increased power consumption.) Stéphane raised the concern that this will punish good applications and reward bad applications - but due to the nature of how mesa performs its client throttling, I believe all mesa applications will be roughly equally affected. To address this concern, and to prevent applications like compositors from permanently boosting the RPS state, we ratelimit the frequency of the wait-boosts each client recieves. Unfortunately, this techinique is ineffective with Ironlake - which also has dynamic render power states and suffers just as dramatically. For Ironlake, the thermal/power headroom is shared with the CPU through Intelligent Power Sharing and the intel-ips module. This leaves us with no GPU boost frequencies available when coming out of idle, and due to hardware limitations we cannot change the arbitration between the CPU and GPU quickly enough to be effective. v2: Limit each client to receiving a single boost for each active period. Tested by QA to only marginally increase power, and to demonstrably increase throughput in games. No latency measurements yet. v3: Cater for front-buffer rendering with manual throttling. v4: Tidy up. v5: Sadly the compositor needs frequent boosts as it may never idle, but due to its picking mechanism (using ReadPixels) may require frequent waits. Those waits, along with the waits for the vrefresh swap, conspire to keep the GPU at low frequencies despite the interactive latency. To overcome this we ditch the one-boost-per-active-period and just ratelimit the number of wait-boosts each client can receive. Reported-and-tested-by: Paul Neumann <paul104x@yahoo.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68716 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Stéphane Marchesin <stephane.marchesin@gmail.com> Cc: Owen Taylor <otaylor@redhat.com> Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com> Cc: "Zhuang, Lena" <lena.zhuang@intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: No extern for function prototypes in headers.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-03 20:01:31 +02:00
Chris Wilson	094f9a54e3	drm/i915: Fix __wait_seqno to use true infinite timeouts When we switched to always using a timeout in conjunction with wait_seqno, we lost the ability to detect missed interrupts. Since, we have had issues with interrupts on a number of generations, and they are required to be delivered in a timely fashion for a smooth UX, it is important that we do log errors found in the wild and prevent the display stalling for upwards of 1s every time the seqno interrupt is missed. Rather than continue to fix up the timeouts to work around the interface impedence in wait_event_*(), open code the combination of wait_event[_interruptible][_timeout], and use the exposed timer to poll for seqno should we detect a lost interrupt. v2: In order to satisfy the debug requirement of logging missed interrupts with the real world requirments of making machines work even if interrupts are hosed, we revert to polling after detecting a missed interrupt. v3: Throw in a debugfs interface to simulate broken hw not reporting interrupts. v4: s/EGAIN/EAGAIN/ (Imre) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> [danvet: Don't use the struct typedef in new code.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-03 20:01:30 +02:00
Chris Wilson	b52b89da09	drm/i915: Add a tracepoint for using a semaphore So that we can find the callers who introduce a ring stall. A single ring stall is not too unwelcome, the right issue becomes when they start to interlock and prevent any concurrent work. That, however, is a little tricker to detect with a mere tracepoint! v2: Rebrand it as a ring event, rather than an object event. v3: Include the seqno in the tracepoint for posterity or something. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:24 +02:00
Ben Widawsky	e2d05a8b1e	drm/i915: Convert active API to VMA Even though we track object activity and not VMA, because we have the active_list be based on the VM, it makes the most sense to use VMAs in the APIs. NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for now, leave them be. v2: Remove leftover hunk from the previous patch which didn't keep i915_gem_object_move_to_active. That patch had to rely on the ring to get the dev instead of the obj. (Chris) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:21 +02:00

1 2 3 4 5 ...

994 Commits