The initialized flag is used to specify a context has been initialized
and it's context is safe to load, ie. the 3d state is setup properly.
With full PPGTT, we emit the address space loads during context switch
and this currently marks a context as initialized. With full PPGTT
patches, if a client first emits a batch to !RCS, then later, RCS, the
code will mistake the context as initialized and try to reload an
uninitialized context.
1. context 1 blit // context marked as initialized, but isn't
2. context 1 render // loads random state from step 2
It is really easy to hit this with a planned upcoming patch which makes
default context reuse possible.
NOTE: This should only effect full PPGTT branches, ie. current
drm-intel-nightly.
Thanks to Chris for helping me track this down.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Simplify the failure scenario in the commit message according
to Chris' review a bit.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was an accidental "ABI" change introduced during PPGTT:
commit 0eea67eb26
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Dec 6 14:11:19 2013 -0800
drm/i915: Create a per file_priv default context
The failure test application actually tests the return type. The other
option is to simply change the test.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
A regression in the topic/ppgtt branch introduce in
commit 87d60b63e0
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Dec 6 14:11:29 2013 -0800
drm/i915: Add PPGTT dumper
The issue is that we're missing the definitions for the seq_file
functions and hence compilation fails.
v2: Just include the right header instead of splattering #ifdefs all
over the place (Chris).
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Reported-by: Antti Koskipaa <antti.koskipaa@linux.intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
LPT does have PCH refclk, but it's different form the IBX/CPT/PPT one
and doesn't use the same structs. It is wrong to have a message saying
that "LPT does not has PCH refclk" (sic). While at it, signal that we
only want this function on IBX/CPT/PPT by renaming it and adding a
WARN.
On HSW we also print "0 shared PLLs initialized", but we *do* have
shared PLLs on HSW (LCPLL, WRPLL, SPLL) and we *do* initialize them.
We just don't use "struct intel_shared_dpll". So remove the debug
message.
In the future we may want to rename all that "intel shared pll" code
to "ibx shared pll", but I'll leave this to another patch.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Properly zero the refcounts and crtc->ddi_pll_set so the previous HW
state doesn't affect the result of reading the current HW state.
This fixes WARNs about WRPLL refcount if we have an HDMI monitor on
HSW and then suspend/resume.
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64379
Tested-by: Qingshuai Tian <qingshuai.tian@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
According to Art, we don't have a way to read back the state reliably at
runtime, through the control reg or the mailbox, at least not without risking
disabling it again. So drop the readout and checking on BDW.
v2: drop TODO comment (Paulo)
move POSTING_READ of control reg under HSW branch in disable (Paulo)
always report IPS as enabled on BDW (Paulo)
References: https://bugs.freedesktop.org/show_bug.cgi?id=71906
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was introduced in:
commit 7c4a395ff8
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Wed Oct 9 19:17:56 2013 +0300
drm/i915: Don't re-compute pipe watermarks except for the affected pipe
and I missed fixing it in:
commit fec8cba306
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Wed Nov 27 11:10:26 2013 -0800
drm/i915: use crtc_htotal in watermark calculations to match fastboot v2
It's needed for ILK+ platforms to fastboot without crashing on a divide
by 0 after a DPMS on action.
Note: Ville mentioned in his review that this confusion seems to go
down to the original introduction of this code in
commit 801bcfffbb
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Fri May 31 10:08:35 2013 -0300
drm/i915: properly set HSW WM_PIPE registers
So it seems to have been missed both in the fastboot patch and in the
3d mode suppport (where only crtc_htotal reflects the real pipe
width).
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Add note based on Ville's review.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When fastbooting, we read out the pipe timings early on, and then in a
panel fitted config, disable the fitter later. But we weren't updating
the pipe src h/w, which meant the mouse cursor was clipped to the
pfitted size rather than the native size set later. Fix that up so the
cursor is visible in the new mode.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Otherwise we won't check the state until the next DPMS transition, which
may never happen.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Avoid duplicating the same piece of code several times by separating
the watemark vfunc setup from the init_clock_gating vfunc setup on PCH
platforms.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We forgot to intialize the watermark vfuncs for BDW, and hence the
watermarks were never updated.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Looks like I forgot to update the ILK/SNB/IVB watermark patches to deal
with BDW. Add the relevant BDW checks to make sure we take the HSW
codepaths on BDW as well.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We're iterating over the CPU transcoders, so check for the correct
power domain.
This fixes many "unclaimed register" error messages.
This can be reproduced by the IGT test mentioned below, but we still
get a FAIL when we run it.
Testcase: igt/kms_lip/flip-vs-panning-vs-hang
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In very rare cases (such as a memory failure stress test) it is possible
to fill the entire ring without emitting a request. Under this
circumstance, the outstanding request is flushed and waited upon. After
space on the ring is cleared, we return to emitting the new command -
except that we just cleared the seqno allocated for this operation and
trigger the sanity check that a request is only ever emitted with a
valid seqno. The fix is to rearrange the code to make sure the
allocation of the seqno for this operation is after any required flushes
of outstanding operations.
The bug exists since the preallocation was introduced in
commit 9d7730914f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Nov 27 16:22:52 2012 +0000
drm/i915: Preallocate next seqno before touching the ring
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Multiple definitions show up multiple times in modinfo output.
There's already an identical one in i915_drv.c along with other MODULE_*
definitions, so drop the lone one in intel_fbdev.c.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Without this fix the ioctls silently succeeded (but actually did
nothing).
It makes all the code which calls into this function way too confusing.
v2: Fix destroy IOCTL as well
v3: Clarify the other two callers of i915_gem_context_get() to never
check for NULL. (Mika)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72903
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Testcase: igt/gem_ctx_exec/basic
[danvet: Fix up the commit message and actually bother to mention the
testcase this fixes.]
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The bug from gen6_ppgtt_insert_entries() was replicated into
gen8_ppgtt_insert_entries(). This applies the fix for the OOPS from the
previous patch to the gen8 routine.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The iommu and gfx on Ironlake do not like each other and require a
big hammer to prevent hard machine hangs. In
commit 5c0422878f
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Mon Oct 17 15:51:55 2011 -0700
drm/i915: ILK + VT-d workaround
we added the workaround, but never emitted any debug message that it was
active. Doing so should help identify known performance regressions.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
My Acer 8510TZ stops displaying anything when X starts with Linus' current
tree. I bisected it down to commit ee1452d745.
This patch reverts commit ee1452d745.
After the revert, everything works as before.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Reported-by: Dylan Borg <borgdylan@hotmail.com> (for a Acer Extensa 5635Z)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
gem_gtt_cpu_tlb seems to indicate that it is needed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72869
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since get_pid_task() grabs a reference on the task_struct, we have to drop the
refcount after reading that task's comm name. Use pid_task() with RCU instead.
Also, avoid directly reading like pid_task()->comm because
pid_task() will return NULL if the task have already exit()ed.
This patch fixes both problems.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
- fbc1 improvements from Ville (pre-gm45).
- vlv forcewake improvements from Deepak S.
- Some corner-cases fixes from Mika for the context hang stat code.
- pc8 improvements and prep work for runtime D3 from Paulo, almost ready for
primetime.
- gen2 dpll fixes from Ville.
- DSI improvements from Shobhit Kumar.
- A few smaller fixes and improvements all over.
[airlied: intel_ddi.c conflict fixed up]
* tag 'drm-intel-next-2013-12-13' of git://people.freedesktop.org/~danvet/drm-intel: (61 commits)
drm/i915/bdw: Implement ff workarounds
drm/i915/bdw: Force all Data Cache Data Port access to be Non-Coherent
drm/i915/bdw: Don't use forcewake needlessly
drm/i915: Clear out old GT FIFO errors in intel_uncore_early_sanitize()
drm/i915: dont call irq_put when irq test is on
drm/i915: Rework the FBC interval/stall stuff a bit
drm/i915: Enable FBC for all mobile gen2 and gen3 platforms
drm/i915: FBC_CONTROL2 is gen4 only
drm/i915: Gen2 FBC1 CFB pitch wants 32B units
drm/i915: split intel_ddi_pll_mode_set in 2 pieces
drm/i915: Fix timeout with missed interrupts in __wait_seqno
drm/i915: touch VGA MSR after we enable the power well
drm/i915: extract hsw_power_well_post_{enable, disable}
drm/i915: remove i915_disable_vga_mem declaration
drm/i915: Parametrize the dphy and other spec specific parameters
drm/i915: Remove redundant DSI PLL enabling
drm/i915: Reorganize the DSI enable/disable sequence
drm/i915: Try harder to get best m, n, p values with minimal error
drm/i915: Compute dsi_clk from pixel clock
drm/i915: Use FLISDSI interface for band gap reset
...
Conflicts:
drivers/gpu/drm/i915/intel_ddi.c
This means something different and is only relevant for gen6 and the
reason why we cant use anything else than aliasing ppgtt there.
Note that the currently implemented logic for secure batches is
broken: Userspace wants the buffer both in ppgtt (for self-referencing
relocations) and in ggtt (for priveledge operations).
This is the same issue the command parser is also facing.
Unfortunately our coverage for corner-cases of self-referencing
batches is spotty.
Note that this will break vsync'ed Xv and DRI2 copies.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This reverts commit 4fe9adbc36.
The patch completely lacks a detailed explanation of what exactly
blows up and how, so is insufficiently justified as a band-aid.
Otoh the justification as a safety measure against userspace botching
up relocations is also fairly weak: If we want real project we need to
at least make the gab big enough that the gpu doesn't scribble over
more important stuff. With 4k screens that would be 32MB.
Also I think this would be much better in conjunction with a (debug)
switch to disable our use of the scratch page.
Hence revert this.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This reverts the abi-change from
commit 67e3d2979b
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Dec 6 14:11:01 2013 -0800
drm/i915: Permit contexts on all rings
We don't actually need this, only the internal changes to allow
contexts on all rings for the purpose of ppgtt switching are required.
And I'm not sure whether this is the right thing to do given some of
the hw features in the pipeline.
Also, new abi needs userspace patches as a proof-of-need, which is
completely lacking here.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
At least for now userspace has no business at all to know that we
switch address spaces around. For any need it has to know whether hw
ppgtt is enabled (e.g. to set bits in MI commands correctly) it can
inquire the existing ppgtt param.
v2: Avoid ternary operator precedence fail (Chris).
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Especially with ppgtt this kinda stopped making sense. And if we
indeed need this to hack around an issue, we need something that also
works for non-root.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Dump the aliasing PPGTT with it. The aliasing PPGTT should actually
always be empty.
TODO: Broadwell. Since we don't yet use full PPGTT on Broadwell, not
having the dumper is okay.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Originally this commit message said:
Now that do_switch does the mm switch, and we always enable the aliasing
PPGTT, and contexts at the same time, there is no need to continue doing
this during PPGTT enabling.
Since originally writing the patch however, I introduced the concept of
synchronous mm switching (using MMIO). Since this is generally not
recommended in the spec (for reasons unknown), I've isolated its usage
as much as possible. As such the "extraneous" switch only ever will
occur when we have full PPGTT.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As with processes which run on the CPU, the goal of multiple VMs is to
provide process isolation. Specific to GEN, there is also the ability to
map more objects per process (2GB each instead of 2Gb-2k total).
For the most part, all the pipes have been laid, and all we need to do
is remove asserts and actually start changing address spaces with the
context switch. Since prior to this we've converted the setting of the
page tables to a streamed version, this is quite easy.
One important thing to point out (since it'd been hotly contested) is
that with this patch, every context created will have it's own address
space (provided the HW can do it).
v2: Disable BDW on rebase
NOTE: I tried to make this commit as small as possible. I needed one
place where I could "turn everything on" and that is here. It could be
split into finer commits, but I didn't really see much point.
Cc: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I need the tricky do_switch fix before I can merge the final piece of
the ppgtt enabling puzzle. Otherwise the conflict will be a real pain
to resolve since the do_switch hunk from -fixes must be placed at the
exact right place within a hunk in the next patch.
Conflicts:
drivers/gpu/drm/i915/i915_gem_context.c
drivers/gpu/drm/i915/i915_gem_execbuffer.c
drivers/gpu/drm/i915/intel_display.c
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is primarily a band aid for an unexplainable error in
gem_reloc_vs_gpu/forked-faulting-reloc-thrashing. Essentially as soon as
a relocated buffer (which had a non-zero presumed offset) moved to
offset 0, something goes bad. Since I have been unable to solve this,
and potentially this is a good thing to do anyway, since many things can
accidentally write to offset 0, why not?
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It's quite common for an object to simply be on the inactive list (and
not unbound) when we want to free the context. This of course happens
with lazy unbinding. Simply, this is needed when an object isn't fully
unbound but we want to free one VMA of the object, for whatever reason.
NOTE: The aliasing PPGTT is not a proper VM, so it needs special casing.
This addresses the fixup requirement mentioned in:
drm/915: Better reset handling for contexts
In the flink, and dmabuf case, we can't assert that the object isn't
still active. To keep it more generic, just check the vma's link in the
object vma list. If we wanted to do a better job, we could track last
seqno (and active) per VMA. It was decided not to do this in the last
iteration. Unfortunately this means the assertion can miss real bugs
when using flink/dmabuf.
v2: Use the newer introduced i915_gem_evict_vm(). Note that handling the
aliasing PPGTT is special.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With context destruction, we always want to be able to tear down the
underlying address space. This is invoked on the last unreference to the
context which could happen before we've moved all objects to the
inactive list. To enable a clean tear down the address space, make sure
to process the request free lastly.
Without this change, we cannot guarantee to we don't still have active
objects in the VM.
As an example of a failing case:
CTX-A is created, count=1
CTX-A is used during execbuf
does a context switch count = 2
and add_request count = 3
CTX B runs, switches, CTX-A count = 2
CTX-A is destroyed, count = 1
retire requests is called
free_request from CTX-A, count = 0 <--- free context with active object
As mentioned above, by doing the free request after processing the
active list, we can avoid this case.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We need to have the address space when reserving space for the objects.
Since the address space and context are tied together, and reserve
occurs before context switch (for good reason), we must lookup our
context earlier in the process.
This leaves some room for optimizations where we no longer need to use
ctx_id in certain places. This will be addressed in a subsequent patch.
Important tricky bit:
Because slow relocations during execbuffer drop struct_mutex
Perhaps it would be best to acquire the reference when we get the
context, but I'll save that for another day (note I have written the
patch before, and I found the changes required to be uglier than this).
Note that since we currently access everything via context id, and not
the data structure this is fine, though not desirable. The next change
attempts to get the context only once via the context ID idr lookup, and
as such, the following can happen:
CTX-A is created, refcount = 1
CTX-A execbuf, mutex dropped
close IOCTL called on CTX-A, refcount = 0
CTX-A resumes in execbuf.
v2: Rebased on top of
commit b6359918b8
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Wed Oct 30 15:44:16 2013 +0200
drm/i915: add i915_get_reset_stats_ioctl
v3: Rebased on top of
commit 25b3dfc87b
Author: Mika Westerberg <mika.westerberg@linux.intel.com>
Date: Tue Nov 12 11:57:30 2013 +0200
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Tue Nov 26 16:14:33 2013 +0200
drm/i915: check context reset stats before relocations
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To simplify the codepaths somewhat, we can simply always create a
context. Contexts already keep hangstat information. This prevents us
from having to differentiate at other parts in the code.
There is allocation overhead, but it should not be measurable.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Every file will get it's own context, and we use this context instead of
the default context. The default context still exists for future
shrinker usage as well as reset handling.
v2: Updated to address Mika's recent context guilty changes
Some more changes around this come up in later patches as well.
v3: Use a fake context to avoid allocation for the !HAS_HW_CONTEXT case.
I've tried the alternatives. This looks the best to me.
Removed hangstat stuff from v2 - for a separate patch
Demote failed PPGTT set to DRM_DEBUG_DRIVER since it can now be invoked
easily from userspace.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We have a default context which suits the aliasing PPGTT well. Tie them
together so it looks like any other context/PPGTT pair. This makes the
code cleaner as it won't have to special case aliasing as often.
The patch has one slightly tricky part in the default context creation
function. In the future (and on aliased setup) we create a new VM for a
context (potentially). However, if we have aliasing PPGTT, which occurs
at this point in time for all platforms GEN6+, we can simply manage the
refcounting to allow things to behave as normal. Now is a good time to
recall that the aliasing_ppgtt doesn't have a real VM, it uses the GGTT
drm_mm.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In following with the old restore code, we must now restore ever PPGTT's
PDEs, since they aren't proper GEM ojbects.
v2: Rebased on BDW. Only do restore pdes for gen6 & 7
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We won't be calling enable() for all PPGTTs. We do need to write PDEs
for all PPGTTs however. By moving the writing to init (which is called
for all PPGTTs) we should accomplish this.
ADD NOTE ABOUT PDE restore
TODO: Eventually, we should allocate the page tables on demand.
v2: Rebased on BDW. Only do PDEs for pre-gen8
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pretty straightforward so far except for the bit about the refcounting.
The PPGTT will potentially be shared amongst multiple contexts. Because
contexts themselves have a refcounted lifecycle, the easiest way to
manage this will be to refcount the PPGTT. To acheive this, we piggy
back off of the existing context refcount, and will increment and
decrement the PPGTT refcount with context creation, and destruction.
To put it more clearly, if context A, and context B both use PPGTT 0, we
can't free the PPGTT until both A, and B are destroyed.
Note that because the PPGTT is permanently pinned (for now), it really
just matters for the PPGTT destruction, as opposed to making space under
memory pressure.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch consolidates the way in which we handle the various supported
PPGTT by module parameter in addition to what the hardware supports. It
strives to make doing the right thing in the code as simple as possible,
with the USES_ macros.
I've opted to add the full PPGTT argument simply so one can see how I
intend to use this function. It will not/cannot be used until later.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rearrange the initialization code to try to special case the aliasing
PPGTT less, and provide usable interfaces for the general case later.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I've found this by accident. The docs don't really come out and say you
need to do this. What the docs do tell you is you need to flush the TLBs
before you set the PP_DIR_BASE, and that the RCS will invalidate its
TLBs upon setting the new PP_DIR_BASE. It makes no such comment about
any of the other rings.
Empirically, this indeed fixes a really obvious bug whereby the batches
being sent to the blitter were not executing (we were executing the
HSWP somehow instead).
NOTE: This should make no difference with the current code. It only
applies when we start using multiple VMs.
NOTE2: HSW appears to be immune to this.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The docs seem to suggest this is the appropriate method (though it
doesn't say so outright). In other words, we probably should have done
this before. We certainly must do this for switching VMs on the fly,
since synchronizing the rings to MMIO updates isn't acceptable.
v2:
Make the reset code actually work for all rings. Note that this was
fixed in subsequent commits, but was indeed broken for this commit.
Add a posting read to the reset case. It probably should have existed
before hand, but since we have no failures; there is no reason to make
it a separate commit.
Make IS_GEN6 not use the ring because I am seeing crashes when using it.
It is a bit of a hack in this patch, it will get fixed up in a couple of
patches.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In order to do the full context switch with address space, it's
convenient to have a way to switch the address space. We already have
this in our code - just pull it out to be called by the context switch
code later.
v2: Rebased on BDW support. Required adding BDW.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The patch before this changed the way in which we allocate space for the
PPGTT PDEs. It began carving out the PPGTT PDEs (which live in the
Global GTT) from the GGTT's drm_mm. Prior to that patch, the PDEs were
hidden from the drm_mm, and therefore could never fail to be allocated.
In unfortunate cases, the drm_mm may be full when we want to allocate
the space. This can technically occur whenever we try to allocate, which
happens in two places currently. Practically, it can only really ever
happen at GPU reset.
Later, when we allocate more PDEs for multiple PPGTTs this will
potentially even more useful.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When PPGTT support was originally enabled, it was only designed to
support 1 PPGTT. It therefore made sense to simply hide the GGTT space
required to enable this from the drm_mm allocator.
Since we intend to support full PPGTT, which means more than 1, and they
can be created and destroyed ad hoc it will be required to use the
proper allocation techniques we already have.
The first step here is to make the existing single PPGTT use the
allocator.
The astute observer will notice that we are reserving space in the GGTT
for the PDEs for the lifetime of the address space, and would be right
to question whether or not this is a good idea. It does not make a
difference with this current patch only the aliasing PPGTT (indeed the
PDEs should still be hidden from the shrinker). For the future, we are
allocating from top to bottom to avoid using the precious "gtt
space" The GGTT space at that point should only be used for scanout, HW
contexts, ringbuffers, HWSP, PDEs, and a couple of other small buffers
(potentially) used by the kernel. Everything else should be mapped into
a PPGTT. To put the consumption in more tangible terms, it takes
approximately 4 sets of PDEs to equal one 19x10 framebuffer (with no
fancy stride or alignment constraints). 3/4 of the total [average] GGTT
can be used for PDEs, and hopefully never touch the 1/4 that the
framebuffer needs.
The astute, and persistent observer might ask about the page tables
which are also pinned for the address space. This waste is unfortunate.
We use 2MB of memory per address space. We leave wrapping the PDEs as a
real GEM object as a TODO.
v2: Align PDEs to 64b in GTT
Allocate the node dynamically so we can use drm_mm_put_block
Now tested on IGT
Allocate node at the top to avoid fragmentation (Chris)
v3: Use Chris' top down allocator
v4: Embed drm_mm_node into ppgtt struct (Jesse)
Remove hunks which didn't belong (Jesse)
v5: Don't subtract guard page since we now killed the guard page prior
to this patch. (Ben)
v6: Rebased and removed guard page stuff.
Added a chunk to the commit message
Allow adding a context to mappable region
v7: Undo v3, so we can make the drm patch last in the series
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v4)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
squash: drm/i915: allow PPGTT to use mappable
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The plan to to make every file descriptor have a default context. To
accommodate this, generalize out default context setup function so it
can be used at file open time.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We **need** to do this for exactly 1 reason, because we want to embed a
PPGTT into the context, but we don't want to special case the default
context.
To achieve that, we must be able to initialize contexts after the GTT is
setup (so we can allocate and pin the default context's BO), but before
the PPGTT and rings are initialized. This is because, currently, context
initialization requires ring usage. We don't have rings until after the
GTT is setup. If we split the enabling part of context initialization,
the part requiring the ringbuffer, we can untangle this, and then later
embed the PPGTT
Incidentally this allows us to also adhere to the original design of
context init/fini in future patches: they were only ever meant to be
called at driver load and unload.
v2: Move hw_contexts_disabled test in i915_gem_context_enable() (Chris)
v3: BUG_ON after checking for disabled contexts. Or else it blows up pre
gen6 (Ben)
v4: Forward port
Modified enable for each ring, since that patch is earlier in the series
Dropped ring arg from create_default_context so it can be used by others
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch adds to changes for contexts on reset:
Sets last context to default - this will prevent the context switch
happening after a reset. That switch is not possible because the
rings are hung during reset and context switch requires reset. This
behavior will need to be reworked in the future, but this is what we
want for now.
In the future, we'll also want to reset the guilty context to
uninitialized. We should wait for ARB_Robustness related code to land
for that.
This is somewhat for paranoia. Because we really don't know what the
GPU was doing when it hung, or the state it was in (mid context write,
for example), later restoring the context is a bad idea. By setting the
flag to not initialized, the next load of that context will not restore
the state, and thus on the subsequent switch away from the context will
overwrite the old data.
NOTE: This code needs a fixup when we actually have multiple VMs. The
issue that can occur is inactive objects in a VM will need to be
destroyed before the last context unref. This can now happen via the
fake switch introduced in this patch (and it other ways in the future)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Previously we dropped the association of a context to a ring. It is
however very important to know which ring a context ran on (we could
have reused the other member, but I was nitpicky).
This is very important when we switch address spaces, which unlike
context objects, do change per ring.
As an example, if we have:
RCS BCS
ctx A
ctx A
ctx B
ctx B
Without tracking the last ring B ran on, we wouldn't know to switch the
address space on BCS in the last row.
As a result, we no longer need to track which ring a context "belongs"
to, as it never really made much sense anyway.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If we want to use contexts in more abstract terms (specifically with
PPGTT in mind), we need to allow them to be specified for any ring.
Since the upcoming patches will bring about the use of multiple address
spaces, and each ring needs to have an address space programmed (which
we intend to do at context switch time), we can no longer only use RCS.
With multiple rings having a last context, we must now unreference these
contexts.
NOTE: This commit requires an update to intel-gpu-tools to make it not
fail.
v2: Rebased with some logical conflicts.
Squashed in the context fini refcount patch
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With the introduction of contexts per fd in the future, one can easily
envision more contexts being used. We do not have an easy remedy to
reduce the space requirements of the contexts, we can make things
slightly better by using less stringent alignments on later hardware.
Ville: Since I can almost predict you'll point this out. I can no longer
find the docs which specify the 64k requirement on certain gen6 SKUs. If
you'd like to change that too, be my guest.
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We'll be doing a bit more stuff with each file, so having our own open
function should make things clean.
This also allows us to easily add conditionals for stuff we don't want
to do when we don't have HW contexts.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The only place we were using it was for GEN6, which won't have PPGTT
support anyway (ie. the VM is always the same). To clear things up,
(it only added confusion for me since it doesn't allow us to assert
vma->vm is what we always want, when just looking at the code).
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Using the current state of the page directory registers, we can
determine which of our address spaces was active when the hang occurred.
This allows us to scan through all the address spaces to identify the
"active" one during error capture.
v2: Rebased for BDW error detection. BDW error detection is similar
except instead of PP_DIR_BASE, we can use the PDP registers.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Add FIXME about global gtt misuse.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The existing check was insufficient to determine whether we can use the
GTT mapping to read out the object during error capture.
The previous condition was, if the object has a GGTT mapping, and the
reloc is in the GTT range... the can happen with opjects mapped into
multiple vms (one of which being the GTT).
There are two solutions to this problem:
1. This patch, which avoid reading the io mapping
2. Use the GGTT offset with the io mapping.
Since error capture is about recording the most accurate possible error
state, and the error was caused by the object not in the GGTT - I opted
for the former.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This came from a patch called, "drm/i915: Move active to vma"
When moving an object to the inactive list, we do it for all VMs for
which the object is bound.
The primary difference from that patch is this time around we don't not
track 'active' per vma, but rather by object. Therefore, we only need
one unref.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was found by code inspection. If the GTT setup fails then we are
left without properly tearing down the drm_mm.
Hopefully this never happens.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To be able to effectively use the GGTT object lookup function, we don't
want to warn when there is no GGTT mapping. Let the caller deal with it
instead.
Originally, I had intended to have this behavior, and has not
introduced the WARN. It was introduced during review with the addition
of the follow commit
commit 5c2abbeab7
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Tue Sep 24 09:57:57 2013 -0700
drm/i915: Provide a cheap ggtt vma lookup
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since the beginning, the functions which try to properly reference the
aliasing PPGTT have deferences a potentially null aliasing_ppgtt member.
Since the accessors are meant to be global, this will not do.
Introduced originally in:
commit a70a3148b0
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Wed Jul 31 16:59:56 2013 -0700
drm/i915: Make proper functions for VMs
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The initial implementation of this function used MMIO to write the PDPs.
Upon review it was determined (correctly) that the docs say to use LRI.
The issue is there are times where we want to do a synchronous write
(GPU reset).
I've tested this, and it works. I've verified with as many people as
possible that it should work.
This should fix the failing reset problems.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
But only when we indeed set up a gtt mapping. We need this since the
vma also holds a pages_pin_count, on top of the unconditional
pages_pin_count we grab for all stolen objects (to avoid swap-out).
This should avoid a pages_pin_count underrun when cleaning up
framebuffers objects taken over from the BIOS.
Chris mentioned in his review that this bug even predates the vma
conversion.
Reported-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We don't have any userspace interfaces that use HZ as a time unit, so
having our own DRM define is useless.
Remove this remnant from the shared drm core days.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Only the two intel drivers need this and they can easily check for
working agp support in their driver ->load callbacks.
This is the only reason why agp initialization could fail, so allows
us to rip out a bit of error handling code in the next patch.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The current values seem to be defined in a format that's specific to the
i915, gma500 and radeon drivers. To make this more generally useful, use
the values as defined in the specification.
While at it, prefix the constants with DP_ for improved namespacing.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- some more ppgtt prep patches from Ben
- a few fbc fixes from Ville
- power well rework from Imre
- vlv forcewake improvements from Deepak S, Ville and Jesse
- a few smaller things all over
[airlied: fixup forwcewake conflict]
* tag 'drm-intel-next-2013-11-29' of git://people.freedesktop.org/~danvet/drm-intel: (97 commits)
drm/i915: Fix port name in vlv_wait_port_ready() timeout warning
drm/i915: Return a drm_mode_status enum in the mode_valid vfuncs
drm/i915: add intel_display_power_enabled_sw() for use in atomic ctx
drm/i915: drop DRM_ERROR in intel_fbdev init
drm/i915/vlv: use parallel context restore when coming out of RC6
drm/i915/vlv: use a lower RC6 timeout on VLV
drm/i915/sdvo: Fix up debug output to not split lines
drm/i915: make sparse happy for the new vlv mmio read function
drm/i915: drop the right force-wake engine in the vlv mmio funcs
drm/i915: Fix GT wake FIFO free entries for VLV
drm/i915: Report all GTFIFODBG errors
drm/i915: Enabling DebugFS for valleyview forcewake counts
drm/i915/vlv: Valleyview support for forcewake Individual power wells.
drm/i915: Add power well arguments to force wake routines.
drm/i915: Do not attempt to re-enable an unconnected primary plane
drm/i915: add a debugfs entry for power domain info
drm/i915: add a default always-on power well
drm/i915: don't do BDW/HSW specific powerdomains init on other platforms
drm/i915: protect HSW power well check with IS_HASWELL in redisable_vga
drm/i915: use IS_HASWELL/BROADWELL instead of HAS_POWER_WELL
...
Conflicts:
drivers/gpu/drm/i915/intel_display.c
The GMCH_CTRL register (or MGCC in the spec) is at a different address
on Sandybridge, and the address to which we currently write to is
undefined. These stray writes appear to upset (hard hang) my Ivybridge
machine whilst it is in UEFI mode.
Note that the register is still marked as locked RO on Sandybridge, so
vgaarb is still dysfunctional.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We use this hook starting from ILK onwards, so change the prefix
accordingly. Also rename functions/struct names used from
haswell_update_wm that are relevant to ILK already.
No functional change.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
No functional change.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Apparently always enabling the sprite scaler magically made
sprites work on ILK in the past.
I think the real reason for the failure was missing sprite
watermark programming, and enabling the scaler effectively
disabled LP1+ watermarks, which was enough to keep things going.
Or it might be that the hardware more or less ignores watermarks
for scaled sprites since things seem to work even if I leave
sprite watermarks at 0 and disable all other planes except the
sprite.
In any case, we left the scaler always on but then failed to
check whether we might be exceeding the scaler's source size
limits. That caused the sprite to fail when a sufficiently
large unscaled image was being displayed.
Now that we're getting proper watermark programming for ILK, we
can keep the scaler disabled unless we need to do actual scaling.
This reverts commit 8aaa81a166.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As the watermark registers aren't double bufferd, clearing the
watermarks immediately after writing the sprite registers can be
hazardous.
Until we have something better, add a wait for vblank between the
two steps to make sure the sprite no longer needs the watermark
levels before we clear them.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When color keying is used, the primary may not be invisible even though
the sprite fully covers it. So check for color keying before deciding to
disable the primary plane.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We now have a very clear method of disabling LP1+ wartermarks,
and we can actually detect if we actually did disable them, or
if they were already disabled. Use that to clean up the
WaCxSRDisabledForSpriteScaling:ivb handling.
I was hoping to apply the workaround in a way that wouldn't
require a blocking wait, but sadly IVB really does appear to
require LP1+ watermarks to be off for an entire frame before
enabling sprite scaling. Simply disabling LP1+ watermarks
during the previous frame is not enough, no matter how early
in the frame we do it :(
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The new HSW watermark code can now handle ILK/SNB/IVB as well, so
switch them over. Kill the old code.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
ILK doesn't like if we just write the LP1+ watermarks registers with 0.
We need to just disable the watermarks by clearing the enable bit. Use
that method also when disabling LP1+ watermarks in init_clock_gating.
It looks like disabling the sprite LP1 watermarks can cause underruns
even if we just toggle the WM1S_LP_EN bit. So treat that bit like the
actual watermark numbers and avoid setting it to 0 immediately.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Linetime watermarks don't exist on ILK/SNB/IVB, so don't compute them
except on HSW.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
ILK has a bunch of issues with FBC. First of all, BSpec tells us that
FBC WM should never be enabled. Secondly when FBC is enabled
with FBC WM disabled, LP2+ watermarks must be disabled.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Multi-pipe LP1+ watermarks are a HSW+ feature, so let's not do it on
earlier generations.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On ILK disabling LP1+ watermarks must be done carefully to avoid
underruns. If we just write 0 to the register in the middle of the scan
cycle we often get an underrun. So instead we have to leave the actual
watermark levels in the register intact, and just toggle the enable bit.
Presumably the hardware takes a while to get out of low power mode, and
so the watermark level need to stay valid until that time.
We also have to be careful with the WM1S_LP_EN bit. It seems the
hardware more or less treats it like the actual watermarks numbers, and
so we must not toggle it too soon. Just leave it alone when disabling
the LP1+ watermarks.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
ILK/SNB don't have LP2+ watermarks for sprites. Also the LP1 sprite
watermark register has its own enable bit. Take these differences
into account when programming the LP1+ registers.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On ILK/SNB only LP0/1 watermarks can be enabled when sprites are
enabled, and on ILK/SNB/IVB sprite scaling is limited to LP0 only.
So we can avoid computing the extra levels we're never going to use.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Add a new function ilk_wm_lp_latency() which will tell us what to write
into the WM_LPx register latency field. HSW is different from erlier
gens in this regard.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On IVB the display data buffer partitioning control lives in the
DISP_ARB_CTL2 register. Add the relevant defines/code for it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Otherwise we don't kick out firmware framebuffers like vesafb and
efifb when CONFIG_DRM_I915_FBDEV=n but CONFIG_FB=y.
There's still the pesky issue with vgacon which we should somehow
replace with the dummy console at least. We have a similar issue at
module un/reload, since vgacon state is terminally botched after
i915.ko has loaded in modeset mode. But this gets us a step further at
least.
v2: Use IS_ENABLED - I always get this wrong for tristates. Spotted by
Jani.
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Noticed while reviewing a patch and couldn't resist the OCD.
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Aside from the fact that it leaves confusing dumps on error capture, it
is entirely unnecessary, and potentially harmful in cases like BDW,
where the instruction has changed.
In reality (seemingly), this will have no behavioral impact.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
A few command were out of numerical order and had different spacing. Put
them back in numerical order, with proper spacing.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We only need to init the reg offset for DPIO once, but we need to reset
DPIO at resume time and at init time.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just add an early init since we may need to access DPIO regs early on.
The init call in modeset_init_hw is also needed for the resume case,
when we need to reset DPIO to keep things happy.
v2: split reset and reg init
v3: split patches (Daniel)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The whole file is wrapped around in #if defined(CONFIG_DEBUG_FS) anyway,
so skip the file at the build level already.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In case of error, the function debugfs_create_file() returns NULL
pointer not ERR_PTR() if debugfs is enabled. The IS_ERR() test in
the return value check should be replaced with NULL test.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We don't actually do anything with the information yet, but parse and
log what's in the VBT.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that all the infrastructure is in place and all the tests from
pm_pc8 pass, we can finally enable the feature.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
So we'll get a fault when someone tries to access the mmap, then we'll
wake up from D3.
v2: - Rebase
v3: - Use gtt active/inactive
Testcase: igt/pm_pc8/gem-mmap-gtt
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[danvet: Add comment + WARN as discussed with Paulo on irc.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The hangcheck function requires the hardware to be working, and if
we're suspending we're going to put the HW in D3 state.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In the current code, at haswell_modeset_global_resources, first we
decide if we want to enable/disable the power well, then we decide if
we want to enable/disable PC8. On the case where we're enabling PC8
this works fine, but on the case where we disable PC8 due to a non-eDP
monitor being enabled, we first enable the power well and then disable
PC8. Although wrong, this doesn't seem to be causing any problems now,
and we don't even see anything in dmesg. But the patches for runtime
D3 turn this problem into a real bug, so we need to fix it.
This fixes the "modeset-non-lpsp" subtest from the "pm_pc8" test from
intel-gpu-tools.
v2: - Rebase (i915_disable_power_well).
v3: - More reabase.
v4: - Rebase on top of -fixes instead of -nightly.
This is commit d62292c8f7 in -next, but
we need it in -fixes to address Dave's report.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reported-by: Dave Jones <davej@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently, PC8 is enabled at modeset_global_resources, which is called
after intel_modeset_update_state. Due to this, there's a small race
condition on the case where we start enabling PC8, then do a modeset
while PC8 is still being enabled. The racing condition triggers a WARN
because intel_modeset_update_state will mark the CRTC as enabled, then
the thread that's still enabling PC8 might look at the data structure
and think that PC8 is being enabled while a pipe is enabled. Despite
the WARN, this is not really a bug since we'll wait for the
PC8-enabling thread to finish when we call modeset_global_resources.
The spec says the CRTC cannot be enabled when we disable LCPLL, so we
had a check for crtc->base.enabled. If we change to crtc->active we
will still prevent disabling LCPLL while the CRTC is enabled, and we
will also prevent the WARN above.
This is a replacement for the previous patch named
"drm/i915: get/put PC8 when we get/put a CRTC"
Testcase: igt/pm_pc8/modeset-lpsp-stress-no-wait
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit 798183c547
from -next due to Dave's report.)
Reported-by: Dave Jones <davej@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
WaVSRefCountFullforceMissDisable and
WaDSRefCountFullforceMissDisable
VS is a carry-over from HSW, and DS is likely not used by anyone yet.
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Line of 106 chars is too long. Really.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I stumbled on to some unimplemented errata. To be honest, I am not
really sure of the impact, just that the docs say to do.
No w/a name for this one.
v2: v1 was a stale thing which should have never seen the light of day.
(Haihao)
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Not all registers need forcewake even if they're not shadowed.
Add the missing check to gen8_writeX() to avoid needless forcewake
usage when writing eg. display registers.
v2: Use straight up <0x40000 check instead of NEEDS_FORCE_WAKE()
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The BIOS or someone else might have done something bad and there
might be old GT FIFO erros reported in GTFIFODBG. Clear those out
in intel_uncore_early_sanitize() to make sure we don't mistake them
for our problems.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If test is running, irq_get was not called so we should gain
balance by not doing irq_put
"So the rule is: if you access unlocked values, you use ACCESS_ONCE().
You don't say "but it can't matter". Because you simply don't know."
-- Linus
v2: use local variable so it can't change during test (Chris)
v3: update commit msg and use ACCESS_ONCE (Ville)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Don't touch DPFC_RECOMP_CTL on FBC2, use RMW to update
the FBC_CONTROL on FBC1 to make it easier for people to
experiment with different numbers. Also fix the interval
mask for FBC1.
v2: Rebased
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
All mobile gen2 and gen3 chipsets should have FBC1, and the code
should now handle them all. So just set has_fbc=true for all such
chipsets.
Note that fbc is still disabled by default for now.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Gen2 and gen3 don't have the FBC_CONTROL2 register, so don't
touch it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On gen2 the compressed frame buffer pitch is specified in 32B units
rather than the 64B units used on gen3+.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The first piece, intel_ddi_pll_select, finds a PLL and assigns it to
the CRTC, but doesn't write any register. It can also fail in case it
doesn't find a PLL.
The second piece, intel_ddi_pll_enable, uses the information stored by
intel_ddi_pll_select to actually enable the PLL by writing to its
register. This function can't fail. We also have some refcount sanity
checks here.
The idea is that one day we'll remove all the functions that touch
registers from haswell_crtc_mode_set to haswell_crtc_enable, so we'll
call intel_ddi_pll_select at haswell_crtc_mode_set and then call
intel_ddi_pll_enable at haswell_crtc_enable. Since I'm already
touching this code, let's take care of this particular split today.
v2: - Clock on the debug message is in KHz
- Add missing POSTING_READ
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Bikeshed comments.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Commit 094f9a54e3 ("drm/i915: Fix __wait_seqno to use true infinite
timeouts") added support for __wait_seqno to detect missing interrupts and
go around them by polling. As there is also timeout detection in
__wait_seqno, the polling and timeout detection were done with the same
timer.
When there has been missed interrupts and polling is needed, the timer is
set to trigger in (now + 1) jiffies in future, instead of the caller
specified timeout.
Now when io_schedule() returns, we calculate the jiffies left to timeout
using the timer expiration value. As the current jiffies is now bound to be
always equal or greater than the expiration value, the timeout_jiffies will
become zero or negative and we return -ETIME to caller even tho the
timeout was never reached.
Fix this by decoupling timeout calculation from timer expiration.
v2: Commit message with some sense in it (Chris Wilson)
v3: add parenthesis on timeout_expire calculation
v4: don't read jiffies without timeout (Chris Wilson)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes regression introduced by:
commit bf51d5e2cd
Author: Paulo Zanoni <paulo.r.zanoni at intel.com>
Date: Wed Jul 3 17:12:13 2013 -0300
drm/i915: switch disable_power_well default value to 1
The bug I'm seeing can be reproduced with:
- Have vgacon configured/enabled
- Make sure the power well gets disabled, then enabled. You can
check this by seeing the messages print by hsw_set_power_well
- Stop your display manager
- echo 0 > /sys/class/vtconsole/vtcon1/bind
I can easily reproduce this by blacklising snd_hda_intel and booting
with eDP+HDMI.
If you do this and then look at dmesg, you'll see we're printing
infinite "Unclaimed register" messages. This is happening because
we're stuck on an infinite loop inside console_unlock(), which is
calling many functions from vgacon.c. And the code that's triggering
the error messages is from vgacon_set_cursor_size().
After we re-enable the power well, every time we read/write the VGA
address 0x3d5 we get an "unclaimed register" interrupt (ERR_INT) and
print error messages. If we write anything to the VGA MSR register (it
doesn't really matter which value you write to bit 0), any
reads/writes to 0x3d5 _don't_ trigger the "unclaimed register" errors
anymore (even if MSR bit 0 is zero). So what happens with the current
code is that when we unbind i915 and bind vgacon, we call
console_unlock(). Function console_unlock() is responsible for
printing any messages that were supposed to be print when the console
was locked, so it calls the TTY layer, which calls the console layer,
which calls vgacon to print the messages. At this point, vgacon
eventually calls vgacon_set_cursor_size(), which touches 0x3d5, which
triggers unclaimed register interrupts. The problem is that when we
get these interrupts, we print the error messages, so we add more work
to console_unlock(), which will try to print it again, and then call
vgacon again, trigger a new interrupt, which will put more stuff to
the buffer, and then we'll be stuck at console_unlock() forever.
If you patch intel_uncore.c to not print anything when we detect
unclaimed registers, we won't get into the console_unlock() infinite
loop and the driver unbind will work just fine. We will still be
getting interrupts every time vgacon touches those registers, but we
will survive. This is a valid experiment, but IMHO it's not the real
fix: if we don't print any error messages we will still keep getting
the interrupts, and if we disable ERR_INT we won't get the interrupt
anymore, but we will also stop getting all the other error interrupts.
I talked about this problem with the HW engineer and his
recommendation is "So don't do any VGA I/O or memory access while the
power well is disabled, and make to re-program MSR after enabling the
power well and before using VGA I/O or memory accesses.".
Notice that this is just a partial fix to fd.o #67813. This fixes the
case where the power well is already enabled when we unbind, not when
it's disabled when we unbind.
V2: - Rebase (first version was sent in September).
V3: - Complete rewrite of the same fix: smaller implementation,
improved commit message.
Testcase: igt/drv_module_reload
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67813
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I want to add more code to the post_enable function.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It was supposed to have been killed on the same commit that killed the
function, e1264ebe9f, but I guess the
intel_drv.h reorganization accidentally brought it back.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As the rings may be processed and their requests deallocated in a
different order to the natural retirement during a reset,
/* Whilst this request exists, batch_obj will be on the
* active_list, and so will hold the active reference. Only when this
* request is retired will the the batch_obj be moved onto the
* inactive_list and lose its active reference. Hence we do not need
* to explicitly hold another reference here.
*/
is violated, and the batch_obj may be dereferenced after it had been
freed on another ring. This can be simply avoided by processing the
status update prior to deallocating any requests.
Fixes regression (a possible OOPS following a GPU hang) from
commit aa60c664e6
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Wed Jun 12 15:13:20 2013 +0300
drm/i915: find guilty batch buffer on ring resets
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: Add the code comment Chris supplied.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Whilst looking up the objects required for an execbuffer, an untimely
allocation failure in creating the vma results in the object being
unreferenced from two lists. The ownership during the lookup is meant to
be moved from the list of objects being looked to the vma, and this
double unreference upon error results in a use-after-free.
Fixes regression from
commit 27173f1f95
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Wed Aug 14 11:38:36 2013 +0200
drm/i915: Convert execbuf code to use vmas
Based on the fix by Ben Widawsky.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: stable@vger.kernel.org
[danvet: Bikeshed the crucial comment above the ownership transfer as
discussed on irc.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just a bunch of regression fixes plus a few patches for long-standing
issues in gem corner-cases that we've hunted down in the past weeks. Since
apparently people hit those in the wild (and we also have nice igts for
them) I've opted for -fixes and cc: stable.
There's 1-2 things oustanding on top of this where I'm still waiting on
confirmation from testing, but nothing really scary.
* tag 'drm-intel-fixes-2013-12-11' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915: don't update the dri1 breadcrumb with modesetting
drm/i915: Repeat eviction search after idling the GPU
drm/i915: Fix use-after-free in do_switch
drm/i915: fix pm init ordering
drm/i915: Hold mutex across i915_gem_release
drm/i915: Skip clock checks on BDW
drm/i915: Do not clobber config status after a forced restore of hw state
drm/i915: Take modeset locks around intel_modeset_setup_hw_state()
As promised bdw fixes come separate for now. Just a few minior things.
* 'bdw-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915/bdw: PIPE_[BC] I[ME]R moved to powerwell
drm/i915/bdw: Limit GTT to 2GB
drm/i915/bdw: Add comment about gen8 HWS PGA
drm/i915/bdw: Free correct number of ppgtt pages
drm/i915/bdw: Do gen6 style reset for gen8
drm/i915/bdw: GEN8 backlight support
drm/i915/bdw: Add BDW to ULT macro
The values of these parameters will be different for differnet panel
based on dsi rate, lane count, etc. Remove the hardcodings and make
these as parameters whch will be initialized in panel specific
sub-encoder implementaion.
This will also form groundwork for planned generic panel sub-encoder
implemntation based on VBT design enhancments to support multiple panels
v2: Mask away the port_bits before use
Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
DSI PLL will get configured during crtc_enable using ->pre_pll_enable
and no need to do in ->mode_set
Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Basically ULPS handling during enable/disable has been moved to
pre_enable and post_disable phases. PLL and panel power disable
also has been moved to post_disable phase. The ULPS entry/exit
sequneces as suggested by HW team is as follows -
During enable time -
set DEVICE_READY --> Clear DEVICE_READY --> set DEVICE_READY
And during disable time to flush all FIFOs -
set ENTER_SLEEP --> EXIT_SLEEP --> ENTER_SLEEP
Also during disbale sequnece sub-encoder disable is moved to the end
after port is disabled.
v2: Based on comments from Ville
- Detailed epxlaination in the commit messgae
- Moved parameter changes out into another patch
- Backlight enabling will be a new patch
v3: Updated as per Jani's comments
- Removed the I915_WRITE_BITS as it is not needed
- Moved panel_reset and send_otp_cmds hooks to dsi_pre_enable
- Moved disable_panel_power hook to dsi_post_disable
- Replace hardcoding with AFE_LATCHOUT
v4: Make intel_dsi_device_ready and intel_dsi_clear_device_ready static
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohan.marimuthu@intel.com>
Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Basically check for both +ive and -ive deviation from target clock and
pick the one with minimal error. If we get a direct match, break from
loop to acheive some optimization.
v2: Use signed variable for target and calculated dsi clock values
Signed-off-by: Vijayakumar Balakrishnan <vijayakumar.balakrishnan@intel.com>
Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pixel clock based calculation is recommended in the MIPI host controller
documentation
v2: Based on review comments from Jani and Ville
- Use dsi_clk in KHz rather than converting in Hz and back to MHz
- RR formula is retained though not used but return dsi_clk in KHz now
- Moved the m-n-p changes into a separate patch
- Removed the parameter check for intel_dsi->dsi_clock_freq. This will be
bought back in if needed when appropriate panel drivers are done
v3: Removed the unused mnp calculation from static table
Signed-off-by: Vijayakumar Balakrishnan <vijayakumar.balakrishnan@intel.com>
Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Some panels require one time programming if they do not contain their
own eeprom for basic register initialization. The sequence is
Panel Reset --> Send OTP --> Enable Pixel Stream --> Enable the panel
v2: Based on review comments from Jani and Ville
- Updated the commit message with more details
- Move the new parameters out of this patch
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohan.marimuthu@intel.com>
Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On my 855 machine the BIOS uses the following DPLL settings:
DPLL 0x90016000
FP0 = 0x61207
FP1 = 0x21207
With the 66MHz SSC refclock, that puts the BIOS generated VCO
frequency at ~908 MHz, which is lower than the 930 MHz limit
we have currently. This also results in the pixel clock coming
out significantly higher than the requested 65 MHz when we try
to recompute it.
Reduce the the VCO limit to 908 MHz. Combined with the earlier
SSC reference clock accuracy fix, this results in the pixel clock
coming out as 65.08 MHz which is quite close to the target. For
some reason the BIOS uses 64.881 MHz, which isn't quite as close.
This makes kms_flip wf_vblank-ts-check pass for the first time
on this machine \o/
Cc: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Store the SSC refclock frequency in kHz to get more accuracy. Currently
we're pretending that 66 MHz is ~66000 kHz, when in fact it is actually
~66667 kHz. By storing the less rounded kHz value we get a much better
accuracy for out pixel clock calculations.
Cc: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Bruno Prémont has a 855 machine with a 1400x1050 LVDS screen.
The VBT mode is as follows:
0:"1400x1050" 0 108000 1400 1416 1528 1688 1050 1051 1054 1066 0x8 0xa
The BIOS uses the following DPLL settings:
DPLL = 0x90020000
FP0 = 0x2140e
FP1 = 0x21207
That puts the BIOS generated VCO frequency at 1512 MHz, which is
higher than the 1400 MHz limit we have currently.
Let's bump the VCO limit to 1512 MHz and see what happens.
Cc: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Bruno Prémont has a 855 machine with a 1400x1050 LVDS screen.
The VBT mode is as follows:
0:"1400x1050" 0 108000 1400 1416 1528 1688 1050 1051 1054 1066 0x8 0xa
The BIOS uses the following DPLL settings:
DPLL = 0x90020000
FP0 = 0x2140e
FP1 = 0x21207
We can't generate that pixel clock currently as we're limiting the N
divider to at least 3, whereas the BIOS uses a value of 2.
Let's reduce the N minimum to 2 and see what happens.
Cc: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In order to determine the correct p2 divider for LVDS on gen2,
we need to check the CLKB mode from the LVDS port register to
determine if we're dealing with single or dual channel LVDS.
Cc: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Every ring seems to have a BB_ADDR registers, so include them all in the
error state.
v2: Also include the _UDW on BDW
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The BB_ADDR register is documented to be 32bits at least since SNB.
Prior to that the high 32bits were listed as MBZ, so using a 64bit read
doesn't seem worth anything. Also the simulator doesn't like the 64bit
read. So just switch to using a 32bit read instead.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If we're disabling the VDD override bit and the panel is enabled, we
don't need to wait for anything. If the panel is disabled, then we
need to actually wait for panel_power_cycle_delay, not
panel_power_down_delay, because the power down delay was already
respected when we disabled the panel.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I don't see a reason to touch VDD when we're disabling the panel:
since the panel is enabled, we don't need VDD. This saves a few sleep
calls from the vdd_on and vdd_off functions at every modeset.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69693
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Fix the patch mangle wiggle has done ... Spotted by Paulo.
Also drop the runtime_pm_put call which now has to go due to different
patch ordering. Also from Paul.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The update is horribly racy since it doesn't protect at all against
concurrent closing of the master fd. And it can't really since that
requires us to grab a mutex.
Instead of jumping through hoops and offloading this to a worker
thread just block this bit of code for the modesetting driver.
Note that the race is fairly easy to hit since we call the breadcrumb
function for any interrupt. So the vblank interrupt (which usually
keeps going for a bit) is enough. But even if we'd block this and only
update the breadcrumb for user interrupts from the CS we could hit
this race with kms/gem userspace: If a non-master is waiting somewhere
(and hence has interrupts enabled) and the master closes its fd
(probably due to crashing).
v2: Add a code comment to explain why fixing this for real isn't
really worth it. Also improve the commit message a bit.
v3: Fix the spelling in the comment.
Reported-by: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
Cc: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
Cc: stable@vger.kernel.org
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We just don't need this. This saves 250ms from every modeset on my
machine.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The code to enable/disable PC8 already takes care of saving and
restoring all the registers we need to save/restore, so do a put()
call when we enable PC8 and a get() call when we disable it.
Ideally, in order to make it easier to add runtime PM support to other
platforms, we should move some things from the PC8 code to the runtime
PM code, but let's do this later, since we can make Haswell work right
now.
V2: - Rebase
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[danvet: Don't actually enable runtime pm since I didn't merge all
patches.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The plan is to merge PC8 and D3 into a single feature, and when we're
in D3 we won't get any hotplug interrupt anyway, so leaving them
enable doesn't make sense, and it also brings us a problem. The
problem is that we get a hotplug interrupt right when we we wake up
from D3, when we're still waking up everything. If we fully disable
interrupts we won't get this hotplug interrupt, so we won't have
problems.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The current code was checking if all bits of "val" were enabled and
DE_PCH_EVENT_IVB was disabled. The new code doesn't care about the
state of DE_PCH_EVENT_IVB: it just checks if everything else is 1.
The goal is that future patches may completely disable interrupts, and
the LCPLL-disabling code shouldn't care about the state of
DE_PCH_EVENT_IVB.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[danvet: I think the commit message is actually wrong in it's
description of what the old test checked, but the new one seems sane.
So meh.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
And put it when it's off. Otherwise, when you run pm_pc8 from
intel-gpu-tools, and the delayed function that disables VDD runs,
we'll get some messages saying we're touching registers while the HW
is suspended.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
These are needed when we cat the debugfs and sysfs files.
V2: - Rebase
V3: - Rebase
V4: - Rebase
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If I add code to enable runtime PM on my Haswell machine, start a
desktop environment, then enable runtime PM, these functions will
complain that they're trying to read/write registers while the
graphics card is suspended.
v2: - Simplify i915_gem_fault changes.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[danvet: Drop the hunk in i915_hangcheck_elapsed, it's the wrong thing
to do.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we are actually setting the device to the D3 state, we should
issue the notification.
The opregion spec says we should send the message before the adapter
is about to be placed in a lower power state, and after the adapter is
placed in a higher power state.
Jani originally wrote a similar patch for PC8, but then we discovered
that we were not really changing the PCI D states when
enabling/disabling PC8, so we had to postpone his patch.
v2: - Improve commit message, explaining the expected state.
v3: - Rebase.
Cc: Jani Nikula <jani.nikula@intel.com>
Credits-to: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch adds the initial infrastructure to allow a Runtime PM
implementation that sets the device to its D3 state. The patch just
adds the necessary callbacks and the initial infrastructure.
We still don't have any platform that actually uses this
infrastructure, we still don't call get/put in all the places we need
to, and we don't have any function to save/restore the state of the
registers. This is not a problem since no platform uses the code added
by this patch. We have a few people simultaneously working on runtime
PM, so this initial code could help everybody make their plans.
V2: - Move some functions to intel_pm.c
- Remove useless pm_runtime_allow() call at init
- Remove useless pm_runtime_mark_last_busy() call at get
- Use pm_runtime_get_sync() instead of 2 calls
- Add a WARN to check if we're really awake
V3: - Rebase.
V4: - Don't need to call pci_{save,restore}_state and
pci_set_power_sate, since they're already called by the PCI
layer
- Remove wrong pm_runtime_enable() call at init_runtime_pm
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In the current code, at haswell_modeset_global_resources, first we
decide if we want to enable/disable the power well, then we decide if
we want to enable/disable PC8. On the case where we're enabling PC8
this works fine, but on the case where we disable PC8 due to a non-eDP
monitor being enabled, we first enable the power well and then disable
PC8. Although wrong, this doesn't seem to be causing any problems now,
and we don't even see anything in dmesg. But the patches for runtime
D3 turn this problem into a real bug, so we need to fix it.
This fixes the "modeset-non-lpsp" subtest from the "pm_pc8" test from
intel-gpu-tools.
v2: - Rebase (i915_disable_power_well).
v3: - More reabase.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We already have some checks and shouldn't be reaching these places on
!HAS_PC8 platforms, but add a WARN, just in case.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The CRI clock is related to the display PHY, so the setup belongs
in intel_init_dpio().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We don't modify the packed infoframe data, so we should keep the
const qualifier in place. Just pass the buffer as 'const void *'
instead of 'const uint8_t *' and we can drop the cast entirely.
v2: Do intel_sdvo_write_infoframe() as well
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If one mode of a internal panel has more than one refresh rate, then a reduced
clock is found for the LFP (LVDS/eDP). This enables switching between low
and high frequency dynamically. Moving downclock calculation to intel_panel
so that it is common for LVDS and eDP.
Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Signed-off-by: Pradeep Bhat <pradeep.bhat@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With the advent of hw context support, we gained some objects that are
pinned for the duration of their request. That is we can make aperture
space available by idling the GPU and in the process performing a
context switch back to the always-pinned default context. As such, we
should not conclude that there is no space in the aperture for the
current object until we have unpinned any such context objects.
Note that we also have the problem of outstanding pageflips preventing
eviction of their framebuffer objects to resolve.
Testcase: igt/gem_ctx_exec/eviction
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72507
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: lu hua <huax.lu@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since early sanitize and uncore sanitize are called one after the other,
I think, we can remove second forcewake reset which was are calling
twice in both the functions.
Note that this is merge fallout between
commit ef46e0d247
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Sat Nov 16 16:00:09 2013 +0100
drm/i915: restore the early forcewake cleanup
and
commit 521198a2e7
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Fri Aug 23 16:52:30 2013 +0300
drm/i915: sanitize forcewake registers on reset
Signed-off-by: Deepak S <deepak.s@intel.com>
[danvet: Explain how this came to be.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQEcBAABAgAGBQJSogqUAAoJEHm+PkMAQRiGM2MIAJrr5KEXEWuuAR4+JkkWBK7A
+dVT4n1MM4wP/aCIyriSlq7kgT03Wxk4Q4wKsj2wZvDQkNgEQjrctgIihc75jqi5
126nmT3YXJLwgDpFA3RHZUWve3j3vfUG53rRuk7K9Xx1sGWU3Ls7BuInvQZ//+QS
6UB4UuEAalmose5U8ToXQfMqZhjwreZKeb64TEZwFvu2klv4cnka1L/zHbmQGgRg
2Pfv+aUrjsYE8s9lkEKX8MIQsDn28Q5Lsv7XIEQwo2at4rYbJaxX6usuC1OI0MQ5
BLUn1GgtvOidq6FzSg6kXiA/MJYH3J0S+p4uULWAprxA+KeJRbWNRroM94W1qAk=
=1Wcq
-----END PGP SIGNATURE-----
Merge tag 'v3.13-rc3' into drm-intel-next-queued
Linux 3.13-rc3
I need a backmerge for two reasons:
- For merging the ppgtt patches from Ben I need to pull in the bdw
support.
- We now have duplicated calls to intel_uncore_forcewake_reset in the
setup code to due 2 different patches merged into -next and 3.13.
The conflict is silen so I need the merge to be able to apply
Deepak's fixup patch.
Conflicts:
drivers/gpu/drm/i915/intel_display.c
Trivial conflict, it doesn't even show up in the merge diff.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently, PC8 is enabled at modeset_global_resources, which is called
after intel_modeset_update_state. Due to this, there's a small race
condition on the case where we start enabling PC8, then do a modeset
while PC8 is still being enabled. The racing condition triggers a WARN
because intel_modeset_update_state will mark the CRTC as enabled, then
the thread that's still enabling PC8 might look at the data structure
and think that PC8 is being enabled while a pipe is enabled. Despite
the WARN, this is not really a bug since we'll wait for the
PC8-enabling thread to finish when we call modeset_global_resources.
The spec says the CRTC cannot be enabled when we disable LCPLL, so we
had a check for crtc->base.enabled. If we change to crtc->active we
will still prevent disabling LCPLL while the CRTC is enabled, and we
will also prevent the WARN above.
This is a replacement for the previous patch named
"drm/i915: get/put PC8 when we get/put a CRTC"
Testcase: igt/pm_pc8/modeset-lpsp-stress-no-wait
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
So apparently under ridiculous amounts of memory pressure we can get
into trouble in do_switch when we try to move the old hw context
backing storage object onto the active lists.
With list debugging enabled that usually results in us chasing a
poisoned pointer - which means we've hit upon a vma that has been
removed from all lrus with list_del (and then deallocated, so it's a
real use-after free).
Ian Lister has done some great callchain chasing and noticed that we
can reenter do_switch:
i915_gem_do_execbuffer()
i915_switch_context()
do_switch()
from = ring->last_context;
i915_gem_object_pin()
i915_gem_object_bind_to_gtt()
ret = drm_mm_insert_node_in_range_generic();
// If the above call fails then it will try i915_gem_evict_something()
// If that fails it will call i915_gem_evict_everything() ...
i915_gem_evict_everything()
i915_gpu_idle()
i915_switch_context(DEFAULT_CONTEXT)
Like with everything else where the shrinker or eviction code can
invalidate pointers we need to reload relevant state.
Note that there's no need to recheck whether a context switch is still
required because:
- Doing a switch to the same context is harmless (besides wasting a
bit of energy).
- This can only happen with the default context. But since that one's
pinned we'll never call down into evict_everything under normal
circumstances. Note that there's a little driver bringup fun
involved namely that we could recourse into do_switch for the
initial switch. Atm we're fine since we assign the context pointer
only after the call to do_switch at driver load or resume time. And
in the gpu reset case we skip the entire setup sequence (which might
be a bug on its own, but definitely not this one here).
Cc'ing stable since apparently ChromeOS guys are seeing this in the
wild (and not just on artificial stress tests), see the reference.
Note that in upstream code doesn't calle evict_everything directly
from evict_something, that's an extension in this product branch. But
we can still hit upon this bug (and apparently we do, see the linked
backtraces). I've noticed this while trying to construct a testcase
for this bug and utterly failed to provoke it. It looks like we need
to driver the system squarly into the lowmem wall and provoke the
shrinker to evict the context object by doing the last-ditch
evict_everything call.
Aside: There's currently no means to get a badly-fragmenting hw
context object away from a bad spot in the upstream code. We should
fix this by at least adding some code to evict_something to handle hw
contexts.
References: https://code.google.com/p/chromium/issues/detail?id=248191
Reported-by: Ian Lister <ian.lister@intel.com>
Cc: Ian Lister <ian.lister@intel.com>
Cc: stable@vger.kernel.org
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Cc: Bloomfield, Jon <jon.bloomfield@intel.com>
Tested-by: Rafael Barbalho <rafael.barbalho@intel.com>
Reviewed-by: Ian Lister <ian.lister@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Shovel a bit more of the the code into the setup function, and call
it earlier. Otherwise lockdep is unhappy since we cancel the delayed
resume work before it's initialized.
While at it also shovel the pc8 setup code into the same functions.
I wanted to also ditch the header declaration of the hws pc8 functions,
but for unfathomable reasons that stuff is in intel_display.c instead
of intel_pm.c.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71980
Tested-by: Guo Jinxian <jinxianx.guo@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If we force the hw to idle as our first step during unload, we can abort
the unload upon failure. Later we can probe whether the hardware remain
active even after we try to shut it down.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just flushing out my pile of bugfixes, most of them for regressions/cc:
stable. Nothing really serious going on.
For outstanding issues we still have the S4 fun due to the hsw S4
duct-tape pending (seems like I need to switch into angry maintainer mode
on that one). And there's the mode merging revert to make my g33 work
again still pending for drm core. For that one I don't have any more clue
(and it looks like no one else has a good idea either). And apparently the
locking WARN fix in here also needs to be replicated for boot, still
confirming that one though.
* tag 'drm-intel-fixes-2013-12-02' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915: Pin pages whilst allocating for dma-buf vmap()
drm/i915: MI_PREDICATE_RESULT_2 is HSW only
drm/i915: Make the DERRMR SRM target global GTT
drm/i915: use the correct force_wake function at the PC8 code
drm/i915: Fix pipe CSC post offset calculation
drm/i915: Simplify DP vs. eDP detection
drm/i915: Check VBT for eDP ports on VLV
drm/i915: use crtc_htotal in watermark calculations to match fastboot v2
drm/i915: Pin relocations for the duration of constructing the execbuffer
drm/i915: take mode config lock around crtc disable at suspend
drm/i915: Prefer setting PTE cache age to 3
drm/i915/ddi: set sink to power down mode on dp disable
Inorder to serialise the closing of the file descriptor and its
subsequent release of client requests with i915_gem_free_request(), we
need to hold the struct_mutex in i915_gem_release(). Failing to do so
has the potential to trigger an OOPS, later with a use-after-free.
Testcase: igt/gem_close_race
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70874
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71029
Reported-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Doing it early prevents moving and relocating objects in vain
for contexts that won't get any GPU time.
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It is useful to assert that if the object is bound, then it must have
its pages pinned to prevent the shrinker from reaping its backing store.
This is even more useful with the introduction of real-ppgtt whereupon
we may have the object bound into several vma, with each instance
pinning the backing store. This assertion breaks down during unbind
where we unpinned the backing store before decoupling the vma binding.
This can be fixed with a trivial reording of the unbind sequence, which
reinforces the
pin pages
bind to vma
...
unbind from vma
unpin pages
concept.
v2: Bonus comment
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On VLV, FIFO will be shared by both SW and HW. So, we read the
free entries through register and update dev_priv variable
and wait for only 20 entries to be free
From Deepak's follow-up mail explaining why vlv is special:
"On SB, Out of 64 FIFO Entries, 20 Entries will be used by HW and
remaining 44 will be used by the SW,. I think due to this reason, we
have a threshold of 20 Entries."
"On VLV, HW and SW can access all 64 fifo entries, I don't think
having a threshold of 20 Entries is mandatory on VLV. Also, since both
SW and HW can access all 64 Entries. I think on VLV, we need to update
the fifo_count before waiting for the FIFO."
v2: Apply mask when we read the number of free FIFO entries (Ville).
v3: Mask applied after reading the register (Deepak).
Signed-off-by: Deepak S <deepak.s@intel.com>
[danvet: Add further explanation from Deepak to commit message.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Only plane A is FBC capable on gen2 (like gen3), but the panel fitter
is hooked up to pipe B, so we want to prefer pipe B + plane A.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add the code comment Chris requested in his review.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Initialize the FBC vfuncs on gen2 and gen3 chipsets. Also make
a clean split for gen7+ vs. gen5+ vfunc initialization.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On gen2 and gen3 chipsets FBC is supported only on plane A. Fix (and
simplify) the plane checks in intel_update_fbc() accordingly.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilons <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Add a REG_WRITE_FOOTER macro as a counterpart to the REG_WRITE_HEADER.
The current code has the spin_lock() in the HEADER, but the
spin_unlock() is open coded, which looks rather confusing on the first
glance. A bit of additional symmetry might help.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We don't have clock state readout support for DDI, so skip the pipe
config clock checks on all DDI platforms.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We call intel_modeset_setup_hw_state() along two paths, driver
load/resume and after a lid event notification. During initialisation of
the driver, it is imperative that we reset the config state. This
correctly sets up the initial connector statuses and prepares the
hardware for a thorough probing. However, during a lid event, we only
want to undo the damage caused by the bios by resetting our last known
mode. In this cirumstance, we do not want to clobber our desired state.
In order to try and keep sanity between the config state and our own
tracking, do the drm_mode_config_reset() first along the load/resume
paths before reading out the hw state and apply any definite known
corrections.
v2: "As discussed on irc I don't think we should force the connector
state to anything here: Imo connector->status should reflect what we
believe to be the true output connection state, whereas connector->encoder
reflects whether this connector is wired up to a pipe. And since we no
longer reject modeset on disconnected connectors and never nuked the pipe
if the connector gets disconnected there's no reason for that - such policy
is userspace's job.
This regression has been introduced in
commit 2e9388923e
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Oct 11 20:08:24 2012 +0200
drm/i915/crt: explicitly set up HOTPLUG_BITS on resume"
so sayeth Daniel.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org (v3.8 and later)
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Some lower level things get angry if we don't have modeset locks
during intel_modeset_setup_hw_state(). Actually the resume and
lid_notify codepaths alreday hold the locks, but the init codepath
doesn't, so fix that.
Note: This slipped through since we only disable pipes if the
plane/pipe linking doesn't match. Which is only relevant on older
gen3 mobile machines, if the BIOS fails to set up our preferred
linking.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Tested-and-reported-by: Paul Bolle <pebolle@tiscali.nl>
[danvet: Add note now that I could confirm my theory with the log
files Paul Bolle provided.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When inspecting reports that boot/suspend/resume times are unusual it
would be useful to clearly identify the time we must spend waiting for
the hardware to complete its task. In this case we have a notification
before we start waiting for the panel to change state, but none
afterwards - which would be useful.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Checkpatch tells me
WARNING: __packed is preferred over __attribute__((packed))
so switch over to __packed across the driver before adding new packed
structs.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Check that the N and P dividers don't cause a divide by zero.
This shouldn't happen under normal circumstances, but can
happen eg. under simulation.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It's a pain for two reasons:
- The vga plane redisablign requires actual legacy vgao i/o to pull
of. The hw engineers really botched this one here :(
- There seem to be some BIOS out there which send out lid events when
unplugging. Together with our broken DP code, which disables the
port when the cable is lost, this results in an immediate modeset
call, which can hang on the wait for outstanding flips.
- Also we don't want to force a modeset on machines where it's not
really needed, see the referenced bug.
We might want to extend this in general to also all machines that
support opregion, since there the BIOS supposedly should manage the
gfx hardware more cooperatively.
v2: Pimp commit message a bit.
Cc: Roland Dreier <roland@kernel.org>
References: https://bugs.freedesktop.org/show_bug.cgi?id=65486
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
During the vmap() routine for the dma-buf, we first grab the pages and
then try to allocate a temporary array to pass to the vmap(). However,
the shrinker can and will reap any object that is unbound if the
allocation for the array first fails. This includes the object which we
are attempting to vmap(). The solution is to mark the object's pages as
pinned whilst we try the allocation to prevent the use-after-free
introduced by the potential shrinkage.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We're currently misprinting the port name when vlv_wait_port_ready()
times out. Fix it by using port_name().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The MI_PREDICATE_RESULT_2 register exits only on HSW. On other
platforms the same offset is either reserved, or contains some
other register. So write the register only on HSW.
This regression has been introduced in
commit 9435373ef8
Author: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Date: Wed Aug 28 16:45:46 2013 -0300
drm/i915: Report enabled slices on Haswell GT3
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Add regression notice.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The ring scratch pages don't have a PPGTT mapping, so the DERRM SRM
should target the global GTT instead.
v2: Add MI_SRM_LRM_GLOBAL_GTT define for -fixes
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When I submitted the first patch adding these force wake functions,
Chris Wilson observed that I was using the wrong functions, so I sent
a second version of the patch to correct this problem. The problem is
that v1 was merged instead of v2.
I was able to notice the problem when running the
debugfs-forcewake-user subtest of pm_pc8 from intel-gpu-tools.
Cc: stable@vger.kernel.org
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We were miscalculating the pipe CSC post offset for the full->limited
range conversion. The resulting post offset was double what it was
supposed to be, which caused blacks to come out grey when using
limited range output on HSW+.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71769
Cc: stable@vger.kernel.org
Tested-by: Lauri Mylläri <lauri.myllari@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We had some mode_valid() vfuncs returning an int, others the enum. Let's
use the latter everywhere.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Atm we call intel_display_power_enabled() from
i915_capture_error_state() in IRQ context and then take a mutex. To fix
this add a new intel_display_power_enabled_sw() which returns the domain
state based on software tracking as opposed to reading the actual HW
state.
Since we use domain_use_count for this without locking on the reader
side make sure we increase the counter only after enabling all required
power wells and decrease it before disabling any of these power wells.
Regression introduced in
commit 1b02383464b4a915627ef3b8fd0ad7f07168c54c
Author: Imre Deak <imre.deak@intel.com>
Date: Tue Sep 24 16:17:09 2013 +0300
drm/i915: support for multiple power wells
Note that atm we depend on the value returned by
intel_display_power_enabled_sw() in i915_capture_error_state() to avoid
unclaimed register access reports. This was never guaranteed though,
since another thread can disable the power concurrently. If this is a
problem we need another explicit way to disable the reporting during
error captures.
v2:
- remove barriers as the caller can't depend on the value
returned from i915_capture_error_state_sw() anyway (Ville)
- dump the state of pipe/transcoder power domain state (Daniel)
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This should just be a debug. Add another debug msg to the inherit path
while we're at it.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72098
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reduce the eDP detection to just checking if it's port A, or if
the VBT tells us that the port is eDP for the other ports.
Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
VLV can have eDP on either port B or C, or even both. Based on the
VBT spec, intel_dpd_is_edp() should work on VLV too, assuming we
check the correct ports.
So instead of hardcoding port D, rename the function to
intel_dp_is_edp() and pass the port as a parameter, and use it
on VLV ports B and C.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71051
Tested-by: Robert Hooker <robert.hooker@canonical.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Wrestle the patch to apply and compile properly.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Setting this bit restores all ring contexts in parallel rather than
serially. Matches current BWG recommendations.
Tested-by: "Meng, Mengmeng" <mengmeng.meng@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Deepak S <deepak.s@inel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We use timeout mode, and we need to lower the timeout to get good RC6
residency when loads are running. This gets me from 0% residency during
glxgears to 77%, which is a pretty good improvement. This value also
matches the current BWG recommentations.
Tested-by: "Meng, Mengmeng" <mengmeng.meng@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Deepak S <deepak.s@inel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It leads to a big mess when stuff interleaves. Especially with the new
patch I've submitted for the drm core to no longer artificially split
up debug messages.
v2: The size parameter to snprintf includes the terminating 0, but the
return value does not. Adjust the logic accordingly. Spotted by Mika.
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>