Pull in drm-next with Dave's DP MST support so that I can merge some
conflicting patches which also touch the driver load sequencing around
interrupt handling.
Conflicts:
drivers/gpu/drm/i915/intel_display.c
drivers/gpu/drm/i915/intel_dp.c
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The current code only runs when we do an I915_WRITE operation. It
checks if the unclaimed register flag is set before we do the
operation, and then it checks it again after we do the operation. This
double check allows us to find out if the I915_WRITE operation in
question is the bad one, or if some previous code is the bad one. When
it finds a problem, our code uses DRM_ERROR to signal it.
The good thing about the current code is that it detects the problem,
so at least we can know we did something wrong. The problem is that
even though we find the problem, we don't really have much information
to actually debug it. So whenever I see one of these DRM_ERROR
messages on my systems, the first thing I do is apply a patch to
change the DRM_ERROR to a WARN and also check for unclaimed registers
on I915_READ operations. This local patch makes things even slower,
but it usually helps a lot in finding the bad code.
The first point here is that since the current code is only useful to
detect whether we have a problem or not, but it is not really good to
find the cause of the problem, I don't think we should be checking
both before and after every I915_WRITE operation: just doing the check
once should be enough for us to quickly detect problems. With this
change, the code that runs by default for every single user will only
do 1 read operation for every single I915_WRITE, instead of 2. This
patch does this change.
The second point is that the local patch I have should be upstream,
but since it makes things slower it should be disabled by default. So
I added the i915.mmio_debug option to enable it.
So after this patch, this is what will happen:
- By default, we will try to detect unclaimed registers once after
every I915_WRITE operation. Previously we tried twice for every
I915_WRITE.
- When we find an unclaimed register we will still print a DRM_ERROR
message, but we will now tell the user to try again with
i915.mmio_debug=1.
- When we use i915.mmio_debug=1 we will try to find unclaimed
registers both before and after every I915_READ and I915_WRITE
operation, and we will print stack traces in case we find them.
This should really help locating the exact point of the bad code
(or at least finding out that i915.ko is not the problem).
This commit also opens space for really-slow register debugging
operations on other platforms. In theory we can now add lots and lots
of debug code behind i915.mmio_debug, enable this option on our tests,
and catch more problems.
v2: - Remove not-so-useful comments (Daniel)
- Fix the param definition macros (Rodrigo)
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we use the runtime IRQ enable/disable functions in our suspend
path, we can simply check the pm._irqs_disabled flag everywhere. So
rename it to catch the users, and add an inline for it to make the
checks clear everywhere.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In the future, we'll need the height of the fb to fetch from memory for
WM computation.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I've tried to split this up, but all the changes are so tightly
related that I didn't find a good way to do this without breaking
bisecting. Essentially this completely changes how psr is glued into
the overall driver, and there's not much you can do to soften such a
paradigm change.
- Use frontbuffer tracking bits stuff to separate disable and
re-enable.
- Don't re-check everything in the psr work. We have now accurate
tracking for everything, so no need to check for sprites or tiling
really. Allows us to ditch tons of locks.
- That in turn allows us to properly cancel the work in the disable
function - no more deadlocks.
- Add a check for HSW sprites and force a flush. Apparently the
hardware doesn't forward the flushing when updating the sprite base
address. We can do the same trick everywhere else we have such
issues, e.g. on baytrail with ... everything.
- Don't re-enable psr with a delay in psr_exit. It really must be
turned off forever if we detect a gtt write. At least with the
current frontbuffer render tracking. Userspace can do a busy ioctl
call or no-op pageflip to re-enable psr.
- Drop redundant checks for crtc and crtc->active - now that they're
only called from enable this is guaranteed.
- Fix up the hsw port check. eDP can also happen on port D, but the
issue is exactly that it doesn't work there. So an || check is
wrong.
- We still schedule the psr work with a delay. The frontbuffer
flushing interface mandates that we upload the next full frame, so
need to wait a bit. Once we have single-shot frame uploads we can do
better here.
v2: Don't enable psr initially, rely upon the fb flush of the initial
plane setup for that. Gives us more unified code flow and makes the
crtc enable sequence less a special case.
v3: s/psr_exit/psr_invalidate/ for consistency
v4: Fixup whitespace.
v5: Correctly bail out of psr_invalidate/flush when
dev_priv->psr.enabled is NULL. Spotted by Rodrigo.
v6:
- Only schedule work when there's work to do. Fixes WARNINGs reported
by Rodrigo.
- Comments Chris requested to clarify the code.
v7: Fix conflict on rebase (Rodrigo)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It's not really optional to have locking ...
The ugly part is how much locking the psr work needs since it has to
recheck everything. Which is way too much. But we need to ditch the
psr work in it's current form anyway and implement proper frontbuffer
tracking.
The other nasty bit that had to go was the delayed work cancle in
psr_exit. Which means a bunch of races just became a bit more likely,
but mea culpa.
v2: Fixup HAS_PSR checks, resulting in uninitialized mutex issues.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Trying to fish that one out through looping is a bit a locking
nightmare. So just set it and use it in the work struct.
v2:
- Don't Oops in psr_work, spotted by Rodrigo.
- Fix compile warning.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Due to runtime pm and system s/r we need to restore hw state every
time we enable a pipe again. Hence trying to avoid that is just
pointless book-keeping which Rodrigo then tried to work around by
manually adding psr_setup calls to our resume code.
Much simpler to just remove code instead.
v2: Properly bail out of psr exit if psr isn't enabled. Spotted by
Rodrigo.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On VLV, after i915_pm_suspend display power wells are staying
power ungated. So, after initiating mem sleep "echo mem > /sys/power/state"
Display is staing D0 State. There might be better way/place to power gate
these wells. Also, we need to make sure that if wells are power gated due to
DPMS OFF sequence, they need not be turned off by i915_pm_suspend again.
v2: Extracted helper for intel_crtc_disable and power gating CRTC power wells.
[Daniel]
Cc: Imre Deak <imre.deak@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Change-Id: I34c80da66aa24c423a5576c68aa1f3a8d0f43848
Signed-off-by: Borun Fu <borun.fu@intel.com>
Signed-off-by: Sagar Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This adds DP 1.2 MST support on Haswell systems.
Notes:
a) this reworks irq handling for DP MST ports, so that we can
avoid the mode config locking in the current hpd handlers, as
we need to process up/down msgs at a better time.
Changes since v0.1:
use PORT_PCH_HOTPLUG to detect short vs long pulses
add a workqueue to deal with digital events as they can get blocked on the
main workqueue beyong mode_config mutex
fix a bunch of modeset checker warnings
acks irqs in the driver
cleanup the MST encoders
Changes since v0.2:
check irq status again in work handler
move around bring up and tear down to fix DPMS on/off
use path properties.
Changes since v0.3:
updates for mst apis
more state checker fixes
irq handling improvements
fbcon handling support
improved reference counting of link - fixes redocking.
Changes since v0.4:
handle gpu reset hpd reinit without oopsing
check link status on HPD irqs
fix suspend/resume
Changes since v0.5:
use proper functions to get max link/lane counts
fix another checker backtrace - due to connectors disappearing.
set output type in more places fro, unknown->displayport
don't talk to devices if no HPD asserted
check mst on short irqs only
check link status properly
rebase onto prepping irq changes.
drop unsued force_act
Changes since v0.6:
cleanup unused struct entry.
[airlied: fix some sparse warnings].
Reviewed-by: Todd Previte <tprevite@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- fbc improvements when stolen memory is tight (Ben)
- cdclk handling improvements for vlv/chv (Ville)
- proper fix for stuck primary planes on gmch platforms with cxsr (Imre&Ebgert
Eich)
- gen8 hw semaphore support (Ben)
- more execlist prep work from Oscar Mateo
- locking fixes for primary planes (Matt Roper)
- code rework to support runtime pm for dpms on hsw/bdw (Paulo, Imre & me), but
not yet enabled because some fixes from Paulo haven't made the cut
- more gpu boost tuning from Chris
- as usual piles of little things all over
* tag 'drm-intel-next-2014-07-11' of git://anongit.freedesktop.org/drm-intel: (93 commits)
drm/i915: Make the RPS interrupt generation mask handle the vlv wa
drm/i915: Move RPS evaluation interval counters to i915->rps
drm/i915: Don't cast a pointer to void* unnecessarily
drm/i915: don't read LVDS regs at compute_config time
drm/i915: check the power domains in intel_lvds_get_hw_state()
drm/i915: check the power domains in ironlake_get_pipe_config()
drm/i915: don't skip shared DPLL assertion on LPT
drm/i915: Only touch WRPLL hw state in enable/disable hooks
drm/i915: Switch to common shared dpll framework for WRPLLs
drm/i915: ->enable hook for WRPLLs
drm/i915: ->disable hook for WRPLLs
drm/i915: State readout support for WRPLLs
drm/i915: add POWER_DOMAIN_PLLS
drm/i915: Document that the pll->mode_set hook is optional
drm/i915: Basic shared dpll support for WRPLLs
drm/i915: Precompute static ddi_pll_sel values in encoders
drm/i915: BDW also has special-purpose DP DDI clocks
drm/i915: State readout and cross-checking for ddi_pll_sel
drm/i915: Move ddi_pll_sel into the pipe config
drm/i915: Add a debugfs file for the shared dpll state
...
We need mem_freq or cz clock for freq/opcode conversion
Signed-off-by: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
No need to re-read the hardware rps fuses when we already have all the
values tucked away in dev_priv->rps.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Place the RPS counters inside the RPS struct.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Mostly this patch is one big excersize in deleting code and asserts
which are no longer needed. Note that we still abuse the shared dpll
framework a bit since we call the enable/disable functions from the
crtc mode_set and off hooks. But changing the actual hardware sequence
will be done in the next step.
Note that besides the massive amount of changes in this patch the
places and order in which the low-level WRPLL code is called is
absolutely unchanged.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[imre: rebased on patchset version w/o pch/crt/fdi refactoring]
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Still tacked onto the side, but slowly getting there.
v2: Don't forget the debugfs file.
v3 (from Paulo): Don't forget to check the power domains.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
And get/put it when needed. The special thing about this commit is
that it will now return false in ibx_pch_dpll_get_hw_state() in case
the power domain is not enabled. This will fix some WARNs we have when
we run pm_rpm on SNB.
Testcase: igt/pm_rpm
Bugzilla:https://bugs.freedesktop.org/show_bug.cgi?id=80463
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just filing in names and ids, but not yet officially registering them
so that the hw state cross checker doesn't completely freak out about
them. Still since we do already read out and cross check
config->shared_dpll the basics are now there to flesh out the wrpll
shared dpll implementation.
The idea is now to roll out all the callbacks step-by-step and then at
the end switch to the shared dpll framework. This way hw and sw
changes are clearly separated.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[imre: added const to hsw_ddi_pll_names (Damien)]
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
SPLL would be a reference clock we could potentially share,
especially if we want to use the SSC mode. But currently we
don't, so let's rip out this complexity for a simpler conversion
to the new display pll framework.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
commit c675949ec5
Author: Jani Nikula <jani.nikula@intel.com>
Date: Wed Apr 9 11:31:37 2014 +0300
drm/i915: do not setup backlight if not available according to VBT
caused a regression on machines with a misconfigured VBT. Add a quirk to
assert the presence of a controllable backlight. Use it to ignore the VBT
backlight presence check during backlight setup.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79813
Tested-by: James Duley <jagduley@gmail.com>
Tested-by: Michael Mullin <masmullin@gmail.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Scot Doyle <lkml14@scotdoyle.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org # 3.15 only
[danvet: Add cc: stable because the regressing commit is in 3.15.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
- Accurate frontbuffer tracking and frontbuffer rendering invalidate, flush and
flip events. This is prep work for proper PSR support and should also be
useful for DRRS&fbc.
- Runtime suspend hardware on system suspend to support the new SOix sleep
states, from Jesse.
- PSR updates for broadwell (Rodrigo)
- Universal plane support for cursors (Matt Roper), including core drm patches.
- Prefault gtt mappings (Chris)
- baytrail write-enable pte bit support (Akash Goel)
- mmio based flips (Sourab Gupta) instead of blitter ring flips
- interrupt handling race fixes (Oscar Mateo)
And old, not yet merged features from the previous round:
- rps/turbo support for chv (Deepak)
- some other straggling chv patches (Ville)
- proper universal plane conversion for the primary plane (Matt Roper)
- ppgtt on vlv from Jesse
- pile of cleanups, little fixes for insane corner cases and improved debug
support all over
* tag 'drm-intel-next-2014-06-20' of git://anongit.freedesktop.org/drm-intel: (99 commits)
drm/i915: Update DRIVER_DATE to 20140620
drivers/i915: Fix unnoticed failure of init_ring_common()
drm/i915: Track frontbuffer invalidation/flushing
drm/i915: Use new frontbuffer bits to increase pll clock
drm/i915: don't take runtime PM reference around freeze/thaw
drm/i915: use runtime irq suspend/resume in freeze/thaw
drm/i915: Properly track domain of the fbcon fb
drm/i915: Print obj->frontbuffer_bits in debugfs output
drm/i915: Introduce accurate frontbuffer tracking
drm/i915: Drop schedule_back from psr_exit
drm/i915: Ditch intel_edp_psr_update
drm/i915: Drop unecessary complexity from psr_inactivate
drm/i915: Remove ctx->last_ring
drm/i915/chv: Ack interrupts before handling them (CHV)
drm/i915/bdw: Ack interrupts before handling them (GEN8)
drm/i915/vlv: Ack interrupts before handling them (VLV)
drm/i915: Ack interrupts before handling them (GEN5 - GEN7)
drm/i915: Don't BUG_ON in i915_gem_obj_offset
drm/i915: Grab dev->struct_mutex in i915_gem_pageflip_info
drm/i915: Add some L3 registers to the parser whitelist
...
Conflicts:
drivers/gpu/drm/i915/i915_drv.c
With RC6 enabled, BYT has an HW issue in determining the right
Gfx busyness.
WA for Turbo + RC6: Use SW based Gfx busy-ness detection to decide
on increasing/decreasing the freq. This logic will monitor C0
counters of render/media power-wells over EI period and takes
necessary action based on these values
v2: Refactor duplicate code. (Ville)
v3: Reformat the comments. (Ville)
v4: Enable required counters and remove unwanted code (Ville)
v5: Added frequency change acceleration support and remove kernel-doc
style comments. (Ville)
v6: Updated comment section and Fix w/a comment. (Ville)
Signed-off-by: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
A bit of background on the context elements.
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Appease checkpatch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is an Execlists preparatory patch, since they make context ID become an
overloaded term:
- In the software, it was used to distinguish which context userspace was
trying to use.
- In the BSpec, the term is used to describe the 20-bits long field the
hardware uses to it to discriminate the contexts that are submitted to
the ELSP and inform the driver about their current status (via Context
Switch Interrupts and Context Status Buffers).
Initially, I tried to make the different meanings converge, but it proved
impossible:
- The software ctx->id is per-filp, while the hardware one needs to be
globally unique.
- Also, we multiplex several backing states objects per intel_context,
and all of them need unique HW IDs.
- I tried adding a per-filp ID and then composing the HW context ID as:
ctx->id + file_priv->id + ring->id, but the fact that the hardware only
uses 20-bits means we have to artificially limit the number of filps or
contexts the userspace can create.
The ctx->user_handle renaming bits are done with this Cocci patch (plus
manual frobbing of the struct declaration):
@@
struct intel_context c;
@@
- (c).id
+ c.user_handle
@@
struct intel_context *c;
@@
- (c)->id
+ c->user_handle
Also, while we are at it, s/DEFAULT_CONTEXT_ID/DEFAULT_CONTEXT_HANDLE and
change the type to unsigned 32 bits.
v2: s/handle/user_handle and change the type to uint32_t as suggested by
Chris Wilson.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v1)
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We have already advanced that Logical Ring Contexts have their own kind
of backing objects, but everything will be better explained in the Execlists
series. For now, suffice it to say that the current backing object is only
ever used with the render ring, so we're making this fact more explicit
(which is a good reason on its own).
As for the is_initialized flag, we only use to signify that the render state
has been initialized (a.k.a. golden context, a.k.a. null context). It doesn't
mean anything for the other engines, so make that distinction obvious.
Done with the following Coccinelle patch (plus manual frobbing of the struct):
@@
struct intel_context c;
@@
- (c).obj
+ c.legacy_hw_ctx.rcs_state
@@
struct intel_context *c;
@@
- (c)->obj
+ c->legacy_hw_ctx.rcs_state
@@
struct intel_context c;
@@
- (c).is_initialized
+ c.legacy_hw_ctx.initialized
@@
struct intel_context *c;
@@
- (c)->is_initialized
+ c->legacy_hw_ctx.initialized
This Execlists prep-work patch has been suggested by Chris Wilson and Daniel
Vetter separately.
Initially, it was two separate patches:
drm/i915: Rename ctx->obj to ctx->rcs_state
drm/i915: Make it obvious that ctx->id is merely a user handle
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: s/id/is_initialized/ to fix the subject and resolve a
conflict in i915_gem_context_reset. Also introduce a new lctx local
variable to avoid overtly long lines.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Some drivers need to be able to have a perfect race-free fbcon setup.
Current drivers only enable hotplug processing after the call to
drm_fb_helper_initial_config which leaves a tiny but important race.
This race is especially noticable on embedded platforms where the
driver itself enables the voltage for the hdmi output, since only then
will monitors (after a bit of delay, as usual) respond by asserting
the hpd pin.
Most of the infrastructure is already there with the split-out
drm_fb_helper_init. And drm_fb_helper_initial_config already has all
the required locking to handle concurrent hpd events since
commit 53f1904bce
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Mar 20 14:26:35 2014 +0100
drm/fb-helper: improve drm_fb_helper_initial_config locking
The only missing bit is making drm_fb_helper_hotplug_event save
against concurrent calls of drm_fb_helper_initial_config. The only
unprotected bit is the check for fb_helper->fb.
With that drivers can first initialize the fb helper, then enabel
hotplug processing and then set up the initial config all in a
completely race-free manner. Update kerneldoc and convert i915 as a
proof of concept.
Feature requested by Thierry since his tegra driver atm reliably boots
slowly enough to misses the hotplug event for an external hdmi screen,
but also reliably boots to quickly for the hpd pin to be asserted when
the fb helper calls into the hdmi ->detect function.
Cc: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Since the semaphore information is in an object, just dump it, and let
the user parse it later.
NOTE: The page being used for the semaphores are incoherent with the
CPU. No matter what I do, I cannot figure out a way to read anything but
0s. Note that the semaphore waits are indeed working.
v2: Don't print signal, and wait (they should be the same). Instead,
print sync_seqno (Chris)
v3: Free the semaphore error object (Chris)
v4: Fix semaphore offset calculation during error state collection
(Ville)
v5: VCS2 rebase
Make semaphore object error capture coding style consistent (Ville)
Do the proper math for the signal offset (Ville)
v6: Fix small conflicts on rebase and s/ring_buffer/engine_cs (Rodrigo)
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Semaphore signalling works similarly to previous GENs with the exception
that the per ring mailboxes no longer exist. Instead you must define
your own space, somewhere in the GTT.
The comments in the code define the layout I've opted for, which should
be fairly future proof. Ie. I tried to define offsets in abstract terms
(NUM_RINGS, seqno size, etc).
NOTE: If one wanted to move this to the HWSP they could. I've decided
one 4k object would be easier to deal with, and provide potential wins
with cache locality, but that's all speculative.
v2: Update the macro to not need the other ring's ring->id (Chris)
Update the comment to use the correct formula (Chris)
v3: Move the macros the ringbuffer.h to prevent churn in next patch
(Ville)
v4: Fixed compilation rebase conflict
commit 1ec9e26dda
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Fri Feb 14 14:01:11 2014 +0100
drm/i915: Consolidate binding parameters into flags
v5: VCS2 rebase
Replace hweight_long with hweight32
v6 (Rodrigo): * Add missed VC2 gen8 ring signal init
* fixing conflicst on rebase
* minor fixes on address table
* remove WARN_ON
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[danvet: s/BUG_ON/WARN_ON/]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The 'i915_driver_preclose()' function has a parameter called 'file_priv'.
However, this is misleading as the structure it points to is a 'drm_file' not a
'drm_i915_file_private'. It should be named just 'file' to avoid confusion.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The digital ports from Ironlake and up have the ability to distinguish
between long and short HPD pulses. Displayport 1.1 only uses the short
form to request link retraining usually, so we haven't really needed
support for it until now.
However with DP 1.2 MST we need to handle the short irqs on their
own outside the modesetting locking the long hpd's involve. This
patch adds the framework to distinguish between short/long to the
current code base, to lay the basis for future DP 1.2 MST work.
This should mean we get better bisectability in case of regression
due to the new irq handling.
v2: add GM45 support (untested, due to lack of hw)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Todd Previte <tprevite@gmail.com>
[danvet: Fix conflicts in i915_irq.c with Oscar Mateo's irq handling
race fixes and a trivial one in intel_drv.h with the psr code.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This functionality will be also needed by an upcoming patch, so factor
it out. As a bonus this also makes things a bit more uniform across
platforms. Note that this also changes the register read-modify-write
to a simple write during disabling. This is what we do during enabling
anyway and according to the spec all the relevant bits are reserved-MBZ
or reserved with a 0 default value.
v2:
- unchanged
v3:
- fix missing cxsr disabling on pineview (Deepak)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJTuaWZAAoJEHm+PkMAQRiGfkIH/2Hhwrg51GWazUYIXVxz5zLU
kPMlaws3vankbhka9HCg02eS3tkzr6shO3F/qlBba+5GUkUDKCcCisIsvk4hgZZg
7YqepTvcaupNxIp4TmTGm1FYVK1GpaWFdJVgg2PDdGFahw3HSlfZoTkBzirNCwga
p/jfeRzathbUixpz9OAC1AEn2gP1AxNRpSt1wShL5rexBb1YRXCPuCEt9B0UsVoR
mzKf5xEsuaZnpCuvWK4S60fjfVhTe8UJ/xGPPfdLyIXU0rvhaKzfeVQO6F5nIQBy
Xvrar1f7oOPZaJRdlmPvAimS7iS8lq/YctuHu7ia1NdJSihtA5sRPf7cWAw2d7s=
=4PrL
-----END PGP SIGNATURE-----
Merge tag 'v3.16-rc4' into drm-intel-next-queued
Due to Dave's vacation drm-next hasn't opened yet for 3.17 so I
couldn't move my drm-intel-next queue forward yet like I usually do.
Just pull in the latest upstream -rc to unblock patch merging - I
don't want to needlessly rebase my current patch pile really and void
all the testing we've done already.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The GEN FBC unit provides the ability to set a low pass on frames it
attempts to compress. If a frame is less than a certain amount
compressibility (2:1, 4:1) it will not bother. This allows the driver to
reduce the size it requests out of stolen memory.
Unluckily, a few months ago, Ville actually began using this feature for
framebuffers that are 16bpp (not sure why not 8bpp). In those cases, we
are already using this mechanism for a different purpose, and so we can
only achieve one further level of compression (2:1 -> 4:1)
FBC GEN1, ie. pre-G45 is ignored.
The cleverness of the patch is Art's. The bugs are mine.
v2: Update message and including missing threshold case 3 (Spotted by Arthur).
Cc: Art Runyan <arthur.j.runyan@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
We are already using the size to determine whether or not to free the
object, so there is no functional change there. Almost everything else
has changed to static allocations of the drm_mm_node too.
Aside from bringing this inline with much of our other code, this makes
error paths slightly simpler, which benefits the look of an upcoming
patch.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jesse noticed that the punit communication needed to query the VLV power
well status can cause substantial delays. Since we can query the state
frequently, for example during I2C transfers, maintain a cached version
of the HW state to get rid of this delay.
This fixes at least one reported regression where boot time increased by
~4 seconds due to frequent power well state queries on VLV during eDP
EDID read.
This regression has been introduced in
commit bb4932c4f1
Author: Imre Deak <imre.deak@intel.com>
Date: Mon Apr 14 20:24:33 2014 +0300
drm/i915: vlv: check port power domain instead of only D0 for eDP VDD on
Reported-by: Jesse Barnes <jesse.barnes@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
So these are the guts of the new beast. This tracks when a frontbuffer
gets invalidated (due to frontbuffer rendering) and hence should be
constantly scaned out, and when it's flushed again and can be
compressed/one-shot-upload.
Rules for flushing are simple: The frontbuffer needs one more full
upload starting from the next vblank. Which means that the flushing
can _only_ be called once the frontbuffer update has been latched.
But this poses a problem for pageflips: We can't just delay the
flushing until the pageflip is latched, since that would pose the risk
that we override frontbuffer rendering that has been scheduled
in-between the pageflip ioctl and the actual latching.
To handle this track asynchronous invalidations (and also pageflip)
state per-ring and delay any in-between flushing until the rendering
has completed. And also cancel any delayed flushing if we get a new
invalidation request (whether delayed or not).
Also call intel_mark_fb_busy in both cases in all cases to make sure
that we keep the screen at the highest refresh rate both on flips,
synchronous plane updates and for frontbuffer rendering.
v2: Lots of improvements
Suggestions from Chris:
- Move invalidate/flush in flush_*_domain and set_to_*_domain.
- Drop the flush in busy_ioctl since it's redundant. Was a leftover
from an earlier concept to track flips/delayed flushes.
- Don't forget about the initial modeset enable/final disable.
Suggested by Chris.
Track flips accurately, too. Since flips complete independently of
rendering we need to track pending flips in a separate mask. Again if
an invalidate happens we need to cancel the evenutal flush to avoid
races.
v3:
Provide correct header declarations for flip functions. Currently not
needed outside of intel_display.c, but part of the proper interface.
v4: Add proper domain management to fbcon so that the fbcon buffer is
also tracked correctly.
v5: Fixup locking around the fbcon set_to_gtt_domain call.
v6: More comments from Chris:
- Split out fbcon changes.
- Drop superflous checks for potential scanout before calling intel_fb
functions - we can micro-optimize this later.
- s/intel_fb_/intel_fb_obj_/ to make it clear that this deals in gem
object. We already have precedence for fb_obj in the pin_and_fence
functions.
v7: Clarify the semantics of the flip flush handling by renaming
things a bit:
- Don't go through a gem object but take the relevant frontbuffer bits
directly. These functions center on the plane, the actual object is
irrelevant - even a flip to the same object as already active should
cause a flush.
- Add a new intel_frontbuffer_flip for synchronous plane updates. It
currently just calls intel_frontbuffer_flush since the implemenation
differs.
This way we achieve a clear split between one-shot update events on
one side and frontbuffer rendering with potentially a very long delay
between the invalidate and flush.
Chris and I also had some discussions about mark_busy and whether it
is appropriate to call from flush. But mark busy is a state which
should be derived from the 3 events (invalidate, flush, flip) we now
have by the users, like psr does by tracking relevant information in
psr.busy_frontbuffer_bits. DRRS (the only real use of mark_busy for
frontbuffer) needs to have similar logic. With that the overall
mark_busy in the core could be removed.
v8: Only when retiring gpu buffers only flush frontbuffer bits we
actually invalidated in a batch. Just for safety since before any
additional usage/invalidate we should always retire current rendering.
Suggested by Chris Wilson.
v9: Actually use intel_frontbuffer_flip in all appropriate places.
Spotted by Chris.
v10: Address more comments from Chris:
- Don't call _flip in set_base when the crtc is inactive, avoids redunancy
in the modeset case with the initial enabling of all planes.
- Add comments explaining that the initial/final plane enable/disable
still has work left to do before it's fully generic.
v11: Only invalidate for gtt/cpu access when writing. Spotted by Chris.
v12: s/_flush/_flip/ in intel_overlay.c per Chris' comment.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The downclocking checks a few more things, so not that simple to
convert. Also, this should get unified with the drrs handling and also
use the locking of that. Otoh the drrs locking is about as hapzardous
as no locking, at least on first sight.
For easier conversion ditch the upclocking on unload - we'll turn off
everything anyway.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
So from just a quick look we seem to have enough information to
accurately figure out whether a given gem bo is used as a frontbuffer
and where exactly: We have obj->pin_count as a first check with no
false negatives and only negligible false positives. And then we can
just walk the modeset objects and figure out where exactly a buffer is
used as scanout.
Except that we can't due to locking order: If we already hold
dev->struct_mutex we can't acquire any modeset locks, so could
potential chase freed pointers and other evil stuff.
So we need something else. For that introduce a new set of bits
obj->frontbuffer_bits to track where a buffer object is used. That we
can then chase without grabbing any modeset locks.
Of course the consumers of this (DRRS, PSR, FBC, ...) still need to be
able to do their magic both when called from modeset and from gem
code. But that can be easily achieved by adding locks for these
specific subsystems which always nest within either kms or gem
locking.
This patch just adds the relevant update code to all places.
Note that if we ever support multi-planar scanout targets then we need
one frontbuffer tracking bit per attachment point that we expose to
userspace.
v2:
- Fix more oopsen. Oops.
- WARN if we leak obj->frontbuffer_bits when freeing a gem buffer. Fix
the bugs this brought to light.
- s/update_frontbuffer_bits/update_fb_bits/. More consistent with the
fb tracking functions (fb for gem object, frontbuffer for raw bits).
And the function name was way too long.
v3: Size obj->frontbuffer_bits correctly so that all pipes fit in.
v4: Don't update fb bits in set_base on failure. Noticed by Chris.
v5: s/i915_gem_update_fb_bits/i915_gem_track_fb/ Also remove a few
local enum pipe variables which are now no longer needed to make the
function arguments no drop over the 80 char limit.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The original comment that introduced it said:
commit 0009e46cd5
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Dec 6 14:11:02 2013 -0800
drm/i915: Track which ring a context ran on
Previously we dropped the association of a context to a ring. It is
however very important to know which ring a context ran on (we could
have reused the other member, but I was nitpicky).
This is very important when we switch address spaces, which unlike
context objects, do change per ring.
As an example, if we have:
RCS BCS
ctx A
ctx A
ctx B
ctx B
Without tracking the last ring B ran on, we wouldn't know to switch the
address space on BCS in the last row.
But this is not really true, because we are already checking to != from (with
"from" being = ring->last_context) and that should be enough to make sure we
switch to the right address space.
We would have a problem if we switched the context object for every ring (since
then we would fail to do it in some situations) but we only switch it for the
render ring, so we don't care.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jesse's SOix work required some patches from acpi-next, so pull it in
through a topic barnch.
Conflicts:
drivers/gpu/drm/i915/i915_drv.c
drivers/gpu/drm/i915/intel_pm.c
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch enables the framework for using MMIO based flip calls,
in contrast with the CS based flip calls which are being used currently.
MMIO based flip calls can be enabled on architectures where
Render and Blitter engines reside in different power wells. The
decision to use MMIO flips can be made based on workloads to give
100% residency for Media power well.
v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)
v3: Rebasing on latest code. Code restructuring after incorporating
Damien's comments
v4: Addressing Ville's review comments
-general cleanup
-updating only base addr instead of calling update_primary_plane
-extending patch for gen5+ platforms
v5: Addressed Ville's review comments
-Making mmio flip vs cs flip selection based on module parameter
-Adding check for DRIVER_MODESET feature in notify_ring before calling
notify mmio flip.
-Other changes mostly in function arguments
v6: -Having a seperate function to check condition for using mmio flips (Ville)
-propogating error code from i915_gem_check_olr (Ville)
v7: -Adding __must_check with i915_gem_check_olr (Chris)
-Renaming mmio_flip_data to mmio_flip (Chris)
-Rebasing on latest nightly
v8: -Rebasing on latest code
-squash 3rd patch in series(mmio setbase vs page flip race) with this patch
-Added new tiling mode update in intel_do_mmio_flip (Chris)
v9: -check for obj->last_write_seqno being 0 instead of obj->ring being NULL in
intel_postpone_flip, as this is a more restrictive condition (Chris)
v10: -Applied Chris's suggestions for squashing patches 2,3 into this patch.
These patches make the selection of CS vs MMIO flip at the page flip time, and
make the module parameter for using mmio flips as tristate, the states being
'force CS flips', 'force mmio flips', 'driver discretion'.
Changed the logic for driver discretion (Chris)
v11: Minor code cleanup(better readability, fixing whitespace errors, using
lockdep to check mutex locked status in postpone_flip, removal of __must_check
in function definition) (Chris)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # snb, ivb
[danvet: Fix up parameter alignement checkpatch spotted.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This adds support for a write-enable bit in the entry of GTT.
This is handled via a read-only flag in the GEM buffer object which
is then used to see how to set the bit when writing the GTT entries.
Currently by default the Batch buffer & Ring buffers are marked as read only.
v2: Moved the pte override code for read-only bit to 'byt_pte_encode'. (Chris)
Fixed the issue of leaving 'gt_old_ro' as unused. (Chris)
v3: Removed the 'gt_old_ro' field, now setting RO bit only for Ring Buffers(Daniel).
v4: Added a new 'flags' parameter to all the pte(gen6) encode & insert_entries functions,
in lieu of overloading the cache_level enum (Daniel).
v5: Removed the superfluous VLV check & changed the definition location of PTE_READ_ONLY flag (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The perfect solution for psr_exit is the hardware tracking the changes and
doing the psr exit by itself. This scenario works for HSW and BDW with some
environments like Gnome and Wayland.
However there are many other scenarios that this isn't true. Mainly one right
now is KDE users on HSW and BDW with PSR on. User would miss many screen
updates. For instances any key typed could be seen only when mouse cursor is
moved. So this patch introduces the ability of trigger PSR exit on kernel side
on some common cases that.
Most of the cases are coverred by psr_exit at set_domain. The remaining cases
are coverred by triggering it at set_domain, busy_ioctl, sw_finish and
mark_busy.
The downside here might be reducing the residency time on the cases this
already work very wall like Gnome environment. But so far let's get focused
on fixinge issues sio PSR couild be used for everybody and we could even
get it enabled by default. Later we can add some alternatives to choose the
level of PSR efficiency over boot flag of even over crtc property.
v2: remove exit from connector_dpms. Daniel pointed this is the wrong way and
also this isn't needed for BDW and HSW anyway.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Vijay Purushothaman <vijay.a.purushothaman@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>