This is an Execlists preparatory patch, since they make context ID become an
overloaded term:
- In the software, it was used to distinguish which context userspace was
trying to use.
- In the BSpec, the term is used to describe the 20-bits long field the
hardware uses to it to discriminate the contexts that are submitted to
the ELSP and inform the driver about their current status (via Context
Switch Interrupts and Context Status Buffers).
Initially, I tried to make the different meanings converge, but it proved
impossible:
- The software ctx->id is per-filp, while the hardware one needs to be
globally unique.
- Also, we multiplex several backing states objects per intel_context,
and all of them need unique HW IDs.
- I tried adding a per-filp ID and then composing the HW context ID as:
ctx->id + file_priv->id + ring->id, but the fact that the hardware only
uses 20-bits means we have to artificially limit the number of filps or
contexts the userspace can create.
The ctx->user_handle renaming bits are done with this Cocci patch (plus
manual frobbing of the struct declaration):
@@
struct intel_context c;
@@
- (c).id
+ c.user_handle
@@
struct intel_context *c;
@@
- (c)->id
+ c->user_handle
Also, while we are at it, s/DEFAULT_CONTEXT_ID/DEFAULT_CONTEXT_HANDLE and
change the type to unsigned 32 bits.
v2: s/handle/user_handle and change the type to uint32_t as suggested by
Chris Wilson.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v1)
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We have already advanced that Logical Ring Contexts have their own kind
of backing objects, but everything will be better explained in the Execlists
series. For now, suffice it to say that the current backing object is only
ever used with the render ring, so we're making this fact more explicit
(which is a good reason on its own).
As for the is_initialized flag, we only use to signify that the render state
has been initialized (a.k.a. golden context, a.k.a. null context). It doesn't
mean anything for the other engines, so make that distinction obvious.
Done with the following Coccinelle patch (plus manual frobbing of the struct):
@@
struct intel_context c;
@@
- (c).obj
+ c.legacy_hw_ctx.rcs_state
@@
struct intel_context *c;
@@
- (c)->obj
+ c->legacy_hw_ctx.rcs_state
@@
struct intel_context c;
@@
- (c).is_initialized
+ c.legacy_hw_ctx.initialized
@@
struct intel_context *c;
@@
- (c)->is_initialized
+ c->legacy_hw_ctx.initialized
This Execlists prep-work patch has been suggested by Chris Wilson and Daniel
Vetter separately.
Initially, it was two separate patches:
drm/i915: Rename ctx->obj to ctx->rcs_state
drm/i915: Make it obvious that ctx->id is merely a user handle
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: s/id/is_initialized/ to fix the subject and resolve a
conflict in i915_gem_context_reset. Also introduce a new lctx local
variable to avoid overtly long lines.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Some drivers need to be able to have a perfect race-free fbcon setup.
Current drivers only enable hotplug processing after the call to
drm_fb_helper_initial_config which leaves a tiny but important race.
This race is especially noticable on embedded platforms where the
driver itself enables the voltage for the hdmi output, since only then
will monitors (after a bit of delay, as usual) respond by asserting
the hpd pin.
Most of the infrastructure is already there with the split-out
drm_fb_helper_init. And drm_fb_helper_initial_config already has all
the required locking to handle concurrent hpd events since
commit 53f1904bce
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Mar 20 14:26:35 2014 +0100
drm/fb-helper: improve drm_fb_helper_initial_config locking
The only missing bit is making drm_fb_helper_hotplug_event save
against concurrent calls of drm_fb_helper_initial_config. The only
unprotected bit is the check for fb_helper->fb.
With that drivers can first initialize the fb helper, then enabel
hotplug processing and then set up the initial config all in a
completely race-free manner. Update kerneldoc and convert i915 as a
proof of concept.
Feature requested by Thierry since his tegra driver atm reliably boots
slowly enough to misses the hotplug event for an external hdmi screen,
but also reliably boots to quickly for the hpd pin to be asserted when
the fb helper calls into the hdmi ->detect function.
Cc: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Since the semaphore information is in an object, just dump it, and let
the user parse it later.
NOTE: The page being used for the semaphores are incoherent with the
CPU. No matter what I do, I cannot figure out a way to read anything but
0s. Note that the semaphore waits are indeed working.
v2: Don't print signal, and wait (they should be the same). Instead,
print sync_seqno (Chris)
v3: Free the semaphore error object (Chris)
v4: Fix semaphore offset calculation during error state collection
(Ville)
v5: VCS2 rebase
Make semaphore object error capture coding style consistent (Ville)
Do the proper math for the signal offset (Ville)
v6: Fix small conflicts on rebase and s/ring_buffer/engine_cs (Rodrigo)
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Semaphore signalling works similarly to previous GENs with the exception
that the per ring mailboxes no longer exist. Instead you must define
your own space, somewhere in the GTT.
The comments in the code define the layout I've opted for, which should
be fairly future proof. Ie. I tried to define offsets in abstract terms
(NUM_RINGS, seqno size, etc).
NOTE: If one wanted to move this to the HWSP they could. I've decided
one 4k object would be easier to deal with, and provide potential wins
with cache locality, but that's all speculative.
v2: Update the macro to not need the other ring's ring->id (Chris)
Update the comment to use the correct formula (Chris)
v3: Move the macros the ringbuffer.h to prevent churn in next patch
(Ville)
v4: Fixed compilation rebase conflict
commit 1ec9e26dda
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Fri Feb 14 14:01:11 2014 +0100
drm/i915: Consolidate binding parameters into flags
v5: VCS2 rebase
Replace hweight_long with hweight32
v6 (Rodrigo): * Add missed VC2 gen8 ring signal init
* fixing conflicst on rebase
* minor fixes on address table
* remove WARN_ON
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[danvet: s/BUG_ON/WARN_ON/]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The 'i915_driver_preclose()' function has a parameter called 'file_priv'.
However, this is misleading as the structure it points to is a 'drm_file' not a
'drm_i915_file_private'. It should be named just 'file' to avoid confusion.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The digital ports from Ironlake and up have the ability to distinguish
between long and short HPD pulses. Displayport 1.1 only uses the short
form to request link retraining usually, so we haven't really needed
support for it until now.
However with DP 1.2 MST we need to handle the short irqs on their
own outside the modesetting locking the long hpd's involve. This
patch adds the framework to distinguish between short/long to the
current code base, to lay the basis for future DP 1.2 MST work.
This should mean we get better bisectability in case of regression
due to the new irq handling.
v2: add GM45 support (untested, due to lack of hw)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Todd Previte <tprevite@gmail.com>
[danvet: Fix conflicts in i915_irq.c with Oscar Mateo's irq handling
race fixes and a trivial one in intel_drv.h with the psr code.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This functionality will be also needed by an upcoming patch, so factor
it out. As a bonus this also makes things a bit more uniform across
platforms. Note that this also changes the register read-modify-write
to a simple write during disabling. This is what we do during enabling
anyway and according to the spec all the relevant bits are reserved-MBZ
or reserved with a 0 default value.
v2:
- unchanged
v3:
- fix missing cxsr disabling on pineview (Deepak)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJTuaWZAAoJEHm+PkMAQRiGfkIH/2Hhwrg51GWazUYIXVxz5zLU
kPMlaws3vankbhka9HCg02eS3tkzr6shO3F/qlBba+5GUkUDKCcCisIsvk4hgZZg
7YqepTvcaupNxIp4TmTGm1FYVK1GpaWFdJVgg2PDdGFahw3HSlfZoTkBzirNCwga
p/jfeRzathbUixpz9OAC1AEn2gP1AxNRpSt1wShL5rexBb1YRXCPuCEt9B0UsVoR
mzKf5xEsuaZnpCuvWK4S60fjfVhTe8UJ/xGPPfdLyIXU0rvhaKzfeVQO6F5nIQBy
Xvrar1f7oOPZaJRdlmPvAimS7iS8lq/YctuHu7ia1NdJSihtA5sRPf7cWAw2d7s=
=4PrL
-----END PGP SIGNATURE-----
Merge tag 'v3.16-rc4' into drm-intel-next-queued
Due to Dave's vacation drm-next hasn't opened yet for 3.17 so I
couldn't move my drm-intel-next queue forward yet like I usually do.
Just pull in the latest upstream -rc to unblock patch merging - I
don't want to needlessly rebase my current patch pile really and void
all the testing we've done already.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The GEN FBC unit provides the ability to set a low pass on frames it
attempts to compress. If a frame is less than a certain amount
compressibility (2:1, 4:1) it will not bother. This allows the driver to
reduce the size it requests out of stolen memory.
Unluckily, a few months ago, Ville actually began using this feature for
framebuffers that are 16bpp (not sure why not 8bpp). In those cases, we
are already using this mechanism for a different purpose, and so we can
only achieve one further level of compression (2:1 -> 4:1)
FBC GEN1, ie. pre-G45 is ignored.
The cleverness of the patch is Art's. The bugs are mine.
v2: Update message and including missing threshold case 3 (Spotted by Arthur).
Cc: Art Runyan <arthur.j.runyan@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
We are already using the size to determine whether or not to free the
object, so there is no functional change there. Almost everything else
has changed to static allocations of the drm_mm_node too.
Aside from bringing this inline with much of our other code, this makes
error paths slightly simpler, which benefits the look of an upcoming
patch.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jesse noticed that the punit communication needed to query the VLV power
well status can cause substantial delays. Since we can query the state
frequently, for example during I2C transfers, maintain a cached version
of the HW state to get rid of this delay.
This fixes at least one reported regression where boot time increased by
~4 seconds due to frequent power well state queries on VLV during eDP
EDID read.
This regression has been introduced in
commit bb4932c4f1
Author: Imre Deak <imre.deak@intel.com>
Date: Mon Apr 14 20:24:33 2014 +0300
drm/i915: vlv: check port power domain instead of only D0 for eDP VDD on
Reported-by: Jesse Barnes <jesse.barnes@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
So these are the guts of the new beast. This tracks when a frontbuffer
gets invalidated (due to frontbuffer rendering) and hence should be
constantly scaned out, and when it's flushed again and can be
compressed/one-shot-upload.
Rules for flushing are simple: The frontbuffer needs one more full
upload starting from the next vblank. Which means that the flushing
can _only_ be called once the frontbuffer update has been latched.
But this poses a problem for pageflips: We can't just delay the
flushing until the pageflip is latched, since that would pose the risk
that we override frontbuffer rendering that has been scheduled
in-between the pageflip ioctl and the actual latching.
To handle this track asynchronous invalidations (and also pageflip)
state per-ring and delay any in-between flushing until the rendering
has completed. And also cancel any delayed flushing if we get a new
invalidation request (whether delayed or not).
Also call intel_mark_fb_busy in both cases in all cases to make sure
that we keep the screen at the highest refresh rate both on flips,
synchronous plane updates and for frontbuffer rendering.
v2: Lots of improvements
Suggestions from Chris:
- Move invalidate/flush in flush_*_domain and set_to_*_domain.
- Drop the flush in busy_ioctl since it's redundant. Was a leftover
from an earlier concept to track flips/delayed flushes.
- Don't forget about the initial modeset enable/final disable.
Suggested by Chris.
Track flips accurately, too. Since flips complete independently of
rendering we need to track pending flips in a separate mask. Again if
an invalidate happens we need to cancel the evenutal flush to avoid
races.
v3:
Provide correct header declarations for flip functions. Currently not
needed outside of intel_display.c, but part of the proper interface.
v4: Add proper domain management to fbcon so that the fbcon buffer is
also tracked correctly.
v5: Fixup locking around the fbcon set_to_gtt_domain call.
v6: More comments from Chris:
- Split out fbcon changes.
- Drop superflous checks for potential scanout before calling intel_fb
functions - we can micro-optimize this later.
- s/intel_fb_/intel_fb_obj_/ to make it clear that this deals in gem
object. We already have precedence for fb_obj in the pin_and_fence
functions.
v7: Clarify the semantics of the flip flush handling by renaming
things a bit:
- Don't go through a gem object but take the relevant frontbuffer bits
directly. These functions center on the plane, the actual object is
irrelevant - even a flip to the same object as already active should
cause a flush.
- Add a new intel_frontbuffer_flip for synchronous plane updates. It
currently just calls intel_frontbuffer_flush since the implemenation
differs.
This way we achieve a clear split between one-shot update events on
one side and frontbuffer rendering with potentially a very long delay
between the invalidate and flush.
Chris and I also had some discussions about mark_busy and whether it
is appropriate to call from flush. But mark busy is a state which
should be derived from the 3 events (invalidate, flush, flip) we now
have by the users, like psr does by tracking relevant information in
psr.busy_frontbuffer_bits. DRRS (the only real use of mark_busy for
frontbuffer) needs to have similar logic. With that the overall
mark_busy in the core could be removed.
v8: Only when retiring gpu buffers only flush frontbuffer bits we
actually invalidated in a batch. Just for safety since before any
additional usage/invalidate we should always retire current rendering.
Suggested by Chris Wilson.
v9: Actually use intel_frontbuffer_flip in all appropriate places.
Spotted by Chris.
v10: Address more comments from Chris:
- Don't call _flip in set_base when the crtc is inactive, avoids redunancy
in the modeset case with the initial enabling of all planes.
- Add comments explaining that the initial/final plane enable/disable
still has work left to do before it's fully generic.
v11: Only invalidate for gtt/cpu access when writing. Spotted by Chris.
v12: s/_flush/_flip/ in intel_overlay.c per Chris' comment.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The downclocking checks a few more things, so not that simple to
convert. Also, this should get unified with the drrs handling and also
use the locking of that. Otoh the drrs locking is about as hapzardous
as no locking, at least on first sight.
For easier conversion ditch the upclocking on unload - we'll turn off
everything anyway.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
So from just a quick look we seem to have enough information to
accurately figure out whether a given gem bo is used as a frontbuffer
and where exactly: We have obj->pin_count as a first check with no
false negatives and only negligible false positives. And then we can
just walk the modeset objects and figure out where exactly a buffer is
used as scanout.
Except that we can't due to locking order: If we already hold
dev->struct_mutex we can't acquire any modeset locks, so could
potential chase freed pointers and other evil stuff.
So we need something else. For that introduce a new set of bits
obj->frontbuffer_bits to track where a buffer object is used. That we
can then chase without grabbing any modeset locks.
Of course the consumers of this (DRRS, PSR, FBC, ...) still need to be
able to do their magic both when called from modeset and from gem
code. But that can be easily achieved by adding locks for these
specific subsystems which always nest within either kms or gem
locking.
This patch just adds the relevant update code to all places.
Note that if we ever support multi-planar scanout targets then we need
one frontbuffer tracking bit per attachment point that we expose to
userspace.
v2:
- Fix more oopsen. Oops.
- WARN if we leak obj->frontbuffer_bits when freeing a gem buffer. Fix
the bugs this brought to light.
- s/update_frontbuffer_bits/update_fb_bits/. More consistent with the
fb tracking functions (fb for gem object, frontbuffer for raw bits).
And the function name was way too long.
v3: Size obj->frontbuffer_bits correctly so that all pipes fit in.
v4: Don't update fb bits in set_base on failure. Noticed by Chris.
v5: s/i915_gem_update_fb_bits/i915_gem_track_fb/ Also remove a few
local enum pipe variables which are now no longer needed to make the
function arguments no drop over the 80 char limit.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The original comment that introduced it said:
commit 0009e46cd5
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Dec 6 14:11:02 2013 -0800
drm/i915: Track which ring a context ran on
Previously we dropped the association of a context to a ring. It is
however very important to know which ring a context ran on (we could
have reused the other member, but I was nitpicky).
This is very important when we switch address spaces, which unlike
context objects, do change per ring.
As an example, if we have:
RCS BCS
ctx A
ctx A
ctx B
ctx B
Without tracking the last ring B ran on, we wouldn't know to switch the
address space on BCS in the last row.
But this is not really true, because we are already checking to != from (with
"from" being = ring->last_context) and that should be enough to make sure we
switch to the right address space.
We would have a problem if we switched the context object for every ring (since
then we would fail to do it in some situations) but we only switch it for the
render ring, so we don't care.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jesse's SOix work required some patches from acpi-next, so pull it in
through a topic barnch.
Conflicts:
drivers/gpu/drm/i915/i915_drv.c
drivers/gpu/drm/i915/intel_pm.c
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch enables the framework for using MMIO based flip calls,
in contrast with the CS based flip calls which are being used currently.
MMIO based flip calls can be enabled on architectures where
Render and Blitter engines reside in different power wells. The
decision to use MMIO flips can be made based on workloads to give
100% residency for Media power well.
v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)
v3: Rebasing on latest code. Code restructuring after incorporating
Damien's comments
v4: Addressing Ville's review comments
-general cleanup
-updating only base addr instead of calling update_primary_plane
-extending patch for gen5+ platforms
v5: Addressed Ville's review comments
-Making mmio flip vs cs flip selection based on module parameter
-Adding check for DRIVER_MODESET feature in notify_ring before calling
notify mmio flip.
-Other changes mostly in function arguments
v6: -Having a seperate function to check condition for using mmio flips (Ville)
-propogating error code from i915_gem_check_olr (Ville)
v7: -Adding __must_check with i915_gem_check_olr (Chris)
-Renaming mmio_flip_data to mmio_flip (Chris)
-Rebasing on latest nightly
v8: -Rebasing on latest code
-squash 3rd patch in series(mmio setbase vs page flip race) with this patch
-Added new tiling mode update in intel_do_mmio_flip (Chris)
v9: -check for obj->last_write_seqno being 0 instead of obj->ring being NULL in
intel_postpone_flip, as this is a more restrictive condition (Chris)
v10: -Applied Chris's suggestions for squashing patches 2,3 into this patch.
These patches make the selection of CS vs MMIO flip at the page flip time, and
make the module parameter for using mmio flips as tristate, the states being
'force CS flips', 'force mmio flips', 'driver discretion'.
Changed the logic for driver discretion (Chris)
v11: Minor code cleanup(better readability, fixing whitespace errors, using
lockdep to check mutex locked status in postpone_flip, removal of __must_check
in function definition) (Chris)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # snb, ivb
[danvet: Fix up parameter alignement checkpatch spotted.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This adds support for a write-enable bit in the entry of GTT.
This is handled via a read-only flag in the GEM buffer object which
is then used to see how to set the bit when writing the GTT entries.
Currently by default the Batch buffer & Ring buffers are marked as read only.
v2: Moved the pte override code for read-only bit to 'byt_pte_encode'. (Chris)
Fixed the issue of leaving 'gt_old_ro' as unused. (Chris)
v3: Removed the 'gt_old_ro' field, now setting RO bit only for Ring Buffers(Daniel).
v4: Added a new 'flags' parameter to all the pte(gen6) encode & insert_entries functions,
in lieu of overloading the cache_level enum (Daniel).
v5: Removed the superfluous VLV check & changed the definition location of PTE_READ_ONLY flag (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The perfect solution for psr_exit is the hardware tracking the changes and
doing the psr exit by itself. This scenario works for HSW and BDW with some
environments like Gnome and Wayland.
However there are many other scenarios that this isn't true. Mainly one right
now is KDE users on HSW and BDW with PSR on. User would miss many screen
updates. For instances any key typed could be seen only when mouse cursor is
moved. So this patch introduces the ability of trigger PSR exit on kernel side
on some common cases that.
Most of the cases are coverred by psr_exit at set_domain. The remaining cases
are coverred by triggering it at set_domain, busy_ioctl, sw_finish and
mark_busy.
The downside here might be reducing the residency time on the cases this
already work very wall like Gnome environment. But so far let's get focused
on fixinge issues sio PSR couild be used for everybody and we could even
get it enabled by default. Later we can add some alternatives to choose the
level of PSR efficiency over boot flag of even over crtc property.
v2: remove exit from connector_dpms. Daniel pointed this is the wrong way and
also this isn't needed for BDW and HSW anyway.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Vijay Purushothaman <vijay.a.purushothaman@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Atm, the forcewake refcount will be incorrectly set to zero during
system suspend if there is any reference held via the
i915_forcewake_user debugfs entry.
Fix this by simply not zeroing the sw counters during suspend and
restoring the original state using them. Note that the only other
places where we zeroed the counters were driver load and unload time,
where it was redundant anyway.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78059
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we have a release hook into i915_gem_object_free, we can move
the explicit call to the internal stolen function and hook it up
throught the callback instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This allows the system to enter the lowest power mode during system freeze.
v2: delete force wake timer at suspend (Imre)
v3: add GT work suspend function (Imre)
v4: use uncore forcewake reset (Daniel)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Working for real this time. i915_ppgtt_info has all sorts of good stuff
in it and X is running nicely on top.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
These are just single registers so wasting space for the pipe offsets
seems a bit pointless. So just use the _PIPE3() macro instead.
Also rewrite the _PIPE3() macro to be more obvious, and protect the
arguments properly.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Frob conflict.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
"Because our driver assumes only one panel is PSR capable, and we
already have other PSR information on dev_priv instead of intel_dp. If
we ever support multiple PSR panels, we'll have to move struct
i915_psr to intel_dp anyway." (by Paulo)
v2: Avoid more than one setup. Removing initialization
and trusting allocation. (By Paulo Zanoni).
v3: rebase.
v4: Adding comment.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Bunch of stuff for 3.16 still:
> - Mipi dsi panel support for byt. Finally! From Shobhit&others. I've
> squeezed this in since it's a regression compared to vbios and we've
> been ridiculed about it a bit too often ...
> - connection_mutex deadlock fix in get_connector (only affects i915).
> - Core patches from Matt's primary plane from Matt Roper, I've pushed the
> i915 stuff to 3.17.
> - vlv power well sequencing fixes from Jesse.
> - Fix for cursor size changes from Chris.
> - agpbusy fixes from Ville.
> - A few smaller things.
>
* tag 'drm-intel-fixes-2014-06-06' of git://anongit.freedesktop.org/drm-intel: (32 commits)
drm/i915: BDW: Adding missing cursor offsets.
drm: Fix getconnector connection_mutex locking
drm/i915/bdw: Only use 2g GGTT for 32b platforms
drm/i915: Nuke pipe A quirk on i830M
drm/i915: fix display power sw state reporting
drm/i915: Always apply cursor width changes
drm/i915: tell the user if both KMS and UMS are disabled
drm/plane-helper: Add drm_plane_helper_check_update() (v3)
drm: Check CRTC compatibility in setplane
drm/i915: use VBT to determine whether to enumerate the VGA port
drm/i915: Don't WARN about ring idle bit on gen2
drm/i915: Silence the WARN if the user tries to GTT mmap an incoherent object
drm/i915: Move the C3 LP write bit setup to gen3_init_clock_gating() for KMS
drm/i915: Enable interrupt-based AGPBUSY# enable on 85x
drm/i915: Flip the sense of AGPBUSY_DIS bit
drm/i915: Set AGPBUSY# bit in init_clock_gating
drm/i915/vlv: add pll assertion when disabling DPIO common well
drm/i915/vlv: move DPIO common reset de-assert into __vlv_set_power_well
drm/i915/vlv: re-order power wells so DPIO common comes after TX
drm/i915/vlv: move CRI refclk enable into __vlv_set_power_well
...
Merge drm-fixes into drm-next.
Both i915 and radeon need this done for later patches.
Conflicts:
drivers/gpu/drm/drm_crtc_helper.c
drivers/gpu/drm/i915/i915_drv.h
drivers/gpu/drm/i915/i915_gem.c
drivers/gpu/drm/i915/i915_gem_execbuffer.c
drivers/gpu/drm/i915/i915_gem_gtt.c
It seems by default the VBT has MIPI configuration block as well. The
Generic driver will assume always MIPI if MIPI configuration block is found.
This is causing probelm when actually there is eDP. Fix this by looking
into general definition block which will have device configurations. From here
we can figure out what is the LFP type and initialize MIPI only if MIPI
is found.
v2: Addressed review comments by Damien
- Moved PORT definitions to intel_bios.h and renamed as DVO_PORT_MIPIA
- renamed is_mipi to has_mipi and moved definition as suggested
- Check has_mipi inside parse_mipi and intel_dsi_init insted of outside
v3: Make has_mipi as a bitfield as suggested
Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: fold in conditions to pack everything neatly below 80 chars.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is pure evil. Userspace, I'm looking at you SNA, repacks batch
buffers on the fly after generation as they are being passed to the
kernel for execution. These batches also contain self-referenced
relocations as a single buffer encompasses the state commands, kernels,
vertices and sampler. During generation the buffers are placed at known
offsets within the full batch, and then the relocation deltas (as passed
to the kernel) are tweaked as the batch is repacked into a smaller buffer.
This means that userspace is passing negative relocations deltas, which
subsequently wrap to large values if the batch is at a low address. The
GPU hangs when it then tries to use the large value as a base for its
address offsets, rather than wrapping back to the real value (as one
would hope). As the GPU uses positive offsets from the base, we can
treat the relocation address as the minimum address read by the GPU.
For the upper bound, we trust that userspace will not read beyond the
end of the buffer.
So, how do we fix negative relocations from wrapping? We can either
check that every relocation looks valid when we write it, and then
position each object such that we prevent the offset wraparound, or we
just special-case the self-referential behaviour of SNA and force all
batches to be above 256k. Daniel prefers the latter approach.
This fixes a GPU hang when it tries to use an address (relocation +
offset) greater than the GTT size. The issue would occur quite easily
with full-ppgtt as each fd gets its own VM space, so low offsets would
often be handed out. However, with the rearrangement of the low GTT due
to capturing the BIOS framebuffer, it is already affecting kernels 3.15
onwards. I think only IVB+ is susceptible to this bug, but the workaround
should only kick in rarely, so it seems sensible to always apply it.
v3: Use a bias for batch buffers to prevent small negative delta relocations
from wrapping.
v4 from Daniel:
- s/BIAS/BATCH_OFFSET_BIAS/
- Extract eb_vma_misplaced/i915_vma_misplaced since the conditions
were growing rather cumbersome.
- Add a comment to eb_get_batch explaining why we do this.
- Apply the batch offset bias everywhere but mention that we've only
observed it on gen7 gpus.
- Drop PIN_OFFSET_FIX for now, that slipped in from a feature patch.
v5: Add static to eb_get_batch, spotted by 0-day tester.
Testcase: igt/gem_bad_reloc
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78533
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
A single object may be referenced by multiple registers fundamentally
breaking the static allotment of ids in the current design. When the
object is used the second time, the physical address of the first
assignment is relinquished and a second one granted. However, the
hardware is still reading (and possibly writing) to the old physical
address now returned to the system. Eventually hilarity will ensue, but
in the short term, it just means that cursors are broken when using more
than one pipe.
v2: Fix up leak of pci handle when handling an error during attachment,
and avoid a double kmap/kunmap. (Ville)
Rebase against -fixes.
v3: And fix the error handling added in v2 (Ville)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77351
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It's barely alive now anyway, so give it the "coup de grâce".
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Up until now, contexts had one (and only one) backing object that was
used by the hardware to save/restore render ring contexts (via the
MI_SET_CONTEXT command). Other rings did not have or need this, so
our i915_hw_context struct had a 1:1 relationship with a a real HW
context.
With Logical Ring Contexts and Execlists, this is not possible anymore:
all rings need a backing object, and it cannot be reused. To prepare
for that, rename our contexts to the more generic term intel_context.
No functional changes.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In the upcoming patches we plan to break the correlation between
engine command streamers (a.k.a. rings) and ringbuffers, so it
makes sense to refactor the code and make the change obvious.
No functional changes.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Adding stuff at the bottom is really no how this should be done, since
that's the place for ums/dri dungeons.
This was added in
commit a8ebba75b3
Author: Zhao Yakui <yakui.zhao@intel.com>
Date: Thu Apr 17 10:37:40 2014 +0800
drm/i915: Use the coarse ping-pong mechanism based on drm fd to dispatch the BSD command on BDW GT3
Also add a note to prevent this from happening again - people really
should be less lazy and take more time to look for a good home of
their new driver-global state.
Cc: Imre Deak <imre.deak@intel.com>
Cc: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
All of the .queue_flip() callbacks duplicate the same code to pin the
buffers and calculate the gtt_offset. Move that code to
intel_crtc_page_flip(). In order to do that we must also move the ring
selection logic there.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Unsurprisingly the cursor C regiters are also at a weird offset on CHV.
Add more pipe offsets to handle them.
This also gets rid of most of the differences between the i9xx vs. ivb
cursor code. We can unify the remaining code as well, but I'll leave
that for another patch.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Before the process killer is invoked, oom-notifiers are executed for one
last try at recovering pages. We can hook into this callback to be sure
that everything that can be is purged from our page lists, and to give a
summary of how much memory is still pinned by the GPU in the case of an
oom. This should be really valuable for debugging OOM issues.
Note that the last-ditch effort call to shrink_all we've previously
called from our normal shrinker when we could free as much as the vm
demaned is moved into the oom notifier. Since the shrinker accounting
races against bind/unbind operations we might have called shrink_all
prematurely, which this approach with an oom notifier avoids.
References: https://bugs.freedesktop.org/show_bug.cgi?id=72742
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: lu hua <huax.lu@intel.com>
[danvet: Bikeshed logical | into || and pimp commit message.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When the machine is under a lot of memory pressure and being stressed by
multiple GPU threads, we quite often report fewer than shrinker->batch
(i.e. SHRINK_BATCH) pages to be freed. This causes the shrink_control to
skip calling into i915.ko to release pages, despite the GPU holding onto
most of the physical pages in its active lists.
References: https://bugs.freedesktop.org/show_bug.cgi?id=72742
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Robert Beckett <robert.beckett@intel.com>
Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch adds a mmio base address variable for DSI display,
to make the DSI code generic, so that, if required, the same code
can be re-used for future platforms with different mmio base.
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Appease checkpatch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Way back we've used this to reject framebuffers with unsupported
pixel formats. But since the modesetting reorg with the compute
config stage we reject those much earlier and just BUG() in this
callback. So switch to a void return type.
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
By exporting the ability to map user address and inserting PTEs
representing their backing pages into the GTT, we can exploit UMA in order
to utilize normal application data as a texture source or even as a
render target (depending upon the capabilities of the chipset). This has
a number of uses, with zero-copy downloads to the GPU and efficient
readback making the intermixed streaming of CPU and GPU operations
fairly efficient. This ability has many widespread implications from
faster rendering of client-side software rasterisers (chromium),
mitigation of stalls due to read back (firefox) and to faster pipelining
of texture data (such as pixel buffer objects in GL or data blobs in CL).
v2: Compile with CONFIG_MMU_NOTIFIER
v3: We can sleep while performing invalidate-range, which we can utilise
to drop our page references prior to the kernel manipulating the vma
(for either discard or cloning) and so protect normal users.
v4: Only run the invalidate notifier if the range intercepts the bo.
v5: Prevent userspace from attempting to GTT mmap non-page aligned buffers
v6: Recheck after reacquire mutex for lost mmu.
v7: Fix implicit padding of ioctl struct by rounding to next 64bit boundary.
v8: Fix rebasing error after forwarding porting the back port.
v9: Limit the userptr to page aligned entries. We now expect userspace
to handle all the offset-in-page adjustments itself.
v10: Prevent vma from being copied across fork to avoid issues with cow.
v11: Drop vma behaviour changes -- locking is nigh on impossible.
Use a worker to load user pages to avoid lock inversions.
v12: Use get_task_mm()/mmput() for correct refcounting of mm.
v13: Use a worker to release the mmu_notifier to avoid lock inversion
v14: Decouple mmu_notifier from struct_mutex using a custom mmu_notifer
with its own locking and tree of objects for each mm/mmu_notifier.
v15: Prevent overlapping userptr objects, and invalidate all objects
within the mmu_notifier range
v16: Fix a typo for iterating over multiple objects in the range and
rearrange error path to destroy the mmu_notifier locklessly.
Also close a race between invalidate_range and the get_pages_worker.
v17: Close a race between get_pages_worker/invalidate_range and fresh
allocations of the same userptr range - and notice that
struct_mutex was presumed to be held when during creation it wasn't.
v18: Sigh. Fix the refactor of st_set_pages() to allocate enough memory
for the struct sg_table and to clear it before reporting an error.
v19: Always error out on read-only userptr requests as we don't have the
hardware infrastructure to support them at the moment.
v20: Refuse to implement read-only support until we have the required
infrastructure - but reserve the bit in flags for future use.
v21: use_mm() is not required for get_user_pages(). It is only meant to
be used to fix up the kernel thread's current->mm for use with
copy_user().
v22: Use sg_alloc_table_from_pages for that chunky feeling
v23: Export a function for sanity checking dma-buf rather than encode
userptr details elsewhere, and clean up comments based on
suggestions by Bradley.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
Cc: Akash Goel <akash.goel@intel.com>
Cc: "Volkin, Bradley D" <bradley.d.volkin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
[danvet: Frob ioctl allocation to pick the next one - will cause a bit
of fuss with create2 apparently, but such are the rules.]
[danvet2: oops, forgot to git add after manual patch application]
[danvet3: Appease sparse.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
HW guys say that it is not a cool idea to let device
go into rc6 without proper 3d pipeline state.
For each new uninitialized context, generate a
valid null render state to be run on context
creation.
This patch introduces a skeleton with empty states.
v2: - No need to vmap (Chris Wilson)
- use .c files for state (Daniel Vetter)
- no need to flush as i915_add_request does it
- remove parameter for batch alloc size
- don't wait for the init (Ben Widawsky)
v3: - move to cpu/gpu (Chris Wilson)
Tested-by: Kristen Carlson Accardi <kristen@linux.intel.com> (v1)
Tested-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Fed up with having that long list_for_each_entry() invocation?
Use for_each_intel_crtc()!
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The fence pin count should always be <= the bo pin count. If that's
not the case then we have a funny problem and are leaking references
somewhere.
Which means we can catch fence pin leaks by checking for the same
upper limit as we do for the bo pin count. Inspired by a discussion
with Ville about a fence leak igt testcase.
v2: Also check for fence->pin_count <= ggtt_vma->pin_count, since that
might catch a leak even quicker. Also de-inline them, they're getting
too big.
v3: Don't separately check for MAX_PIN_COUNT since the > vma->pin_count
check will catch that already (Chris).
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
CHV has 2 display phys. First phy (IOSF offset 0x1A) has two channels,
and second phy (IOSF offset 0x12) has single channel. The first phy is
used for port B and port C, while second phy is only for port D.
v2: Move the pipe to determine which phy to select for
vlv_dpio_read/vlv_dpio_write to another patch. (Daniel)
v3: Rebase the code based on rework on how to calculate DPIO offset.
Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For clients that submit large batch buffers the command parser has
a substantial impact on performance. On my HSW ULT system performance
drops as much as ~20% on some tests. Most of the time is spent in the
command lookup code. Converting that from the current naive search to
a hash table lookup reduces the performance drop to ~10%.
The choice of value for I915_CMD_HASH_ORDER allows all commands
currently used in the parser tables to hash to their own bucket (except
for one collision on the render ring). The tradeoff is that it wastes
memory. Because the opcodes for the commands in the tables are not
particularly well distributed, reducing the order still leaves many
buckets empty. The increased collisions don't seem to have a huge
impact on the performance gain, but for now anyhow, the parser trades
memory for performance.
NB: Ville noticed that the error paths through the ring init code
will leak memory. I've not addressed that here. We can do a follow
up pass to handle all of the leaks.
v2: improved comment describing selection of hash key mask (Damien)
replace a BUG_ON() with an error return (Tvrtko, Ville)
commit message improvements
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
During the review of
commit 1f70999f90
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Jan 27 22:43:07 2014 +0000
drm/i915: Prevent recursion by retiring requests when the ring is full
Ville raised the point that our interaction with request->tail was
likely to foul up other uses elsewhere (such as hang check comparing
ACTHD against requests).
However, we also need to restore the implicit retire requests that certain
test cases depend upon (e.g. igt/gem_exec_lut_handle), this raises the
spectre that the ppgtt will randomly call i915_gpu_idle() and recurse
back into intel_ring_begin().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78023
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
[danvet: Remove now unused 'tail' variable as spotted by Brad.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Add runtime PM support for VLV, but leave it disabled. The next patch
enables it.
The suspend/resume sequence used is based on [1] and [2]. In practice we
depend on the GT RC6 mechanism to save the HW context depending on the
render and media power wells. By the time we run the runtime suspend
callback the display side is also off and the HW context for that is
managed by the display power domain framework.
Besides the above there are Gunit registers that depend on a system-wide
power well. This power well goes off once the device enters any of the
S0i[R123] states. To handle this scenario, save/restore these Gunit
registers. Note that this is not the complete register set dictated by
[2], to remove some overhead, registers that are known not to be used are
ignored. Also some registers are fully setup by initialization functions
called during resume, these are not saved either. The list of registers
can be further reduced, see the TODO note in the code.
[1] VLV_gfx_clocking_PM_reset_y12w21d3 / "Driver D3 entry/exit"
[2] VLV2_S0IXRegs
v2:
- unchanged
v3:
- fix s/GEN6_PMIIR/GEN6_PMIMR/ typo when saving/restoring registers
(Ville)
v4:
- rebased on the previous patch fixing GEN register prefixes
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[ rebased (according to v4) ]
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Enable aliasing PPGTT for CHV, but keep full PPGTT still disabled until
it gets enabled for BDW.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
During the initial power well enabling on the driver init/resume path
we can avoid initialzing part of the HW/SW state that will be
initialized anyway by the subsequent init/resume code. For some steps
like HPD initialization this redundancy is not only an overhead but an
actual problem, since they can't be run this early in the overall init
sequence.
Add a flag marking the init phase and skip reinitialzing state that is
not strictly necessary based on that.
This is also needed by the upcoming HPD init restructuring by Thierry
and Daniel.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I don't have any insight on what parts can do what. The docs do seem to
suggest WT caching works in at least the same manner as it does on
Haswell.
The addr = 0 is to shut up GCC:
drivers/gpu/drm/i915/i915_gem_gtt.c:80:7: warning: 'addr' may be used
uninitialized in this function [-Wmaybe-uninitialized]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This will be needed by the VLV runtime PM helpers too, so factor it out.
Also add a safety check for the case where the previous force-off is
still pending, since I'm not sure if Punit can handle a new setting
while the previous one hasn't settled yet.
v2:
- unchanged
v3:
- add a note to the commit message about the safety check (Ville)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
While checking the error capture path I noticed that we lacked the
power domain-on check for PIPESTAT so fix this by moving that to where
the rest of pipe registers are captured.
The move also revealed that we actually don't include this register in
the error report, so fix that too.
v2:
- patch introduced in v2 of the patchset
v3:
- add back !HAS_PCH_SPLIT check (Ville)
[ Ignore my previous comment about the gen<=5 || vlv check, I realized
that it's the same as !HAS_PCH_SPLIT. ]
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The BDW GT3 has two independent BSD rings, which can be used to process the
video commands. To be simpler, it is transparent to user-space driver/middle.
Instead the kernel driver will decide which ring is to dispatch the BSD video
command.
As every BSD ring is powerful, it is enough to dispatch the BSD video command
based on the drm fd. In such case it can play back video stream while encoding
another video stream. The coarse ping-pong mechanism is used to determine
which BSD ring is used to dispatch the BSD video command.
V1->V2: Follow Daniel's comment and use the simple ping-pong mechanism.
This is only to add the support of dual BSD rings on BDW GT3 machine.
The further optimization will be considered in another patch set.
V2->V3: Follow Daniel's comment to use the struct_mutext instead of
atomic_t during determining which ring can be used to dispatch Video command.
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Based on the hardware spec, the BDW GT3 machine has two independent
BSD ring that can be used to dispatch the video commands.
So just initialize it.
V3->V4: Follow Imre's comment to do some minor updates. For example:
more comments are added to describe the semaphore between ring.
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
[danvet: Fix up checkpatch error.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm-intel-next-2014-04-16:
- vlv infoframe fixes from Jesse
- dsi/mipi fixes from Shobhit
- gen8 pageflip fixes for LRI/SRM from Damien
- cmd parser fixes from Brad Volkin
- some prep patches for CHV, DRRS, ...
- and tons of little things all over
drm-intel-next-2014-04-04:
- cmd parser for gen7 but only in enforcing and not yet granting mode - the
batch copying stuff is still missing. Also performance is a bit ... rough
(Brad Volkin + OACONTROL fix from Ken).
- deprecate UMS harder (i.e. CONFIG_BROKEN)
- interrupt rework from Paulo Zanoni
- runtime PM support for bdw and snb, again from Paulo
- a pile of refactorings from various people all over the place to prep for new
stuff (irq reworks, power domain polish, ...)
drm-intel-next-2014-04-04:
- cmd parser for gen7 but only in enforcing and not yet granting mode - the
batch copying stuff is still missing. Also performance is a bit ... rough
(Brad Volkin + OACONTROL fix from Ken).
- deprecate UMS harder (i.e. CONFIG_BROKEN)
- interrupt rework from Paulo Zanoni
- runtime PM support for bdw and snb, again from Paulo
- a pile of refactorings from various people all over the place to prep for new
stuff (irq reworks, power domain polish, ...)
Conflicts:
drivers/gpu/drm/i915/i915_gem_context.c
Because the docs say ULX doesn't support it on HSW.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The parser extracts the config block(#52) and sequence(#53) data
and store in private data structures.
v2: Address review comments by Jani
- adjust code for the structure changes for bdb_mipi_config
- add boundry and buffer overflow checks as suggested
- use kmemdup instead of kmalloc and memcpy
v3: More strict check while parsing VBT
- Ensure that at anytime we do not go beyond sequence block
while parsing
- On unknown element fail the whole parsing
v4: Style changes and spell check mostly as suggested by Jani
Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If we always initialize kref for the context, even if we are using fake
contexts for hangstats when there is no hw support, we can forgo the
dance to dereference the ctx->obj and inspect whether we are permitted
to use kref inside i915_gem_context_reference() and _unreference().
My ulterior motive here is to improve the debugging of a use-after-free
of ctx->obj. This patch avoids the dereference here and instead forces
the assertion checks associated with kref.
v2: Refactor the fake contexts to being even more like the real
contexts, so that there is much less duplicated and special case code.
v3: Tweaks.
v4: Tweaks, minor.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76671
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: lu hua <huax.lu@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[Jani: tiny change to backport to drm-intel-fixes.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The only supported types are none and PWM. Other values are obsolete or
reserved, don't add them.
Tested-by: Kamal Mostafa <kamal@canonical.com>
Tested-by: Martin <bugs@mrvanes.com>
Tested-by: jrg.otte@gmail.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
This patch computes and stored 2nd M/N/TU for switching to different
refresh rate dynamically. PIPECONF_EDP_RR_MODE_SWITCH bit helps toggle
between alternate refresh rates programmed in 2nd M/N/TU registers.
v2: Daniel's review comments
Computing M2/N2 in compute_config and storing it in crtc_config
v3: Modified reference to edp_downclock and edp_downclock_avail based on the
changes made to move them from dev_private to intel_panel.
v4: Modified references to is_drrs_supported based on the changes made to
rename it to drrs_support.
v5: Jani's review comments
Removed superfluous return statements. Changed support for Gen 7 and above.
Corrected indentation. Re-structured the code which finds crtc and connector
from encoder. Changed some logs to be less verbose.
v6: Modifying i915_drrs to include only intel connector as intel_dp can be
derived from intel connector when required.
v7: As per internal review comments, acquiring mutex just before accessing
drrs RR. As per Chris's review comments, added documentation about the use
of locking in the function.
v8: Incorporated Jani's review comments.
Removed reference to edp_downclock.
v9: Jani's review comments. Modified comment in set_drrs. Changed index to
type edp_drrs_refresh_rate_type. Check if PSR is enabled before setting
registers fo DRRS.
Signed-off-by: Pradeep Bhat <pradeep.bhat@intel.com>
Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We will treat Cherryview like Valleyview for most parts. Add a macro
for cases when we need to tell the two apart.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Piglit runner and QA are both looking at the dmesg for
DRM_ERRORs with test cases. Add a flag to control those
when we they are expected from related test cases.
Also add flag to control if contexts should be banned
that introduced the hang. Hangcheck is timer based and
preventing bans by adding sleeps to testcases makes
testing slower.
v2: intel_ring_stopped(), readable comment (Chris)
v3: keep compatibility (Daniel)
References: https://bugs.freedesktop.org/show_bug.cgi?id=75876
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
VTd has a few too many "outright disable the damn thing" workarounds
accumulated and for validation we want a simple knob to make sure we
disable them all.
Since this is for bdw+ validation and atm we don't have any
workarounds for bdw this option currently does nothing. So currently
this is just a placeholder to make sure reality will match with the
documented process for our validation people.
v2: Fix up param description (Jani).
v3: Actually git add ...
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For error state, like the recent modification to ACTHD, FADD also gets
an upper dword. This is useful for debug to make sure the fetch address
and head are similar.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This sould be enough.
v2: BDW should also run hsw_runtime_resume (Ben).
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that PC8 is part of runtime PM, the check is useless.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just because I have a SNB machine and I can easily test it.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we don't keep the hotplug interrupts enabled anymore, we can
kill the regsave struct and just cal the normal IRQ preinstall,
postinstall and uninstall functions. This makes it easier to add
runtime PM support to non-HSW platforms.
The only downside is in case we get a request to update interrupts
while they are disabled, won't be able to update the regsave struct.
But this should never happen anyway, so we're not losing too much.
v2: - Rebase.
v3: - Rebase.
v4: - Rebase.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch reads the DRRS support and Mode type from VBT fields.
The read information will be stored in VBT struct during BIOS
parsing. The above functionality is needed for decision making
whether DRRS feature is supported in i915 driver for eDP panels.
This information helps us decide if seamless DRRS can be done
at runtime to support certain power saving features. This patch
was tested by setting necessary bit in VBT struct and merging
the new VBT with system BIOS so that we can read the value.
v2: Incorporated review comments from Chris Wilson
Removed "intel_" as a prefix for DRRS specific declarations.
v3: Incorporated Jani's review comments
Removed function which deducts drrs mode from panel_type. Modified some
print statements. Made changes to use DRRS_NOT_SUPPORTED as 0 instead of -1.
v4: Incorporated Jani's review comments.
Modifications around setting vbt drrs_type.
Signed-off-by: Pradeep Bhat <pradeep.bhat@intel.com>
Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
[danvet: Drop the misleading/redundant comment about the added drrs
field in the vbt struct as discussed with Jani on irc.]
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
There are no longer users of drm_i915_private_t. Drop the typedef. Good
riddance.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Chris Wilson <chris@chris-wislon.co.uk>
[danvet: Add the hunk in i915_cmd_parser.c here which had to be
relocated to the how this was merged.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
These defines are only used in intel_uncore.c.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
That function isn't used outside this file anymore.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Instead of reading out the CD clock rate from the HW at each modeset, do
this only during driver init and resume and use the cached value during
modeset. This moves things towards a state where the sw and hw side
setup is separated. It's also needed for VLV RPM, where we don't put
device into D0 state until modeset_global_resources is called and thus
can't access any display/gfx registers.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
So userspace can query the kernel for command parser support.
v2: Add i915_cmd_parser_get_version(), history log, and kerneldoc
OTC-Tracker: AXIA-4631
Change-Id: I58af650db9f6753c2dcac9c54ab432fd31db302f
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Various commands that access memory have a bit to determine whether
the graphics address specified in the command should use the GGTT or
PPGTT for translation. These checks ensure that the bit indicates
PPGTT translation.
Most of these checks use the existing bit-checking infrastructure.
The PIPE_CONTROL and MI_FLUSH_DW commands, however, are multi-function
commands. The GGTT/PPGTT bit is only relevant for certain uses of the
command. As such, this change also extends the bit-checking code to
include a "condition" mask and offset. If the condition mask is non-zero
then the parser only performs the bit check when the bits specified by
the condition mask/offset are also non-zero.
NOTE: At this point in the series PPGTT must be enabled for the parser
to work correctly. If it's not enabled, userspace will not be setting
the PPGTT bits the way the parser requires. VLV is the only platform
where this is a problem, so at this point, we disable parsing for VLV.
v2: whitespace and trailing commas fixes, rebased
OTC-Tracker: AXIA-4631
Change-Id: I3f4c76b6734f1956ec47e698230f97d0998ff92b
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Drop the unecessary cast Jani spotted.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This file contains all necessary defines, prototypes and typesdefs for
manipulating GEN graphics address translation (this does not include the
legacy AGP driver)
Reiterating the comment in the header,
"Please try to maintain the following order within this file unless it
makes sense to do otherwise. From top to bottom:
1. typedefs
2. #defines, and macros
3. structure definitions
4. function prototypes
Within each section, please try to order by generation in ascending
order, from top to bottom (ie. GEN6 on the top, GEN8 on the bottom)."
I've made some minor cleanups, and fixed a couple of typos while here -
but there should be no functional changes.
The purpose of the patch is to reduce clutter in our main header file,
making room for new growth, and make documentation of our interfaces
easier by splitting things out.
With a little more work, like making i915_gtt a pointer, we could
potentially completely isolate this header from i915_drv.h. At the
moment however, I don't think it's worth the effort.
Personally, I would have liked to put the PTE encoding functions in this
file too, but I didn't want to rock the boat too much.
A similar patch has been in use on my machine for some time. This exact
patch though has only been compile tested.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Remove the rest of the references to drm_i915_private_t. No functional
changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Drop hunk in i915_cmd_parser.c]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This reverts commit 2754436913.
Conflicts:
drivers/gpu/drm/i915/i915_irq.c
The partial application of interrupt masking without regard to other
pathways for adjusting the RPS frequency results in completely disabling
the PM interrupts. This leads to excessive power consumption as the GPU
is kept at max clocks (until the failsafe mechanism fires of explicitly
downclocking the GPU when all requests are idle). Or equally as bad for
the UX, the GPU is kept at minimum clocks and prevented from upclocking
in response to a requirement for more power.
Testcase: pm_rps/blocking
Cc: Deepak S <deepak.s@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by:Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As Broadwell has an increased virtual address size, it requires more
than 32 bits to store offsets into its address space. This includes the
debug registers to track the current HEAD of the individual rings, which
may be anywhere within the per-process address spaces. In order to find
the full location, we need to read the high bits from a second register.
We then also need to expand our storage to keep track of the larger
address.
v2: Carefully read the two registers to catch wraparound between
the reads.
v3: Use a WARN_ON rather than loop indefinitely on an unstable
register read.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Drop spurious hunk which conflicted.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When we use different rps events for different platforms or due to wa,
we might end up needing this logic in a lot of places. Instead of
this let's use a variable in dev_priv to track the enabled PM
interrupts.
v2: Initialize pm_rps_events in intel_irq_init() (Ville).
Signed-off-by: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Frob the commit message a bit since the English was a bit too
garbled ;-) ]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It is important that the user is fully aware that the seemingly atomic
read/write of a 64-bit value from MMIO space, may in fact be 2 separate
operations of 32-bits. This can lead to hilarity, such as
commit d18b961903
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Jul 10 13:36:23 2013 +0100
drm/i915: Fix incoherence with fence updates on Sandybridge+
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The idea of printing objects used by each process is to judge how each
process is using them. This means that we need to evaluate whether the
object is bound for that particular process, rather than just whether it
is bound into the global GTT.
v2: Restore the non-full-ppgtt path for simplicity as we may not even
create vma with older hardware.
v3: Tweak handling of global entries and default context entries.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The names of the struct members for RPS are stupid. Every time I need to
do anything in this code I have to spend a significant amount of time to
remember what it all means. By renaming the variables (and adding the
comments) I hope to clear up the situation. Indeed doing this make some
upcoming patches more readable.
I've avoided ILK because it's possible that the naming used for Ironlake
matches what is in the docs. I believe the ILK power docs were never
published, and I am too lazy to dig them up.
v2: leave rp0, and rp1 in the names. It is useful to have these limits
available at times. min_freq and max_freq (which may be equal to rp0, or
rp1 depending on the platform) represent the actual HW min and max.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
this leaves a temporarily awkward min_delay (the soft limit) with the
new min_freq (the hardware limit). It's fixed in the next patch.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that PC8 got much simpler, there are less things to document.
Also, runtime PM already has a nice documentation, so we don't need to
re-explain it on our driver.
v2: - Rebase.
- Fix typo (Jesse).
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The only remaining field of the struct was the lock, which was
useless.
v2: - Rebase.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When other platforms add runtime PM support they will also need to
disable interrupts, so move the variable to the runtime PM struct.
Also notice that the longer-term goal is to completely kill the
regsave struct, and I even have patches for that.
v2: - Rebase.
v3: - Rebase.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It was just being used on debugfs and on a WARN inside
hsw_set_power_well. But now that we PC8 is part of runtime PM and we
get/put runtime PM when we get/put any power domain, we shouldn't need
the WARN anymore.
v2: - Rebase.
v3: - Rebase.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since after the latest patches it's only being used to prevent
getting/putting the runtime PM refcount.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The requirements_met variable was used to track two things: enabled
CRTCs and the power well. After the latest chagnes, we get a runtime
PM reference whenever we get any of the power domains, and we get
power domains when we enable CRTCs or the power well, so we should
already be covered, not needing this specific tracking.
v2: - Rebase.
v3: - Rebase.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently, when our driver becomes idle for i915.pc8_timeout (default:
5s) we enable PC8, so we save some power, but not everything we can.
Then, while PC8 is enabled, if we stay idle for more
autosuspend_delay_ms (default: 10s) we'll enter runtime PM and put the
graphics device in D3 state, saving even more power. The two features
are separate things with increasing levels of power savings, but if we
disable PC8 we'll never get into D3.
While from the modularity point of view it would be nice to keep these
features as separate, we have reasons to merge them:
- We are not aware of anybody wanting a "PC8 without D3" environment.
- If we keep both features as separate, we'll have to to test both
PC8 and PC8+D3 code paths. We're already having a major pain to
make QA do automated testing of just one thing, testing both paths
will cost even more.
- Only Haswell+ supports PC8, so if we want to add runtime PM support
to, for example, IVB, we'll have to copy some code from the PC8
feature to runtime PM, so merging both features as a single thing
will make it easier for enabling runtime PM on other platforms.
This patch only does the very basic steps required to have PC8 and
runtime PM merged on a single feature: the next patches will take care
of cleaning up everything.
v2: - Rebase.
v3: - Rebase.
- Fully remove the deprecated i915 params since Daniel doesn't
consider them as part of the ABI.
v4: - Rebase.
- Fix typo in the commit message.
v5: - Rebase, again.
- Add a huge comment explaining the different forcewake usage
(Chris, Daniel).
- Use open-coded forcewake functions (Daniel).
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The name 'update_plane' was used both for the primary plane functions in
intel_display.c and the sprite/overlay functions in intel_sprite.c.
Rename the primary plane functions to 'update_primary_plane' to avoid
confusion.
On a similar note, intel_display.c already had a function called
intel_disable_primary_plane() that programs the hardware to disable a
pipe's primary plane. When we hook up primary planes through the DRM
plane interface, one of the natural handler names will be
intel_primary_plane_disable(), which is very similar. To avoid
confusion, rename the existing intel_disable_primary_plane() to
intel_disable_primary_hw_plane() to make the two names a little more
distinct.
Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
[danvet: Fix up conflicts.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>