Merge tag 'drm-intel-gt-next-2021-08-06-1' of ssh://git.freedesktop.org/git/drm/drm-intel into drm-next
UAPI Changes: - Add I915_MMAP_OFFSET_FIXED On devices with local memory `I915_MMAP_OFFSET_FIXED` is the only valid type. On devices without local memory, this caching mode is invalid. As caching mode when specifying `I915_MMAP_OFFSET_FIXED`, WC or WB will be used, depending on the object placement on creation. WB will be used when the object can only exist in system memory, WC otherwise. Userspace: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11888 - Reinstate the mmap ioctl for (already released) integrated Gen12 platforms Rationale: Otherwise media driver breaks eg. for ADL-P. Long term goal is still to sunset the IOCTL even for integrated and require using mmap_offset. - Reject caching/set_domain IOCTLs on discrete Expected to become immutable property of the BO - Disallow changing context parameters after first use on Gen12 and earlier - Require setting context parameters at creation on platforms after Gen12 Rationale (for both): Allow less dynamic changes to the context to simplify the implementation and avoid user shooting theirselves in the foot. - Drop I915_CONTEXT_PARAM_RINGSIZE Userspace PR for compute-driver has not been merged - Drop I915_CONTEXT_PARAM_NO_ZEROMAP Userspace PR for libdrm / Beignet was never landed - Drop CONTEXT_CLONE API Userspace PR for Mesa was never landed - Drop getparam support for I915_CONTEXT_PARAM_ENGINES Only existed for symmetry wrt. setparam, never used. - Disallow bonding of virtual engines Drop the prep work, no hardware has been released needing it. - (Implicit) Disable gpu relocations Media userspace was the last userspace to still use them. They have converted so performance can be regained with an update. Core Changes: - Merge topic branch 'topic/i915-ttm-2021-06-11' (from Maarten) - Merge topic branch 'topic/revid_steppings' (from Matt R) - Merge topic branch 'topic/xehp-dg2-definitions-2021-07-21' (from Matt R) - Backmerges drm-next (Rodrigo) Driver Changes: - Initial workarounds for ADL-P (Clint) - Preliminary code for XeHP/DG2 (Stuart, Umesh, Matt R, Prathap, Ram, Venkata, Akeem, Tvrtko, John, Lucas) - Fix ADL-S DMA mask size to 39 bits (Tejas) - Remove code for CNL (Lucas) - Add ADL-P GuC/HuC firmwares (John) - Update HuC to 7.9.3 for TGL/ADL-S/RKL (John) - Fix -EDEADLK handling regression (Ville) - Implement Wa_1508744258 for DG1 and Gen12 iGFX (Jose) - Extend Wa_1406941453 to ADL-S (Jose) - Drop unnecessary workarounds per stepping for SKL/BXT/ICL (Matt R) - Use fuse info to enable SFC on Gen12 (Venkata) - Unconditionally flush the pages on acquire on EHL/JSL (Matt A) - Probe existence of backing struct pages upon userptr creation (Chris, Matt A) - Add an intermediate GEM proto-context to delay real context creation (Jason) - Implement SINGLE_TIMELINE with a syncobj (Jason) - Set the watchdog timeout directly in intel_context_set_gem (Jason) - Disallow userspace from creating contexts with too many engines (Jason) - Revert "drm/i915/gem: Asynchronous cmdparser" (Jason) - Revert "drm/i915: Propagate errors on awaiting already signaled fences" (Jason) - Revert "drm/i915: Skip over MI_NOOP when parsing" (Jason) - Revert "drm/i915: Shrink the GEM kmem_caches upon idling" (Daniel) - Always let TTM handle object migration (Jason) - Correct the locking and pin pattern for dma-buf (Thomas H, Michael R, Jason) - Migrate to system at dma-buf attach time (Thomas, Michael R) - MAJOR refactoring of the GuC backend code to allow for enabling on Gen11+ (Matt B, John, Michal Wa., Fernando, Daniele, Vinay) - Update GuC firmware interface to v62.0.0 (John, Michal Wa., Matt B) - Add GuCRC feature to hand over the control of HW RC6 to the GuC on Gen12+ when GuC submission is enabled (Vinay, Sujaritha, Daniele, John, Tvrtko) - Use the correct IRQ during resume and eliminate DRM IRQ midlayer (Thomas Z) - Add pipelined page migration and clearing (Chris, Thomas H) - Use TTM for system memory on discrete (Thomas H) - Implement object migration for display vs. dma-buf (Thomas H) - Perform execbuffer object locking as a separate step (Thomas H) - Add support for explicit L3BANK steering (Matt, Daniele) - Remove duplicated call to ops->pread (Daniel) - Fix pagefault disabling in the first execbuf slowpath (Daniel) - Simplify userptr locking (Thomas H) - Improvements to the GuC CTB code (Matt B, John) - Make GT workaround upper bounds exclusive (Matt R) - Check for nomodeset in i915_init() first (Daniel) - Delete now unused gpu reloc code (Daniel) - Document RFC plans for GuC submission, DRM scheduler and new parallel submit uAPI (Matt B) - Reintroduce buddy allocator this time with TTM (Matt A) - Support forcing page size with LMEM (Matt A) - Add i915_sched_engine to abstract a submission queue between backends (Matt B) - Use accelerated move in TTM (Ram) - Fix memory leaks from TTM backend (Thomas H) - Introduce WW transaction helper (Thomas H) - Improve debug Kconfig texts a bit (Daniel) - Unify user object creation code (Jason) - Use a table for i915_init/exit (Jason) - Move slabs to module init/exit (Daniel) - Remove now unused i915_globals (Daniel) - Extract i915_module.c (Daniel) - Consistently use adl-p/adl-s in WA comments (Jose) - Finish INTEL_GEN and friends conversion (Lucas) - Correct variable/function namings (Lucas) - Code checker fixes (Wan, Matt A) - Tracepoint improvements (Matt B) - Kerneldoc improvements (Tvrtko, Jason, Matt A, Maarten) - Selftest improvements (Chris, Matt A, Tejas, Thomas H, John, Matt B, Rahul, Vinay) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YQ0JmYiXhGskNcrI@jlahtine-mobl.ger.corp.intel.com
This commit is contained in:
commit
25fed6b324
|
@ -422,9 +422,16 @@ Batchbuffer Parsing
|
|||
User Batchbuffer Execution
|
||||
--------------------------
|
||||
|
||||
.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_context_types.h
|
||||
|
||||
.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
|
||||
:doc: User command execution
|
||||
|
||||
Scheduling
|
||||
----------
|
||||
.. kernel-doc:: drivers/gpu/drm/i915/i915_scheduler_types.h
|
||||
:functions: i915_sched_engine
|
||||
|
||||
Logical Rings, Logical Ring Contexts and Execlists
|
||||
--------------------------------------------------
|
||||
|
||||
|
@ -518,6 +525,14 @@ GuC-based command submission
|
|||
.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
|
||||
:doc: GuC-based command submission
|
||||
|
||||
GuC ABI
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
|
||||
.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h
|
||||
.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
|
||||
.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
|
||||
|
||||
HuC
|
||||
---
|
||||
.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/intel_huc.c
|
||||
|
|
|
@ -0,0 +1,122 @@
|
|||
/* SPDX-License-Identifier: MIT */
|
||||
/*
|
||||
* Copyright © 2021 Intel Corporation
|
||||
*/
|
||||
|
||||
#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see i915_context_engines_parallel_submit */
|
||||
|
||||
/**
|
||||
* struct drm_i915_context_engines_parallel_submit - Configure engine for
|
||||
* parallel submission.
|
||||
*
|
||||
* Setup a slot in the context engine map to allow multiple BBs to be submitted
|
||||
* in a single execbuf IOCTL. Those BBs will then be scheduled to run on the GPU
|
||||
* in parallel. Multiple hardware contexts are created internally in the i915
|
||||
* run these BBs. Once a slot is configured for N BBs only N BBs can be
|
||||
* submitted in each execbuf IOCTL and this is implicit behavior e.g. The user
|
||||
* doesn't tell the execbuf IOCTL there are N BBs, the execbuf IOCTL knows how
|
||||
* many BBs there are based on the slot's configuration. The N BBs are the last
|
||||
* N buffer objects or first N if I915_EXEC_BATCH_FIRST is set.
|
||||
*
|
||||
* The default placement behavior is to create implicit bonds between each
|
||||
* context if each context maps to more than 1 physical engine (e.g. context is
|
||||
* a virtual engine). Also we only allow contexts of same engine class and these
|
||||
* contexts must be in logically contiguous order. Examples of the placement
|
||||
* behavior described below. Lastly, the default is to not allow BBs to
|
||||
* preempted mid BB rather insert coordinated preemption on all hardware
|
||||
* contexts between each set of BBs. Flags may be added in the future to change
|
||||
* both of these default behaviors.
|
||||
*
|
||||
* Returns -EINVAL if hardware context placement configuration is invalid or if
|
||||
* the placement configuration isn't supported on the platform / submission
|
||||
* interface.
|
||||
* Returns -ENODEV if extension isn't supported on the platform / submission
|
||||
* interface.
|
||||
*
|
||||
* .. code-block:: none
|
||||
*
|
||||
* Example 1 pseudo code:
|
||||
* CS[X] = generic engine of same class, logical instance X
|
||||
* INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
|
||||
* set_engines(INVALID)
|
||||
* set_parallel(engine_index=0, width=2, num_siblings=1,
|
||||
* engines=CS[0],CS[1])
|
||||
*
|
||||
* Results in the following valid placement:
|
||||
* CS[0], CS[1]
|
||||
*
|
||||
* Example 2 pseudo code:
|
||||
* CS[X] = generic engine of same class, logical instance X
|
||||
* INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
|
||||
* set_engines(INVALID)
|
||||
* set_parallel(engine_index=0, width=2, num_siblings=2,
|
||||
* engines=CS[0],CS[2],CS[1],CS[3])
|
||||
*
|
||||
* Results in the following valid placements:
|
||||
* CS[0], CS[1]
|
||||
* CS[2], CS[3]
|
||||
*
|
||||
* This can also be thought of as 2 virtual engines described by 2-D array
|
||||
* in the engines the field with bonds placed between each index of the
|
||||
* virtual engines. e.g. CS[0] is bonded to CS[1], CS[2] is bonded to
|
||||
* CS[3].
|
||||
* VE[0] = CS[0], CS[2]
|
||||
* VE[1] = CS[1], CS[3]
|
||||
*
|
||||
* Example 3 pseudo code:
|
||||
* CS[X] = generic engine of same class, logical instance X
|
||||
* INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
|
||||
* set_engines(INVALID)
|
||||
* set_parallel(engine_index=0, width=2, num_siblings=2,
|
||||
* engines=CS[0],CS[1],CS[1],CS[3])
|
||||
*
|
||||
* Results in the following valid and invalid placements:
|
||||
* CS[0], CS[1]
|
||||
* CS[1], CS[3] - Not logical contiguous, return -EINVAL
|
||||
*/
|
||||
struct drm_i915_context_engines_parallel_submit {
|
||||
/**
|
||||
* @base: base user extension.
|
||||
*/
|
||||
struct i915_user_extension base;
|
||||
|
||||
/**
|
||||
* @engine_index: slot for parallel engine
|
||||
*/
|
||||
__u16 engine_index;
|
||||
|
||||
/**
|
||||
* @width: number of contexts per parallel engine
|
||||
*/
|
||||
__u16 width;
|
||||
|
||||
/**
|
||||
* @num_siblings: number of siblings per context
|
||||
*/
|
||||
__u16 num_siblings;
|
||||
|
||||
/**
|
||||
* @mbz16: reserved for future use; must be zero
|
||||
*/
|
||||
__u16 mbz16;
|
||||
|
||||
/**
|
||||
* @flags: all undefined flags must be zero, currently not defined flags
|
||||
*/
|
||||
__u64 flags;
|
||||
|
||||
/**
|
||||
* @mbz64: reserved for future use; must be zero
|
||||
*/
|
||||
__u64 mbz64[3];
|
||||
|
||||
/**
|
||||
* @engines: 2-d array of engine instances to configure parallel engine
|
||||
*
|
||||
* length = width (i) * num_siblings (j)
|
||||
* index = j + i * num_siblings
|
||||
*/
|
||||
struct i915_engine_class_instance engines[0];
|
||||
|
||||
} __packed;
|
||||
|
|
@ -0,0 +1,148 @@
|
|||
=========================================
|
||||
I915 GuC Submission/DRM Scheduler Section
|
||||
=========================================
|
||||
|
||||
Upstream plan
|
||||
=============
|
||||
For upstream the overall plan for landing GuC submission and integrating the
|
||||
i915 with the DRM scheduler is:
|
||||
|
||||
* Merge basic GuC submission
|
||||
* Basic submission support for all gen11+ platforms
|
||||
* Not enabled by default on any current platforms but can be enabled via
|
||||
modparam enable_guc
|
||||
* Lots of rework will need to be done to integrate with DRM scheduler so
|
||||
no need to nit pick everything in the code, it just should be
|
||||
functional, no major coding style / layering errors, and not regress
|
||||
execlists
|
||||
* Update IGTs / selftests as needed to work with GuC submission
|
||||
* Enable CI on supported platforms for a baseline
|
||||
* Rework / get CI heathly for GuC submission in place as needed
|
||||
* Merge new parallel submission uAPI
|
||||
* Bonding uAPI completely incompatible with GuC submission, plus it has
|
||||
severe design issues in general, which is why we want to retire it no
|
||||
matter what
|
||||
* New uAPI adds I915_CONTEXT_ENGINES_EXT_PARALLEL context setup step
|
||||
which configures a slot with N contexts
|
||||
* After I915_CONTEXT_ENGINES_EXT_PARALLEL a user can submit N batches to
|
||||
a slot in a single execbuf IOCTL and the batches run on the GPU in
|
||||
paralllel
|
||||
* Initially only for GuC submission but execlists can be supported if
|
||||
needed
|
||||
* Convert the i915 to use the DRM scheduler
|
||||
* GuC submission backend fully integrated with DRM scheduler
|
||||
* All request queues removed from backend (e.g. all backpressure
|
||||
handled in DRM scheduler)
|
||||
* Resets / cancels hook in DRM scheduler
|
||||
* Watchdog hooks into DRM scheduler
|
||||
* Lots of complexity of the GuC backend can be pulled out once
|
||||
integrated with DRM scheduler (e.g. state machine gets
|
||||
simplier, locking gets simplier, etc...)
|
||||
* Execlists backend will minimum required to hook in the DRM scheduler
|
||||
* Legacy interface
|
||||
* Features like timeslicing / preemption / virtual engines would
|
||||
be difficult to integrate with the DRM scheduler and these
|
||||
features are not required for GuC submission as the GuC does
|
||||
these things for us
|
||||
* ROI low on fully integrating into DRM scheduler
|
||||
* Fully integrating would add lots of complexity to DRM
|
||||
scheduler
|
||||
* Port i915 priority inheritance / boosting feature in DRM scheduler
|
||||
* Used for i915 page flip, may be useful to other DRM drivers as
|
||||
well
|
||||
* Will be an optional feature in the DRM scheduler
|
||||
* Remove in-order completion assumptions from DRM scheduler
|
||||
* Even when using the DRM scheduler the backends will handle
|
||||
preemption, timeslicing, etc... so it is possible for jobs to
|
||||
finish out of order
|
||||
* Pull out i915 priority levels and use DRM priority levels
|
||||
* Optimize DRM scheduler as needed
|
||||
|
||||
TODOs for GuC submission upstream
|
||||
=================================
|
||||
|
||||
* Need an update to GuC firmware / i915 to enable error state capture
|
||||
* Open source tool to decode GuC logs
|
||||
* Public GuC spec
|
||||
|
||||
New uAPI for basic GuC submission
|
||||
=================================
|
||||
No major changes are required to the uAPI for basic GuC submission. The only
|
||||
change is a new scheduler attribute: I915_SCHEDULER_CAP_STATIC_PRIORITY_MAP.
|
||||
This attribute indicates the 2k i915 user priority levels are statically mapped
|
||||
into 3 levels as follows:
|
||||
|
||||
* -1k to -1 Low priority
|
||||
* 0 Medium priority
|
||||
* 1 to 1k High priority
|
||||
|
||||
This is needed because the GuC only has 4 priority bands. The highest priority
|
||||
band is reserved with the kernel. This aligns with the DRM scheduler priority
|
||||
levels too.
|
||||
|
||||
Spec references:
|
||||
----------------
|
||||
* https://www.khronos.org/registry/EGL/extensions/IMG/EGL_IMG_context_priority.txt
|
||||
* https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/chap5.html#devsandqueues-priority
|
||||
* https://spec.oneapi.com/level-zero/latest/core/api.html#ze-command-queue-priority-t
|
||||
|
||||
New parallel submission uAPI
|
||||
============================
|
||||
The existing bonding uAPI is completely broken with GuC submission because
|
||||
whether a submission is a single context submit or parallel submit isn't known
|
||||
until execbuf time activated via the I915_SUBMIT_FENCE. To submit multiple
|
||||
contexts in parallel with the GuC the context must be explicitly registered with
|
||||
N contexts and all N contexts must be submitted in a single command to the GuC.
|
||||
The GuC interfaces do not support dynamically changing between N contexts as the
|
||||
bonding uAPI does. Hence the need for a new parallel submission interface. Also
|
||||
the legacy bonding uAPI is quite confusing and not intuitive at all. Furthermore
|
||||
I915_SUBMIT_FENCE is by design a future fence, so not really something we should
|
||||
continue to support.
|
||||
|
||||
The new parallel submission uAPI consists of 3 parts:
|
||||
|
||||
* Export engines logical mapping
|
||||
* A 'set_parallel' extension to configure contexts for parallel
|
||||
submission
|
||||
* Extend execbuf2 IOCTL to support submitting N BBs in a single IOCTL
|
||||
|
||||
Export engines logical mapping
|
||||
------------------------------
|
||||
Certain use cases require BBs to be placed on engine instances in logical order
|
||||
(e.g. split-frame on gen11+). The logical mapping of engine instances can change
|
||||
based on fusing. Rather than making UMDs be aware of fusing, simply expose the
|
||||
logical mapping with the existing query engine info IOCTL. Also the GuC
|
||||
submission interface currently only supports submitting multiple contexts to
|
||||
engines in logical order which is a new requirement compared to execlists.
|
||||
Lastly, all current platforms have at most 2 engine instances and the logical
|
||||
order is the same as uAPI order. This will change on platforms with more than 2
|
||||
engine instances.
|
||||
|
||||
A single bit will be added to drm_i915_engine_info.flags indicating that the
|
||||
logical instance has been returned and a new field,
|
||||
drm_i915_engine_info.logical_instance, returns the logical instance.
|
||||
|
||||
A 'set_parallel' extension to configure contexts for parallel submission
|
||||
------------------------------------------------------------------------
|
||||
The 'set_parallel' extension configures a slot for parallel submission of N BBs.
|
||||
It is a setup step that must be called before using any of the contexts. See
|
||||
I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE or I915_CONTEXT_ENGINES_EXT_BOND for
|
||||
similar existing examples. Once a slot is configured for parallel submission the
|
||||
execbuf2 IOCTL can be called submitting N BBs in a single IOCTL. Initially only
|
||||
supports GuC submission. Execlists supports can be added later if needed.
|
||||
|
||||
Add I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT and
|
||||
drm_i915_context_engines_parallel_submit to the uAPI to implement this
|
||||
extension.
|
||||
|
||||
.. kernel-doc:: Documentation/gpu/rfc/i915_parallel_execbuf.h
|
||||
:functions: drm_i915_context_engines_parallel_submit
|
||||
|
||||
Extend execbuf2 IOCTL to support submitting N BBs in a single IOCTL
|
||||
-------------------------------------------------------------------
|
||||
Contexts that have been configured with the 'set_parallel' extension can only
|
||||
submit N BBs in a single execbuf2 IOCTL. The BBs are either the last N objects
|
||||
in the drm_i915_gem_exec_object2 list or the first N if I915_EXEC_BATCH_FIRST is
|
||||
set. The number of BBs is implicit based on the slot submitted and how it has
|
||||
been configured by 'set_parallel' or other extensions. No uAPI changes are
|
||||
required to the execbuf2 IOCTL.
|
|
@ -19,3 +19,7 @@ host such documentation:
|
|||
.. toctree::
|
||||
|
||||
i915_gem_lmem.rst
|
||||
|
||||
.. toctree::
|
||||
|
||||
i915_scheduler.rst
|
||||
|
|
|
@ -207,6 +207,8 @@ config DRM_I915_LOW_LEVEL_TRACEPOINTS
|
|||
This provides the ability to precisely monitor engine utilisation
|
||||
and also analyze the request dependency resolving timeline.
|
||||
|
||||
Recommended for driver developers only.
|
||||
|
||||
If in doubt, say "N".
|
||||
|
||||
config DRM_I915_DEBUG_VBLANK_EVADE
|
||||
|
@ -220,6 +222,8 @@ config DRM_I915_DEBUG_VBLANK_EVADE
|
|||
is exceeded, even if there isn't an actual risk of missing
|
||||
the vblank.
|
||||
|
||||
Recommended for driver developers only.
|
||||
|
||||
If in doubt, say "N".
|
||||
|
||||
config DRM_I915_DEBUG_RUNTIME_PM
|
||||
|
@ -232,4 +236,6 @@ config DRM_I915_DEBUG_RUNTIME_PM
|
|||
runtime PM functionality. This may introduce overhead during
|
||||
driver loading, suspend and resume operations.
|
||||
|
||||
Recommended for driver developers only.
|
||||
|
||||
If in doubt, say "N"
|
||||
|
|
|
@ -38,6 +38,7 @@ i915-y += i915_drv.o \
|
|||
i915_irq.o \
|
||||
i915_getparam.o \
|
||||
i915_mitigations.o \
|
||||
i915_module.o \
|
||||
i915_params.o \
|
||||
i915_pci.o \
|
||||
i915_scatterlist.o \
|
||||
|
@ -89,7 +90,6 @@ gt-y += \
|
|||
gt/gen8_ppgtt.o \
|
||||
gt/intel_breadcrumbs.o \
|
||||
gt/intel_context.o \
|
||||
gt/intel_context_param.o \
|
||||
gt/intel_context_sseu.o \
|
||||
gt/intel_engine_cs.o \
|
||||
gt/intel_engine_heartbeat.o \
|
||||
|
@ -108,6 +108,7 @@ gt-y += \
|
|||
gt/intel_gtt.o \
|
||||
gt/intel_llc.o \
|
||||
gt/intel_lrc.o \
|
||||
gt/intel_migrate.o \
|
||||
gt/intel_mocs.o \
|
||||
gt/intel_ppgtt.o \
|
||||
gt/intel_rc6.o \
|
||||
|
@ -135,7 +136,6 @@ i915-y += $(gt-y)
|
|||
gem-y += \
|
||||
gem/i915_gem_busy.o \
|
||||
gem/i915_gem_clflush.o \
|
||||
gem/i915_gem_client_blt.o \
|
||||
gem/i915_gem_context.o \
|
||||
gem/i915_gem_create.o \
|
||||
gem/i915_gem_dmabuf.o \
|
||||
|
@ -143,7 +143,6 @@ gem-y += \
|
|||
gem/i915_gem_execbuffer.o \
|
||||
gem/i915_gem_internal.o \
|
||||
gem/i915_gem_object.o \
|
||||
gem/i915_gem_object_blt.o \
|
||||
gem/i915_gem_lmem.o \
|
||||
gem/i915_gem_mman.o \
|
||||
gem/i915_gem_pages.o \
|
||||
|
@ -162,15 +161,17 @@ gem-y += \
|
|||
i915-y += \
|
||||
$(gem-y) \
|
||||
i915_active.o \
|
||||
i915_buddy.o \
|
||||
i915_cmd_parser.o \
|
||||
i915_gem_evict.o \
|
||||
i915_gem_gtt.o \
|
||||
i915_gem_ww.o \
|
||||
i915_gem.o \
|
||||
i915_globals.o \
|
||||
i915_query.o \
|
||||
i915_request.o \
|
||||
i915_scheduler.o \
|
||||
i915_trace_points.o \
|
||||
i915_ttm_buddy_manager.o \
|
||||
i915_vma.o \
|
||||
intel_wopcm.o
|
||||
|
||||
|
@ -185,6 +186,8 @@ i915-y += gt/uc/intel_uc.o \
|
|||
gt/uc/intel_guc_fw.o \
|
||||
gt/uc/intel_guc_log.o \
|
||||
gt/uc/intel_guc_log_debugfs.o \
|
||||
gt/uc/intel_guc_rc.o \
|
||||
gt/uc/intel_guc_slpc.o \
|
||||
gt/uc/intel_guc_submission.o \
|
||||
gt/uc/intel_huc.o \
|
||||
gt/uc/intel_huc_debugfs.o \
|
||||
|
@ -277,7 +280,9 @@ i915-y += i915_perf.o
|
|||
# Post-mortem debug and GPU hang state capture
|
||||
i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o
|
||||
i915-$(CONFIG_DRM_I915_SELFTEST) += \
|
||||
gem/selftests/i915_gem_client_blt.o \
|
||||
gem/selftests/igt_gem_utils.o \
|
||||
selftests/intel_scheduler_helpers.o \
|
||||
selftests/i915_random.o \
|
||||
selftests/i915_selftest.o \
|
||||
selftests/igt_atomic.o \
|
||||
|
|
|
@ -1331,6 +1331,9 @@ retry:
|
|||
ret = i915_gem_object_lock(obj, &ww);
|
||||
if (!ret && phys_cursor)
|
||||
ret = i915_gem_object_attach_phys(obj, alignment);
|
||||
else if (!ret && HAS_LMEM(dev_priv))
|
||||
ret = i915_gem_object_migrate(obj, &ww, INTEL_REGION_LMEM);
|
||||
/* TODO: Do we need to sync when migration becomes async? */
|
||||
if (!ret)
|
||||
ret = i915_gem_object_pin_pages(obj);
|
||||
if (ret)
|
||||
|
@ -11778,7 +11781,7 @@ intel_user_framebuffer_create(struct drm_device *dev,
|
|||
|
||||
/* object is backed with LMEM for discrete */
|
||||
i915 = to_i915(obj->base.dev);
|
||||
if (HAS_LMEM(i915) && !i915_gem_object_validates_to_lmem(obj)) {
|
||||
if (HAS_LMEM(i915) && !i915_gem_object_can_migrate(obj, INTEL_REGION_LMEM)) {
|
||||
/* object is "remote", not in local memory */
|
||||
i915_gem_object_put(obj);
|
||||
return ERR_PTR(-EREMOTE);
|
||||
|
|
|
@ -5799,7 +5799,7 @@ static void tgl_bw_buddy_init(struct drm_i915_private *dev_priv)
|
|||
int config, i;
|
||||
|
||||
if (IS_ALDERLAKE_S(dev_priv) ||
|
||||
IS_DG1_REVID(dev_priv, DG1_REVID_A0, DG1_REVID_A0) ||
|
||||
IS_DG1_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
|
||||
IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_B0))
|
||||
/* Wa_1409767108:tgl,dg1,adl-s */
|
||||
table = wa_1409767108_buddy_page_masks;
|
||||
|
|
|
@ -2667,15 +2667,15 @@ static bool cnl_ddi_hdmi_pll_dividers(struct intel_crtc_state *crtc_state)
|
|||
}
|
||||
|
||||
/*
|
||||
* Display WA #22010492432: ehl, tgl
|
||||
* Display WA #22010492432: ehl, tgl, adl-p
|
||||
* Program half of the nominal DCO divider fraction value.
|
||||
*/
|
||||
static bool
|
||||
ehl_combo_pll_div_frac_wa_needed(struct drm_i915_private *i915)
|
||||
{
|
||||
return ((IS_PLATFORM(i915, INTEL_ELKHARTLAKE) &&
|
||||
IS_JSL_EHL_REVID(i915, EHL_REVID_B0, REVID_FOREVER)) ||
|
||||
IS_TIGERLAKE(i915)) &&
|
||||
IS_JSL_EHL_DISPLAY_STEP(i915, STEP_B0, STEP_FOREVER)) ||
|
||||
IS_TIGERLAKE(i915) || IS_ALDERLAKE_P(i915)) &&
|
||||
i915->dpll.ref_clks.nssc == 38400;
|
||||
}
|
||||
|
||||
|
|
|
@ -594,7 +594,7 @@ static void hsw_activate_psr2(struct intel_dp *intel_dp)
|
|||
if (intel_dp->psr.psr2_sel_fetch_enabled) {
|
||||
/* WA 1408330847 */
|
||||
if (IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
|
||||
IS_RKL_REVID(dev_priv, RKL_REVID_A0, RKL_REVID_A0))
|
||||
IS_RKL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0))
|
||||
intel_de_rmw(dev_priv, CHICKEN_PAR1_1,
|
||||
DIS_RAM_BYPASS_PSR2_MAN_TRACK,
|
||||
DIS_RAM_BYPASS_PSR2_MAN_TRACK);
|
||||
|
@ -1342,7 +1342,7 @@ static void intel_psr_disable_locked(struct intel_dp *intel_dp)
|
|||
/* WA 1408330847 */
|
||||
if (intel_dp->psr.psr2_sel_fetch_enabled &&
|
||||
(IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
|
||||
IS_RKL_REVID(dev_priv, RKL_REVID_A0, RKL_REVID_A0)))
|
||||
IS_RKL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0)))
|
||||
intel_de_rmw(dev_priv, CHICKEN_PAR1_1,
|
||||
DIS_RAM_BYPASS_PSR2_MAN_TRACK, 0);
|
||||
|
||||
|
|
|
@ -24,13 +24,11 @@ static void __do_clflush(struct drm_i915_gem_object *obj)
|
|||
i915_gem_object_flush_frontbuffer(obj, ORIGIN_CPU);
|
||||
}
|
||||
|
||||
static int clflush_work(struct dma_fence_work *base)
|
||||
static void clflush_work(struct dma_fence_work *base)
|
||||
{
|
||||
struct clflush *clflush = container_of(base, typeof(*clflush), base);
|
||||
|
||||
__do_clflush(clflush->obj);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void clflush_release(struct dma_fence_work *base)
|
||||
|
|
|
@ -1,355 +0,0 @@
|
|||
// SPDX-License-Identifier: MIT
|
||||
/*
|
||||
* Copyright © 2019 Intel Corporation
|
||||
*/
|
||||
|
||||
#include "i915_drv.h"
|
||||
#include "gt/intel_context.h"
|
||||
#include "gt/intel_engine_pm.h"
|
||||
#include "i915_gem_client_blt.h"
|
||||
#include "i915_gem_object_blt.h"
|
||||
|
||||
struct i915_sleeve {
|
||||
struct i915_vma *vma;
|
||||
struct drm_i915_gem_object *obj;
|
||||
struct sg_table *pages;
|
||||
struct i915_page_sizes page_sizes;
|
||||
};
|
||||
|
||||
static int vma_set_pages(struct i915_vma *vma)
|
||||
{
|
||||
struct i915_sleeve *sleeve = vma->private;
|
||||
|
||||
vma->pages = sleeve->pages;
|
||||
vma->page_sizes = sleeve->page_sizes;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void vma_clear_pages(struct i915_vma *vma)
|
||||
{
|
||||
GEM_BUG_ON(!vma->pages);
|
||||
vma->pages = NULL;
|
||||
}
|
||||
|
||||
static void vma_bind(struct i915_address_space *vm,
|
||||
struct i915_vm_pt_stash *stash,
|
||||
struct i915_vma *vma,
|
||||
enum i915_cache_level cache_level,
|
||||
u32 flags)
|
||||
{
|
||||
vm->vma_ops.bind_vma(vm, stash, vma, cache_level, flags);
|
||||
}
|
||||
|
||||
static void vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
|
||||
{
|
||||
vm->vma_ops.unbind_vma(vm, vma);
|
||||
}
|
||||
|
||||
static const struct i915_vma_ops proxy_vma_ops = {
|
||||
.set_pages = vma_set_pages,
|
||||
.clear_pages = vma_clear_pages,
|
||||
.bind_vma = vma_bind,
|
||||
.unbind_vma = vma_unbind,
|
||||
};
|
||||
|
||||
static struct i915_sleeve *create_sleeve(struct i915_address_space *vm,
|
||||
struct drm_i915_gem_object *obj,
|
||||
struct sg_table *pages,
|
||||
struct i915_page_sizes *page_sizes)
|
||||
{
|
||||
struct i915_sleeve *sleeve;
|
||||
struct i915_vma *vma;
|
||||
int err;
|
||||
|
||||
sleeve = kzalloc(sizeof(*sleeve), GFP_KERNEL);
|
||||
if (!sleeve)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
vma = i915_vma_instance(obj, vm, NULL);
|
||||
if (IS_ERR(vma)) {
|
||||
err = PTR_ERR(vma);
|
||||
goto err_free;
|
||||
}
|
||||
|
||||
vma->private = sleeve;
|
||||
vma->ops = &proxy_vma_ops;
|
||||
|
||||
sleeve->vma = vma;
|
||||
sleeve->pages = pages;
|
||||
sleeve->page_sizes = *page_sizes;
|
||||
|
||||
return sleeve;
|
||||
|
||||
err_free:
|
||||
kfree(sleeve);
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
|
||||
static void destroy_sleeve(struct i915_sleeve *sleeve)
|
||||
{
|
||||
kfree(sleeve);
|
||||
}
|
||||
|
||||
struct clear_pages_work {
|
||||
struct dma_fence dma;
|
||||
struct dma_fence_cb cb;
|
||||
struct i915_sw_fence wait;
|
||||
struct work_struct work;
|
||||
struct irq_work irq_work;
|
||||
struct i915_sleeve *sleeve;
|
||||
struct intel_context *ce;
|
||||
u32 value;
|
||||
};
|
||||
|
||||
static const char *clear_pages_work_driver_name(struct dma_fence *fence)
|
||||
{
|
||||
return DRIVER_NAME;
|
||||
}
|
||||
|
||||
static const char *clear_pages_work_timeline_name(struct dma_fence *fence)
|
||||
{
|
||||
return "clear";
|
||||
}
|
||||
|
||||
static void clear_pages_work_release(struct dma_fence *fence)
|
||||
{
|
||||
struct clear_pages_work *w = container_of(fence, typeof(*w), dma);
|
||||
|
||||
destroy_sleeve(w->sleeve);
|
||||
|
||||
i915_sw_fence_fini(&w->wait);
|
||||
|
||||
BUILD_BUG_ON(offsetof(typeof(*w), dma));
|
||||
dma_fence_free(&w->dma);
|
||||
}
|
||||
|
||||
static const struct dma_fence_ops clear_pages_work_ops = {
|
||||
.get_driver_name = clear_pages_work_driver_name,
|
||||
.get_timeline_name = clear_pages_work_timeline_name,
|
||||
.release = clear_pages_work_release,
|
||||
};
|
||||
|
||||
static void clear_pages_signal_irq_worker(struct irq_work *work)
|
||||
{
|
||||
struct clear_pages_work *w = container_of(work, typeof(*w), irq_work);
|
||||
|
||||
dma_fence_signal(&w->dma);
|
||||
dma_fence_put(&w->dma);
|
||||
}
|
||||
|
||||
static void clear_pages_dma_fence_cb(struct dma_fence *fence,
|
||||
struct dma_fence_cb *cb)
|
||||
{
|
||||
struct clear_pages_work *w = container_of(cb, typeof(*w), cb);
|
||||
|
||||
if (fence->error)
|
||||
dma_fence_set_error(&w->dma, fence->error);
|
||||
|
||||
/*
|
||||
* Push the signalling of the fence into yet another worker to avoid
|
||||
* the nightmare locking around the fence spinlock.
|
||||
*/
|
||||
irq_work_queue(&w->irq_work);
|
||||
}
|
||||
|
||||
static void clear_pages_worker(struct work_struct *work)
|
||||
{
|
||||
struct clear_pages_work *w = container_of(work, typeof(*w), work);
|
||||
struct drm_i915_gem_object *obj = w->sleeve->vma->obj;
|
||||
struct i915_vma *vma = w->sleeve->vma;
|
||||
struct i915_gem_ww_ctx ww;
|
||||
struct i915_request *rq;
|
||||
struct i915_vma *batch;
|
||||
int err = w->dma.error;
|
||||
|
||||
if (unlikely(err))
|
||||
goto out_signal;
|
||||
|
||||
if (obj->cache_dirty) {
|
||||
if (i915_gem_object_has_struct_page(obj))
|
||||
drm_clflush_sg(w->sleeve->pages);
|
||||
obj->cache_dirty = false;
|
||||
}
|
||||
obj->read_domains = I915_GEM_GPU_DOMAINS;
|
||||
obj->write_domain = 0;
|
||||
|
||||
i915_gem_ww_ctx_init(&ww, false);
|
||||
intel_engine_pm_get(w->ce->engine);
|
||||
retry:
|
||||
err = intel_context_pin_ww(w->ce, &ww);
|
||||
if (err)
|
||||
goto out_signal;
|
||||
|
||||
batch = intel_emit_vma_fill_blt(w->ce, vma, &ww, w->value);
|
||||
if (IS_ERR(batch)) {
|
||||
err = PTR_ERR(batch);
|
||||
goto out_ctx;
|
||||
}
|
||||
|
||||
rq = i915_request_create(w->ce);
|
||||
if (IS_ERR(rq)) {
|
||||
err = PTR_ERR(rq);
|
||||
goto out_batch;
|
||||
}
|
||||
|
||||
/* There's no way the fence has signalled */
|
||||
if (dma_fence_add_callback(&rq->fence, &w->cb,
|
||||
clear_pages_dma_fence_cb))
|
||||
GEM_BUG_ON(1);
|
||||
|
||||
err = intel_emit_vma_mark_active(batch, rq);
|
||||
if (unlikely(err))
|
||||
goto out_request;
|
||||
|
||||
/*
|
||||
* w->dma is already exported via (vma|obj)->resv we need only
|
||||
* keep track of the GPU activity within this vma/request, and
|
||||
* propagate the signal from the request to w->dma.
|
||||
*/
|
||||
err = __i915_vma_move_to_active(vma, rq);
|
||||
if (err)
|
||||
goto out_request;
|
||||
|
||||
if (rq->engine->emit_init_breadcrumb) {
|
||||
err = rq->engine->emit_init_breadcrumb(rq);
|
||||
if (unlikely(err))
|
||||
goto out_request;
|
||||
}
|
||||
|
||||
err = rq->engine->emit_bb_start(rq,
|
||||
batch->node.start, batch->node.size,
|
||||
0);
|
||||
out_request:
|
||||
if (unlikely(err)) {
|
||||
i915_request_set_error_once(rq, err);
|
||||
err = 0;
|
||||
}
|
||||
|
||||
i915_request_add(rq);
|
||||
out_batch:
|
||||
intel_emit_vma_release(w->ce, batch);
|
||||
out_ctx:
|
||||
intel_context_unpin(w->ce);
|
||||
out_signal:
|
||||
if (err == -EDEADLK) {
|
||||
err = i915_gem_ww_ctx_backoff(&ww);
|
||||
if (!err)
|
||||
goto retry;
|
||||
}
|
||||
i915_gem_ww_ctx_fini(&ww);
|
||||
|
||||
i915_vma_unpin(w->sleeve->vma);
|
||||
intel_engine_pm_put(w->ce->engine);
|
||||
|
||||
if (unlikely(err)) {
|
||||
dma_fence_set_error(&w->dma, err);
|
||||
dma_fence_signal(&w->dma);
|
||||
dma_fence_put(&w->dma);
|
||||
}
|
||||
}
|
||||
|
||||
static int pin_wait_clear_pages_work(struct clear_pages_work *w,
|
||||
struct intel_context *ce)
|
||||
{
|
||||
struct i915_vma *vma = w->sleeve->vma;
|
||||
struct i915_gem_ww_ctx ww;
|
||||
int err;
|
||||
|
||||
i915_gem_ww_ctx_init(&ww, false);
|
||||
retry:
|
||||
err = i915_gem_object_lock(vma->obj, &ww);
|
||||
if (err)
|
||||
goto out;
|
||||
|
||||
err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER);
|
||||
if (unlikely(err))
|
||||
goto out;
|
||||
|
||||
err = i915_sw_fence_await_reservation(&w->wait,
|
||||
vma->obj->base.resv, NULL,
|
||||
true, 0, I915_FENCE_GFP);
|
||||
if (err)
|
||||
goto err_unpin_vma;
|
||||
|
||||
dma_resv_add_excl_fence(vma->obj->base.resv, &w->dma);
|
||||
|
||||
err_unpin_vma:
|
||||
if (err)
|
||||
i915_vma_unpin(vma);
|
||||
out:
|
||||
if (err == -EDEADLK) {
|
||||
err = i915_gem_ww_ctx_backoff(&ww);
|
||||
if (!err)
|
||||
goto retry;
|
||||
}
|
||||
i915_gem_ww_ctx_fini(&ww);
|
||||
return err;
|
||||
}
|
||||
|
||||
static int __i915_sw_fence_call
|
||||
clear_pages_work_notify(struct i915_sw_fence *fence,
|
||||
enum i915_sw_fence_notify state)
|
||||
{
|
||||
struct clear_pages_work *w = container_of(fence, typeof(*w), wait);
|
||||
|
||||
switch (state) {
|
||||
case FENCE_COMPLETE:
|
||||
schedule_work(&w->work);
|
||||
break;
|
||||
|
||||
case FENCE_FREE:
|
||||
dma_fence_put(&w->dma);
|
||||
break;
|
||||
}
|
||||
|
||||
return NOTIFY_DONE;
|
||||
}
|
||||
|
||||
static DEFINE_SPINLOCK(fence_lock);
|
||||
|
||||
/* XXX: better name please */
|
||||
int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
|
||||
struct intel_context *ce,
|
||||
struct sg_table *pages,
|
||||
struct i915_page_sizes *page_sizes,
|
||||
u32 value)
|
||||
{
|
||||
struct clear_pages_work *work;
|
||||
struct i915_sleeve *sleeve;
|
||||
int err;
|
||||
|
||||
sleeve = create_sleeve(ce->vm, obj, pages, page_sizes);
|
||||
if (IS_ERR(sleeve))
|
||||
return PTR_ERR(sleeve);
|
||||
|
||||
work = kmalloc(sizeof(*work), GFP_KERNEL);
|
||||
if (!work) {
|
||||
destroy_sleeve(sleeve);
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
work->value = value;
|
||||
work->sleeve = sleeve;
|
||||
work->ce = ce;
|
||||
|
||||
INIT_WORK(&work->work, clear_pages_worker);
|
||||
|
||||
init_irq_work(&work->irq_work, clear_pages_signal_irq_worker);
|
||||
|
||||
dma_fence_init(&work->dma, &clear_pages_work_ops, &fence_lock, 0, 0);
|
||||
i915_sw_fence_init(&work->wait, clear_pages_work_notify);
|
||||
|
||||
err = pin_wait_clear_pages_work(work, ce);
|
||||
if (err < 0)
|
||||
dma_fence_set_error(&work->dma, err);
|
||||
|
||||
dma_fence_get(&work->dma);
|
||||
i915_sw_fence_commit(&work->wait);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
|
||||
#include "selftests/i915_gem_client_blt.c"
|
||||
#endif
|
|
@ -1,21 +0,0 @@
|
|||
/* SPDX-License-Identifier: MIT */
|
||||
/*
|
||||
* Copyright © 2019 Intel Corporation
|
||||
*/
|
||||
#ifndef __I915_GEM_CLIENT_BLT_H__
|
||||
#define __I915_GEM_CLIENT_BLT_H__
|
||||
|
||||
#include <linux/types.h>
|
||||
|
||||
struct drm_i915_gem_object;
|
||||
struct i915_page_sizes;
|
||||
struct intel_context;
|
||||
struct sg_table;
|
||||
|
||||
int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
|
||||
struct intel_context *ce,
|
||||
struct sg_table *pages,
|
||||
struct i915_page_sizes *page_sizes,
|
||||
u32 value);
|
||||
|
||||
#endif
|
File diff suppressed because it is too large
Load Diff
|
@ -133,6 +133,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
|
|||
int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
|
||||
struct drm_file *file);
|
||||
|
||||
struct i915_gem_context *
|
||||
i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
|
||||
|
||||
static inline struct i915_gem_context *
|
||||
i915_gem_context_get(struct i915_gem_context *ctx)
|
||||
{
|
||||
|
@ -221,6 +224,9 @@ i915_gem_engines_iter_next(struct i915_gem_engines_iter *it);
|
|||
for (i915_gem_engines_iter_init(&(it), (engines)); \
|
||||
((ce) = i915_gem_engines_iter_next(&(it)));)
|
||||
|
||||
void i915_gem_context_module_exit(void);
|
||||
int i915_gem_context_module_init(void);
|
||||
|
||||
struct i915_lut_handle *i915_lut_handle_alloc(void);
|
||||
void i915_lut_handle_free(struct i915_lut_handle *lut);
|
||||
|
||||
|
|
|
@ -30,22 +30,176 @@ struct i915_address_space;
|
|||
struct intel_timeline;
|
||||
struct intel_ring;
|
||||
|
||||
/**
|
||||
* struct i915_gem_engines - A set of engines
|
||||
*/
|
||||
struct i915_gem_engines {
|
||||
union {
|
||||
/** @link: Link in i915_gem_context::stale::engines */
|
||||
struct list_head link;
|
||||
|
||||
/** @rcu: RCU to use when freeing */
|
||||
struct rcu_head rcu;
|
||||
};
|
||||
|
||||
/** @fence: Fence used for delayed destruction of engines */
|
||||
struct i915_sw_fence fence;
|
||||
|
||||
/** @ctx: i915_gem_context backpointer */
|
||||
struct i915_gem_context *ctx;
|
||||
|
||||
/** @num_engines: Number of engines in this set */
|
||||
unsigned int num_engines;
|
||||
|
||||
/** @engines: Array of engines */
|
||||
struct intel_context *engines[];
|
||||
};
|
||||
|
||||
/**
|
||||
* struct i915_gem_engines_iter - Iterator for an i915_gem_engines set
|
||||
*/
|
||||
struct i915_gem_engines_iter {
|
||||
/** @idx: Index into i915_gem_engines::engines */
|
||||
unsigned int idx;
|
||||
|
||||
/** @engines: Engine set being iterated */
|
||||
const struct i915_gem_engines *engines;
|
||||
};
|
||||
|
||||
/**
|
||||
* enum i915_gem_engine_type - Describes the type of an i915_gem_proto_engine
|
||||
*/
|
||||
enum i915_gem_engine_type {
|
||||
/** @I915_GEM_ENGINE_TYPE_INVALID: An invalid engine */
|
||||
I915_GEM_ENGINE_TYPE_INVALID = 0,
|
||||
|
||||
/** @I915_GEM_ENGINE_TYPE_PHYSICAL: A single physical engine */
|
||||
I915_GEM_ENGINE_TYPE_PHYSICAL,
|
||||
|
||||
/** @I915_GEM_ENGINE_TYPE_BALANCED: A load-balanced engine set */
|
||||
I915_GEM_ENGINE_TYPE_BALANCED,
|
||||
};
|
||||
|
||||
/**
|
||||
* struct i915_gem_proto_engine - prototype engine
|
||||
*
|
||||
* This struct describes an engine that a context may contain. Engines
|
||||
* have three types:
|
||||
*
|
||||
* - I915_GEM_ENGINE_TYPE_INVALID: Invalid engines can be created but they
|
||||
* show up as a NULL in i915_gem_engines::engines[i] and any attempt to
|
||||
* use them by the user results in -EINVAL. They are also useful during
|
||||
* proto-context construction because the client may create invalid
|
||||
* engines and then set them up later as virtual engines.
|
||||
*
|
||||
* - I915_GEM_ENGINE_TYPE_PHYSICAL: A single physical engine, described by
|
||||
* i915_gem_proto_engine::engine.
|
||||
*
|
||||
* - I915_GEM_ENGINE_TYPE_BALANCED: A load-balanced engine set, described
|
||||
* i915_gem_proto_engine::num_siblings and i915_gem_proto_engine::siblings.
|
||||
*/
|
||||
struct i915_gem_proto_engine {
|
||||
/** @type: Type of this engine */
|
||||
enum i915_gem_engine_type type;
|
||||
|
||||
/** @engine: Engine, for physical */
|
||||
struct intel_engine_cs *engine;
|
||||
|
||||
/** @num_siblings: Number of balanced siblings */
|
||||
unsigned int num_siblings;
|
||||
|
||||
/** @siblings: Balanced siblings */
|
||||
struct intel_engine_cs **siblings;
|
||||
|
||||
/** @sseu: Client-set SSEU parameters */
|
||||
struct intel_sseu sseu;
|
||||
};
|
||||
|
||||
/**
|
||||
* struct i915_gem_proto_context - prototype context
|
||||
*
|
||||
* The struct i915_gem_proto_context represents the creation parameters for
|
||||
* a struct i915_gem_context. This is used to gather parameters provided
|
||||
* either through creation flags or via SET_CONTEXT_PARAM so that, when we
|
||||
* create the final i915_gem_context, those parameters can be immutable.
|
||||
*
|
||||
* The context uAPI allows for two methods of setting context parameters:
|
||||
* SET_CONTEXT_PARAM and CONTEXT_CREATE_EXT_SETPARAM. The former is
|
||||
* allowed to be called at any time while the later happens as part of
|
||||
* GEM_CONTEXT_CREATE. When these were initially added, Currently,
|
||||
* everything settable via one is settable via the other. While some
|
||||
* params are fairly simple and setting them on a live context is harmless
|
||||
* such the context priority, others are far trickier such as the VM or the
|
||||
* set of engines. To avoid some truly nasty race conditions, we don't
|
||||
* allow setting the VM or the set of engines on live contexts.
|
||||
*
|
||||
* The way we dealt with this without breaking older userspace that sets
|
||||
* the VM or engine set via SET_CONTEXT_PARAM is to delay the creation of
|
||||
* the actual context until after the client is done configuring it with
|
||||
* SET_CONTEXT_PARAM. From the perspective of the client, it has the same
|
||||
* u32 context ID the whole time. From the perspective of i915, however,
|
||||
* it's an i915_gem_proto_context right up until the point where we attempt
|
||||
* to do something which the proto-context can't handle at which point the
|
||||
* real context gets created.
|
||||
*
|
||||
* This is accomplished via a little xarray dance. When GEM_CONTEXT_CREATE
|
||||
* is called, we create a proto-context, reserve a slot in context_xa but
|
||||
* leave it NULL, the proto-context in the corresponding slot in
|
||||
* proto_context_xa. Then, whenever we go to look up a context, we first
|
||||
* check context_xa. If it's there, we return the i915_gem_context and
|
||||
* we're done. If it's not, we look in proto_context_xa and, if we find it
|
||||
* there, we create the actual context and kill the proto-context.
|
||||
*
|
||||
* At the time we made this change (April, 2021), we did a fairly complete
|
||||
* audit of existing userspace to ensure this wouldn't break anything:
|
||||
*
|
||||
* - Mesa/i965 didn't use the engines or VM APIs at all
|
||||
*
|
||||
* - Mesa/ANV used the engines API but via CONTEXT_CREATE_EXT_SETPARAM and
|
||||
* didn't use the VM API.
|
||||
*
|
||||
* - Mesa/iris didn't use the engines or VM APIs at all
|
||||
*
|
||||
* - The open-source compute-runtime didn't yet use the engines API but
|
||||
* did use the VM API via SET_CONTEXT_PARAM. However, CONTEXT_SETPARAM
|
||||
* was always the second ioctl on that context, immediately following
|
||||
* GEM_CONTEXT_CREATE.
|
||||
*
|
||||
* - The media driver sets engines and bonding/balancing via
|
||||
* SET_CONTEXT_PARAM. However, CONTEXT_SETPARAM to set the VM was
|
||||
* always the second ioctl on that context, immediately following
|
||||
* GEM_CONTEXT_CREATE and setting engines immediately followed that.
|
||||
*
|
||||
* In order for this dance to work properly, any modification to an
|
||||
* i915_gem_proto_context that is exposed to the client via
|
||||
* drm_i915_file_private::proto_context_xa must be guarded by
|
||||
* drm_i915_file_private::proto_context_lock. The exception is when a
|
||||
* proto-context has not yet been exposed such as when handling
|
||||
* CONTEXT_CREATE_SET_PARAM during GEM_CONTEXT_CREATE.
|
||||
*/
|
||||
struct i915_gem_proto_context {
|
||||
/** @vm: See &i915_gem_context.vm */
|
||||
struct i915_address_space *vm;
|
||||
|
||||
/** @user_flags: See &i915_gem_context.user_flags */
|
||||
unsigned long user_flags;
|
||||
|
||||
/** @sched: See &i915_gem_context.sched */
|
||||
struct i915_sched_attr sched;
|
||||
|
||||
/** @num_user_engines: Number of user-specified engines or -1 */
|
||||
int num_user_engines;
|
||||
|
||||
/** @user_engines: User-specified engines */
|
||||
struct i915_gem_proto_engine *user_engines;
|
||||
|
||||
/** @legacy_rcs_sseu: Client-set SSEU parameters for the legacy RCS */
|
||||
struct intel_sseu legacy_rcs_sseu;
|
||||
|
||||
/** @single_timeline: See See &i915_gem_context.syncobj */
|
||||
bool single_timeline;
|
||||
};
|
||||
|
||||
/**
|
||||
* struct i915_gem_context - client state
|
||||
*
|
||||
|
@ -53,10 +207,10 @@ struct i915_gem_engines_iter {
|
|||
* logical hardware state for a particular client.
|
||||
*/
|
||||
struct i915_gem_context {
|
||||
/** i915: i915 device backpointer */
|
||||
/** @i915: i915 device backpointer */
|
||||
struct drm_i915_private *i915;
|
||||
|
||||
/** file_priv: owning file descriptor */
|
||||
/** @file_priv: owning file descriptor */
|
||||
struct drm_i915_file_private *file_priv;
|
||||
|
||||
/**
|
||||
|
@ -81,9 +235,23 @@ struct i915_gem_context {
|
|||
* CONTEXT_USER_ENGINES flag is set).
|
||||
*/
|
||||
struct i915_gem_engines __rcu *engines;
|
||||
struct mutex engines_mutex; /* guards writes to engines */
|
||||
|
||||
struct intel_timeline *timeline;
|
||||
/** @engines_mutex: guards writes to engines */
|
||||
struct mutex engines_mutex;
|
||||
|
||||
/**
|
||||
* @syncobj: Shared timeline syncobj
|
||||
*
|
||||
* When the SHARED_TIMELINE flag is set on context creation, we
|
||||
* emulate a single timeline across all engines using this syncobj.
|
||||
* For every execbuffer2 call, this syncobj is used as both an in-
|
||||
* and out-fence. Unlike the real intel_timeline, this doesn't
|
||||
* provide perfect atomic in-order guarantees if the client races
|
||||
* with itself by calling execbuffer2 twice concurrently. However,
|
||||
* if userspace races with itself, that's not likely to yield well-
|
||||
* defined results anyway so we choose to not care.
|
||||
*/
|
||||
struct drm_syncobj *syncobj;
|
||||
|
||||
/**
|
||||
* @vm: unique address space (GTT)
|
||||
|
@ -106,7 +274,7 @@ struct i915_gem_context {
|
|||
*/
|
||||
struct pid *pid;
|
||||
|
||||
/** link: place with &drm_i915_private.context_list */
|
||||
/** @link: place with &drm_i915_private.context_list */
|
||||
struct list_head link;
|
||||
|
||||
/**
|
||||
|
@ -129,7 +297,6 @@ struct i915_gem_context {
|
|||
* @user_flags: small set of booleans controlled by the user
|
||||
*/
|
||||
unsigned long user_flags;
|
||||
#define UCONTEXT_NO_ZEROMAP 0
|
||||
#define UCONTEXT_NO_ERROR_CAPTURE 1
|
||||
#define UCONTEXT_BANNABLE 2
|
||||
#define UCONTEXT_RECOVERABLE 3
|
||||
|
@ -142,11 +309,13 @@ struct i915_gem_context {
|
|||
#define CONTEXT_CLOSED 0
|
||||
#define CONTEXT_USER_ENGINES 1
|
||||
|
||||
/** @mutex: guards everything that isn't engines or handles_vma */
|
||||
struct mutex mutex;
|
||||
|
||||
/** @sched: scheduler parameters */
|
||||
struct i915_sched_attr sched;
|
||||
|
||||
/** guilty_count: How many times this context has caused a GPU hang. */
|
||||
/** @guilty_count: How many times this context has caused a GPU hang. */
|
||||
atomic_t guilty_count;
|
||||
/**
|
||||
* @active_count: How many times this context was active during a GPU
|
||||
|
@ -154,25 +323,23 @@ struct i915_gem_context {
|
|||
*/
|
||||
atomic_t active_count;
|
||||
|
||||
struct {
|
||||
u64 timeout_us;
|
||||
} watchdog;
|
||||
|
||||
/**
|
||||
* @hang_timestamp: The last time(s) this context caused a GPU hang
|
||||
*/
|
||||
unsigned long hang_timestamp[2];
|
||||
#define CONTEXT_FAST_HANG_JIFFIES (120 * HZ) /* 3 hangs within 120s? Banned! */
|
||||
|
||||
/** remap_slice: Bitmask of cache lines that need remapping */
|
||||
/** @remap_slice: Bitmask of cache lines that need remapping */
|
||||
u8 remap_slice;
|
||||
|
||||
/**
|
||||
* handles_vma: rbtree to look up our context specific obj/vma for
|
||||
* @handles_vma: rbtree to look up our context specific obj/vma for
|
||||
* the user handle. (user handles are per fd, but the binding is
|
||||
* per vm, which may be one per context or shared with the global GTT)
|
||||
*/
|
||||
struct radix_tree_root handles_vma;
|
||||
|
||||
/** @lut_mutex: Locks handles_vma */
|
||||
struct mutex lut_mutex;
|
||||
|
||||
/**
|
||||
|
@ -184,8 +351,11 @@ struct i915_gem_context {
|
|||
*/
|
||||
char name[TASK_COMM_LEN + 8];
|
||||
|
||||
/** @stale: tracks stale engines to be destroyed */
|
||||
struct {
|
||||
/** @lock: guards engines */
|
||||
spinlock_t lock;
|
||||
/** @engines: list of stale engines */
|
||||
struct list_head engines;
|
||||
} stale;
|
||||
};
|
||||
|
|
|
@ -11,13 +11,14 @@
|
|||
#include "i915_trace.h"
|
||||
#include "i915_user_extensions.h"
|
||||
|
||||
static u32 object_max_page_size(struct drm_i915_gem_object *obj)
|
||||
static u32 object_max_page_size(struct intel_memory_region **placements,
|
||||
unsigned int n_placements)
|
||||
{
|
||||
u32 max_page_size = 0;
|
||||
int i;
|
||||
|
||||
for (i = 0; i < obj->mm.n_placements; i++) {
|
||||
struct intel_memory_region *mr = obj->mm.placements[i];
|
||||
for (i = 0; i < n_placements; i++) {
|
||||
struct intel_memory_region *mr = placements[i];
|
||||
|
||||
GEM_BUG_ON(!is_power_of_2(mr->min_page_size));
|
||||
max_page_size = max_t(u32, max_page_size, mr->min_page_size);
|
||||
|
@ -27,10 +28,13 @@ static u32 object_max_page_size(struct drm_i915_gem_object *obj)
|
|||
return max_page_size;
|
||||
}
|
||||
|
||||
static void object_set_placements(struct drm_i915_gem_object *obj,
|
||||
struct intel_memory_region **placements,
|
||||
unsigned int n_placements)
|
||||
static int object_set_placements(struct drm_i915_gem_object *obj,
|
||||
struct intel_memory_region **placements,
|
||||
unsigned int n_placements)
|
||||
{
|
||||
struct intel_memory_region **arr;
|
||||
unsigned int i;
|
||||
|
||||
GEM_BUG_ON(!n_placements);
|
||||
|
||||
/*
|
||||
|
@ -44,9 +48,20 @@ static void object_set_placements(struct drm_i915_gem_object *obj,
|
|||
obj->mm.placements = &i915->mm.regions[mr->id];
|
||||
obj->mm.n_placements = 1;
|
||||
} else {
|
||||
obj->mm.placements = placements;
|
||||
arr = kmalloc_array(n_placements,
|
||||
sizeof(struct intel_memory_region *),
|
||||
GFP_KERNEL);
|
||||
if (!arr)
|
||||
return -ENOMEM;
|
||||
|
||||
for (i = 0; i < n_placements; i++)
|
||||
arr[i] = placements[i];
|
||||
|
||||
obj->mm.placements = arr;
|
||||
obj->mm.n_placements = n_placements;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int i915_gem_publish(struct drm_i915_gem_object *obj,
|
||||
|
@ -67,22 +82,46 @@ static int i915_gem_publish(struct drm_i915_gem_object *obj,
|
|||
return 0;
|
||||
}
|
||||
|
||||
static int
|
||||
i915_gem_setup(struct drm_i915_gem_object *obj, u64 size)
|
||||
/**
|
||||
* Creates a new object using the same path as DRM_I915_GEM_CREATE_EXT
|
||||
* @i915: i915 private
|
||||
* @size: size of the buffer, in bytes
|
||||
* @placements: possible placement regions, in priority order
|
||||
* @n_placements: number of possible placement regions
|
||||
*
|
||||
* This function is exposed primarily for selftests and does very little
|
||||
* error checking. It is assumed that the set of placement regions has
|
||||
* already been verified to be valid.
|
||||
*/
|
||||
struct drm_i915_gem_object *
|
||||
__i915_gem_object_create_user(struct drm_i915_private *i915, u64 size,
|
||||
struct intel_memory_region **placements,
|
||||
unsigned int n_placements)
|
||||
{
|
||||
struct intel_memory_region *mr = obj->mm.placements[0];
|
||||
struct intel_memory_region *mr = placements[0];
|
||||
struct drm_i915_gem_object *obj;
|
||||
unsigned int flags;
|
||||
int ret;
|
||||
|
||||
size = round_up(size, object_max_page_size(obj));
|
||||
i915_gem_flush_free_objects(i915);
|
||||
|
||||
size = round_up(size, object_max_page_size(placements, n_placements));
|
||||
if (size == 0)
|
||||
return -EINVAL;
|
||||
return ERR_PTR(-EINVAL);
|
||||
|
||||
/* For most of the ABI (e.g. mmap) we think in system pages */
|
||||
GEM_BUG_ON(!IS_ALIGNED(size, PAGE_SIZE));
|
||||
|
||||
if (i915_gem_object_size_2big(size))
|
||||
return -E2BIG;
|
||||
return ERR_PTR(-E2BIG);
|
||||
|
||||
obj = i915_gem_object_alloc();
|
||||
if (!obj)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
ret = object_set_placements(obj, placements, n_placements);
|
||||
if (ret)
|
||||
goto object_free;
|
||||
|
||||
/*
|
||||
* I915_BO_ALLOC_USER will make sure the object is cleared before
|
||||
|
@ -90,14 +129,20 @@ i915_gem_setup(struct drm_i915_gem_object *obj, u64 size)
|
|||
*/
|
||||
flags = I915_BO_ALLOC_USER;
|
||||
|
||||
ret = mr->ops->init_object(mr, obj, size, flags);
|
||||
ret = mr->ops->init_object(mr, obj, size, 0, flags);
|
||||
if (ret)
|
||||
return ret;
|
||||
goto object_free;
|
||||
|
||||
GEM_BUG_ON(size != obj->base.size);
|
||||
|
||||
trace_i915_gem_object_create(obj);
|
||||
return 0;
|
||||
return obj;
|
||||
|
||||
object_free:
|
||||
if (obj->mm.n_placements > 1)
|
||||
kfree(obj->mm.placements);
|
||||
i915_gem_object_free(obj);
|
||||
return ERR_PTR(ret);
|
||||
}
|
||||
|
||||
int
|
||||
|
@ -110,7 +155,6 @@ i915_gem_dumb_create(struct drm_file *file,
|
|||
enum intel_memory_type mem_type;
|
||||
int cpp = DIV_ROUND_UP(args->bpp, 8);
|
||||
u32 format;
|
||||
int ret;
|
||||
|
||||
switch (cpp) {
|
||||
case 1:
|
||||
|
@ -143,22 +187,13 @@ i915_gem_dumb_create(struct drm_file *file,
|
|||
if (HAS_LMEM(to_i915(dev)))
|
||||
mem_type = INTEL_MEMORY_LOCAL;
|
||||
|
||||
obj = i915_gem_object_alloc();
|
||||
if (!obj)
|
||||
return -ENOMEM;
|
||||
|
||||
mr = intel_memory_region_by_type(to_i915(dev), mem_type);
|
||||
object_set_placements(obj, &mr, 1);
|
||||
|
||||
ret = i915_gem_setup(obj, args->size);
|
||||
if (ret)
|
||||
goto object_free;
|
||||
obj = __i915_gem_object_create_user(to_i915(dev), args->size, &mr, 1);
|
||||
if (IS_ERR(obj))
|
||||
return PTR_ERR(obj);
|
||||
|
||||
return i915_gem_publish(obj, file, &args->size, &args->handle);
|
||||
|
||||
object_free:
|
||||
i915_gem_object_free(obj);
|
||||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -175,31 +210,20 @@ i915_gem_create_ioctl(struct drm_device *dev, void *data,
|
|||
struct drm_i915_gem_create *args = data;
|
||||
struct drm_i915_gem_object *obj;
|
||||
struct intel_memory_region *mr;
|
||||
int ret;
|
||||
|
||||
i915_gem_flush_free_objects(i915);
|
||||
|
||||
obj = i915_gem_object_alloc();
|
||||
if (!obj)
|
||||
return -ENOMEM;
|
||||
|
||||
mr = intel_memory_region_by_type(i915, INTEL_MEMORY_SYSTEM);
|
||||
object_set_placements(obj, &mr, 1);
|
||||
|
||||
ret = i915_gem_setup(obj, args->size);
|
||||
if (ret)
|
||||
goto object_free;
|
||||
obj = __i915_gem_object_create_user(i915, args->size, &mr, 1);
|
||||
if (IS_ERR(obj))
|
||||
return PTR_ERR(obj);
|
||||
|
||||
return i915_gem_publish(obj, file, &args->size, &args->handle);
|
||||
|
||||
object_free:
|
||||
i915_gem_object_free(obj);
|
||||
return ret;
|
||||
}
|
||||
|
||||
struct create_ext {
|
||||
struct drm_i915_private *i915;
|
||||
struct drm_i915_gem_object *vanilla_object;
|
||||
struct intel_memory_region *placements[INTEL_REGION_UNKNOWN];
|
||||
unsigned int n_placements;
|
||||
};
|
||||
|
||||
static void repr_placements(char *buf, size_t size,
|
||||
|
@ -230,8 +254,7 @@ static int set_placements(struct drm_i915_gem_create_ext_memory_regions *args,
|
|||
struct drm_i915_private *i915 = ext_data->i915;
|
||||
struct drm_i915_gem_memory_class_instance __user *uregions =
|
||||
u64_to_user_ptr(args->regions);
|
||||
struct drm_i915_gem_object *obj = ext_data->vanilla_object;
|
||||
struct intel_memory_region **placements;
|
||||
struct intel_memory_region *placements[INTEL_REGION_UNKNOWN];
|
||||
u32 mask;
|
||||
int i, ret = 0;
|
||||
|
||||
|
@ -245,6 +268,8 @@ static int set_placements(struct drm_i915_gem_create_ext_memory_regions *args,
|
|||
ret = -EINVAL;
|
||||
}
|
||||
|
||||
BUILD_BUG_ON(ARRAY_SIZE(i915->mm.regions) != ARRAY_SIZE(placements));
|
||||
BUILD_BUG_ON(ARRAY_SIZE(ext_data->placements) != ARRAY_SIZE(placements));
|
||||
if (args->num_regions > ARRAY_SIZE(i915->mm.regions)) {
|
||||
drm_dbg(&i915->drm, "num_regions is too large\n");
|
||||
ret = -EINVAL;
|
||||
|
@ -253,21 +278,13 @@ static int set_placements(struct drm_i915_gem_create_ext_memory_regions *args,
|
|||
if (ret)
|
||||
return ret;
|
||||
|
||||
placements = kmalloc_array(args->num_regions,
|
||||
sizeof(struct intel_memory_region *),
|
||||
GFP_KERNEL);
|
||||
if (!placements)
|
||||
return -ENOMEM;
|
||||
|
||||
mask = 0;
|
||||
for (i = 0; i < args->num_regions; i++) {
|
||||
struct drm_i915_gem_memory_class_instance region;
|
||||
struct intel_memory_region *mr;
|
||||
|
||||
if (copy_from_user(®ion, uregions, sizeof(region))) {
|
||||
ret = -EFAULT;
|
||||
goto out_free;
|
||||
}
|
||||
if (copy_from_user(®ion, uregions, sizeof(region)))
|
||||
return -EFAULT;
|
||||
|
||||
mr = intel_memory_region_lookup(i915,
|
||||
region.memory_class,
|
||||
|
@ -293,14 +310,14 @@ static int set_placements(struct drm_i915_gem_create_ext_memory_regions *args,
|
|||
++uregions;
|
||||
}
|
||||
|
||||
if (obj->mm.placements) {
|
||||
if (ext_data->n_placements) {
|
||||
ret = -EINVAL;
|
||||
goto out_dump;
|
||||
}
|
||||
|
||||
object_set_placements(obj, placements, args->num_regions);
|
||||
if (args->num_regions == 1)
|
||||
kfree(placements);
|
||||
ext_data->n_placements = args->num_regions;
|
||||
for (i = 0; i < args->num_regions; i++)
|
||||
ext_data->placements[i] = placements[i];
|
||||
|
||||
return 0;
|
||||
|
||||
|
@ -308,11 +325,11 @@ out_dump:
|
|||
if (1) {
|
||||
char buf[256];
|
||||
|
||||
if (obj->mm.placements) {
|
||||
if (ext_data->n_placements) {
|
||||
repr_placements(buf,
|
||||
sizeof(buf),
|
||||
obj->mm.placements,
|
||||
obj->mm.n_placements);
|
||||
ext_data->placements,
|
||||
ext_data->n_placements);
|
||||
drm_dbg(&i915->drm,
|
||||
"Placements were already set in previous EXT. Existing placements: %s\n",
|
||||
buf);
|
||||
|
@ -322,8 +339,6 @@ out_dump:
|
|||
drm_dbg(&i915->drm, "New placements(so far validated): %s\n", buf);
|
||||
}
|
||||
|
||||
out_free:
|
||||
kfree(placements);
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
@ -358,44 +373,30 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
|
|||
struct drm_i915_private *i915 = to_i915(dev);
|
||||
struct drm_i915_gem_create_ext *args = data;
|
||||
struct create_ext ext_data = { .i915 = i915 };
|
||||
struct intel_memory_region **placements_ext;
|
||||
struct drm_i915_gem_object *obj;
|
||||
int ret;
|
||||
|
||||
if (args->flags)
|
||||
return -EINVAL;
|
||||
|
||||
i915_gem_flush_free_objects(i915);
|
||||
|
||||
obj = i915_gem_object_alloc();
|
||||
if (!obj)
|
||||
return -ENOMEM;
|
||||
|
||||
ext_data.vanilla_object = obj;
|
||||
ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
|
||||
create_extensions,
|
||||
ARRAY_SIZE(create_extensions),
|
||||
&ext_data);
|
||||
placements_ext = obj->mm.placements;
|
||||
if (ret)
|
||||
goto object_free;
|
||||
return ret;
|
||||
|
||||
if (!placements_ext) {
|
||||
struct intel_memory_region *mr =
|
||||
if (!ext_data.n_placements) {
|
||||
ext_data.placements[0] =
|
||||
intel_memory_region_by_type(i915, INTEL_MEMORY_SYSTEM);
|
||||
|
||||
object_set_placements(obj, &mr, 1);
|
||||
ext_data.n_placements = 1;
|
||||
}
|
||||
|
||||
ret = i915_gem_setup(obj, args->size);
|
||||
if (ret)
|
||||
goto object_free;
|
||||
obj = __i915_gem_object_create_user(i915, args->size,
|
||||
ext_data.placements,
|
||||
ext_data.n_placements);
|
||||
if (IS_ERR(obj))
|
||||
return PTR_ERR(obj);
|
||||
|
||||
return i915_gem_publish(obj, file, &args->size, &args->handle);
|
||||
|
||||
object_free:
|
||||
if (obj->mm.n_placements > 1)
|
||||
kfree(placements_ext);
|
||||
i915_gem_object_free(obj);
|
||||
return ret;
|
||||
}
|
||||
|
|
|
@ -12,6 +12,8 @@
|
|||
#include "i915_gem_object.h"
|
||||
#include "i915_scatterlist.h"
|
||||
|
||||
I915_SELFTEST_DECLARE(static bool force_different_devices;)
|
||||
|
||||
static struct drm_i915_gem_object *dma_buf_to_obj(struct dma_buf *buf)
|
||||
{
|
||||
return to_intel_bo(buf->priv);
|
||||
|
@ -25,15 +27,11 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme
|
|||
struct scatterlist *src, *dst;
|
||||
int ret, i;
|
||||
|
||||
ret = i915_gem_object_pin_pages_unlocked(obj);
|
||||
if (ret)
|
||||
goto err;
|
||||
|
||||
/* Copy sg so that we make an independent mapping */
|
||||
st = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
|
||||
if (st == NULL) {
|
||||
ret = -ENOMEM;
|
||||
goto err_unpin_pages;
|
||||
goto err;
|
||||
}
|
||||
|
||||
ret = sg_alloc_table(st, obj->mm.pages->nents, GFP_KERNEL);
|
||||
|
@ -58,8 +56,6 @@ err_free_sg:
|
|||
sg_free_table(st);
|
||||
err_free:
|
||||
kfree(st);
|
||||
err_unpin_pages:
|
||||
i915_gem_object_unpin_pages(obj);
|
||||
err:
|
||||
return ERR_PTR(ret);
|
||||
}
|
||||
|
@ -68,13 +64,9 @@ static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment,
|
|||
struct sg_table *sg,
|
||||
enum dma_data_direction dir)
|
||||
{
|
||||
struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf);
|
||||
|
||||
dma_unmap_sgtable(attachment->dev, sg, dir, DMA_ATTR_SKIP_CPU_SYNC);
|
||||
sg_free_table(sg);
|
||||
kfree(sg);
|
||||
|
||||
i915_gem_object_unpin_pages(obj);
|
||||
}
|
||||
|
||||
static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct dma_buf_map *map)
|
||||
|
@ -168,7 +160,46 @@ retry:
|
|||
return err;
|
||||
}
|
||||
|
||||
static int i915_gem_dmabuf_attach(struct dma_buf *dmabuf,
|
||||
struct dma_buf_attachment *attach)
|
||||
{
|
||||
struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
|
||||
struct i915_gem_ww_ctx ww;
|
||||
int err;
|
||||
|
||||
if (!i915_gem_object_can_migrate(obj, INTEL_REGION_SMEM))
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
for_i915_gem_ww(&ww, err, true) {
|
||||
err = i915_gem_object_lock(obj, &ww);
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
err = i915_gem_object_migrate(obj, &ww, INTEL_REGION_SMEM);
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
err = i915_gem_object_wait_migration(obj, 0);
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
err = i915_gem_object_pin_pages(obj);
|
||||
}
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static void i915_gem_dmabuf_detach(struct dma_buf *dmabuf,
|
||||
struct dma_buf_attachment *attach)
|
||||
{
|
||||
struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
|
||||
|
||||
i915_gem_object_unpin_pages(obj);
|
||||
}
|
||||
|
||||
static const struct dma_buf_ops i915_dmabuf_ops = {
|
||||
.attach = i915_gem_dmabuf_attach,
|
||||
.detach = i915_gem_dmabuf_detach,
|
||||
.map_dma_buf = i915_gem_map_dma_buf,
|
||||
.unmap_dma_buf = i915_gem_unmap_dma_buf,
|
||||
.release = drm_gem_dmabuf_release,
|
||||
|
@ -204,6 +235,8 @@ static int i915_gem_object_get_pages_dmabuf(struct drm_i915_gem_object *obj)
|
|||
struct sg_table *pages;
|
||||
unsigned int sg_page_sizes;
|
||||
|
||||
assert_object_held(obj);
|
||||
|
||||
pages = dma_buf_map_attachment(obj->base.import_attach,
|
||||
DMA_BIDIRECTIONAL);
|
||||
if (IS_ERR(pages))
|
||||
|
@ -241,7 +274,8 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
|
|||
if (dma_buf->ops == &i915_dmabuf_ops) {
|
||||
obj = dma_buf_to_obj(dma_buf);
|
||||
/* is it from our device? */
|
||||
if (obj->base.dev == dev) {
|
||||
if (obj->base.dev == dev &&
|
||||
!I915_SELFTEST_ONLY(force_different_devices)) {
|
||||
/*
|
||||
* Importing dmabuf exported from out own gem increases
|
||||
* refcount on gem itself instead of f_count of dmabuf.
|
||||
|
|
|
@ -268,6 +268,9 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,
|
|||
struct drm_i915_gem_object *obj;
|
||||
int err = 0;
|
||||
|
||||
if (IS_DGFX(to_i915(dev)))
|
||||
return -ENODEV;
|
||||
|
||||
rcu_read_lock();
|
||||
obj = i915_gem_object_lookup_rcu(file, args->handle);
|
||||
if (!obj) {
|
||||
|
@ -303,6 +306,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
|
|||
enum i915_cache_level level;
|
||||
int ret = 0;
|
||||
|
||||
if (IS_DGFX(i915))
|
||||
return -ENODEV;
|
||||
|
||||
switch (args->caching) {
|
||||
case I915_CACHING_NONE:
|
||||
level = I915_CACHE_NONE;
|
||||
|
@ -375,7 +381,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
|
|||
struct i915_vma *vma;
|
||||
int ret;
|
||||
|
||||
/* Frame buffer must be in LMEM (no migration yet) */
|
||||
/* Frame buffer must be in LMEM */
|
||||
if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj))
|
||||
return ERR_PTR(-EINVAL);
|
||||
|
||||
|
@ -484,6 +490,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
|
|||
u32 write_domain = args->write_domain;
|
||||
int err;
|
||||
|
||||
if (IS_DGFX(to_i915(dev)))
|
||||
return -ENODEV;
|
||||
|
||||
/* Only handle setting domains to types used by the CPU. */
|
||||
if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS)
|
||||
return -EINVAL;
|
||||
|
|
|
@ -277,18 +277,9 @@ struct i915_execbuffer {
|
|||
bool has_llc : 1;
|
||||
bool has_fence : 1;
|
||||
bool needs_unfenced : 1;
|
||||
|
||||
struct i915_request *rq;
|
||||
u32 *rq_cmd;
|
||||
unsigned int rq_size;
|
||||
struct intel_gt_buffer_pool_node *pool;
|
||||
} reloc_cache;
|
||||
|
||||
struct intel_gt_buffer_pool_node *reloc_pool; /** relocation pool for -EDEADLK handling */
|
||||
struct intel_context *reloc_context;
|
||||
|
||||
u64 invalid_flags; /** Set of execobj.flags that are invalid */
|
||||
u32 context_flags; /** Set of execobj.flags to insert from the ctx */
|
||||
|
||||
u64 batch_len; /** Length of batch within object */
|
||||
u32 batch_start_offset; /** Location within object of batch */
|
||||
|
@ -539,9 +530,6 @@ eb_validate_vma(struct i915_execbuffer *eb,
|
|||
entry->flags |= EXEC_OBJECT_NEEDS_GTT | __EXEC_OBJECT_NEEDS_MAP;
|
||||
}
|
||||
|
||||
if (!(entry->flags & EXEC_OBJECT_PINNED))
|
||||
entry->flags |= eb->context_flags;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -741,17 +729,13 @@ static int eb_select_context(struct i915_execbuffer *eb)
|
|||
struct i915_gem_context *ctx;
|
||||
|
||||
ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1);
|
||||
if (unlikely(!ctx))
|
||||
return -ENOENT;
|
||||
if (unlikely(IS_ERR(ctx)))
|
||||
return PTR_ERR(ctx);
|
||||
|
||||
eb->gem_context = ctx;
|
||||
if (rcu_access_pointer(ctx->vm))
|
||||
eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
|
||||
|
||||
eb->context_flags = 0;
|
||||
if (test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags))
|
||||
eb->context_flags |= __EXEC_OBJECT_NEEDS_BIAS;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -920,6 +904,23 @@ err:
|
|||
return err;
|
||||
}
|
||||
|
||||
static int eb_lock_vmas(struct i915_execbuffer *eb)
|
||||
{
|
||||
unsigned int i;
|
||||
int err;
|
||||
|
||||
for (i = 0; i < eb->buffer_count; i++) {
|
||||
struct eb_vma *ev = &eb->vma[i];
|
||||
struct i915_vma *vma = ev->vma;
|
||||
|
||||
err = i915_gem_object_lock(vma->obj, &eb->ww);
|
||||
if (err)
|
||||
return err;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int eb_validate_vmas(struct i915_execbuffer *eb)
|
||||
{
|
||||
unsigned int i;
|
||||
|
@ -927,15 +928,15 @@ static int eb_validate_vmas(struct i915_execbuffer *eb)
|
|||
|
||||
INIT_LIST_HEAD(&eb->unbound);
|
||||
|
||||
err = eb_lock_vmas(eb);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
for (i = 0; i < eb->buffer_count; i++) {
|
||||
struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
|
||||
struct eb_vma *ev = &eb->vma[i];
|
||||
struct i915_vma *vma = ev->vma;
|
||||
|
||||
err = i915_gem_object_lock(vma->obj, &eb->ww);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
err = eb_pin_vma(eb, entry, ev);
|
||||
if (err == -EDEADLK)
|
||||
return err;
|
||||
|
@ -992,7 +993,7 @@ eb_get_vma(const struct i915_execbuffer *eb, unsigned long handle)
|
|||
}
|
||||
}
|
||||
|
||||
static void eb_release_vmas(struct i915_execbuffer *eb, bool final, bool release_userptr)
|
||||
static void eb_release_vmas(struct i915_execbuffer *eb, bool final)
|
||||
{
|
||||
const unsigned int count = eb->buffer_count;
|
||||
unsigned int i;
|
||||
|
@ -1006,11 +1007,6 @@ static void eb_release_vmas(struct i915_execbuffer *eb, bool final, bool release
|
|||
|
||||
eb_unreserve_vma(ev);
|
||||
|
||||
if (release_userptr && ev->flags & __EXEC_OBJECT_USERPTR_INIT) {
|
||||
ev->flags &= ~__EXEC_OBJECT_USERPTR_INIT;
|
||||
i915_gem_object_userptr_submit_fini(vma->obj);
|
||||
}
|
||||
|
||||
if (final)
|
||||
i915_vma_put(vma);
|
||||
}
|
||||
|
@ -1020,8 +1016,6 @@ static void eb_release_vmas(struct i915_execbuffer *eb, bool final, bool release
|
|||
|
||||
static void eb_destroy(const struct i915_execbuffer *eb)
|
||||
{
|
||||
GEM_BUG_ON(eb->reloc_cache.rq);
|
||||
|
||||
if (eb->lut_size > 0)
|
||||
kfree(eb->buckets);
|
||||
}
|
||||
|
@ -1033,14 +1027,6 @@ relocation_target(const struct drm_i915_gem_relocation_entry *reloc,
|
|||
return gen8_canonical_addr((int)reloc->delta + target->node.start);
|
||||
}
|
||||
|
||||
static void reloc_cache_clear(struct reloc_cache *cache)
|
||||
{
|
||||
cache->rq = NULL;
|
||||
cache->rq_cmd = NULL;
|
||||
cache->pool = NULL;
|
||||
cache->rq_size = 0;
|
||||
}
|
||||
|
||||
static void reloc_cache_init(struct reloc_cache *cache,
|
||||
struct drm_i915_private *i915)
|
||||
{
|
||||
|
@ -1053,7 +1039,6 @@ static void reloc_cache_init(struct reloc_cache *cache,
|
|||
cache->has_fence = cache->graphics_ver < 4;
|
||||
cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
|
||||
cache->node.flags = 0;
|
||||
reloc_cache_clear(cache);
|
||||
}
|
||||
|
||||
static inline void *unmask_page(unsigned long p)
|
||||
|
@ -1075,48 +1060,10 @@ static inline struct i915_ggtt *cache_to_ggtt(struct reloc_cache *cache)
|
|||
return &i915->ggtt;
|
||||
}
|
||||
|
||||
static void reloc_cache_put_pool(struct i915_execbuffer *eb, struct reloc_cache *cache)
|
||||
{
|
||||
if (!cache->pool)
|
||||
return;
|
||||
|
||||
/*
|
||||
* This is a bit nasty, normally we keep objects locked until the end
|
||||
* of execbuffer, but we already submit this, and have to unlock before
|
||||
* dropping the reference. Fortunately we can only hold 1 pool node at
|
||||
* a time, so this should be harmless.
|
||||
*/
|
||||
i915_gem_ww_unlock_single(cache->pool->obj);
|
||||
intel_gt_buffer_pool_put(cache->pool);
|
||||
cache->pool = NULL;
|
||||
}
|
||||
|
||||
static void reloc_gpu_flush(struct i915_execbuffer *eb, struct reloc_cache *cache)
|
||||
{
|
||||
struct drm_i915_gem_object *obj = cache->rq->batch->obj;
|
||||
|
||||
GEM_BUG_ON(cache->rq_size >= obj->base.size / sizeof(u32));
|
||||
cache->rq_cmd[cache->rq_size] = MI_BATCH_BUFFER_END;
|
||||
|
||||
i915_gem_object_flush_map(obj);
|
||||
i915_gem_object_unpin_map(obj);
|
||||
|
||||
intel_gt_chipset_flush(cache->rq->engine->gt);
|
||||
|
||||
i915_request_add(cache->rq);
|
||||
reloc_cache_put_pool(eb, cache);
|
||||
reloc_cache_clear(cache);
|
||||
|
||||
eb->reloc_pool = NULL;
|
||||
}
|
||||
|
||||
static void reloc_cache_reset(struct reloc_cache *cache, struct i915_execbuffer *eb)
|
||||
{
|
||||
void *vaddr;
|
||||
|
||||
if (cache->rq)
|
||||
reloc_gpu_flush(eb, cache);
|
||||
|
||||
if (!cache->vaddr)
|
||||
return;
|
||||
|
||||
|
@ -1298,295 +1245,6 @@ static void clflush_write32(u32 *addr, u32 value, unsigned int flushes)
|
|||
*addr = value;
|
||||
}
|
||||
|
||||
static int reloc_move_to_gpu(struct i915_request *rq, struct i915_vma *vma)
|
||||
{
|
||||
struct drm_i915_gem_object *obj = vma->obj;
|
||||
int err;
|
||||
|
||||
assert_vma_held(vma);
|
||||
|
||||
if (obj->cache_dirty & ~obj->cache_coherent)
|
||||
i915_gem_clflush_object(obj, 0);
|
||||
obj->write_domain = 0;
|
||||
|
||||
err = i915_request_await_object(rq, vma->obj, true);
|
||||
if (err == 0)
|
||||
err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
|
||||
struct intel_engine_cs *engine,
|
||||
struct i915_vma *vma,
|
||||
unsigned int len)
|
||||
{
|
||||
struct reloc_cache *cache = &eb->reloc_cache;
|
||||
struct intel_gt_buffer_pool_node *pool = eb->reloc_pool;
|
||||
struct i915_request *rq;
|
||||
struct i915_vma *batch;
|
||||
u32 *cmd;
|
||||
int err;
|
||||
|
||||
if (!pool) {
|
||||
pool = intel_gt_get_buffer_pool(engine->gt, PAGE_SIZE,
|
||||
cache->has_llc ?
|
||||
I915_MAP_WB :
|
||||
I915_MAP_WC);
|
||||
if (IS_ERR(pool))
|
||||
return PTR_ERR(pool);
|
||||
}
|
||||
eb->reloc_pool = NULL;
|
||||
|
||||
err = i915_gem_object_lock(pool->obj, &eb->ww);
|
||||
if (err)
|
||||
goto err_pool;
|
||||
|
||||
cmd = i915_gem_object_pin_map(pool->obj, pool->type);
|
||||
if (IS_ERR(cmd)) {
|
||||
err = PTR_ERR(cmd);
|
||||
goto err_pool;
|
||||
}
|
||||
intel_gt_buffer_pool_mark_used(pool);
|
||||
|
||||
memset32(cmd, 0, pool->obj->base.size / sizeof(u32));
|
||||
|
||||
batch = i915_vma_instance(pool->obj, vma->vm, NULL);
|
||||
if (IS_ERR(batch)) {
|
||||
err = PTR_ERR(batch);
|
||||
goto err_unmap;
|
||||
}
|
||||
|
||||
err = i915_vma_pin_ww(batch, &eb->ww, 0, 0, PIN_USER | PIN_NONBLOCK);
|
||||
if (err)
|
||||
goto err_unmap;
|
||||
|
||||
if (engine == eb->context->engine) {
|
||||
rq = i915_request_create(eb->context);
|
||||
} else {
|
||||
struct intel_context *ce = eb->reloc_context;
|
||||
|
||||
if (!ce) {
|
||||
ce = intel_context_create(engine);
|
||||
if (IS_ERR(ce)) {
|
||||
err = PTR_ERR(ce);
|
||||
goto err_unpin;
|
||||
}
|
||||
|
||||
i915_vm_put(ce->vm);
|
||||
ce->vm = i915_vm_get(eb->context->vm);
|
||||
eb->reloc_context = ce;
|
||||
}
|
||||
|
||||
err = intel_context_pin_ww(ce, &eb->ww);
|
||||
if (err)
|
||||
goto err_unpin;
|
||||
|
||||
rq = i915_request_create(ce);
|
||||
intel_context_unpin(ce);
|
||||
}
|
||||
if (IS_ERR(rq)) {
|
||||
err = PTR_ERR(rq);
|
||||
goto err_unpin;
|
||||
}
|
||||
|
||||
err = intel_gt_buffer_pool_mark_active(pool, rq);
|
||||
if (err)
|
||||
goto err_request;
|
||||
|
||||
err = reloc_move_to_gpu(rq, vma);
|
||||
if (err)
|
||||
goto err_request;
|
||||
|
||||
err = eb->engine->emit_bb_start(rq,
|
||||
batch->node.start, PAGE_SIZE,
|
||||
cache->graphics_ver > 5 ? 0 : I915_DISPATCH_SECURE);
|
||||
if (err)
|
||||
goto skip_request;
|
||||
|
||||
assert_vma_held(batch);
|
||||
err = i915_request_await_object(rq, batch->obj, false);
|
||||
if (err == 0)
|
||||
err = i915_vma_move_to_active(batch, rq, 0);
|
||||
if (err)
|
||||
goto skip_request;
|
||||
|
||||
rq->batch = batch;
|
||||
i915_vma_unpin(batch);
|
||||
|
||||
cache->rq = rq;
|
||||
cache->rq_cmd = cmd;
|
||||
cache->rq_size = 0;
|
||||
cache->pool = pool;
|
||||
|
||||
/* Return with batch mapping (cmd) still pinned */
|
||||
return 0;
|
||||
|
||||
skip_request:
|
||||
i915_request_set_error_once(rq, err);
|
||||
err_request:
|
||||
i915_request_add(rq);
|
||||
err_unpin:
|
||||
i915_vma_unpin(batch);
|
||||
err_unmap:
|
||||
i915_gem_object_unpin_map(pool->obj);
|
||||
err_pool:
|
||||
eb->reloc_pool = pool;
|
||||
return err;
|
||||
}
|
||||
|
||||
static bool reloc_can_use_engine(const struct intel_engine_cs *engine)
|
||||
{
|
||||
return engine->class != VIDEO_DECODE_CLASS || GRAPHICS_VER(engine->i915) != 6;
|
||||
}
|
||||
|
||||
static u32 *reloc_gpu(struct i915_execbuffer *eb,
|
||||
struct i915_vma *vma,
|
||||
unsigned int len)
|
||||
{
|
||||
struct reloc_cache *cache = &eb->reloc_cache;
|
||||
u32 *cmd;
|
||||
|
||||
if (cache->rq_size > PAGE_SIZE/sizeof(u32) - (len + 1))
|
||||
reloc_gpu_flush(eb, cache);
|
||||
|
||||
if (unlikely(!cache->rq)) {
|
||||
int err;
|
||||
struct intel_engine_cs *engine = eb->engine;
|
||||
|
||||
/* If we need to copy for the cmdparser, we will stall anyway */
|
||||
if (eb_use_cmdparser(eb))
|
||||
return ERR_PTR(-EWOULDBLOCK);
|
||||
|
||||
if (!reloc_can_use_engine(engine)) {
|
||||
engine = engine->gt->engine_class[COPY_ENGINE_CLASS][0];
|
||||
if (!engine)
|
||||
return ERR_PTR(-ENODEV);
|
||||
}
|
||||
|
||||
err = __reloc_gpu_alloc(eb, engine, vma, len);
|
||||
if (unlikely(err))
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
|
||||
cmd = cache->rq_cmd + cache->rq_size;
|
||||
cache->rq_size += len;
|
||||
|
||||
return cmd;
|
||||
}
|
||||
|
||||
static inline bool use_reloc_gpu(struct i915_vma *vma)
|
||||
{
|
||||
if (DBG_FORCE_RELOC == FORCE_GPU_RELOC)
|
||||
return true;
|
||||
|
||||
if (DBG_FORCE_RELOC)
|
||||
return false;
|
||||
|
||||
return !dma_resv_test_signaled(vma->resv, true);
|
||||
}
|
||||
|
||||
static unsigned long vma_phys_addr(struct i915_vma *vma, u32 offset)
|
||||
{
|
||||
struct page *page;
|
||||
unsigned long addr;
|
||||
|
||||
GEM_BUG_ON(vma->pages != vma->obj->mm.pages);
|
||||
|
||||
page = i915_gem_object_get_page(vma->obj, offset >> PAGE_SHIFT);
|
||||
addr = PFN_PHYS(page_to_pfn(page));
|
||||
GEM_BUG_ON(overflows_type(addr, u32)); /* expected dma32 */
|
||||
|
||||
return addr + offset_in_page(offset);
|
||||
}
|
||||
|
||||
static int __reloc_entry_gpu(struct i915_execbuffer *eb,
|
||||
struct i915_vma *vma,
|
||||
u64 offset,
|
||||
u64 target_addr)
|
||||
{
|
||||
const unsigned int ver = eb->reloc_cache.graphics_ver;
|
||||
unsigned int len;
|
||||
u32 *batch;
|
||||
u64 addr;
|
||||
|
||||
if (ver >= 8)
|
||||
len = offset & 7 ? 8 : 5;
|
||||
else if (ver >= 4)
|
||||
len = 4;
|
||||
else
|
||||
len = 3;
|
||||
|
||||
batch = reloc_gpu(eb, vma, len);
|
||||
if (batch == ERR_PTR(-EDEADLK))
|
||||
return -EDEADLK;
|
||||
else if (IS_ERR(batch))
|
||||
return false;
|
||||
|
||||
addr = gen8_canonical_addr(vma->node.start + offset);
|
||||
if (ver >= 8) {
|
||||
if (offset & 7) {
|
||||
*batch++ = MI_STORE_DWORD_IMM_GEN4;
|
||||
*batch++ = lower_32_bits(addr);
|
||||
*batch++ = upper_32_bits(addr);
|
||||
*batch++ = lower_32_bits(target_addr);
|
||||
|
||||
addr = gen8_canonical_addr(addr + 4);
|
||||
|
||||
*batch++ = MI_STORE_DWORD_IMM_GEN4;
|
||||
*batch++ = lower_32_bits(addr);
|
||||
*batch++ = upper_32_bits(addr);
|
||||
*batch++ = upper_32_bits(target_addr);
|
||||
} else {
|
||||
*batch++ = (MI_STORE_DWORD_IMM_GEN4 | (1 << 21)) + 1;
|
||||
*batch++ = lower_32_bits(addr);
|
||||
*batch++ = upper_32_bits(addr);
|
||||
*batch++ = lower_32_bits(target_addr);
|
||||
*batch++ = upper_32_bits(target_addr);
|
||||
}
|
||||
} else if (ver >= 6) {
|
||||
*batch++ = MI_STORE_DWORD_IMM_GEN4;
|
||||
*batch++ = 0;
|
||||
*batch++ = addr;
|
||||
*batch++ = target_addr;
|
||||
} else if (IS_I965G(eb->i915)) {
|
||||
*batch++ = MI_STORE_DWORD_IMM_GEN4;
|
||||
*batch++ = 0;
|
||||
*batch++ = vma_phys_addr(vma, offset);
|
||||
*batch++ = target_addr;
|
||||
} else if (ver >= 4) {
|
||||
*batch++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
|
||||
*batch++ = 0;
|
||||
*batch++ = addr;
|
||||
*batch++ = target_addr;
|
||||
} else if (ver >= 3 &&
|
||||
!(IS_I915G(eb->i915) || IS_I915GM(eb->i915))) {
|
||||
*batch++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
|
||||
*batch++ = addr;
|
||||
*batch++ = target_addr;
|
||||
} else {
|
||||
*batch++ = MI_STORE_DWORD_IMM;
|
||||
*batch++ = vma_phys_addr(vma, offset);
|
||||
*batch++ = target_addr;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static int reloc_entry_gpu(struct i915_execbuffer *eb,
|
||||
struct i915_vma *vma,
|
||||
u64 offset,
|
||||
u64 target_addr)
|
||||
{
|
||||
if (eb->reloc_cache.vaddr)
|
||||
return false;
|
||||
|
||||
if (!use_reloc_gpu(vma))
|
||||
return false;
|
||||
|
||||
return __reloc_entry_gpu(eb, vma, offset, target_addr);
|
||||
}
|
||||
|
||||
static u64
|
||||
relocate_entry(struct i915_vma *vma,
|
||||
const struct drm_i915_gem_relocation_entry *reloc,
|
||||
|
@ -1595,32 +1253,25 @@ relocate_entry(struct i915_vma *vma,
|
|||
{
|
||||
u64 target_addr = relocation_target(reloc, target);
|
||||
u64 offset = reloc->offset;
|
||||
int reloc_gpu = reloc_entry_gpu(eb, vma, offset, target_addr);
|
||||
|
||||
if (reloc_gpu < 0)
|
||||
return reloc_gpu;
|
||||
|
||||
if (!reloc_gpu) {
|
||||
bool wide = eb->reloc_cache.use_64bit_reloc;
|
||||
void *vaddr;
|
||||
bool wide = eb->reloc_cache.use_64bit_reloc;
|
||||
void *vaddr;
|
||||
|
||||
repeat:
|
||||
vaddr = reloc_vaddr(vma->obj, eb,
|
||||
offset >> PAGE_SHIFT);
|
||||
if (IS_ERR(vaddr))
|
||||
return PTR_ERR(vaddr);
|
||||
vaddr = reloc_vaddr(vma->obj, eb,
|
||||
offset >> PAGE_SHIFT);
|
||||
if (IS_ERR(vaddr))
|
||||
return PTR_ERR(vaddr);
|
||||
|
||||
GEM_BUG_ON(!IS_ALIGNED(offset, sizeof(u32)));
|
||||
clflush_write32(vaddr + offset_in_page(offset),
|
||||
lower_32_bits(target_addr),
|
||||
eb->reloc_cache.vaddr);
|
||||
GEM_BUG_ON(!IS_ALIGNED(offset, sizeof(u32)));
|
||||
clflush_write32(vaddr + offset_in_page(offset),
|
||||
lower_32_bits(target_addr),
|
||||
eb->reloc_cache.vaddr);
|
||||
|
||||
if (wide) {
|
||||
offset += sizeof(u32);
|
||||
target_addr >>= 32;
|
||||
wide = false;
|
||||
goto repeat;
|
||||
}
|
||||
if (wide) {
|
||||
offset += sizeof(u32);
|
||||
target_addr >>= 32;
|
||||
wide = false;
|
||||
goto repeat;
|
||||
}
|
||||
|
||||
return target->node.start | UPDATE;
|
||||
|
@ -1992,7 +1643,7 @@ repeat:
|
|||
}
|
||||
|
||||
/* We may process another execbuffer during the unlock... */
|
||||
eb_release_vmas(eb, false, true);
|
||||
eb_release_vmas(eb, false);
|
||||
i915_gem_ww_ctx_fini(&eb->ww);
|
||||
|
||||
if (rq) {
|
||||
|
@ -2061,9 +1712,7 @@ repeat_validate:
|
|||
|
||||
list_for_each_entry(ev, &eb->relocs, reloc_link) {
|
||||
if (!have_copy) {
|
||||
pagefault_disable();
|
||||
err = eb_relocate_vma(eb, ev);
|
||||
pagefault_enable();
|
||||
if (err)
|
||||
break;
|
||||
} else {
|
||||
|
@ -2096,7 +1745,7 @@ repeat_validate:
|
|||
|
||||
err:
|
||||
if (err == -EDEADLK) {
|
||||
eb_release_vmas(eb, false, false);
|
||||
eb_release_vmas(eb, false);
|
||||
err = i915_gem_ww_ctx_backoff(&eb->ww);
|
||||
if (!err)
|
||||
goto repeat_validate;
|
||||
|
@ -2193,7 +1842,7 @@ retry:
|
|||
|
||||
err:
|
||||
if (err == -EDEADLK) {
|
||||
eb_release_vmas(eb, false, false);
|
||||
eb_release_vmas(eb, false);
|
||||
err = i915_gem_ww_ctx_backoff(&eb->ww);
|
||||
if (!err)
|
||||
goto retry;
|
||||
|
@ -2270,7 +1919,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
|
|||
|
||||
#ifdef CONFIG_MMU_NOTIFIER
|
||||
if (!err && (eb->args->flags & __EXEC_USERPTR_USED)) {
|
||||
spin_lock(&eb->i915->mm.notifier_lock);
|
||||
read_lock(&eb->i915->mm.notifier_lock);
|
||||
|
||||
/*
|
||||
* count is always at least 1, otherwise __EXEC_USERPTR_USED
|
||||
|
@ -2288,7 +1937,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
|
|||
break;
|
||||
}
|
||||
|
||||
spin_unlock(&eb->i915->mm.notifier_lock);
|
||||
read_unlock(&eb->i915->mm.notifier_lock);
|
||||
}
|
||||
#endif
|
||||
|
||||
|
@ -3156,8 +2805,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
|
|||
eb.exec = exec;
|
||||
eb.vma = (struct eb_vma *)(exec + args->buffer_count + 1);
|
||||
eb.vma[0].vma = NULL;
|
||||
eb.reloc_pool = eb.batch_pool = NULL;
|
||||
eb.reloc_context = NULL;
|
||||
eb.batch_pool = NULL;
|
||||
|
||||
eb.invalid_flags = __EXEC_OBJECT_UNKNOWN_FLAGS;
|
||||
reloc_cache_init(&eb.reloc_cache, eb.i915);
|
||||
|
@ -3232,7 +2880,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
|
|||
|
||||
err = eb_lookup_vmas(&eb);
|
||||
if (err) {
|
||||
eb_release_vmas(&eb, true, true);
|
||||
eb_release_vmas(&eb, true);
|
||||
goto err_engine;
|
||||
}
|
||||
|
||||
|
@ -3255,9 +2903,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
|
|||
|
||||
batch = eb.batch->vma;
|
||||
|
||||
/* All GPU relocation batches must be submitted prior to the user rq */
|
||||
GEM_BUG_ON(eb.reloc_cache.rq);
|
||||
|
||||
/* Allocate a request for this batch buffer nice and early. */
|
||||
eb.request = i915_request_create(eb.context);
|
||||
if (IS_ERR(eb.request)) {
|
||||
|
@ -3265,11 +2910,20 @@ i915_gem_do_execbuffer(struct drm_device *dev,
|
|||
goto err_vma;
|
||||
}
|
||||
|
||||
if (unlikely(eb.gem_context->syncobj)) {
|
||||
struct dma_fence *fence;
|
||||
|
||||
fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
|
||||
err = i915_request_await_dma_fence(eb.request, fence);
|
||||
dma_fence_put(fence);
|
||||
if (err)
|
||||
goto err_ext;
|
||||
}
|
||||
|
||||
if (in_fence) {
|
||||
if (args->flags & I915_EXEC_FENCE_SUBMIT)
|
||||
err = i915_request_await_execution(eb.request,
|
||||
in_fence,
|
||||
eb.engine->bond_execute);
|
||||
in_fence);
|
||||
else
|
||||
err = i915_request_await_dma_fence(eb.request,
|
||||
in_fence);
|
||||
|
@ -3322,10 +2976,16 @@ err_request:
|
|||
fput(out_fence->file);
|
||||
}
|
||||
}
|
||||
|
||||
if (unlikely(eb.gem_context->syncobj)) {
|
||||
drm_syncobj_replace_fence(eb.gem_context->syncobj,
|
||||
&eb.request->fence);
|
||||
}
|
||||
|
||||
i915_request_put(eb.request);
|
||||
|
||||
err_vma:
|
||||
eb_release_vmas(&eb, true, true);
|
||||
eb_release_vmas(&eb, true);
|
||||
if (eb.trampoline)
|
||||
i915_vma_unpin(eb.trampoline);
|
||||
WARN_ON(err == -EDEADLK);
|
||||
|
@ -3333,10 +2993,6 @@ err_vma:
|
|||
|
||||
if (eb.batch_pool)
|
||||
intel_gt_buffer_pool_put(eb.batch_pool);
|
||||
if (eb.reloc_pool)
|
||||
intel_gt_buffer_pool_put(eb.reloc_pool);
|
||||
if (eb.reloc_context)
|
||||
intel_context_put(eb.reloc_context);
|
||||
err_engine:
|
||||
eb_put_engine(&eb);
|
||||
err_context:
|
||||
|
@ -3450,7 +3106,3 @@ end:;
|
|||
kvfree(exec2_list);
|
||||
return err;
|
||||
}
|
||||
|
||||
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
|
||||
#include "selftests/i915_gem_execbuffer.c"
|
||||
#endif
|
||||
|
|
|
@ -177,8 +177,8 @@ i915_gem_object_create_internal(struct drm_i915_private *i915,
|
|||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
drm_gem_private_object_init(&i915->drm, &obj->base, size);
|
||||
i915_gem_object_init(obj, &i915_gem_object_internal_ops, &lock_class,
|
||||
I915_BO_ALLOC_STRUCT_PAGE);
|
||||
i915_gem_object_init(obj, &i915_gem_object_internal_ops, &lock_class, 0);
|
||||
obj->mem_flags |= I915_BO_FLAG_STRUCT_PAGE;
|
||||
|
||||
/*
|
||||
* Mark the object as volatile, such that the pages are marked as
|
||||
|
|
|
@ -23,27 +23,6 @@ i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
|
|||
return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
|
||||
}
|
||||
|
||||
/**
|
||||
* i915_gem_object_validates_to_lmem - Whether the object is resident in
|
||||
* lmem when pages are present.
|
||||
* @obj: The object to check.
|
||||
*
|
||||
* Migratable objects residency may change from under us if the object is
|
||||
* not pinned or locked. This function is intended to be used to check whether
|
||||
* the object can only reside in lmem when pages are present.
|
||||
*
|
||||
* Return: Whether the object is always resident in lmem when pages are
|
||||
* present.
|
||||
*/
|
||||
bool i915_gem_object_validates_to_lmem(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
struct intel_memory_region *mr = READ_ONCE(obj->mm.region);
|
||||
|
||||
return !i915_gem_object_migratable(obj) &&
|
||||
mr && (mr->type == INTEL_MEMORY_LOCAL ||
|
||||
mr->type == INTEL_MEMORY_STOLEN_LOCAL);
|
||||
}
|
||||
|
||||
/**
|
||||
* i915_gem_object_is_lmem - Whether the object is resident in
|
||||
* lmem
|
||||
|
@ -71,11 +50,64 @@ bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
|
|||
mr->type == INTEL_MEMORY_STOLEN_LOCAL);
|
||||
}
|
||||
|
||||
/**
|
||||
* __i915_gem_object_is_lmem - Whether the object is resident in
|
||||
* lmem while in the fence signaling critical path.
|
||||
* @obj: The object to check.
|
||||
*
|
||||
* This function is intended to be called from within the fence signaling
|
||||
* path where the fence keeps the object from being migrated. For example
|
||||
* during gpu reset or similar.
|
||||
*
|
||||
* Return: Whether the object is resident in lmem.
|
||||
*/
|
||||
bool __i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
struct intel_memory_region *mr = READ_ONCE(obj->mm.region);
|
||||
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
GEM_WARN_ON(dma_resv_test_signaled(obj->base.resv, true));
|
||||
#endif
|
||||
return mr && (mr->type == INTEL_MEMORY_LOCAL ||
|
||||
mr->type == INTEL_MEMORY_STOLEN_LOCAL);
|
||||
}
|
||||
|
||||
/**
|
||||
* __i915_gem_object_create_lmem_with_ps - Create lmem object and force the
|
||||
* minimum page size for the backing pages.
|
||||
* @i915: The i915 instance.
|
||||
* @size: The size in bytes for the object. Note that we need to round the size
|
||||
* up depending on the @page_size. The final object size can be fished out from
|
||||
* the drm GEM object.
|
||||
* @page_size: The requested minimum page size in bytes for this object. This is
|
||||
* useful if we need something bigger than the regions min_page_size due to some
|
||||
* hw restriction, or in some very specialised cases where it needs to be
|
||||
* smaller, where the internal fragmentation cost is too great when rounding up
|
||||
* the object size.
|
||||
* @flags: The optional BO allocation flags.
|
||||
*
|
||||
* Note that this interface assumes you know what you are doing when forcing the
|
||||
* @page_size. If this is smaller than the regions min_page_size then it can
|
||||
* never be inserted into any GTT, otherwise it might lead to undefined
|
||||
* behaviour.
|
||||
*
|
||||
* Return: The object pointer, which might be an ERR_PTR in the case of failure.
|
||||
*/
|
||||
struct drm_i915_gem_object *
|
||||
__i915_gem_object_create_lmem_with_ps(struct drm_i915_private *i915,
|
||||
resource_size_t size,
|
||||
resource_size_t page_size,
|
||||
unsigned int flags)
|
||||
{
|
||||
return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_LMEM],
|
||||
size, page_size, flags);
|
||||
}
|
||||
|
||||
struct drm_i915_gem_object *
|
||||
i915_gem_object_create_lmem(struct drm_i915_private *i915,
|
||||
resource_size_t size,
|
||||
unsigned int flags)
|
||||
{
|
||||
return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_LMEM],
|
||||
size, flags);
|
||||
size, 0, flags);
|
||||
}
|
||||
|
|
|
@ -21,6 +21,13 @@ i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
|
|||
|
||||
bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
|
||||
|
||||
bool __i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
|
||||
|
||||
struct drm_i915_gem_object *
|
||||
__i915_gem_object_create_lmem_with_ps(struct drm_i915_private *i915,
|
||||
resource_size_t size,
|
||||
resource_size_t page_size,
|
||||
unsigned int flags);
|
||||
struct drm_i915_gem_object *
|
||||
i915_gem_object_create_lmem(struct drm_i915_private *i915,
|
||||
resource_size_t size,
|
||||
|
|
|
@ -645,7 +645,8 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
|
|||
goto insert;
|
||||
|
||||
/* Attempt to reap some mmap space from dead objects */
|
||||
err = intel_gt_retire_requests_timeout(&i915->gt, MAX_SCHEDULE_TIMEOUT);
|
||||
err = intel_gt_retire_requests_timeout(&i915->gt, MAX_SCHEDULE_TIMEOUT,
|
||||
NULL);
|
||||
if (err)
|
||||
goto err;
|
||||
|
||||
|
@ -679,13 +680,19 @@ __assign_mmap_offset(struct drm_i915_gem_object *obj,
|
|||
return -ENODEV;
|
||||
|
||||
if (obj->ops->mmap_offset) {
|
||||
if (mmap_type != I915_MMAP_TYPE_FIXED)
|
||||
return -ENODEV;
|
||||
|
||||
*offset = obj->ops->mmap_offset(obj);
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (mmap_type == I915_MMAP_TYPE_FIXED)
|
||||
return -ENODEV;
|
||||
|
||||
if (mmap_type != I915_MMAP_TYPE_GTT &&
|
||||
!i915_gem_object_has_struct_page(obj) &&
|
||||
!i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM))
|
||||
!i915_gem_object_has_iomem(obj))
|
||||
return -ENODEV;
|
||||
|
||||
mmo = mmap_offset_attach(obj, mmap_type, file);
|
||||
|
@ -709,7 +716,12 @@ __assign_mmap_offset_handle(struct drm_file *file,
|
|||
if (!obj)
|
||||
return -ENOENT;
|
||||
|
||||
err = i915_gem_object_lock_interruptible(obj, NULL);
|
||||
if (err)
|
||||
goto out_put;
|
||||
err = __assign_mmap_offset(obj, mmap_type, offset, file);
|
||||
i915_gem_object_unlock(obj);
|
||||
out_put:
|
||||
i915_gem_object_put(obj);
|
||||
return err;
|
||||
}
|
||||
|
@ -722,7 +734,9 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
|
|||
{
|
||||
enum i915_mmap_type mmap_type;
|
||||
|
||||
if (boot_cpu_has(X86_FEATURE_PAT))
|
||||
if (HAS_LMEM(to_i915(dev)))
|
||||
mmap_type = I915_MMAP_TYPE_FIXED;
|
||||
else if (boot_cpu_has(X86_FEATURE_PAT))
|
||||
mmap_type = I915_MMAP_TYPE_WC;
|
||||
else if (!i915_ggtt_has_aperture(&to_i915(dev)->ggtt))
|
||||
return -ENODEV;
|
||||
|
@ -793,6 +807,10 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
|
|||
type = I915_MMAP_TYPE_UC;
|
||||
break;
|
||||
|
||||
case I915_MMAP_OFFSET_FIXED:
|
||||
type = I915_MMAP_TYPE_FIXED;
|
||||
break;
|
||||
|
||||
default:
|
||||
return -EINVAL;
|
||||
}
|
||||
|
@ -933,10 +951,7 @@ int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma)
|
|||
return PTR_ERR(anon);
|
||||
}
|
||||
|
||||
vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
|
||||
|
||||
if (i915_gem_object_has_iomem(obj))
|
||||
vma->vm_flags |= VM_IO;
|
||||
vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO;
|
||||
|
||||
/*
|
||||
* We keep the ref on mmo->obj, not vm_file, but we require
|
||||
|
@ -966,6 +981,9 @@ int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma)
|
|||
vma->vm_ops = &vm_ops_cpu;
|
||||
break;
|
||||
|
||||
case I915_MMAP_TYPE_FIXED:
|
||||
GEM_WARN_ON(1);
|
||||
fallthrough;
|
||||
case I915_MMAP_TYPE_WB:
|
||||
vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
|
||||
vma->vm_ops = &vm_ops_cpu;
|
||||
|
|
|
@ -30,14 +30,10 @@
|
|||
#include "i915_gem_context.h"
|
||||
#include "i915_gem_mman.h"
|
||||
#include "i915_gem_object.h"
|
||||
#include "i915_globals.h"
|
||||
#include "i915_memcpy.h"
|
||||
#include "i915_trace.h"
|
||||
|
||||
static struct i915_global_object {
|
||||
struct i915_global base;
|
||||
struct kmem_cache *slab_objects;
|
||||
} global;
|
||||
static struct kmem_cache *slab_objects;
|
||||
|
||||
static const struct drm_gem_object_funcs i915_gem_object_funcs;
|
||||
|
||||
|
@ -45,7 +41,7 @@ struct drm_i915_gem_object *i915_gem_object_alloc(void)
|
|||
{
|
||||
struct drm_i915_gem_object *obj;
|
||||
|
||||
obj = kmem_cache_zalloc(global.slab_objects, GFP_KERNEL);
|
||||
obj = kmem_cache_zalloc(slab_objects, GFP_KERNEL);
|
||||
if (!obj)
|
||||
return NULL;
|
||||
obj->base.funcs = &i915_gem_object_funcs;
|
||||
|
@ -55,7 +51,7 @@ struct drm_i915_gem_object *i915_gem_object_alloc(void)
|
|||
|
||||
void i915_gem_object_free(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
return kmem_cache_free(global.slab_objects, obj);
|
||||
return kmem_cache_free(slab_objects, obj);
|
||||
}
|
||||
|
||||
void i915_gem_object_init(struct drm_i915_gem_object *obj,
|
||||
|
@ -475,34 +471,200 @@ bool i915_gem_object_migratable(struct drm_i915_gem_object *obj)
|
|||
return obj->mm.n_placements > 1;
|
||||
}
|
||||
|
||||
/**
|
||||
* i915_gem_object_has_struct_page - Whether the object is page-backed
|
||||
* @obj: The object to query.
|
||||
*
|
||||
* This function should only be called while the object is locked or pinned,
|
||||
* otherwise the page backing may change under the caller.
|
||||
*
|
||||
* Return: True if page-backed, false otherwise.
|
||||
*/
|
||||
bool i915_gem_object_has_struct_page(const struct drm_i915_gem_object *obj)
|
||||
{
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
if (IS_DGFX(to_i915(obj->base.dev)) &&
|
||||
i915_gem_object_evictable((void __force *)obj))
|
||||
assert_object_held_shared(obj);
|
||||
#endif
|
||||
return obj->mem_flags & I915_BO_FLAG_STRUCT_PAGE;
|
||||
}
|
||||
|
||||
/**
|
||||
* i915_gem_object_has_iomem - Whether the object is iomem-backed
|
||||
* @obj: The object to query.
|
||||
*
|
||||
* This function should only be called while the object is locked or pinned,
|
||||
* otherwise the iomem backing may change under the caller.
|
||||
*
|
||||
* Return: True if iomem-backed, false otherwise.
|
||||
*/
|
||||
bool i915_gem_object_has_iomem(const struct drm_i915_gem_object *obj)
|
||||
{
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
if (IS_DGFX(to_i915(obj->base.dev)) &&
|
||||
i915_gem_object_evictable((void __force *)obj))
|
||||
assert_object_held_shared(obj);
|
||||
#endif
|
||||
return obj->mem_flags & I915_BO_FLAG_IOMEM;
|
||||
}
|
||||
|
||||
/**
|
||||
* i915_gem_object_can_migrate - Whether an object likely can be migrated
|
||||
*
|
||||
* @obj: The object to migrate
|
||||
* @id: The region intended to migrate to
|
||||
*
|
||||
* Check whether the object backend supports migration to the
|
||||
* given region. Note that pinning may affect the ability to migrate as
|
||||
* returned by this function.
|
||||
*
|
||||
* This function is primarily intended as a helper for checking the
|
||||
* possibility to migrate objects and might be slightly less permissive
|
||||
* than i915_gem_object_migrate() when it comes to objects with the
|
||||
* I915_BO_ALLOC_USER flag set.
|
||||
*
|
||||
* Return: true if migration is possible, false otherwise.
|
||||
*/
|
||||
bool i915_gem_object_can_migrate(struct drm_i915_gem_object *obj,
|
||||
enum intel_region_id id)
|
||||
{
|
||||
struct drm_i915_private *i915 = to_i915(obj->base.dev);
|
||||
unsigned int num_allowed = obj->mm.n_placements;
|
||||
struct intel_memory_region *mr;
|
||||
unsigned int i;
|
||||
|
||||
GEM_BUG_ON(id >= INTEL_REGION_UNKNOWN);
|
||||
GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
|
||||
|
||||
mr = i915->mm.regions[id];
|
||||
if (!mr)
|
||||
return false;
|
||||
|
||||
if (obj->mm.region == mr)
|
||||
return true;
|
||||
|
||||
if (!i915_gem_object_evictable(obj))
|
||||
return false;
|
||||
|
||||
if (!obj->ops->migrate)
|
||||
return false;
|
||||
|
||||
if (!(obj->flags & I915_BO_ALLOC_USER))
|
||||
return true;
|
||||
|
||||
if (num_allowed == 0)
|
||||
return false;
|
||||
|
||||
for (i = 0; i < num_allowed; ++i) {
|
||||
if (mr == obj->mm.placements[i])
|
||||
return true;
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
/**
|
||||
* i915_gem_object_migrate - Migrate an object to the desired region id
|
||||
* @obj: The object to migrate.
|
||||
* @ww: An optional struct i915_gem_ww_ctx. If NULL, the backend may
|
||||
* not be successful in evicting other objects to make room for this object.
|
||||
* @id: The region id to migrate to.
|
||||
*
|
||||
* Attempt to migrate the object to the desired memory region. The
|
||||
* object backend must support migration and the object may not be
|
||||
* pinned, (explicitly pinned pages or pinned vmas). The object must
|
||||
* be locked.
|
||||
* On successful completion, the object will have pages pointing to
|
||||
* memory in the new region, but an async migration task may not have
|
||||
* completed yet, and to accomplish that, i915_gem_object_wait_migration()
|
||||
* must be called.
|
||||
*
|
||||
* Note: the @ww parameter is not used yet, but included to make sure
|
||||
* callers put some effort into obtaining a valid ww ctx if one is
|
||||
* available.
|
||||
*
|
||||
* Return: 0 on success. Negative error code on failure. In particular may
|
||||
* return -ENXIO on lack of region space, -EDEADLK for deadlock avoidance
|
||||
* if @ww is set, -EINTR or -ERESTARTSYS if signal pending, and
|
||||
* -EBUSY if the object is pinned.
|
||||
*/
|
||||
int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
|
||||
struct i915_gem_ww_ctx *ww,
|
||||
enum intel_region_id id)
|
||||
{
|
||||
struct drm_i915_private *i915 = to_i915(obj->base.dev);
|
||||
struct intel_memory_region *mr;
|
||||
|
||||
GEM_BUG_ON(id >= INTEL_REGION_UNKNOWN);
|
||||
GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
|
||||
assert_object_held(obj);
|
||||
|
||||
mr = i915->mm.regions[id];
|
||||
GEM_BUG_ON(!mr);
|
||||
|
||||
if (!i915_gem_object_can_migrate(obj, id))
|
||||
return -EINVAL;
|
||||
|
||||
if (!obj->ops->migrate) {
|
||||
if (GEM_WARN_ON(obj->mm.region != mr))
|
||||
return -EINVAL;
|
||||
return 0;
|
||||
}
|
||||
|
||||
return obj->ops->migrate(obj, mr);
|
||||
}
|
||||
|
||||
/**
|
||||
* i915_gem_object_placement_possible - Check whether the object can be
|
||||
* placed at certain memory type
|
||||
* @obj: Pointer to the object
|
||||
* @type: The memory type to check
|
||||
*
|
||||
* Return: True if the object can be placed in @type. False otherwise.
|
||||
*/
|
||||
bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
|
||||
enum intel_memory_type type)
|
||||
{
|
||||
unsigned int i;
|
||||
|
||||
if (!obj->mm.n_placements) {
|
||||
switch (type) {
|
||||
case INTEL_MEMORY_LOCAL:
|
||||
return i915_gem_object_has_iomem(obj);
|
||||
case INTEL_MEMORY_SYSTEM:
|
||||
return i915_gem_object_has_pages(obj);
|
||||
default:
|
||||
/* Ignore stolen for now */
|
||||
GEM_BUG_ON(1);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
for (i = 0; i < obj->mm.n_placements; i++) {
|
||||
if (obj->mm.placements[i]->type == type)
|
||||
return true;
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
void i915_gem_init__objects(struct drm_i915_private *i915)
|
||||
{
|
||||
INIT_WORK(&i915->mm.free_work, __i915_gem_free_work);
|
||||
}
|
||||
|
||||
static void i915_global_objects_shrink(void)
|
||||
void i915_objects_module_exit(void)
|
||||
{
|
||||
kmem_cache_shrink(global.slab_objects);
|
||||
kmem_cache_destroy(slab_objects);
|
||||
}
|
||||
|
||||
static void i915_global_objects_exit(void)
|
||||
int __init i915_objects_module_init(void)
|
||||
{
|
||||
kmem_cache_destroy(global.slab_objects);
|
||||
}
|
||||
|
||||
static struct i915_global_object global = { {
|
||||
.shrink = i915_global_objects_shrink,
|
||||
.exit = i915_global_objects_exit,
|
||||
} };
|
||||
|
||||
int __init i915_global_objects_init(void)
|
||||
{
|
||||
global.slab_objects =
|
||||
KMEM_CACHE(drm_i915_gem_object, SLAB_HWCACHE_ALIGN);
|
||||
if (!global.slab_objects)
|
||||
slab_objects = KMEM_CACHE(drm_i915_gem_object, SLAB_HWCACHE_ALIGN);
|
||||
if (!slab_objects)
|
||||
return -ENOMEM;
|
||||
|
||||
i915_global_register(&global.base);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -515,6 +677,7 @@ static const struct drm_gem_object_funcs i915_gem_object_funcs = {
|
|||
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
|
||||
#include "selftests/huge_gem_object.c"
|
||||
#include "selftests/huge_pages.c"
|
||||
#include "selftests/i915_gem_migrate.c"
|
||||
#include "selftests/i915_gem_object.c"
|
||||
#include "selftests/i915_gem_coherency.c"
|
||||
#endif
|
||||
|
|
|
@ -12,10 +12,14 @@
|
|||
#include <drm/drm_device.h>
|
||||
|
||||
#include "display/intel_frontbuffer.h"
|
||||
#include "intel_memory_region.h"
|
||||
#include "i915_gem_object_types.h"
|
||||
#include "i915_gem_gtt.h"
|
||||
#include "i915_gem_ww.h"
|
||||
#include "i915_vma_types.h"
|
||||
|
||||
enum intel_region_id;
|
||||
|
||||
/*
|
||||
* XXX: There is a prevalence of the assumption that we fit the
|
||||
* object's page count inside a 32bit _signed_ variable. Let's document
|
||||
|
@ -44,6 +48,9 @@ static inline bool i915_gem_object_size_2big(u64 size)
|
|||
|
||||
void i915_gem_init__objects(struct drm_i915_private *i915);
|
||||
|
||||
void i915_objects_module_exit(void);
|
||||
int i915_objects_module_init(void);
|
||||
|
||||
struct drm_i915_gem_object *i915_gem_object_alloc(void);
|
||||
void i915_gem_object_free(struct drm_i915_gem_object *obj);
|
||||
|
||||
|
@ -57,6 +64,10 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915,
|
|||
struct drm_i915_gem_object *
|
||||
i915_gem_object_create_shmem_from_data(struct drm_i915_private *i915,
|
||||
const void *data, resource_size_t size);
|
||||
struct drm_i915_gem_object *
|
||||
__i915_gem_object_create_user(struct drm_i915_private *i915, u64 size,
|
||||
struct intel_memory_region **placements,
|
||||
unsigned int n_placements);
|
||||
|
||||
extern const struct drm_i915_gem_object_ops i915_gem_shmem_ops;
|
||||
|
||||
|
@ -147,7 +158,7 @@ i915_gem_object_put(struct drm_i915_gem_object *obj)
|
|||
/*
|
||||
* If more than one potential simultaneous locker, assert held.
|
||||
*/
|
||||
static inline void assert_object_held_shared(struct drm_i915_gem_object *obj)
|
||||
static inline void assert_object_held_shared(const struct drm_i915_gem_object *obj)
|
||||
{
|
||||
/*
|
||||
* Note mm list lookup is protected by
|
||||
|
@ -169,13 +180,17 @@ static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj,
|
|||
else
|
||||
ret = dma_resv_lock(obj->base.resv, ww ? &ww->ctx : NULL);
|
||||
|
||||
if (!ret && ww)
|
||||
if (!ret && ww) {
|
||||
i915_gem_object_get(obj);
|
||||
list_add_tail(&obj->obj_link, &ww->obj_list);
|
||||
}
|
||||
if (ret == -EALREADY)
|
||||
ret = 0;
|
||||
|
||||
if (ret == -EDEADLK)
|
||||
if (ret == -EDEADLK) {
|
||||
i915_gem_object_get(obj);
|
||||
ww->contended = obj;
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
@ -261,17 +276,9 @@ i915_gem_object_type_has(const struct drm_i915_gem_object *obj,
|
|||
return obj->ops->flags & flags;
|
||||
}
|
||||
|
||||
static inline bool
|
||||
i915_gem_object_has_struct_page(const struct drm_i915_gem_object *obj)
|
||||
{
|
||||
return obj->flags & I915_BO_ALLOC_STRUCT_PAGE;
|
||||
}
|
||||
bool i915_gem_object_has_struct_page(const struct drm_i915_gem_object *obj);
|
||||
|
||||
static inline bool
|
||||
i915_gem_object_has_iomem(const struct drm_i915_gem_object *obj)
|
||||
{
|
||||
return i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM);
|
||||
}
|
||||
bool i915_gem_object_has_iomem(const struct drm_i915_gem_object *obj);
|
||||
|
||||
static inline bool
|
||||
i915_gem_object_is_shrinkable(const struct drm_i915_gem_object *obj)
|
||||
|
@ -342,22 +349,22 @@ struct scatterlist *
|
|||
__i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
|
||||
struct i915_gem_object_page_iter *iter,
|
||||
unsigned int n,
|
||||
unsigned int *offset, bool allow_alloc, bool dma);
|
||||
unsigned int *offset, bool dma);
|
||||
|
||||
static inline struct scatterlist *
|
||||
i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
|
||||
unsigned int n,
|
||||
unsigned int *offset, bool allow_alloc)
|
||||
unsigned int *offset)
|
||||
{
|
||||
return __i915_gem_object_get_sg(obj, &obj->mm.get_page, n, offset, allow_alloc, false);
|
||||
return __i915_gem_object_get_sg(obj, &obj->mm.get_page, n, offset, false);
|
||||
}
|
||||
|
||||
static inline struct scatterlist *
|
||||
i915_gem_object_get_sg_dma(struct drm_i915_gem_object *obj,
|
||||
unsigned int n,
|
||||
unsigned int *offset, bool allow_alloc)
|
||||
unsigned int *offset)
|
||||
{
|
||||
return __i915_gem_object_get_sg(obj, &obj->mm.get_dma_page, n, offset, allow_alloc, true);
|
||||
return __i915_gem_object_get_sg(obj, &obj->mm.get_dma_page, n, offset, true);
|
||||
}
|
||||
|
||||
struct page *
|
||||
|
@ -598,7 +605,18 @@ bool i915_gem_object_evictable(struct drm_i915_gem_object *obj);
|
|||
|
||||
bool i915_gem_object_migratable(struct drm_i915_gem_object *obj);
|
||||
|
||||
bool i915_gem_object_validates_to_lmem(struct drm_i915_gem_object *obj);
|
||||
int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
|
||||
struct i915_gem_ww_ctx *ww,
|
||||
enum intel_region_id id);
|
||||
|
||||
bool i915_gem_object_can_migrate(struct drm_i915_gem_object *obj,
|
||||
enum intel_region_id id);
|
||||
|
||||
int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,
|
||||
unsigned int flags);
|
||||
|
||||
bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
|
||||
enum intel_memory_type type);
|
||||
|
||||
#ifdef CONFIG_MMU_NOTIFIER
|
||||
static inline bool
|
||||
|
@ -609,14 +627,12 @@ i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
|
|||
|
||||
int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj);
|
||||
int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj);
|
||||
void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj);
|
||||
int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj);
|
||||
#else
|
||||
static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) { return false; }
|
||||
|
||||
static inline int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); return -ENODEV; }
|
||||
static inline int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); return -ENODEV; }
|
||||
static inline void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); }
|
||||
static inline int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); return -ENODEV; }
|
||||
|
||||
#endif
|
||||
|
|
|
@ -1,461 +0,0 @@
|
|||
// SPDX-License-Identifier: MIT
|
||||
/*
|
||||
* Copyright © 2019 Intel Corporation
|
||||
*/
|
||||
|
||||
#include "i915_drv.h"
|
||||
#include "gt/intel_context.h"
|
||||
#include "gt/intel_engine_pm.h"
|
||||
#include "gt/intel_gpu_commands.h"
|
||||
#include "gt/intel_gt.h"
|
||||
#include "gt/intel_gt_buffer_pool.h"
|
||||
#include "gt/intel_ring.h"
|
||||
#include "i915_gem_clflush.h"
|
||||
#include "i915_gem_object_blt.h"
|
||||
|
||||
struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
|
||||
struct i915_vma *vma,
|
||||
struct i915_gem_ww_ctx *ww,
|
||||
u32 value)
|
||||
{
|
||||
struct drm_i915_private *i915 = ce->vm->i915;
|
||||
const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
|
||||
struct intel_gt_buffer_pool_node *pool;
|
||||
struct i915_vma *batch;
|
||||
u64 offset;
|
||||
u64 count;
|
||||
u64 rem;
|
||||
u32 size;
|
||||
u32 *cmd;
|
||||
int err;
|
||||
|
||||
GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
|
||||
intel_engine_pm_get(ce->engine);
|
||||
|
||||
count = div_u64(round_up(vma->size, block_size), block_size);
|
||||
size = (1 + 8 * count) * sizeof(u32);
|
||||
size = round_up(size, PAGE_SIZE);
|
||||
pool = intel_gt_get_buffer_pool(ce->engine->gt, size, I915_MAP_WC);
|
||||
if (IS_ERR(pool)) {
|
||||
err = PTR_ERR(pool);
|
||||
goto out_pm;
|
||||
}
|
||||
|
||||
err = i915_gem_object_lock(pool->obj, ww);
|
||||
if (err)
|
||||
goto out_put;
|
||||
|
||||
batch = i915_vma_instance(pool->obj, ce->vm, NULL);
|
||||
if (IS_ERR(batch)) {
|
||||
err = PTR_ERR(batch);
|
||||
goto out_put;
|
||||
}
|
||||
|
||||
err = i915_vma_pin_ww(batch, ww, 0, 0, PIN_USER);
|
||||
if (unlikely(err))
|
||||
goto out_put;
|
||||
|
||||
/* we pinned the pool, mark it as such */
|
||||
intel_gt_buffer_pool_mark_used(pool);
|
||||
|
||||
cmd = i915_gem_object_pin_map(pool->obj, pool->type);
|
||||
if (IS_ERR(cmd)) {
|
||||
err = PTR_ERR(cmd);
|
||||
goto out_unpin;
|
||||
}
|
||||
|
||||
rem = vma->size;
|
||||
offset = vma->node.start;
|
||||
|
||||
do {
|
||||
u32 size = min_t(u64, rem, block_size);
|
||||
|
||||
GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
|
||||
|
||||
if (GRAPHICS_VER(i915) >= 8) {
|
||||
*cmd++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2);
|
||||
*cmd++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
|
||||
*cmd++ = 0;
|
||||
*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
|
||||
*cmd++ = lower_32_bits(offset);
|
||||
*cmd++ = upper_32_bits(offset);
|
||||
*cmd++ = value;
|
||||
} else {
|
||||
*cmd++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
|
||||
*cmd++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
|
||||
*cmd++ = 0;
|
||||
*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
|
||||
*cmd++ = offset;
|
||||
*cmd++ = value;
|
||||
}
|
||||
|
||||
/* Allow ourselves to be preempted in between blocks. */
|
||||
*cmd++ = MI_ARB_CHECK;
|
||||
|
||||
offset += size;
|
||||
rem -= size;
|
||||
} while (rem);
|
||||
|
||||
*cmd = MI_BATCH_BUFFER_END;
|
||||
|
||||
i915_gem_object_flush_map(pool->obj);
|
||||
i915_gem_object_unpin_map(pool->obj);
|
||||
|
||||
intel_gt_chipset_flush(ce->vm->gt);
|
||||
|
||||
batch->private = pool;
|
||||
return batch;
|
||||
|
||||
out_unpin:
|
||||
i915_vma_unpin(batch);
|
||||
out_put:
|
||||
intel_gt_buffer_pool_put(pool);
|
||||
out_pm:
|
||||
intel_engine_pm_put(ce->engine);
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
|
||||
int intel_emit_vma_mark_active(struct i915_vma *vma, struct i915_request *rq)
|
||||
{
|
||||
int err;
|
||||
|
||||
err = i915_request_await_object(rq, vma->obj, false);
|
||||
if (err == 0)
|
||||
err = i915_vma_move_to_active(vma, rq, 0);
|
||||
if (unlikely(err))
|
||||
return err;
|
||||
|
||||
return intel_gt_buffer_pool_mark_active(vma->private, rq);
|
||||
}
|
||||
|
||||
void intel_emit_vma_release(struct intel_context *ce, struct i915_vma *vma)
|
||||
{
|
||||
i915_vma_unpin(vma);
|
||||
intel_gt_buffer_pool_put(vma->private);
|
||||
intel_engine_pm_put(ce->engine);
|
||||
}
|
||||
|
||||
static int
|
||||
move_obj_to_gpu(struct drm_i915_gem_object *obj,
|
||||
struct i915_request *rq,
|
||||
bool write)
|
||||
{
|
||||
if (obj->cache_dirty & ~obj->cache_coherent)
|
||||
i915_gem_clflush_object(obj, 0);
|
||||
|
||||
return i915_request_await_object(rq, obj, write);
|
||||
}
|
||||
|
||||
int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
|
||||
struct intel_context *ce,
|
||||
u32 value)
|
||||
{
|
||||
struct i915_gem_ww_ctx ww;
|
||||
struct i915_request *rq;
|
||||
struct i915_vma *batch;
|
||||
struct i915_vma *vma;
|
||||
int err;
|
||||
|
||||
vma = i915_vma_instance(obj, ce->vm, NULL);
|
||||
if (IS_ERR(vma))
|
||||
return PTR_ERR(vma);
|
||||
|
||||
i915_gem_ww_ctx_init(&ww, true);
|
||||
intel_engine_pm_get(ce->engine);
|
||||
retry:
|
||||
err = i915_gem_object_lock(obj, &ww);
|
||||
if (err)
|
||||
goto out;
|
||||
|
||||
err = intel_context_pin_ww(ce, &ww);
|
||||
if (err)
|
||||
goto out;
|
||||
|
||||
err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER);
|
||||
if (err)
|
||||
goto out_ctx;
|
||||
|
||||
batch = intel_emit_vma_fill_blt(ce, vma, &ww, value);
|
||||
if (IS_ERR(batch)) {
|
||||
err = PTR_ERR(batch);
|
||||
goto out_vma;
|
||||
}
|
||||
|
||||
rq = i915_request_create(ce);
|
||||
if (IS_ERR(rq)) {
|
||||
err = PTR_ERR(rq);
|
||||
goto out_batch;
|
||||
}
|
||||
|
||||
err = intel_emit_vma_mark_active(batch, rq);
|
||||
if (unlikely(err))
|
||||
goto out_request;
|
||||
|
||||
err = move_obj_to_gpu(vma->obj, rq, true);
|
||||
if (err == 0)
|
||||
err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
|
||||
if (unlikely(err))
|
||||
goto out_request;
|
||||
|
||||
if (ce->engine->emit_init_breadcrumb)
|
||||
err = ce->engine->emit_init_breadcrumb(rq);
|
||||
|
||||
if (likely(!err))
|
||||
err = ce->engine->emit_bb_start(rq,
|
||||
batch->node.start,
|
||||
batch->node.size,
|
||||
0);
|
||||
out_request:
|
||||
if (unlikely(err))
|
||||
i915_request_set_error_once(rq, err);
|
||||
|
||||
i915_request_add(rq);
|
||||
out_batch:
|
||||
intel_emit_vma_release(ce, batch);
|
||||
out_vma:
|
||||
i915_vma_unpin(vma);
|
||||
out_ctx:
|
||||
intel_context_unpin(ce);
|
||||
out:
|
||||
if (err == -EDEADLK) {
|
||||
err = i915_gem_ww_ctx_backoff(&ww);
|
||||
if (!err)
|
||||
goto retry;
|
||||
}
|
||||
i915_gem_ww_ctx_fini(&ww);
|
||||
intel_engine_pm_put(ce->engine);
|
||||
return err;
|
||||
}
|
||||
|
||||
/* Wa_1209644611:icl,ehl */
|
||||
static bool wa_1209644611_applies(struct drm_i915_private *i915, u32 size)
|
||||
{
|
||||
u32 height = size >> PAGE_SHIFT;
|
||||
|
||||
if (GRAPHICS_VER(i915) != 11)
|
||||
return false;
|
||||
|
||||
return height % 4 == 3 && height <= 8;
|
||||
}
|
||||
|
||||
struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
|
||||
struct i915_gem_ww_ctx *ww,
|
||||
struct i915_vma *src,
|
||||
struct i915_vma *dst)
|
||||
{
|
||||
struct drm_i915_private *i915 = ce->vm->i915;
|
||||
const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
|
||||
struct intel_gt_buffer_pool_node *pool;
|
||||
struct i915_vma *batch;
|
||||
u64 src_offset, dst_offset;
|
||||
u64 count, rem;
|
||||
u32 size, *cmd;
|
||||
int err;
|
||||
|
||||
GEM_BUG_ON(src->size != dst->size);
|
||||
|
||||
GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
|
||||
intel_engine_pm_get(ce->engine);
|
||||
|
||||
count = div_u64(round_up(dst->size, block_size), block_size);
|
||||
size = (1 + 11 * count) * sizeof(u32);
|
||||
size = round_up(size, PAGE_SIZE);
|
||||
pool = intel_gt_get_buffer_pool(ce->engine->gt, size, I915_MAP_WC);
|
||||
if (IS_ERR(pool)) {
|
||||
err = PTR_ERR(pool);
|
||||
goto out_pm;
|
||||
}
|
||||
|
||||
err = i915_gem_object_lock(pool->obj, ww);
|
||||
if (err)
|
||||
goto out_put;
|
||||
|
||||
batch = i915_vma_instance(pool->obj, ce->vm, NULL);
|
||||
if (IS_ERR(batch)) {
|
||||
err = PTR_ERR(batch);
|
||||
goto out_put;
|
||||
}
|
||||
|
||||
err = i915_vma_pin_ww(batch, ww, 0, 0, PIN_USER);
|
||||
if (unlikely(err))
|
||||
goto out_put;
|
||||
|
||||
/* we pinned the pool, mark it as such */
|
||||
intel_gt_buffer_pool_mark_used(pool);
|
||||
|
||||
cmd = i915_gem_object_pin_map(pool->obj, pool->type);
|
||||
if (IS_ERR(cmd)) {
|
||||
err = PTR_ERR(cmd);
|
||||
goto out_unpin;
|
||||
}
|
||||
|
||||
rem = src->size;
|
||||
src_offset = src->node.start;
|
||||
dst_offset = dst->node.start;
|
||||
|
||||
do {
|
||||
size = min_t(u64, rem, block_size);
|
||||
GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
|
||||
|
||||
if (GRAPHICS_VER(i915) >= 9 &&
|
||||
!wa_1209644611_applies(i915, size)) {
|
||||
*cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
|
||||
*cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
|
||||
*cmd++ = 0;
|
||||
*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
|
||||
*cmd++ = lower_32_bits(dst_offset);
|
||||
*cmd++ = upper_32_bits(dst_offset);
|
||||
*cmd++ = 0;
|
||||
*cmd++ = PAGE_SIZE;
|
||||
*cmd++ = lower_32_bits(src_offset);
|
||||
*cmd++ = upper_32_bits(src_offset);
|
||||
} else if (GRAPHICS_VER(i915) >= 8) {
|
||||
*cmd++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (10 - 2);
|
||||
*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
|
||||
*cmd++ = 0;
|
||||
*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
|
||||
*cmd++ = lower_32_bits(dst_offset);
|
||||
*cmd++ = upper_32_bits(dst_offset);
|
||||
*cmd++ = 0;
|
||||
*cmd++ = PAGE_SIZE;
|
||||
*cmd++ = lower_32_bits(src_offset);
|
||||
*cmd++ = upper_32_bits(src_offset);
|
||||
} else {
|
||||
*cmd++ = SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
|
||||
*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
|
||||
*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE;
|
||||
*cmd++ = dst_offset;
|
||||
*cmd++ = PAGE_SIZE;
|
||||
*cmd++ = src_offset;
|
||||
}
|
||||
|
||||
/* Allow ourselves to be preempted in between blocks. */
|
||||
*cmd++ = MI_ARB_CHECK;
|
||||
|
||||
src_offset += size;
|
||||
dst_offset += size;
|
||||
rem -= size;
|
||||
} while (rem);
|
||||
|
||||
*cmd = MI_BATCH_BUFFER_END;
|
||||
|
||||
i915_gem_object_flush_map(pool->obj);
|
||||
i915_gem_object_unpin_map(pool->obj);
|
||||
|
||||
intel_gt_chipset_flush(ce->vm->gt);
|
||||
batch->private = pool;
|
||||
return batch;
|
||||
|
||||
out_unpin:
|
||||
i915_vma_unpin(batch);
|
||||
out_put:
|
||||
intel_gt_buffer_pool_put(pool);
|
||||
out_pm:
|
||||
intel_engine_pm_put(ce->engine);
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
|
||||
int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
|
||||
struct drm_i915_gem_object *dst,
|
||||
struct intel_context *ce)
|
||||
{
|
||||
struct i915_address_space *vm = ce->vm;
|
||||
struct i915_vma *vma[2], *batch;
|
||||
struct i915_gem_ww_ctx ww;
|
||||
struct i915_request *rq;
|
||||
int err, i;
|
||||
|
||||
vma[0] = i915_vma_instance(src, vm, NULL);
|
||||
if (IS_ERR(vma[0]))
|
||||
return PTR_ERR(vma[0]);
|
||||
|
||||
vma[1] = i915_vma_instance(dst, vm, NULL);
|
||||
if (IS_ERR(vma[1]))
|
||||
return PTR_ERR(vma[1]);
|
||||
|
||||
i915_gem_ww_ctx_init(&ww, true);
|
||||
intel_engine_pm_get(ce->engine);
|
||||
retry:
|
||||
err = i915_gem_object_lock(src, &ww);
|
||||
if (!err)
|
||||
err = i915_gem_object_lock(dst, &ww);
|
||||
if (!err)
|
||||
err = intel_context_pin_ww(ce, &ww);
|
||||
if (err)
|
||||
goto out;
|
||||
|
||||
err = i915_vma_pin_ww(vma[0], &ww, 0, 0, PIN_USER);
|
||||
if (err)
|
||||
goto out_ctx;
|
||||
|
||||
err = i915_vma_pin_ww(vma[1], &ww, 0, 0, PIN_USER);
|
||||
if (unlikely(err))
|
||||
goto out_unpin_src;
|
||||
|
||||
batch = intel_emit_vma_copy_blt(ce, &ww, vma[0], vma[1]);
|
||||
if (IS_ERR(batch)) {
|
||||
err = PTR_ERR(batch);
|
||||
goto out_unpin_dst;
|
||||
}
|
||||
|
||||
rq = i915_request_create(ce);
|
||||
if (IS_ERR(rq)) {
|
||||
err = PTR_ERR(rq);
|
||||
goto out_batch;
|
||||
}
|
||||
|
||||
err = intel_emit_vma_mark_active(batch, rq);
|
||||
if (unlikely(err))
|
||||
goto out_request;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(vma); i++) {
|
||||
err = move_obj_to_gpu(vma[i]->obj, rq, i);
|
||||
if (unlikely(err))
|
||||
goto out_request;
|
||||
}
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(vma); i++) {
|
||||
unsigned int flags = i ? EXEC_OBJECT_WRITE : 0;
|
||||
|
||||
err = i915_vma_move_to_active(vma[i], rq, flags);
|
||||
if (unlikely(err))
|
||||
goto out_request;
|
||||
}
|
||||
|
||||
if (rq->engine->emit_init_breadcrumb) {
|
||||
err = rq->engine->emit_init_breadcrumb(rq);
|
||||
if (unlikely(err))
|
||||
goto out_request;
|
||||
}
|
||||
|
||||
err = rq->engine->emit_bb_start(rq,
|
||||
batch->node.start, batch->node.size,
|
||||
0);
|
||||
|
||||
out_request:
|
||||
if (unlikely(err))
|
||||
i915_request_set_error_once(rq, err);
|
||||
|
||||
i915_request_add(rq);
|
||||
out_batch:
|
||||
intel_emit_vma_release(ce, batch);
|
||||
out_unpin_dst:
|
||||
i915_vma_unpin(vma[1]);
|
||||
out_unpin_src:
|
||||
i915_vma_unpin(vma[0]);
|
||||
out_ctx:
|
||||
intel_context_unpin(ce);
|
||||
out:
|
||||
if (err == -EDEADLK) {
|
||||
err = i915_gem_ww_ctx_backoff(&ww);
|
||||
if (!err)
|
||||
goto retry;
|
||||
}
|
||||
i915_gem_ww_ctx_fini(&ww);
|
||||
intel_engine_pm_put(ce->engine);
|
||||
return err;
|
||||
}
|
||||
|
||||
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
|
||||
#include "selftests/i915_gem_object_blt.c"
|
||||
#endif
|
|
@ -1,39 +0,0 @@
|
|||
/* SPDX-License-Identifier: MIT */
|
||||
/*
|
||||
* Copyright © 2019 Intel Corporation
|
||||
*/
|
||||
|
||||
#ifndef __I915_GEM_OBJECT_BLT_H__
|
||||
#define __I915_GEM_OBJECT_BLT_H__
|
||||
|
||||
#include <linux/types.h>
|
||||
|
||||
#include "gt/intel_context.h"
|
||||
#include "gt/intel_engine_pm.h"
|
||||
#include "i915_vma.h"
|
||||
|
||||
struct drm_i915_gem_object;
|
||||
struct i915_gem_ww_ctx;
|
||||
|
||||
struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
|
||||
struct i915_vma *vma,
|
||||
struct i915_gem_ww_ctx *ww,
|
||||
u32 value);
|
||||
|
||||
struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
|
||||
struct i915_gem_ww_ctx *ww,
|
||||
struct i915_vma *src,
|
||||
struct i915_vma *dst);
|
||||
|
||||
int intel_emit_vma_mark_active(struct i915_vma *vma, struct i915_request *rq);
|
||||
void intel_emit_vma_release(struct intel_context *ce, struct i915_vma *vma);
|
||||
|
||||
int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
|
||||
struct intel_context *ce,
|
||||
u32 value);
|
||||
|
||||
int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
|
||||
struct drm_i915_gem_object *dst,
|
||||
struct intel_context *ce);
|
||||
|
||||
#endif
|
|
@ -18,6 +18,7 @@
|
|||
|
||||
struct drm_i915_gem_object;
|
||||
struct intel_fronbuffer;
|
||||
struct intel_memory_region;
|
||||
|
||||
/*
|
||||
* struct i915_lut_handle tracks the fast lookups from handle to vma used
|
||||
|
@ -33,10 +34,9 @@ struct i915_lut_handle {
|
|||
|
||||
struct drm_i915_gem_object_ops {
|
||||
unsigned int flags;
|
||||
#define I915_GEM_OBJECT_HAS_IOMEM BIT(1)
|
||||
#define I915_GEM_OBJECT_IS_SHRINKABLE BIT(2)
|
||||
#define I915_GEM_OBJECT_IS_PROXY BIT(3)
|
||||
#define I915_GEM_OBJECT_NO_MMAP BIT(4)
|
||||
#define I915_GEM_OBJECT_IS_SHRINKABLE BIT(1)
|
||||
#define I915_GEM_OBJECT_IS_PROXY BIT(2)
|
||||
#define I915_GEM_OBJECT_NO_MMAP BIT(3)
|
||||
|
||||
/* Interface between the GEM object and its backing storage.
|
||||
* get_pages() is called once prior to the use of the associated set
|
||||
|
@ -78,12 +78,100 @@ struct drm_i915_gem_object_ops {
|
|||
* delayed_free - Override the default delayed free implementation
|
||||
*/
|
||||
void (*delayed_free)(struct drm_i915_gem_object *obj);
|
||||
|
||||
/**
|
||||
* migrate - Migrate object to a different region either for
|
||||
* pinning or for as long as the object lock is held.
|
||||
*/
|
||||
int (*migrate)(struct drm_i915_gem_object *obj,
|
||||
struct intel_memory_region *mr);
|
||||
|
||||
void (*release)(struct drm_i915_gem_object *obj);
|
||||
|
||||
const struct vm_operations_struct *mmap_ops;
|
||||
const char *name; /* friendly name for debug, e.g. lockdep classes */
|
||||
};
|
||||
|
||||
/**
|
||||
* enum i915_cache_level - The supported GTT caching values for system memory
|
||||
* pages.
|
||||
*
|
||||
* These translate to some special GTT PTE bits when binding pages into some
|
||||
* address space. It also determines whether an object, or rather its pages are
|
||||
* coherent with the GPU, when also reading or writing through the CPU cache
|
||||
* with those pages.
|
||||
*
|
||||
* Userspace can also control this through struct drm_i915_gem_caching.
|
||||
*/
|
||||
enum i915_cache_level {
|
||||
/**
|
||||
* @I915_CACHE_NONE:
|
||||
*
|
||||
* GPU access is not coherent with the CPU cache. If the cache is dirty
|
||||
* and we need the underlying pages to be coherent with some later GPU
|
||||
* access then we need to manually flush the pages.
|
||||
*
|
||||
* On shared LLC platforms reads and writes through the CPU cache are
|
||||
* still coherent even with this setting. See also
|
||||
* &drm_i915_gem_object.cache_coherent for more details. Due to this we
|
||||
* should only ever use uncached for scanout surfaces, otherwise we end
|
||||
* up over-flushing in some places.
|
||||
*
|
||||
* This is the default on non-LLC platforms.
|
||||
*/
|
||||
I915_CACHE_NONE = 0,
|
||||
/**
|
||||
* @I915_CACHE_LLC:
|
||||
*
|
||||
* GPU access is coherent with the CPU cache. If the cache is dirty,
|
||||
* then the GPU will ensure that access remains coherent, when both
|
||||
* reading and writing through the CPU cache. GPU writes can dirty the
|
||||
* CPU cache.
|
||||
*
|
||||
* Not used for scanout surfaces.
|
||||
*
|
||||
* Applies to both platforms with shared LLC(HAS_LLC), and snooping
|
||||
* based platforms(HAS_SNOOP).
|
||||
*
|
||||
* This is the default on shared LLC platforms. The only exception is
|
||||
* scanout objects, where the display engine is not coherent with the
|
||||
* CPU cache. For such objects I915_CACHE_NONE or I915_CACHE_WT is
|
||||
* automatically applied by the kernel in pin_for_display, if userspace
|
||||
* has not done so already.
|
||||
*/
|
||||
I915_CACHE_LLC,
|
||||
/**
|
||||
* @I915_CACHE_L3_LLC:
|
||||
*
|
||||
* Explicitly enable the Gfx L3 cache, with coherent LLC.
|
||||
*
|
||||
* The Gfx L3 sits between the domain specific caches, e.g
|
||||
* sampler/render caches, and the larger LLC. LLC is coherent with the
|
||||
* GPU, but L3 is only visible to the GPU, so likely needs to be flushed
|
||||
* when the workload completes.
|
||||
*
|
||||
* Not used for scanout surfaces.
|
||||
*
|
||||
* Only exposed on some gen7 + GGTT. More recent hardware has dropped
|
||||
* this explicit setting, where it should now be enabled by default.
|
||||
*/
|
||||
I915_CACHE_L3_LLC,
|
||||
/**
|
||||
* @I915_CACHE_WT:
|
||||
*
|
||||
* Write-through. Used for scanout surfaces.
|
||||
*
|
||||
* The GPU can utilise the caches, while still having the display engine
|
||||
* be coherent with GPU writes, as a result we don't need to flush the
|
||||
* CPU caches when moving out of the render domain. This is the default
|
||||
* setting chosen by the kernel, if supported by the HW, otherwise we
|
||||
* fallback to I915_CACHE_NONE. On the CPU side writes through the CPU
|
||||
* cache still need to be flushed, to remain coherent with the display
|
||||
* engine.
|
||||
*/
|
||||
I915_CACHE_WT,
|
||||
};
|
||||
|
||||
enum i915_map_type {
|
||||
I915_MAP_WB = 0,
|
||||
I915_MAP_WC,
|
||||
|
@ -97,6 +185,7 @@ enum i915_mmap_type {
|
|||
I915_MMAP_TYPE_WC,
|
||||
I915_MMAP_TYPE_WB,
|
||||
I915_MMAP_TYPE_UC,
|
||||
I915_MMAP_TYPE_FIXED,
|
||||
};
|
||||
|
||||
struct i915_mmap_offset {
|
||||
|
@ -201,25 +290,138 @@ struct drm_i915_gem_object {
|
|||
unsigned long flags;
|
||||
#define I915_BO_ALLOC_CONTIGUOUS BIT(0)
|
||||
#define I915_BO_ALLOC_VOLATILE BIT(1)
|
||||
#define I915_BO_ALLOC_STRUCT_PAGE BIT(2)
|
||||
#define I915_BO_ALLOC_CPU_CLEAR BIT(3)
|
||||
#define I915_BO_ALLOC_USER BIT(4)
|
||||
#define I915_BO_ALLOC_CPU_CLEAR BIT(2)
|
||||
#define I915_BO_ALLOC_USER BIT(3)
|
||||
#define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS | \
|
||||
I915_BO_ALLOC_VOLATILE | \
|
||||
I915_BO_ALLOC_STRUCT_PAGE | \
|
||||
I915_BO_ALLOC_CPU_CLEAR | \
|
||||
I915_BO_ALLOC_USER)
|
||||
#define I915_BO_READONLY BIT(5)
|
||||
#define I915_TILING_QUIRK_BIT 6 /* unknown swizzling; do not release! */
|
||||
#define I915_BO_READONLY BIT(4)
|
||||
#define I915_TILING_QUIRK_BIT 5 /* unknown swizzling; do not release! */
|
||||
|
||||
/*
|
||||
* Is the object to be mapped as read-only to the GPU
|
||||
* Only honoured if hardware has relevant pte bit
|
||||
/**
|
||||
* @mem_flags - Mutable placement-related flags
|
||||
*
|
||||
* These are flags that indicate specifics of the memory region
|
||||
* the object is currently in. As such they are only stable
|
||||
* either under the object lock or if the object is pinned.
|
||||
*/
|
||||
unsigned int mem_flags;
|
||||
#define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */
|
||||
#define I915_BO_FLAG_IOMEM BIT(1) /* Object backed by IO memory */
|
||||
/**
|
||||
* @cache_level: The desired GTT caching level.
|
||||
*
|
||||
* See enum i915_cache_level for possible values, along with what
|
||||
* each does.
|
||||
*/
|
||||
unsigned int cache_level:3;
|
||||
unsigned int cache_coherent:2;
|
||||
/**
|
||||
* @cache_coherent:
|
||||
*
|
||||
* Track whether the pages are coherent with the GPU if reading or
|
||||
* writing through the CPU caches. The largely depends on the
|
||||
* @cache_level setting.
|
||||
*
|
||||
* On platforms which don't have the shared LLC(HAS_SNOOP), like on Atom
|
||||
* platforms, coherency must be explicitly requested with some special
|
||||
* GTT caching bits(see enum i915_cache_level). When enabling coherency
|
||||
* it does come at a performance and power cost on such platforms. On
|
||||
* the flip side the kernel does not need to manually flush any buffers
|
||||
* which need to be coherent with the GPU, if the object is not coherent
|
||||
* i.e @cache_coherent is zero.
|
||||
*
|
||||
* On platforms that share the LLC with the CPU(HAS_LLC), all GT memory
|
||||
* access will automatically snoop the CPU caches(even with CACHE_NONE).
|
||||
* The one exception is when dealing with the display engine, like with
|
||||
* scanout surfaces. To handle this the kernel will always flush the
|
||||
* surface out of the CPU caches when preparing it for scanout. Also
|
||||
* note that since scanout surfaces are only ever read by the display
|
||||
* engine we only need to care about flushing any writes through the CPU
|
||||
* cache, reads on the other hand will always be coherent.
|
||||
*
|
||||
* Something strange here is why @cache_coherent is not a simple
|
||||
* boolean, i.e coherent vs non-coherent. The reasoning for this is back
|
||||
* to the display engine not being fully coherent. As a result scanout
|
||||
* surfaces will either be marked as I915_CACHE_NONE or I915_CACHE_WT.
|
||||
* In the case of seeing I915_CACHE_NONE the kernel makes the assumption
|
||||
* that this is likely a scanout surface, and will set @cache_coherent
|
||||
* as only I915_BO_CACHE_COHERENT_FOR_READ, on platforms with the shared
|
||||
* LLC. The kernel uses this to always flush writes through the CPU
|
||||
* cache as early as possible, where it can, in effect keeping
|
||||
* @cache_dirty clean, so we can potentially avoid stalling when
|
||||
* flushing the surface just before doing the scanout. This does mean
|
||||
* we might unnecessarily flush non-scanout objects in some places, but
|
||||
* the default assumption is that all normal objects should be using
|
||||
* I915_CACHE_LLC, at least on platforms with the shared LLC.
|
||||
*
|
||||
* Supported values:
|
||||
*
|
||||
* I915_BO_CACHE_COHERENT_FOR_READ:
|
||||
*
|
||||
* On shared LLC platforms, we use this for special scanout surfaces,
|
||||
* where the display engine is not coherent with the CPU cache. As such
|
||||
* we need to ensure we flush any writes before doing the scanout. As an
|
||||
* optimisation we try to flush any writes as early as possible to avoid
|
||||
* stalling later.
|
||||
*
|
||||
* Thus for scanout surfaces using I915_CACHE_NONE, on shared LLC
|
||||
* platforms, we use:
|
||||
*
|
||||
* cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ
|
||||
*
|
||||
* While for normal objects that are fully coherent, including special
|
||||
* scanout surfaces marked as I915_CACHE_WT, we use:
|
||||
*
|
||||
* cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ |
|
||||
* I915_BO_CACHE_COHERENT_FOR_WRITE
|
||||
*
|
||||
* And then for objects that are not coherent at all we use:
|
||||
*
|
||||
* cache_coherent = 0
|
||||
*
|
||||
* I915_BO_CACHE_COHERENT_FOR_WRITE:
|
||||
*
|
||||
* When writing through the CPU cache, the GPU is still coherent. Note
|
||||
* that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
|
||||
*/
|
||||
#define I915_BO_CACHE_COHERENT_FOR_READ BIT(0)
|
||||
#define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1)
|
||||
unsigned int cache_coherent:2;
|
||||
|
||||
/**
|
||||
* @cache_dirty:
|
||||
*
|
||||
* Track if we are we dirty with writes through the CPU cache for this
|
||||
* object. As a result reading directly from main memory might yield
|
||||
* stale data.
|
||||
*
|
||||
* This also ties into whether the kernel is tracking the object as
|
||||
* coherent with the GPU, as per @cache_coherent, as it determines if
|
||||
* flushing might be needed at various points.
|
||||
*
|
||||
* Another part of @cache_dirty is managing flushing when first
|
||||
* acquiring the pages for system memory, at this point the pages are
|
||||
* considered foreign, so the default assumption is that the cache is
|
||||
* dirty, for example the page zeroing done by the kernel might leave
|
||||
* writes though the CPU cache, or swapping-in, while the actual data in
|
||||
* main memory is potentially stale. Note that this is a potential
|
||||
* security issue when dealing with userspace objects and zeroing. Now,
|
||||
* whether we actually need apply the big sledgehammer of flushing all
|
||||
* the pages on acquire depends on if @cache_coherent is marked as
|
||||
* I915_BO_CACHE_COHERENT_FOR_WRITE, i.e that the GPU will be coherent
|
||||
* for both reads and writes though the CPU cache.
|
||||
*
|
||||
* Note that on shared LLC platforms we still apply the heavy flush for
|
||||
* I915_CACHE_NONE objects, under the assumption that this is going to
|
||||
* be used for scanout.
|
||||
*
|
||||
* Update: On some hardware there is now also the 'Bypass LLC' MOCS
|
||||
* entry, which defeats our @cache_coherent tracking, since userspace
|
||||
* can freely bypass the CPU cache when touching the pages with the GPU,
|
||||
* where the kernel is completely unaware. On such platform we need
|
||||
* apply the sledgehammer-on-acquire regardless of the @cache_coherent.
|
||||
*/
|
||||
unsigned int cache_dirty:1;
|
||||
|
||||
/**
|
||||
|
@ -265,9 +467,10 @@ struct drm_i915_gem_object {
|
|||
struct intel_memory_region *region;
|
||||
|
||||
/**
|
||||
* Memory manager node allocated for this object.
|
||||
* Memory manager resource allocated for this object. Only
|
||||
* needed for the mock region.
|
||||
*/
|
||||
void *st_mm_node;
|
||||
struct ttm_resource *res;
|
||||
|
||||
/**
|
||||
* Element within memory_region->objects or region->purgeable
|
||||
|
|
|
@ -321,8 +321,7 @@ static void *i915_gem_object_map_pfn(struct drm_i915_gem_object *obj,
|
|||
dma_addr_t addr;
|
||||
void *vaddr;
|
||||
|
||||
if (type != I915_MAP_WC)
|
||||
return ERR_PTR(-ENODEV);
|
||||
GEM_BUG_ON(type != I915_MAP_WC);
|
||||
|
||||
if (n_pfn > ARRAY_SIZE(stack)) {
|
||||
/* Too big for stack -- allocate temporary array instead */
|
||||
|
@ -351,7 +350,7 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
|
|||
int err;
|
||||
|
||||
if (!i915_gem_object_has_struct_page(obj) &&
|
||||
!i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM))
|
||||
!i915_gem_object_has_iomem(obj))
|
||||
return ERR_PTR(-ENXIO);
|
||||
|
||||
assert_object_held(obj);
|
||||
|
@ -374,6 +373,34 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
|
|||
}
|
||||
GEM_BUG_ON(!i915_gem_object_has_pages(obj));
|
||||
|
||||
/*
|
||||
* For discrete our CPU mappings needs to be consistent in order to
|
||||
* function correctly on !x86. When mapping things through TTM, we use
|
||||
* the same rules to determine the caching type.
|
||||
*
|
||||
* The caching rules, starting from DG1:
|
||||
*
|
||||
* - If the object can be placed in device local-memory, then the
|
||||
* pages should be allocated and mapped as write-combined only.
|
||||
*
|
||||
* - Everything else is always allocated and mapped as write-back,
|
||||
* with the guarantee that everything is also coherent with the
|
||||
* GPU.
|
||||
*
|
||||
* Internal users of lmem are already expected to get this right, so no
|
||||
* fudging needed there.
|
||||
*/
|
||||
if (i915_gem_object_placement_possible(obj, INTEL_MEMORY_LOCAL)) {
|
||||
if (type != I915_MAP_WC && !obj->mm.n_placements) {
|
||||
ptr = ERR_PTR(-ENODEV);
|
||||
goto err_unpin;
|
||||
}
|
||||
|
||||
type = I915_MAP_WC;
|
||||
} else if (IS_DGFX(to_i915(obj->base.dev))) {
|
||||
type = I915_MAP_WB;
|
||||
}
|
||||
|
||||
ptr = page_unpack_bits(obj->mm.mapping, &has_type);
|
||||
if (ptr && has_type != type) {
|
||||
if (pinned) {
|
||||
|
@ -467,7 +494,7 @@ __i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
|
|||
struct i915_gem_object_page_iter *iter,
|
||||
unsigned int n,
|
||||
unsigned int *offset,
|
||||
bool allow_alloc, bool dma)
|
||||
bool dma)
|
||||
{
|
||||
struct scatterlist *sg;
|
||||
unsigned int idx, count;
|
||||
|
@ -489,9 +516,6 @@ __i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
|
|||
if (n < READ_ONCE(iter->sg_idx))
|
||||
goto lookup;
|
||||
|
||||
if (!allow_alloc)
|
||||
goto manual_lookup;
|
||||
|
||||
mutex_lock(&iter->lock);
|
||||
|
||||
/* We prefer to reuse the last sg so that repeated lookup of this
|
||||
|
@ -541,16 +565,7 @@ scan:
|
|||
if (unlikely(n < idx)) /* insertion completed by another thread */
|
||||
goto lookup;
|
||||
|
||||
goto manual_walk;
|
||||
|
||||
manual_lookup:
|
||||
idx = 0;
|
||||
sg = obj->mm.pages->sgl;
|
||||
count = __sg_page_count(sg);
|
||||
|
||||
manual_walk:
|
||||
/*
|
||||
* In case we failed to insert the entry into the radixtree, we need
|
||||
/* In case we failed to insert the entry into the radixtree, we need
|
||||
* to look beyond the current sg.
|
||||
*/
|
||||
while (idx + count <= n) {
|
||||
|
@ -597,7 +612,7 @@ i915_gem_object_get_page(struct drm_i915_gem_object *obj, unsigned int n)
|
|||
|
||||
GEM_BUG_ON(!i915_gem_object_has_struct_page(obj));
|
||||
|
||||
sg = i915_gem_object_get_sg(obj, n, &offset, true);
|
||||
sg = i915_gem_object_get_sg(obj, n, &offset);
|
||||
return nth_page(sg_page(sg), offset);
|
||||
}
|
||||
|
||||
|
@ -623,7 +638,7 @@ i915_gem_object_get_dma_address_len(struct drm_i915_gem_object *obj,
|
|||
struct scatterlist *sg;
|
||||
unsigned int offset;
|
||||
|
||||
sg = i915_gem_object_get_sg_dma(obj, n, &offset, true);
|
||||
sg = i915_gem_object_get_sg_dma(obj, n, &offset);
|
||||
|
||||
if (len)
|
||||
*len = sg_dma_len(sg) - (offset << PAGE_SHIFT);
|
||||
|
|
|
@ -76,7 +76,7 @@ static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
|
|||
intel_gt_chipset_flush(&to_i915(obj->base.dev)->gt);
|
||||
|
||||
/* We're no longer struct page backed */
|
||||
obj->flags &= ~I915_BO_ALLOC_STRUCT_PAGE;
|
||||
obj->mem_flags &= ~I915_BO_FLAG_STRUCT_PAGE;
|
||||
__i915_gem_object_set_pages(obj, st, sg->length);
|
||||
|
||||
return 0;
|
||||
|
|
|
@ -13,11 +13,7 @@ void i915_gem_object_init_memory_region(struct drm_i915_gem_object *obj,
|
|||
{
|
||||
obj->mm.region = intel_memory_region_get(mem);
|
||||
|
||||
if (obj->base.size <= mem->min_page_size)
|
||||
obj->flags |= I915_BO_ALLOC_CONTIGUOUS;
|
||||
|
||||
mutex_lock(&mem->objects.lock);
|
||||
|
||||
list_add(&obj->mm.region_link, &mem->objects.list);
|
||||
mutex_unlock(&mem->objects.lock);
|
||||
}
|
||||
|
@ -36,9 +32,11 @@ void i915_gem_object_release_memory_region(struct drm_i915_gem_object *obj)
|
|||
struct drm_i915_gem_object *
|
||||
i915_gem_object_create_region(struct intel_memory_region *mem,
|
||||
resource_size_t size,
|
||||
resource_size_t page_size,
|
||||
unsigned int flags)
|
||||
{
|
||||
struct drm_i915_gem_object *obj;
|
||||
resource_size_t default_page_size;
|
||||
int err;
|
||||
|
||||
/*
|
||||
|
@ -52,7 +50,14 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
|
|||
if (!mem)
|
||||
return ERR_PTR(-ENODEV);
|
||||
|
||||
size = round_up(size, mem->min_page_size);
|
||||
default_page_size = mem->min_page_size;
|
||||
if (page_size)
|
||||
default_page_size = page_size;
|
||||
|
||||
GEM_BUG_ON(!is_power_of_2_u64(default_page_size));
|
||||
GEM_BUG_ON(default_page_size < PAGE_SIZE);
|
||||
|
||||
size = round_up(size, default_page_size);
|
||||
|
||||
GEM_BUG_ON(!size);
|
||||
GEM_BUG_ON(!IS_ALIGNED(size, I915_GTT_MIN_ALIGNMENT));
|
||||
|
@ -64,7 +69,7 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
|
|||
if (!obj)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
err = mem->ops->init_object(mem, obj, size, flags);
|
||||
err = mem->ops->init_object(mem, obj, size, page_size, flags);
|
||||
if (err)
|
||||
goto err_object_free;
|
||||
|
||||
|
|
|
@ -19,6 +19,7 @@ void i915_gem_object_release_memory_region(struct drm_i915_gem_object *obj);
|
|||
struct drm_i915_gem_object *
|
||||
i915_gem_object_create_region(struct intel_memory_region *mem,
|
||||
resource_size_t size,
|
||||
resource_size_t page_size,
|
||||
unsigned int flags);
|
||||
|
||||
#endif
|
||||
|
|
|
@ -182,6 +182,24 @@ rebuild_st:
|
|||
if (i915_gem_object_needs_bit17_swizzle(obj))
|
||||
i915_gem_object_do_bit_17_swizzle(obj, st);
|
||||
|
||||
/*
|
||||
* EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
|
||||
* possible for userspace to bypass the GTT caching bits set by the
|
||||
* kernel, as per the given object cache_level. This is troublesome
|
||||
* since the heavy flush we apply when first gathering the pages is
|
||||
* skipped if the kernel thinks the object is coherent with the GPU. As
|
||||
* a result it might be possible to bypass the cache and read the
|
||||
* contents of the page directly, which could be stale data. If it's
|
||||
* just a case of userspace shooting themselves in the foot then so be
|
||||
* it, but since i915 takes the stance of always zeroing memory before
|
||||
* handing it to userspace, we need to prevent this.
|
||||
*
|
||||
* By setting cache_dirty here we make the clflush in set_pages
|
||||
* unconditional on such platforms.
|
||||
*/
|
||||
if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER)
|
||||
obj->cache_dirty = true;
|
||||
|
||||
__i915_gem_object_set_pages(obj, st, sg_page_sizes);
|
||||
|
||||
return 0;
|
||||
|
@ -302,6 +320,7 @@ void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_
|
|||
struct pagevec pvec;
|
||||
struct page *page;
|
||||
|
||||
GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev)));
|
||||
__i915_gem_object_release_shmem(obj, pages, true);
|
||||
|
||||
i915_gem_gtt_finish_pages(obj, pages);
|
||||
|
@ -444,7 +463,7 @@ shmem_pread(struct drm_i915_gem_object *obj,
|
|||
|
||||
static void shmem_release(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
if (obj->flags & I915_BO_ALLOC_STRUCT_PAGE)
|
||||
if (i915_gem_object_has_struct_page(obj))
|
||||
i915_gem_object_release_memory_region(obj);
|
||||
|
||||
fput(obj->base.filp);
|
||||
|
@ -489,6 +508,7 @@ static int __create_shmem(struct drm_i915_private *i915,
|
|||
static int shmem_object_init(struct intel_memory_region *mem,
|
||||
struct drm_i915_gem_object *obj,
|
||||
resource_size_t size,
|
||||
resource_size_t page_size,
|
||||
unsigned int flags)
|
||||
{
|
||||
static struct lock_class_key lock_class;
|
||||
|
@ -513,9 +533,8 @@ static int shmem_object_init(struct intel_memory_region *mem,
|
|||
mapping_set_gfp_mask(mapping, mask);
|
||||
GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM));
|
||||
|
||||
i915_gem_object_init(obj, &i915_gem_shmem_ops, &lock_class,
|
||||
I915_BO_ALLOC_STRUCT_PAGE);
|
||||
|
||||
i915_gem_object_init(obj, &i915_gem_shmem_ops, &lock_class, 0);
|
||||
obj->mem_flags |= I915_BO_FLAG_STRUCT_PAGE;
|
||||
obj->write_domain = I915_GEM_DOMAIN_CPU;
|
||||
obj->read_domains = I915_GEM_DOMAIN_CPU;
|
||||
|
||||
|
@ -548,7 +567,7 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915,
|
|||
resource_size_t size)
|
||||
{
|
||||
return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_SMEM],
|
||||
size, 0);
|
||||
size, 0, 0);
|
||||
}
|
||||
|
||||
/* Allocate a new GEM object and fill it with the supplied data */
|
||||
|
@ -561,6 +580,7 @@ i915_gem_object_create_shmem_from_data(struct drm_i915_private *dev_priv,
|
|||
resource_size_t offset;
|
||||
int err;
|
||||
|
||||
GEM_WARN_ON(IS_DGFX(dev_priv));
|
||||
obj = i915_gem_object_create_shmem(dev_priv, round_up(size, PAGE_SIZE));
|
||||
if (IS_ERR(obj))
|
||||
return obj;
|
||||
|
|
|
@ -670,6 +670,7 @@ static int __i915_gem_object_create_stolen(struct intel_memory_region *mem,
|
|||
static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
|
||||
struct drm_i915_gem_object *obj,
|
||||
resource_size_t size,
|
||||
resource_size_t page_size,
|
||||
unsigned int flags)
|
||||
{
|
||||
struct drm_i915_private *i915 = mem->i915;
|
||||
|
@ -708,7 +709,7 @@ struct drm_i915_gem_object *
|
|||
i915_gem_object_create_stolen(struct drm_i915_private *i915,
|
||||
resource_size_t size)
|
||||
{
|
||||
return i915_gem_object_create_region(i915->mm.stolen_region, size, 0);
|
||||
return i915_gem_object_create_region(i915->mm.stolen_region, size, 0, 0);
|
||||
}
|
||||
|
||||
static int init_stolen_smem(struct intel_memory_region *mem)
|
||||
|
|
|
@ -15,6 +15,9 @@
|
|||
#include "gem/i915_gem_ttm.h"
|
||||
#include "gem/i915_gem_mman.h"
|
||||
|
||||
#include "gt/intel_migrate.h"
|
||||
#include "gt/intel_engine_pm.h"
|
||||
|
||||
#define I915_PL_LMEM0 TTM_PL_PRIV
|
||||
#define I915_PL_SYSTEM TTM_PL_SYSTEM
|
||||
#define I915_PL_STOLEN TTM_PL_VRAM
|
||||
|
@ -24,6 +27,11 @@
|
|||
#define I915_TTM_PRIO_NO_PAGES 1
|
||||
#define I915_TTM_PRIO_HAS_PAGES 2
|
||||
|
||||
/*
|
||||
* Size of struct ttm_place vector in on-stack struct ttm_placement allocs
|
||||
*/
|
||||
#define I915_TTM_MAX_PLACEMENTS INTEL_REGION_UNKNOWN
|
||||
|
||||
/**
|
||||
* struct i915_ttm_tt - TTM page vector with additional private information
|
||||
* @ttm: The base TTM page vector.
|
||||
|
@ -42,36 +50,123 @@ struct i915_ttm_tt {
|
|||
struct sg_table *cached_st;
|
||||
};
|
||||
|
||||
static const struct ttm_place lmem0_sys_placement_flags[] = {
|
||||
{
|
||||
.fpfn = 0,
|
||||
.lpfn = 0,
|
||||
.mem_type = I915_PL_LMEM0,
|
||||
.flags = 0,
|
||||
}, {
|
||||
.fpfn = 0,
|
||||
.lpfn = 0,
|
||||
.mem_type = I915_PL_SYSTEM,
|
||||
.flags = 0,
|
||||
}
|
||||
};
|
||||
|
||||
static struct ttm_placement i915_lmem0_placement = {
|
||||
.num_placement = 1,
|
||||
.placement = &lmem0_sys_placement_flags[0],
|
||||
.num_busy_placement = 1,
|
||||
.busy_placement = &lmem0_sys_placement_flags[0],
|
||||
static const struct ttm_place sys_placement_flags = {
|
||||
.fpfn = 0,
|
||||
.lpfn = 0,
|
||||
.mem_type = I915_PL_SYSTEM,
|
||||
.flags = 0,
|
||||
};
|
||||
|
||||
static struct ttm_placement i915_sys_placement = {
|
||||
.num_placement = 1,
|
||||
.placement = &lmem0_sys_placement_flags[1],
|
||||
.placement = &sys_placement_flags,
|
||||
.num_busy_placement = 1,
|
||||
.busy_placement = &lmem0_sys_placement_flags[1],
|
||||
.busy_placement = &sys_placement_flags,
|
||||
};
|
||||
|
||||
static int i915_ttm_err_to_gem(int err)
|
||||
{
|
||||
/* Fastpath */
|
||||
if (likely(!err))
|
||||
return 0;
|
||||
|
||||
switch (err) {
|
||||
case -EBUSY:
|
||||
/*
|
||||
* TTM likes to convert -EDEADLK to -EBUSY, and wants us to
|
||||
* restart the operation, since we don't record the contending
|
||||
* lock. We use -EAGAIN to restart.
|
||||
*/
|
||||
return -EAGAIN;
|
||||
case -ENOSPC:
|
||||
/*
|
||||
* Memory type / region is full, and we can't evict.
|
||||
* Except possibly system, that returns -ENOMEM;
|
||||
*/
|
||||
return -ENXIO;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static bool gpu_binds_iomem(struct ttm_resource *mem)
|
||||
{
|
||||
return mem->mem_type != TTM_PL_SYSTEM;
|
||||
}
|
||||
|
||||
static bool cpu_maps_iomem(struct ttm_resource *mem)
|
||||
{
|
||||
/* Once / if we support GGTT, this is also false for cached ttm_tts */
|
||||
return mem->mem_type != TTM_PL_SYSTEM;
|
||||
}
|
||||
|
||||
static enum i915_cache_level
|
||||
i915_ttm_cache_level(struct drm_i915_private *i915, struct ttm_resource *res,
|
||||
struct ttm_tt *ttm)
|
||||
{
|
||||
return ((HAS_LLC(i915) || HAS_SNOOP(i915)) && !gpu_binds_iomem(res) &&
|
||||
ttm->caching == ttm_cached) ? I915_CACHE_LLC :
|
||||
I915_CACHE_NONE;
|
||||
}
|
||||
|
||||
static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj);
|
||||
|
||||
static enum ttm_caching
|
||||
i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj)
|
||||
{
|
||||
/*
|
||||
* Objects only allowed in system get cached cpu-mappings.
|
||||
* Other objects get WC mapping for now. Even if in system.
|
||||
*/
|
||||
if (obj->mm.region->type == INTEL_MEMORY_SYSTEM &&
|
||||
obj->mm.n_placements <= 1)
|
||||
return ttm_cached;
|
||||
|
||||
return ttm_write_combined;
|
||||
}
|
||||
|
||||
static void
|
||||
i915_ttm_place_from_region(const struct intel_memory_region *mr,
|
||||
struct ttm_place *place,
|
||||
unsigned int flags)
|
||||
{
|
||||
memset(place, 0, sizeof(*place));
|
||||
place->mem_type = intel_region_to_ttm_type(mr);
|
||||
|
||||
if (flags & I915_BO_ALLOC_CONTIGUOUS)
|
||||
place->flags = TTM_PL_FLAG_CONTIGUOUS;
|
||||
}
|
||||
|
||||
static void
|
||||
i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj,
|
||||
struct ttm_place *requested,
|
||||
struct ttm_place *busy,
|
||||
struct ttm_placement *placement)
|
||||
{
|
||||
unsigned int num_allowed = obj->mm.n_placements;
|
||||
unsigned int flags = obj->flags;
|
||||
unsigned int i;
|
||||
|
||||
placement->num_placement = 1;
|
||||
i915_ttm_place_from_region(num_allowed ? obj->mm.placements[0] :
|
||||
obj->mm.region, requested, flags);
|
||||
|
||||
/* Cache this on object? */
|
||||
placement->num_busy_placement = num_allowed;
|
||||
for (i = 0; i < placement->num_busy_placement; ++i)
|
||||
i915_ttm_place_from_region(obj->mm.placements[i], busy + i, flags);
|
||||
|
||||
if (num_allowed == 0) {
|
||||
*busy = *requested;
|
||||
placement->num_busy_placement = 1;
|
||||
}
|
||||
|
||||
placement->placement = requested;
|
||||
placement->busy_placement = busy;
|
||||
}
|
||||
|
||||
static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
|
||||
uint32_t page_flags)
|
||||
{
|
||||
|
@ -89,7 +184,8 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
|
|||
man->use_tt)
|
||||
page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC;
|
||||
|
||||
ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, ttm_write_combined);
|
||||
ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
|
||||
i915_ttm_select_tt_caching(obj));
|
||||
if (ret) {
|
||||
kfree(i915_tt);
|
||||
return NULL;
|
||||
|
@ -119,6 +215,7 @@ static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm)
|
|||
struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
|
||||
|
||||
ttm_tt_destroy_common(bdev, ttm);
|
||||
ttm_tt_fini(ttm);
|
||||
kfree(i915_tt);
|
||||
}
|
||||
|
||||
|
@ -128,11 +225,7 @@ static bool i915_ttm_eviction_valuable(struct ttm_buffer_object *bo,
|
|||
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
|
||||
|
||||
/* Will do for now. Our pinned objects are still on TTM's LRU lists */
|
||||
if (!i915_gem_object_evictable(obj))
|
||||
return false;
|
||||
|
||||
/* This isn't valid with a buddy allocator */
|
||||
return ttm_bo_eviction_valuable(bo, place);
|
||||
return i915_gem_object_evictable(obj);
|
||||
}
|
||||
|
||||
static void i915_ttm_evict_flags(struct ttm_buffer_object *bo,
|
||||
|
@ -175,6 +268,55 @@ static void i915_ttm_free_cached_io_st(struct drm_i915_gem_object *obj)
|
|||
obj->ttm.cached_io_st = NULL;
|
||||
}
|
||||
|
||||
static void
|
||||
i915_ttm_adjust_domains_after_move(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
|
||||
|
||||
if (cpu_maps_iomem(bo->resource) || bo->ttm->caching != ttm_cached) {
|
||||
obj->write_domain = I915_GEM_DOMAIN_WC;
|
||||
obj->read_domains = I915_GEM_DOMAIN_WC;
|
||||
} else {
|
||||
obj->write_domain = I915_GEM_DOMAIN_CPU;
|
||||
obj->read_domains = I915_GEM_DOMAIN_CPU;
|
||||
}
|
||||
}
|
||||
|
||||
static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
|
||||
unsigned int cache_level;
|
||||
unsigned int i;
|
||||
|
||||
/*
|
||||
* If object was moved to an allowable region, update the object
|
||||
* region to consider it migrated. Note that if it's currently not
|
||||
* in an allowable region, it's evicted and we don't update the
|
||||
* object region.
|
||||
*/
|
||||
if (intel_region_to_ttm_type(obj->mm.region) != bo->resource->mem_type) {
|
||||
for (i = 0; i < obj->mm.n_placements; ++i) {
|
||||
struct intel_memory_region *mr = obj->mm.placements[i];
|
||||
|
||||
if (intel_region_to_ttm_type(mr) == bo->resource->mem_type &&
|
||||
mr != obj->mm.region) {
|
||||
i915_gem_object_release_memory_region(obj);
|
||||
i915_gem_object_init_memory_region(obj, mr);
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
obj->mem_flags &= ~(I915_BO_FLAG_STRUCT_PAGE | I915_BO_FLAG_IOMEM);
|
||||
|
||||
obj->mem_flags |= cpu_maps_iomem(bo->resource) ? I915_BO_FLAG_IOMEM :
|
||||
I915_BO_FLAG_STRUCT_PAGE;
|
||||
|
||||
cache_level = i915_ttm_cache_level(to_i915(bo->base.dev), bo->resource,
|
||||
bo->ttm);
|
||||
i915_gem_object_set_cache_coherency(obj, cache_level);
|
||||
}
|
||||
|
||||
static void i915_ttm_purge(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
|
||||
|
@ -190,8 +332,10 @@ static void i915_ttm_purge(struct drm_i915_gem_object *obj)
|
|||
|
||||
/* TTM's purge interface. Note that we might be reentering. */
|
||||
ret = ttm_bo_validate(bo, &place, &ctx);
|
||||
|
||||
if (!ret) {
|
||||
obj->write_domain = 0;
|
||||
obj->read_domains = 0;
|
||||
i915_ttm_adjust_gem_after_move(obj);
|
||||
i915_ttm_free_cached_io_st(obj);
|
||||
obj->mm.madv = __I915_MADV_PURGED;
|
||||
}
|
||||
|
@ -214,6 +358,7 @@ static void i915_ttm_delete_mem_notify(struct ttm_buffer_object *bo)
|
|||
|
||||
if (likely(obj)) {
|
||||
/* This releases all gem object bindings to the backend. */
|
||||
i915_ttm_free_cached_io_st(obj);
|
||||
__i915_gem_free_object(obj);
|
||||
}
|
||||
}
|
||||
|
@ -273,13 +418,75 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
|
|||
struct ttm_resource *res)
|
||||
{
|
||||
struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
|
||||
struct ttm_resource_manager *man =
|
||||
ttm_manager_type(bo->bdev, res->mem_type);
|
||||
|
||||
if (man->use_tt)
|
||||
if (!gpu_binds_iomem(res))
|
||||
return i915_ttm_tt_get_st(bo->ttm);
|
||||
|
||||
return intel_region_ttm_node_to_st(obj->mm.region, res);
|
||||
/*
|
||||
* If CPU mapping differs, we need to add the ttm_tt pages to
|
||||
* the resulting st. Might make sense for GGTT.
|
||||
*/
|
||||
GEM_WARN_ON(!cpu_maps_iomem(res));
|
||||
return intel_region_ttm_resource_to_st(obj->mm.region, res);
|
||||
}
|
||||
|
||||
static int i915_ttm_accel_move(struct ttm_buffer_object *bo,
|
||||
struct ttm_resource *dst_mem,
|
||||
struct sg_table *dst_st)
|
||||
{
|
||||
struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915),
|
||||
bdev);
|
||||
struct ttm_resource_manager *src_man =
|
||||
ttm_manager_type(bo->bdev, bo->resource->mem_type);
|
||||
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
|
||||
struct sg_table *src_st;
|
||||
struct i915_request *rq;
|
||||
struct ttm_tt *ttm = bo->ttm;
|
||||
enum i915_cache_level src_level, dst_level;
|
||||
int ret;
|
||||
|
||||
if (!i915->gt.migrate.context)
|
||||
return -EINVAL;
|
||||
|
||||
dst_level = i915_ttm_cache_level(i915, dst_mem, ttm);
|
||||
if (!ttm || !ttm_tt_is_populated(ttm)) {
|
||||
if (bo->type == ttm_bo_type_kernel)
|
||||
return -EINVAL;
|
||||
|
||||
if (ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))
|
||||
return 0;
|
||||
|
||||
intel_engine_pm_get(i915->gt.migrate.context->engine);
|
||||
ret = intel_context_migrate_clear(i915->gt.migrate.context, NULL,
|
||||
dst_st->sgl, dst_level,
|
||||
gpu_binds_iomem(dst_mem),
|
||||
0, &rq);
|
||||
|
||||
if (!ret && rq) {
|
||||
i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT);
|
||||
i915_request_put(rq);
|
||||
}
|
||||
intel_engine_pm_put(i915->gt.migrate.context->engine);
|
||||
} else {
|
||||
src_st = src_man->use_tt ? i915_ttm_tt_get_st(ttm) :
|
||||
obj->ttm.cached_io_st;
|
||||
|
||||
src_level = i915_ttm_cache_level(i915, bo->resource, ttm);
|
||||
intel_engine_pm_get(i915->gt.migrate.context->engine);
|
||||
ret = intel_context_migrate_copy(i915->gt.migrate.context,
|
||||
NULL, src_st->sgl, src_level,
|
||||
gpu_binds_iomem(bo->resource),
|
||||
dst_st->sgl, dst_level,
|
||||
gpu_binds_iomem(dst_mem),
|
||||
&rq);
|
||||
if (!ret && rq) {
|
||||
i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT);
|
||||
i915_request_put(rq);
|
||||
}
|
||||
intel_engine_pm_put(i915->gt.migrate.context->engine);
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
|
||||
|
@ -290,8 +497,6 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
|
|||
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
|
||||
struct ttm_resource_manager *dst_man =
|
||||
ttm_manager_type(bo->bdev, dst_mem->mem_type);
|
||||
struct ttm_resource_manager *src_man =
|
||||
ttm_manager_type(bo->bdev, bo->resource->mem_type);
|
||||
struct intel_memory_region *dst_reg, *src_reg;
|
||||
union {
|
||||
struct ttm_kmap_iter_tt tt;
|
||||
|
@ -332,34 +537,40 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
|
|||
if (IS_ERR(dst_st))
|
||||
return PTR_ERR(dst_st);
|
||||
|
||||
/* If we start mapping GGTT, we can no longer use man::use_tt here. */
|
||||
dst_iter = dst_man->use_tt ?
|
||||
ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) :
|
||||
ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap,
|
||||
dst_st, dst_reg->region.start);
|
||||
ret = i915_ttm_accel_move(bo, dst_mem, dst_st);
|
||||
if (ret) {
|
||||
/* If we start mapping GGTT, we can no longer use man::use_tt here. */
|
||||
dst_iter = !cpu_maps_iomem(dst_mem) ?
|
||||
ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) :
|
||||
ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap,
|
||||
dst_st, dst_reg->region.start);
|
||||
|
||||
src_iter = src_man->use_tt ?
|
||||
ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) :
|
||||
ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap,
|
||||
obj->ttm.cached_io_st,
|
||||
src_reg->region.start);
|
||||
src_iter = !cpu_maps_iomem(bo->resource) ?
|
||||
ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) :
|
||||
ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap,
|
||||
obj->ttm.cached_io_st,
|
||||
src_reg->region.start);
|
||||
|
||||
ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter);
|
||||
ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter);
|
||||
}
|
||||
/* Below dst_mem becomes bo->resource. */
|
||||
ttm_bo_move_sync_cleanup(bo, dst_mem);
|
||||
i915_ttm_adjust_domains_after_move(obj);
|
||||
i915_ttm_free_cached_io_st(obj);
|
||||
|
||||
if (!dst_man->use_tt) {
|
||||
if (gpu_binds_iomem(dst_mem) || cpu_maps_iomem(dst_mem)) {
|
||||
obj->ttm.cached_io_st = dst_st;
|
||||
obj->ttm.get_io_page.sg_pos = dst_st->sgl;
|
||||
obj->ttm.get_io_page.sg_idx = 0;
|
||||
}
|
||||
|
||||
i915_ttm_adjust_gem_after_move(obj);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int i915_ttm_io_mem_reserve(struct ttm_device *bdev, struct ttm_resource *mem)
|
||||
{
|
||||
if (mem->mem_type < I915_PL_LMEM0)
|
||||
if (!cpu_maps_iomem(mem))
|
||||
return 0;
|
||||
|
||||
mem->bus.caching = ttm_write_combined;
|
||||
|
@ -378,7 +589,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo,
|
|||
|
||||
GEM_WARN_ON(bo->ttm);
|
||||
|
||||
sg = __i915_gem_object_get_sg(obj, &obj->ttm.get_io_page, page_offset, &ofs, true, true);
|
||||
sg = __i915_gem_object_get_sg(obj, &obj->ttm.get_io_page, page_offset, &ofs, true);
|
||||
|
||||
return ((base + sg_dma_address(sg)) >> PAGE_SHIFT) + ofs;
|
||||
}
|
||||
|
@ -406,7 +617,8 @@ struct ttm_device_funcs *i915_ttm_driver(void)
|
|||
return &i915_ttm_bo_driver;
|
||||
}
|
||||
|
||||
static int i915_ttm_get_pages(struct drm_i915_gem_object *obj)
|
||||
static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
|
||||
struct ttm_placement *placement)
|
||||
{
|
||||
struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
|
||||
struct ttm_operation_ctx ctx = {
|
||||
|
@ -414,25 +626,111 @@ static int i915_ttm_get_pages(struct drm_i915_gem_object *obj)
|
|||
.no_wait_gpu = false,
|
||||
};
|
||||
struct sg_table *st;
|
||||
int real_num_busy;
|
||||
int ret;
|
||||
|
||||
/* Move to the requested placement. */
|
||||
ret = ttm_bo_validate(bo, &i915_lmem0_placement, &ctx);
|
||||
if (ret)
|
||||
return ret == -ENOSPC ? -ENXIO : ret;
|
||||
/* First try only the requested placement. No eviction. */
|
||||
real_num_busy = fetch_and_zero(&placement->num_busy_placement);
|
||||
ret = ttm_bo_validate(bo, placement, &ctx);
|
||||
if (ret) {
|
||||
ret = i915_ttm_err_to_gem(ret);
|
||||
/*
|
||||
* Anything that wants to restart the operation gets to
|
||||
* do that.
|
||||
*/
|
||||
if (ret == -EDEADLK || ret == -EINTR || ret == -ERESTARTSYS ||
|
||||
ret == -EAGAIN)
|
||||
return ret;
|
||||
|
||||
/* Object either has a page vector or is an iomem object */
|
||||
st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st;
|
||||
if (IS_ERR(st))
|
||||
return PTR_ERR(st);
|
||||
|
||||
__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
|
||||
/*
|
||||
* If the initial attempt fails, allow all accepted placements,
|
||||
* evicting if necessary.
|
||||
*/
|
||||
placement->num_busy_placement = real_num_busy;
|
||||
ret = ttm_bo_validate(bo, placement, &ctx);
|
||||
if (ret)
|
||||
return i915_ttm_err_to_gem(ret);
|
||||
}
|
||||
|
||||
i915_ttm_adjust_lru(obj);
|
||||
if (bo->ttm && !ttm_tt_is_populated(bo->ttm)) {
|
||||
ret = ttm_tt_populate(bo->bdev, bo->ttm, &ctx);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
i915_ttm_adjust_domains_after_move(obj);
|
||||
i915_ttm_adjust_gem_after_move(obj);
|
||||
}
|
||||
|
||||
if (!i915_gem_object_has_pages(obj)) {
|
||||
/* Object either has a page vector or is an iomem object */
|
||||
st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st;
|
||||
if (IS_ERR(st))
|
||||
return PTR_ERR(st);
|
||||
|
||||
__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int i915_ttm_get_pages(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
struct ttm_place requested, busy[I915_TTM_MAX_PLACEMENTS];
|
||||
struct ttm_placement placement;
|
||||
|
||||
GEM_BUG_ON(obj->mm.n_placements > I915_TTM_MAX_PLACEMENTS);
|
||||
|
||||
/* Move to the requested placement. */
|
||||
i915_ttm_placement_from_obj(obj, &requested, busy, &placement);
|
||||
|
||||
return __i915_ttm_get_pages(obj, &placement);
|
||||
}
|
||||
|
||||
/**
|
||||
* DOC: Migration vs eviction
|
||||
*
|
||||
* GEM migration may not be the same as TTM migration / eviction. If
|
||||
* the TTM core decides to evict an object it may be evicted to a
|
||||
* TTM memory type that is not in the object's allowable GEM regions, or
|
||||
* in fact theoretically to a TTM memory type that doesn't correspond to
|
||||
* a GEM memory region. In that case the object's GEM region is not
|
||||
* updated, and the data is migrated back to the GEM region at
|
||||
* get_pages time. TTM may however set up CPU ptes to the object even
|
||||
* when it is evicted.
|
||||
* Gem forced migration using the i915_ttm_migrate() op, is allowed even
|
||||
* to regions that are not in the object's list of allowable placements.
|
||||
*/
|
||||
static int i915_ttm_migrate(struct drm_i915_gem_object *obj,
|
||||
struct intel_memory_region *mr)
|
||||
{
|
||||
struct ttm_place requested;
|
||||
struct ttm_placement placement;
|
||||
int ret;
|
||||
|
||||
i915_ttm_place_from_region(mr, &requested, obj->flags);
|
||||
placement.num_placement = 1;
|
||||
placement.num_busy_placement = 1;
|
||||
placement.placement = &requested;
|
||||
placement.busy_placement = &requested;
|
||||
|
||||
ret = __i915_ttm_get_pages(obj, &placement);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
/*
|
||||
* Reinitialize the region bindings. This is primarily
|
||||
* required for objects where the new region is not in
|
||||
* its allowable placements.
|
||||
*/
|
||||
if (obj->mm.region != mr) {
|
||||
i915_gem_object_release_memory_region(obj);
|
||||
i915_gem_object_init_memory_region(obj, mr);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void i915_ttm_put_pages(struct drm_i915_gem_object *obj,
|
||||
struct sg_table *st)
|
||||
{
|
||||
|
@ -561,15 +859,15 @@ static u64 i915_ttm_mmap_offset(struct drm_i915_gem_object *obj)
|
|||
return drm_vma_node_offset_addr(&obj->base.vma_node);
|
||||
}
|
||||
|
||||
const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {
|
||||
static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {
|
||||
.name = "i915_gem_object_ttm",
|
||||
.flags = I915_GEM_OBJECT_HAS_IOMEM,
|
||||
|
||||
.get_pages = i915_ttm_get_pages,
|
||||
.put_pages = i915_ttm_put_pages,
|
||||
.truncate = i915_ttm_purge,
|
||||
.adjust_lru = i915_ttm_adjust_lru,
|
||||
.delayed_free = i915_ttm_delayed_free,
|
||||
.migrate = i915_ttm_migrate,
|
||||
.mmap_offset = i915_ttm_mmap_offset,
|
||||
.mmap_ops = &vm_ops_ttm,
|
||||
};
|
||||
|
@ -596,37 +894,32 @@ void i915_ttm_bo_destroy(struct ttm_buffer_object *bo)
|
|||
int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
|
||||
struct drm_i915_gem_object *obj,
|
||||
resource_size_t size,
|
||||
resource_size_t page_size,
|
||||
unsigned int flags)
|
||||
{
|
||||
static struct lock_class_key lock_class;
|
||||
struct drm_i915_private *i915 = mem->i915;
|
||||
struct ttm_operation_ctx ctx = {
|
||||
.interruptible = true,
|
||||
.no_wait_gpu = false,
|
||||
};
|
||||
enum ttm_bo_type bo_type;
|
||||
size_t alignment = 0;
|
||||
int ret;
|
||||
|
||||
/* Adjust alignment to GPU- and CPU huge page sizes. */
|
||||
|
||||
if (mem->is_range_manager) {
|
||||
if (size >= SZ_1G)
|
||||
alignment = SZ_1G >> PAGE_SHIFT;
|
||||
else if (size >= SZ_2M)
|
||||
alignment = SZ_2M >> PAGE_SHIFT;
|
||||
else if (size >= SZ_64K)
|
||||
alignment = SZ_64K >> PAGE_SHIFT;
|
||||
}
|
||||
|
||||
drm_gem_private_object_init(&i915->drm, &obj->base, size);
|
||||
i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags);
|
||||
i915_gem_object_init_memory_region(obj, mem);
|
||||
i915_gem_object_make_unshrinkable(obj);
|
||||
obj->read_domains = I915_GEM_DOMAIN_WC | I915_GEM_DOMAIN_GTT;
|
||||
i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
|
||||
INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN);
|
||||
mutex_init(&obj->ttm.get_io_page.lock);
|
||||
|
||||
bo_type = (obj->flags & I915_BO_ALLOC_USER) ? ttm_bo_type_device :
|
||||
ttm_bo_type_kernel;
|
||||
|
||||
obj->base.vma_node.driver_private = i915_gem_to_ttm(obj);
|
||||
|
||||
/* Forcing the page size is kernel internal only */
|
||||
GEM_BUG_ON(page_size && obj->mm.n_placements);
|
||||
|
||||
/*
|
||||
* If this function fails, it will call the destructor, but
|
||||
* our caller still owns the object. So no freeing in the
|
||||
|
@ -634,14 +927,39 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
|
|||
* Similarly, in delayed_destroy, we can't call ttm_bo_put()
|
||||
* until successful initialization.
|
||||
*/
|
||||
obj->base.vma_node.driver_private = i915_gem_to_ttm(obj);
|
||||
ret = ttm_bo_init(&i915->bdev, i915_gem_to_ttm(obj), size,
|
||||
bo_type, &i915_sys_placement, alignment,
|
||||
true, NULL, NULL, i915_ttm_bo_destroy);
|
||||
ret = ttm_bo_init_reserved(&i915->bdev, i915_gem_to_ttm(obj), size,
|
||||
bo_type, &i915_sys_placement,
|
||||
page_size >> PAGE_SHIFT,
|
||||
&ctx, NULL, NULL, i915_ttm_bo_destroy);
|
||||
if (ret)
|
||||
return i915_ttm_err_to_gem(ret);
|
||||
|
||||
if (!ret)
|
||||
obj->ttm.created = true;
|
||||
obj->ttm.created = true;
|
||||
i915_ttm_adjust_domains_after_move(obj);
|
||||
i915_ttm_adjust_gem_after_move(obj);
|
||||
i915_gem_object_unlock(obj);
|
||||
|
||||
/* i915 wants -ENXIO when out of memory region space. */
|
||||
return (ret == -ENOSPC) ? -ENXIO : ret;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static const struct intel_memory_region_ops ttm_system_region_ops = {
|
||||
.init_object = __i915_gem_ttm_object_init,
|
||||
};
|
||||
|
||||
struct intel_memory_region *
|
||||
i915_gem_ttm_system_setup(struct drm_i915_private *i915,
|
||||
u16 type, u16 instance)
|
||||
{
|
||||
struct intel_memory_region *mr;
|
||||
|
||||
mr = intel_memory_region_create(i915, 0,
|
||||
totalram_pages() << PAGE_SHIFT,
|
||||
PAGE_SIZE, 0,
|
||||
type, instance,
|
||||
&ttm_system_region_ops);
|
||||
if (IS_ERR(mr))
|
||||
return mr;
|
||||
|
||||
intel_memory_region_set_name(mr, "system-ttm");
|
||||
return mr;
|
||||
}
|
||||
|
|
|
@ -44,5 +44,6 @@ i915_ttm_to_gem(struct ttm_buffer_object *bo)
|
|||
int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
|
||||
struct drm_i915_gem_object *obj,
|
||||
resource_size_t size,
|
||||
resource_size_t page_size,
|
||||
unsigned int flags);
|
||||
#endif
|
||||
|
|
|
@ -67,11 +67,11 @@ static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier *mni,
|
|||
if (!mmu_notifier_range_blockable(range))
|
||||
return false;
|
||||
|
||||
spin_lock(&i915->mm.notifier_lock);
|
||||
write_lock(&i915->mm.notifier_lock);
|
||||
|
||||
mmu_interval_set_seq(mni, cur_seq);
|
||||
|
||||
spin_unlock(&i915->mm.notifier_lock);
|
||||
write_unlock(&i915->mm.notifier_lock);
|
||||
|
||||
/*
|
||||
* We don't wait when the process is exiting. This is valid
|
||||
|
@ -107,16 +107,15 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj)
|
|||
|
||||
static void i915_gem_object_userptr_drop_ref(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
struct drm_i915_private *i915 = to_i915(obj->base.dev);
|
||||
struct page **pvec = NULL;
|
||||
|
||||
spin_lock(&i915->mm.notifier_lock);
|
||||
assert_object_held_shared(obj);
|
||||
|
||||
if (!--obj->userptr.page_ref) {
|
||||
pvec = obj->userptr.pvec;
|
||||
obj->userptr.pvec = NULL;
|
||||
}
|
||||
GEM_BUG_ON(obj->userptr.page_ref < 0);
|
||||
spin_unlock(&i915->mm.notifier_lock);
|
||||
|
||||
if (pvec) {
|
||||
const unsigned long num_pages = obj->base.size >> PAGE_SHIFT;
|
||||
|
@ -128,7 +127,6 @@ static void i915_gem_object_userptr_drop_ref(struct drm_i915_gem_object *obj)
|
|||
|
||||
static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
struct drm_i915_private *i915 = to_i915(obj->base.dev);
|
||||
const unsigned long num_pages = obj->base.size >> PAGE_SHIFT;
|
||||
unsigned int max_segment = i915_sg_segment_size();
|
||||
struct sg_table *st;
|
||||
|
@ -141,16 +139,13 @@ static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
|
|||
if (!st)
|
||||
return -ENOMEM;
|
||||
|
||||
spin_lock(&i915->mm.notifier_lock);
|
||||
if (GEM_WARN_ON(!obj->userptr.page_ref)) {
|
||||
spin_unlock(&i915->mm.notifier_lock);
|
||||
ret = -EFAULT;
|
||||
if (!obj->userptr.page_ref) {
|
||||
ret = -EAGAIN;
|
||||
goto err_free;
|
||||
}
|
||||
|
||||
obj->userptr.page_ref++;
|
||||
pvec = obj->userptr.pvec;
|
||||
spin_unlock(&i915->mm.notifier_lock);
|
||||
|
||||
alloc_table:
|
||||
sg = __sg_alloc_table_from_pages(st, pvec, num_pages, 0,
|
||||
|
@ -241,7 +236,7 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
|
|||
i915_gem_object_userptr_drop_ref(obj);
|
||||
}
|
||||
|
||||
static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj, bool get_pages)
|
||||
static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
struct sg_table *pages;
|
||||
int err;
|
||||
|
@ -259,15 +254,11 @@ static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj, bool
|
|||
if (!IS_ERR_OR_NULL(pages))
|
||||
i915_gem_userptr_put_pages(obj, pages);
|
||||
|
||||
if (get_pages)
|
||||
err = ____i915_gem_object_get_pages(obj);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
struct drm_i915_private *i915 = to_i915(obj->base.dev);
|
||||
const unsigned long num_pages = obj->base.size >> PAGE_SHIFT;
|
||||
struct page **pvec;
|
||||
unsigned int gup_flags = 0;
|
||||
|
@ -277,38 +268,21 @@ int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj)
|
|||
if (obj->userptr.notifier.mm != current->mm)
|
||||
return -EFAULT;
|
||||
|
||||
notifier_seq = mmu_interval_read_begin(&obj->userptr.notifier);
|
||||
|
||||
ret = i915_gem_object_lock_interruptible(obj, NULL);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
/* optimistically try to preserve current pages while unlocked */
|
||||
if (i915_gem_object_has_pages(obj) &&
|
||||
!mmu_interval_check_retry(&obj->userptr.notifier,
|
||||
obj->userptr.notifier_seq)) {
|
||||
spin_lock(&i915->mm.notifier_lock);
|
||||
if (obj->userptr.pvec &&
|
||||
!mmu_interval_read_retry(&obj->userptr.notifier,
|
||||
obj->userptr.notifier_seq)) {
|
||||
obj->userptr.page_ref++;
|
||||
|
||||
/* We can keep using the current binding, this is the fastpath */
|
||||
ret = 1;
|
||||
}
|
||||
spin_unlock(&i915->mm.notifier_lock);
|
||||
}
|
||||
|
||||
if (!ret) {
|
||||
/* Make sure userptr is unbound for next attempt, so we don't use stale pages. */
|
||||
ret = i915_gem_object_userptr_unbind(obj, false);
|
||||
}
|
||||
i915_gem_object_unlock(obj);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
if (ret > 0)
|
||||
if (notifier_seq == obj->userptr.notifier_seq && obj->userptr.pvec) {
|
||||
i915_gem_object_unlock(obj);
|
||||
return 0;
|
||||
}
|
||||
|
||||
notifier_seq = mmu_interval_read_begin(&obj->userptr.notifier);
|
||||
ret = i915_gem_object_userptr_unbind(obj);
|
||||
i915_gem_object_unlock(obj);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
pvec = kvmalloc_array(num_pages, sizeof(struct page *), GFP_KERNEL);
|
||||
if (!pvec)
|
||||
|
@ -329,7 +303,9 @@ int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj)
|
|||
}
|
||||
ret = 0;
|
||||
|
||||
spin_lock(&i915->mm.notifier_lock);
|
||||
ret = i915_gem_object_lock_interruptible(obj, NULL);
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
if (mmu_interval_read_retry(&obj->userptr.notifier,
|
||||
!obj->userptr.page_ref ? notifier_seq :
|
||||
|
@ -341,12 +317,14 @@ int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj)
|
|||
if (!obj->userptr.page_ref++) {
|
||||
obj->userptr.pvec = pvec;
|
||||
obj->userptr.notifier_seq = notifier_seq;
|
||||
|
||||
pvec = NULL;
|
||||
ret = ____i915_gem_object_get_pages(obj);
|
||||
}
|
||||
|
||||
obj->userptr.page_ref--;
|
||||
|
||||
out_unlock:
|
||||
spin_unlock(&i915->mm.notifier_lock);
|
||||
i915_gem_object_unlock(obj);
|
||||
|
||||
out:
|
||||
if (pvec) {
|
||||
|
@ -369,11 +347,6 @@ int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj)
|
|||
return 0;
|
||||
}
|
||||
|
||||
void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
i915_gem_object_userptr_drop_ref(obj);
|
||||
}
|
||||
|
||||
int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
int err;
|
||||
|
@ -396,7 +369,6 @@ int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj)
|
|||
i915_gem_object_unlock(obj);
|
||||
}
|
||||
|
||||
i915_gem_object_userptr_submit_fini(obj);
|
||||
return err;
|
||||
}
|
||||
|
||||
|
@ -450,6 +422,34 @@ static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
|
|||
|
||||
#endif
|
||||
|
||||
static int
|
||||
probe_range(struct mm_struct *mm, unsigned long addr, unsigned long len)
|
||||
{
|
||||
const unsigned long end = addr + len;
|
||||
struct vm_area_struct *vma;
|
||||
int ret = -EFAULT;
|
||||
|
||||
mmap_read_lock(mm);
|
||||
for (vma = find_vma(mm, addr); vma; vma = vma->vm_next) {
|
||||
/* Check for holes, note that we also update the addr below */
|
||||
if (vma->vm_start > addr)
|
||||
break;
|
||||
|
||||
if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
|
||||
break;
|
||||
|
||||
if (vma->vm_end >= end) {
|
||||
ret = 0;
|
||||
break;
|
||||
}
|
||||
|
||||
addr = vma->vm_end;
|
||||
}
|
||||
mmap_read_unlock(mm);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
/*
|
||||
* Creates a new mm object that wraps some normal memory from the process
|
||||
* context - user memory.
|
||||
|
@ -505,7 +505,8 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
|
|||
}
|
||||
|
||||
if (args->flags & ~(I915_USERPTR_READ_ONLY |
|
||||
I915_USERPTR_UNSYNCHRONIZED))
|
||||
I915_USERPTR_UNSYNCHRONIZED |
|
||||
I915_USERPTR_PROBE))
|
||||
return -EINVAL;
|
||||
|
||||
if (i915_gem_object_size_2big(args->user_size))
|
||||
|
@ -532,14 +533,24 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
|
|||
return -ENODEV;
|
||||
}
|
||||
|
||||
if (args->flags & I915_USERPTR_PROBE) {
|
||||
/*
|
||||
* Check that the range pointed to represents real struct
|
||||
* pages and not iomappings (at this moment in time!)
|
||||
*/
|
||||
ret = probe_range(current->mm, args->user_ptr, args->user_size);
|
||||
if (ret)
|
||||
return ret;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_MMU_NOTIFIER
|
||||
obj = i915_gem_object_alloc();
|
||||
if (obj == NULL)
|
||||
return -ENOMEM;
|
||||
|
||||
drm_gem_private_object_init(dev, &obj->base, args->user_size);
|
||||
i915_gem_object_init(obj, &i915_gem_userptr_ops, &lock_class,
|
||||
I915_BO_ALLOC_STRUCT_PAGE);
|
||||
i915_gem_object_init(obj, &i915_gem_userptr_ops, &lock_class, 0);
|
||||
obj->mem_flags = I915_BO_FLAG_STRUCT_PAGE;
|
||||
obj->read_domains = I915_GEM_DOMAIN_CPU;
|
||||
obj->write_domain = I915_GEM_DOMAIN_CPU;
|
||||
i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);
|
||||
|
@ -572,7 +583,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
|
|||
int i915_gem_init_userptr(struct drm_i915_private *dev_priv)
|
||||
{
|
||||
#ifdef CONFIG_MMU_NOTIFIER
|
||||
spin_lock_init(&dev_priv->mm.notifier_lock);
|
||||
rwlock_init(&dev_priv->mm.notifier_lock);
|
||||
#endif
|
||||
|
||||
return 0;
|
||||
|
|
|
@ -104,8 +104,8 @@ static void fence_set_priority(struct dma_fence *fence,
|
|||
engine = rq->engine;
|
||||
|
||||
rcu_read_lock(); /* RCU serialisation for set-wedged protection */
|
||||
if (engine->schedule)
|
||||
engine->schedule(rq, attr);
|
||||
if (engine->sched_engine->schedule)
|
||||
engine->sched_engine->schedule(rq, attr);
|
||||
rcu_read_unlock();
|
||||
}
|
||||
|
||||
|
@ -290,3 +290,22 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
|
|||
i915_gem_object_put(obj);
|
||||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
* i915_gem_object_wait_migration - Sync an accelerated migration operation
|
||||
* @obj: The migrating object.
|
||||
* @flags: waiting flags. Currently supports only I915_WAIT_INTERRUPTIBLE.
|
||||
*
|
||||
* Wait for any pending async migration operation on the object,
|
||||
* whether it's explicitly (i915_gem_object_migrate()) or implicitly
|
||||
* (swapin, initial clearing) initiated.
|
||||
*
|
||||
* Return: 0 if successful, -ERESTARTSYS if a signal was hit during waiting.
|
||||
*/
|
||||
int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,
|
||||
unsigned int flags)
|
||||
{
|
||||
might_sleep();
|
||||
/* NOP for now. */
|
||||
return 0;
|
||||
}
|
||||
|
|
|
@ -114,8 +114,8 @@ huge_gem_object(struct drm_i915_private *i915,
|
|||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
drm_gem_private_object_init(&i915->drm, &obj->base, dma_size);
|
||||
i915_gem_object_init(obj, &huge_ops, &lock_class,
|
||||
I915_BO_ALLOC_STRUCT_PAGE);
|
||||
i915_gem_object_init(obj, &huge_ops, &lock_class, 0);
|
||||
obj->mem_flags |= I915_BO_FLAG_STRUCT_PAGE;
|
||||
|
||||
obj->read_domains = I915_GEM_DOMAIN_CPU;
|
||||
obj->write_domain = I915_GEM_DOMAIN_CPU;
|
||||
|
|
|
@ -167,9 +167,8 @@ huge_pages_object(struct drm_i915_private *i915,
|
|||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
drm_gem_private_object_init(&i915->drm, &obj->base, size);
|
||||
i915_gem_object_init(obj, &huge_page_ops, &lock_class,
|
||||
I915_BO_ALLOC_STRUCT_PAGE);
|
||||
|
||||
i915_gem_object_init(obj, &huge_page_ops, &lock_class, 0);
|
||||
obj->mem_flags |= I915_BO_FLAG_STRUCT_PAGE;
|
||||
i915_gem_object_set_volatile(obj);
|
||||
|
||||
obj->write_domain = I915_GEM_DOMAIN_CPU;
|
||||
|
@ -497,7 +496,8 @@ static int igt_mock_memory_region_huge_pages(void *arg)
|
|||
int i;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(flags); ++i) {
|
||||
obj = i915_gem_object_create_region(mem, page_size,
|
||||
obj = i915_gem_object_create_region(mem,
|
||||
page_size, page_size,
|
||||
flags[i]);
|
||||
if (IS_ERR(obj)) {
|
||||
err = PTR_ERR(obj);
|
||||
|
|
|
@ -5,6 +5,7 @@
|
|||
|
||||
#include "i915_selftest.h"
|
||||
|
||||
#include "gt/intel_context.h"
|
||||
#include "gt/intel_engine_user.h"
|
||||
#include "gt/intel_gt.h"
|
||||
#include "gt/intel_gpu_commands.h"
|
||||
|
@ -16,118 +17,6 @@
|
|||
#include "huge_gem_object.h"
|
||||
#include "mock_context.h"
|
||||
|
||||
static int __igt_client_fill(struct intel_engine_cs *engine)
|
||||
{
|
||||
struct intel_context *ce = engine->kernel_context;
|
||||
struct drm_i915_gem_object *obj;
|
||||
I915_RND_STATE(prng);
|
||||
IGT_TIMEOUT(end);
|
||||
u32 *vaddr;
|
||||
int err = 0;
|
||||
|
||||
intel_engine_pm_get(engine);
|
||||
do {
|
||||
const u32 max_block_size = S16_MAX * PAGE_SIZE;
|
||||
u32 sz = min_t(u64, ce->vm->total >> 4, prandom_u32_state(&prng));
|
||||
u32 phys_sz = sz % (max_block_size + 1);
|
||||
u32 val = prandom_u32_state(&prng);
|
||||
u32 i;
|
||||
|
||||
sz = round_up(sz, PAGE_SIZE);
|
||||
phys_sz = round_up(phys_sz, PAGE_SIZE);
|
||||
|
||||
pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
|
||||
phys_sz, sz, val);
|
||||
|
||||
obj = huge_gem_object(engine->i915, phys_sz, sz);
|
||||
if (IS_ERR(obj)) {
|
||||
err = PTR_ERR(obj);
|
||||
goto err_flush;
|
||||
}
|
||||
|
||||
vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
|
||||
if (IS_ERR(vaddr)) {
|
||||
err = PTR_ERR(vaddr);
|
||||
goto err_put;
|
||||
}
|
||||
|
||||
/*
|
||||
* XXX: The goal is move this to get_pages, so try to dirty the
|
||||
* CPU cache first to check that we do the required clflush
|
||||
* before scheduling the blt for !llc platforms. This matches
|
||||
* some version of reality where at get_pages the pages
|
||||
* themselves may not yet be coherent with the GPU(swap-in). If
|
||||
* we are missing the flush then we should see the stale cache
|
||||
* values after we do the set_to_cpu_domain and pick it up as a
|
||||
* test failure.
|
||||
*/
|
||||
memset32(vaddr, val ^ 0xdeadbeaf,
|
||||
huge_gem_object_phys_size(obj) / sizeof(u32));
|
||||
|
||||
if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
|
||||
obj->cache_dirty = true;
|
||||
|
||||
err = i915_gem_schedule_fill_pages_blt(obj, ce, obj->mm.pages,
|
||||
&obj->mm.page_sizes,
|
||||
val);
|
||||
if (err)
|
||||
goto err_unpin;
|
||||
|
||||
i915_gem_object_lock(obj, NULL);
|
||||
err = i915_gem_object_set_to_cpu_domain(obj, false);
|
||||
i915_gem_object_unlock(obj);
|
||||
if (err)
|
||||
goto err_unpin;
|
||||
|
||||
for (i = 0; i < huge_gem_object_phys_size(obj) / sizeof(u32); ++i) {
|
||||
if (vaddr[i] != val) {
|
||||
pr_err("vaddr[%u]=%x, expected=%x\n", i,
|
||||
vaddr[i], val);
|
||||
err = -EINVAL;
|
||||
goto err_unpin;
|
||||
}
|
||||
}
|
||||
|
||||
i915_gem_object_unpin_map(obj);
|
||||
i915_gem_object_put(obj);
|
||||
} while (!time_after(jiffies, end));
|
||||
|
||||
goto err_flush;
|
||||
|
||||
err_unpin:
|
||||
i915_gem_object_unpin_map(obj);
|
||||
err_put:
|
||||
i915_gem_object_put(obj);
|
||||
err_flush:
|
||||
if (err == -ENOMEM)
|
||||
err = 0;
|
||||
intel_engine_pm_put(engine);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static int igt_client_fill(void *arg)
|
||||
{
|
||||
int inst = 0;
|
||||
|
||||
do {
|
||||
struct intel_engine_cs *engine;
|
||||
int err;
|
||||
|
||||
engine = intel_engine_lookup_user(arg,
|
||||
I915_ENGINE_CLASS_COPY,
|
||||
inst++);
|
||||
if (!engine)
|
||||
return 0;
|
||||
|
||||
err = __igt_client_fill(engine);
|
||||
if (err == -ENOMEM)
|
||||
err = 0;
|
||||
if (err)
|
||||
return err;
|
||||
} while (1);
|
||||
}
|
||||
|
||||
#define WIDTH 512
|
||||
#define HEIGHT 32
|
||||
|
||||
|
@ -693,7 +582,6 @@ static int igt_client_tiled_blits(void *arg)
|
|||
int i915_gem_client_blt_live_selftests(struct drm_i915_private *i915)
|
||||
{
|
||||
static const struct i915_subtest tests[] = {
|
||||
SUBTEST(igt_client_fill),
|
||||
SUBTEST(igt_client_tiled_blits),
|
||||
};
|
||||
|
||||
|
|
|
@ -680,7 +680,7 @@ static int igt_ctx_exec(void *arg)
|
|||
struct i915_gem_context *ctx;
|
||||
struct intel_context *ce;
|
||||
|
||||
ctx = kernel_context(i915);
|
||||
ctx = kernel_context(i915, NULL);
|
||||
if (IS_ERR(ctx)) {
|
||||
err = PTR_ERR(ctx);
|
||||
goto out_file;
|
||||
|
@ -813,16 +813,12 @@ static int igt_shared_ctx_exec(void *arg)
|
|||
struct i915_gem_context *ctx;
|
||||
struct intel_context *ce;
|
||||
|
||||
ctx = kernel_context(i915);
|
||||
ctx = kernel_context(i915, ctx_vm(parent));
|
||||
if (IS_ERR(ctx)) {
|
||||
err = PTR_ERR(ctx);
|
||||
goto out_test;
|
||||
}
|
||||
|
||||
mutex_lock(&ctx->mutex);
|
||||
__assign_ppgtt(ctx, ctx_vm(parent));
|
||||
mutex_unlock(&ctx->mutex);
|
||||
|
||||
ce = i915_gem_context_get_engine(ctx, engine->legacy_idx);
|
||||
GEM_BUG_ON(IS_ERR(ce));
|
||||
|
||||
|
@ -1875,125 +1871,6 @@ out_file:
|
|||
return err;
|
||||
}
|
||||
|
||||
static bool skip_unused_engines(struct intel_context *ce, void *data)
|
||||
{
|
||||
return !ce->state;
|
||||
}
|
||||
|
||||
static void mock_barrier_task(void *data)
|
||||
{
|
||||
unsigned int *counter = data;
|
||||
|
||||
++*counter;
|
||||
}
|
||||
|
||||
static int mock_context_barrier(void *arg)
|
||||
{
|
||||
#undef pr_fmt
|
||||
#define pr_fmt(x) "context_barrier_task():" # x
|
||||
struct drm_i915_private *i915 = arg;
|
||||
struct i915_gem_context *ctx;
|
||||
struct i915_request *rq;
|
||||
unsigned int counter;
|
||||
int err;
|
||||
|
||||
/*
|
||||
* The context barrier provides us with a callback after it emits
|
||||
* a request; useful for retiring old state after loading new.
|
||||
*/
|
||||
|
||||
ctx = mock_context(i915, "mock");
|
||||
if (!ctx)
|
||||
return -ENOMEM;
|
||||
|
||||
counter = 0;
|
||||
err = context_barrier_task(ctx, 0, NULL, NULL, NULL,
|
||||
mock_barrier_task, &counter);
|
||||
if (err) {
|
||||
pr_err("Failed at line %d, err=%d\n", __LINE__, err);
|
||||
goto out;
|
||||
}
|
||||
if (counter == 0) {
|
||||
pr_err("Did not retire immediately with 0 engines\n");
|
||||
err = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
counter = 0;
|
||||
err = context_barrier_task(ctx, ALL_ENGINES, skip_unused_engines,
|
||||
NULL, NULL, mock_barrier_task, &counter);
|
||||
if (err) {
|
||||
pr_err("Failed at line %d, err=%d\n", __LINE__, err);
|
||||
goto out;
|
||||
}
|
||||
if (counter == 0) {
|
||||
pr_err("Did not retire immediately for all unused engines\n");
|
||||
err = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
rq = igt_request_alloc(ctx, i915->gt.engine[RCS0]);
|
||||
if (IS_ERR(rq)) {
|
||||
pr_err("Request allocation failed!\n");
|
||||
goto out;
|
||||
}
|
||||
i915_request_add(rq);
|
||||
|
||||
counter = 0;
|
||||
context_barrier_inject_fault = BIT(RCS0);
|
||||
err = context_barrier_task(ctx, ALL_ENGINES, NULL, NULL, NULL,
|
||||
mock_barrier_task, &counter);
|
||||
context_barrier_inject_fault = 0;
|
||||
if (err == -ENXIO)
|
||||
err = 0;
|
||||
else
|
||||
pr_err("Did not hit fault injection!\n");
|
||||
if (counter != 0) {
|
||||
pr_err("Invoked callback on error!\n");
|
||||
err = -EIO;
|
||||
}
|
||||
if (err)
|
||||
goto out;
|
||||
|
||||
counter = 0;
|
||||
err = context_barrier_task(ctx, ALL_ENGINES, skip_unused_engines,
|
||||
NULL, NULL, mock_barrier_task, &counter);
|
||||
if (err) {
|
||||
pr_err("Failed at line %d, err=%d\n", __LINE__, err);
|
||||
goto out;
|
||||
}
|
||||
mock_device_flush(i915);
|
||||
if (counter == 0) {
|
||||
pr_err("Did not retire on each active engines\n");
|
||||
err = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
out:
|
||||
mock_context_close(ctx);
|
||||
return err;
|
||||
#undef pr_fmt
|
||||
#define pr_fmt(x) x
|
||||
}
|
||||
|
||||
int i915_gem_context_mock_selftests(void)
|
||||
{
|
||||
static const struct i915_subtest tests[] = {
|
||||
SUBTEST(mock_context_barrier),
|
||||
};
|
||||
struct drm_i915_private *i915;
|
||||
int err;
|
||||
|
||||
i915 = mock_gem_device();
|
||||
if (!i915)
|
||||
return -ENOMEM;
|
||||
|
||||
err = i915_subtests(tests, i915);
|
||||
|
||||
mock_destroy_device(i915);
|
||||
return err;
|
||||
}
|
||||
|
||||
int i915_gem_context_live_selftests(struct drm_i915_private *i915)
|
||||
{
|
||||
static const struct i915_subtest tests[] = {
|
||||
|
|
|
@ -35,7 +35,7 @@ static int igt_dmabuf_export(void *arg)
|
|||
static int igt_dmabuf_import_self(void *arg)
|
||||
{
|
||||
struct drm_i915_private *i915 = arg;
|
||||
struct drm_i915_gem_object *obj;
|
||||
struct drm_i915_gem_object *obj, *import_obj;
|
||||
struct drm_gem_object *import;
|
||||
struct dma_buf *dmabuf;
|
||||
int err;
|
||||
|
@ -65,10 +65,19 @@ static int igt_dmabuf_import_self(void *arg)
|
|||
err = -EINVAL;
|
||||
goto out_import;
|
||||
}
|
||||
import_obj = to_intel_bo(import);
|
||||
|
||||
i915_gem_object_lock(import_obj, NULL);
|
||||
err = __i915_gem_object_get_pages(import_obj);
|
||||
i915_gem_object_unlock(import_obj);
|
||||
if (err) {
|
||||
pr_err("Same object dma-buf get_pages failed!\n");
|
||||
goto out_import;
|
||||
}
|
||||
|
||||
err = 0;
|
||||
out_import:
|
||||
i915_gem_object_put(to_intel_bo(import));
|
||||
i915_gem_object_put(import_obj);
|
||||
out_dmabuf:
|
||||
dma_buf_put(dmabuf);
|
||||
out:
|
||||
|
@ -76,6 +85,180 @@ out:
|
|||
return err;
|
||||
}
|
||||
|
||||
static int igt_dmabuf_import_same_driver_lmem(void *arg)
|
||||
{
|
||||
struct drm_i915_private *i915 = arg;
|
||||
struct intel_memory_region *lmem = i915->mm.regions[INTEL_REGION_LMEM];
|
||||
struct drm_i915_gem_object *obj;
|
||||
struct drm_gem_object *import;
|
||||
struct dma_buf *dmabuf;
|
||||
int err;
|
||||
|
||||
if (!lmem)
|
||||
return 0;
|
||||
|
||||
force_different_devices = true;
|
||||
|
||||
obj = __i915_gem_object_create_user(i915, PAGE_SIZE, &lmem, 1);
|
||||
if (IS_ERR(obj)) {
|
||||
pr_err("__i915_gem_object_create_user failed with err=%ld\n",
|
||||
PTR_ERR(dmabuf));
|
||||
err = PTR_ERR(obj);
|
||||
goto out_ret;
|
||||
}
|
||||
|
||||
dmabuf = i915_gem_prime_export(&obj->base, 0);
|
||||
if (IS_ERR(dmabuf)) {
|
||||
pr_err("i915_gem_prime_export failed with err=%ld\n",
|
||||
PTR_ERR(dmabuf));
|
||||
err = PTR_ERR(dmabuf);
|
||||
goto out;
|
||||
}
|
||||
|
||||
/*
|
||||
* We expect an import of an LMEM-only object to fail with
|
||||
* -EOPNOTSUPP because it can't be migrated to SMEM.
|
||||
*/
|
||||
import = i915_gem_prime_import(&i915->drm, dmabuf);
|
||||
if (!IS_ERR(import)) {
|
||||
drm_gem_object_put(import);
|
||||
pr_err("i915_gem_prime_import succeeded when it shouldn't have\n");
|
||||
err = -EINVAL;
|
||||
} else if (PTR_ERR(import) != -EOPNOTSUPP) {
|
||||
pr_err("i915_gem_prime_import failed with the wrong err=%ld\n",
|
||||
PTR_ERR(import));
|
||||
err = PTR_ERR(import);
|
||||
}
|
||||
|
||||
dma_buf_put(dmabuf);
|
||||
out:
|
||||
i915_gem_object_put(obj);
|
||||
out_ret:
|
||||
force_different_devices = false;
|
||||
return err;
|
||||
}
|
||||
|
||||
static int igt_dmabuf_import_same_driver(struct drm_i915_private *i915,
|
||||
struct intel_memory_region **regions,
|
||||
unsigned int num_regions)
|
||||
{
|
||||
struct drm_i915_gem_object *obj, *import_obj;
|
||||
struct drm_gem_object *import;
|
||||
struct dma_buf *dmabuf;
|
||||
struct dma_buf_attachment *import_attach;
|
||||
struct sg_table *st;
|
||||
long timeout;
|
||||
int err;
|
||||
|
||||
force_different_devices = true;
|
||||
|
||||
obj = __i915_gem_object_create_user(i915, PAGE_SIZE,
|
||||
regions, num_regions);
|
||||
if (IS_ERR(obj)) {
|
||||
pr_err("__i915_gem_object_create_user failed with err=%ld\n",
|
||||
PTR_ERR(dmabuf));
|
||||
err = PTR_ERR(obj);
|
||||
goto out_ret;
|
||||
}
|
||||
|
||||
dmabuf = i915_gem_prime_export(&obj->base, 0);
|
||||
if (IS_ERR(dmabuf)) {
|
||||
pr_err("i915_gem_prime_export failed with err=%ld\n",
|
||||
PTR_ERR(dmabuf));
|
||||
err = PTR_ERR(dmabuf);
|
||||
goto out;
|
||||
}
|
||||
|
||||
import = i915_gem_prime_import(&i915->drm, dmabuf);
|
||||
if (IS_ERR(import)) {
|
||||
pr_err("i915_gem_prime_import failed with err=%ld\n",
|
||||
PTR_ERR(import));
|
||||
err = PTR_ERR(import);
|
||||
goto out_dmabuf;
|
||||
}
|
||||
|
||||
if (import == &obj->base) {
|
||||
pr_err("i915_gem_prime_import reused gem object!\n");
|
||||
err = -EINVAL;
|
||||
goto out_import;
|
||||
}
|
||||
|
||||
import_obj = to_intel_bo(import);
|
||||
|
||||
i915_gem_object_lock(import_obj, NULL);
|
||||
err = __i915_gem_object_get_pages(import_obj);
|
||||
if (err) {
|
||||
pr_err("Different objects dma-buf get_pages failed!\n");
|
||||
i915_gem_object_unlock(import_obj);
|
||||
goto out_import;
|
||||
}
|
||||
|
||||
/*
|
||||
* If the exported object is not in system memory, something
|
||||
* weird is going on. TODO: When p2p is supported, this is no
|
||||
* longer considered weird.
|
||||
*/
|
||||
if (obj->mm.region != i915->mm.regions[INTEL_REGION_SMEM]) {
|
||||
pr_err("Exported dma-buf is not in system memory\n");
|
||||
err = -EINVAL;
|
||||
}
|
||||
|
||||
i915_gem_object_unlock(import_obj);
|
||||
|
||||
/* Now try a fake an importer */
|
||||
import_attach = dma_buf_attach(dmabuf, obj->base.dev->dev);
|
||||
if (IS_ERR(import_attach)) {
|
||||
err = PTR_ERR(import_attach);
|
||||
goto out_import;
|
||||
}
|
||||
|
||||
st = dma_buf_map_attachment(import_attach, DMA_BIDIRECTIONAL);
|
||||
if (IS_ERR(st)) {
|
||||
err = PTR_ERR(st);
|
||||
goto out_detach;
|
||||
}
|
||||
|
||||
timeout = dma_resv_wait_timeout(dmabuf->resv, false, true, 5 * HZ);
|
||||
if (!timeout) {
|
||||
pr_err("dmabuf wait for exclusive fence timed out.\n");
|
||||
timeout = -ETIME;
|
||||
}
|
||||
err = timeout > 0 ? 0 : timeout;
|
||||
dma_buf_unmap_attachment(import_attach, st, DMA_BIDIRECTIONAL);
|
||||
out_detach:
|
||||
dma_buf_detach(dmabuf, import_attach);
|
||||
out_import:
|
||||
i915_gem_object_put(import_obj);
|
||||
out_dmabuf:
|
||||
dma_buf_put(dmabuf);
|
||||
out:
|
||||
i915_gem_object_put(obj);
|
||||
out_ret:
|
||||
force_different_devices = false;
|
||||
return err;
|
||||
}
|
||||
|
||||
static int igt_dmabuf_import_same_driver_smem(void *arg)
|
||||
{
|
||||
struct drm_i915_private *i915 = arg;
|
||||
struct intel_memory_region *smem = i915->mm.regions[INTEL_REGION_SMEM];
|
||||
|
||||
return igt_dmabuf_import_same_driver(i915, &smem, 1);
|
||||
}
|
||||
|
||||
static int igt_dmabuf_import_same_driver_lmem_smem(void *arg)
|
||||
{
|
||||
struct drm_i915_private *i915 = arg;
|
||||
struct intel_memory_region *regions[2];
|
||||
|
||||
if (!i915->mm.regions[INTEL_REGION_LMEM])
|
||||
return 0;
|
||||
|
||||
regions[0] = i915->mm.regions[INTEL_REGION_LMEM];
|
||||
regions[1] = i915->mm.regions[INTEL_REGION_SMEM];
|
||||
return igt_dmabuf_import_same_driver(i915, regions, 2);
|
||||
}
|
||||
|
||||
static int igt_dmabuf_import(void *arg)
|
||||
{
|
||||
struct drm_i915_private *i915 = arg;
|
||||
|
@ -286,6 +469,9 @@ int i915_gem_dmabuf_live_selftests(struct drm_i915_private *i915)
|
|||
{
|
||||
static const struct i915_subtest tests[] = {
|
||||
SUBTEST(igt_dmabuf_export),
|
||||
SUBTEST(igt_dmabuf_import_same_driver_lmem),
|
||||
SUBTEST(igt_dmabuf_import_same_driver_smem),
|
||||
SUBTEST(igt_dmabuf_import_same_driver_lmem_smem),
|
||||
};
|
||||
|
||||
return i915_subtests(tests, i915);
|
||||
|
|
|
@ -0,0 +1,243 @@
|
|||
// SPDX-License-Identifier: MIT
|
||||
/*
|
||||
* Copyright © 2020-2021 Intel Corporation
|
||||
*/
|
||||
|
||||
#include "gt/intel_migrate.h"
|
||||
|
||||
static int igt_fill_check_buffer(struct drm_i915_gem_object *obj,
|
||||
bool fill)
|
||||
{
|
||||
struct drm_i915_private *i915 = to_i915(obj->base.dev);
|
||||
unsigned int i, count = obj->base.size / sizeof(u32);
|
||||
enum i915_map_type map_type =
|
||||
i915_coherent_map_type(i915, obj, false);
|
||||
u32 *cur;
|
||||
int err = 0;
|
||||
|
||||
assert_object_held(obj);
|
||||
cur = i915_gem_object_pin_map(obj, map_type);
|
||||
if (IS_ERR(cur))
|
||||
return PTR_ERR(cur);
|
||||
|
||||
if (fill)
|
||||
for (i = 0; i < count; ++i)
|
||||
*cur++ = i;
|
||||
else
|
||||
for (i = 0; i < count; ++i)
|
||||
if (*cur++ != i) {
|
||||
pr_err("Object content mismatch at location %d of %d\n", i, count);
|
||||
err = -EINVAL;
|
||||
break;
|
||||
}
|
||||
|
||||
i915_gem_object_unpin_map(obj);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static int igt_create_migrate(struct intel_gt *gt, enum intel_region_id src,
|
||||
enum intel_region_id dst)
|
||||
{
|
||||
struct drm_i915_private *i915 = gt->i915;
|
||||
struct intel_memory_region *src_mr = i915->mm.regions[src];
|
||||
struct drm_i915_gem_object *obj;
|
||||
struct i915_gem_ww_ctx ww;
|
||||
int err = 0;
|
||||
|
||||
GEM_BUG_ON(!src_mr);
|
||||
|
||||
/* Switch object backing-store on create */
|
||||
obj = i915_gem_object_create_region(src_mr, PAGE_SIZE, 0, 0);
|
||||
if (IS_ERR(obj))
|
||||
return PTR_ERR(obj);
|
||||
|
||||
for_i915_gem_ww(&ww, err, true) {
|
||||
err = i915_gem_object_lock(obj, &ww);
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
err = igt_fill_check_buffer(obj, true);
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
err = i915_gem_object_migrate(obj, &ww, dst);
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
err = i915_gem_object_pin_pages(obj);
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
if (i915_gem_object_can_migrate(obj, src))
|
||||
err = -EINVAL;
|
||||
|
||||
i915_gem_object_unpin_pages(obj);
|
||||
err = i915_gem_object_wait_migration(obj, true);
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
err = igt_fill_check_buffer(obj, false);
|
||||
}
|
||||
i915_gem_object_put(obj);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static int igt_smem_create_migrate(void *arg)
|
||||
{
|
||||
return igt_create_migrate(arg, INTEL_REGION_LMEM, INTEL_REGION_SMEM);
|
||||
}
|
||||
|
||||
static int igt_lmem_create_migrate(void *arg)
|
||||
{
|
||||
return igt_create_migrate(arg, INTEL_REGION_SMEM, INTEL_REGION_LMEM);
|
||||
}
|
||||
|
||||
static int igt_same_create_migrate(void *arg)
|
||||
{
|
||||
return igt_create_migrate(arg, INTEL_REGION_LMEM, INTEL_REGION_LMEM);
|
||||
}
|
||||
|
||||
static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
|
||||
struct drm_i915_gem_object *obj)
|
||||
{
|
||||
int err;
|
||||
|
||||
err = i915_gem_object_lock(obj, ww);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
if (i915_gem_object_is_lmem(obj)) {
|
||||
err = i915_gem_object_migrate(obj, ww, INTEL_REGION_SMEM);
|
||||
if (err) {
|
||||
pr_err("Object failed migration to smem\n");
|
||||
if (err)
|
||||
return err;
|
||||
}
|
||||
|
||||
if (i915_gem_object_is_lmem(obj)) {
|
||||
pr_err("object still backed by lmem\n");
|
||||
err = -EINVAL;
|
||||
}
|
||||
|
||||
if (!i915_gem_object_has_struct_page(obj)) {
|
||||
pr_err("object not backed by struct page\n");
|
||||
err = -EINVAL;
|
||||
}
|
||||
|
||||
} else {
|
||||
err = i915_gem_object_migrate(obj, ww, INTEL_REGION_LMEM);
|
||||
if (err) {
|
||||
pr_err("Object failed migration to lmem\n");
|
||||
if (err)
|
||||
return err;
|
||||
}
|
||||
|
||||
if (i915_gem_object_has_struct_page(obj)) {
|
||||
pr_err("object still backed by struct page\n");
|
||||
err = -EINVAL;
|
||||
}
|
||||
|
||||
if (!i915_gem_object_is_lmem(obj)) {
|
||||
pr_err("object not backed by lmem\n");
|
||||
err = -EINVAL;
|
||||
}
|
||||
}
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static int igt_lmem_pages_migrate(void *arg)
|
||||
{
|
||||
struct intel_gt *gt = arg;
|
||||
struct drm_i915_private *i915 = gt->i915;
|
||||
struct drm_i915_gem_object *obj;
|
||||
struct i915_gem_ww_ctx ww;
|
||||
struct i915_request *rq;
|
||||
int err;
|
||||
int i;
|
||||
|
||||
/* From LMEM to shmem and back again */
|
||||
|
||||
obj = i915_gem_object_create_lmem(i915, SZ_2M, 0);
|
||||
if (IS_ERR(obj))
|
||||
return PTR_ERR(obj);
|
||||
|
||||
/* Initial GPU fill, sync, CPU initialization. */
|
||||
for_i915_gem_ww(&ww, err, true) {
|
||||
err = i915_gem_object_lock(obj, &ww);
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
err = ____i915_gem_object_get_pages(obj);
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
err = intel_migrate_clear(>->migrate, &ww, NULL,
|
||||
obj->mm.pages->sgl, obj->cache_level,
|
||||
i915_gem_object_is_lmem(obj),
|
||||
0xdeadbeaf, &rq);
|
||||
if (rq) {
|
||||
dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
|
||||
i915_request_put(rq);
|
||||
}
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
err = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE,
|
||||
5 * HZ);
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
err = igt_fill_check_buffer(obj, true);
|
||||
if (err)
|
||||
continue;
|
||||
}
|
||||
if (err)
|
||||
goto out_put;
|
||||
|
||||
/*
|
||||
* Migrate to and from smem without explicitly syncing.
|
||||
* Finalize with data in smem for fast readout.
|
||||
*/
|
||||
for (i = 1; i <= 5; ++i) {
|
||||
for_i915_gem_ww(&ww, err, true)
|
||||
err = lmem_pages_migrate_one(&ww, obj);
|
||||
if (err)
|
||||
goto out_put;
|
||||
}
|
||||
|
||||
err = i915_gem_object_lock_interruptible(obj, NULL);
|
||||
if (err)
|
||||
goto out_put;
|
||||
|
||||
/* Finally sync migration and check content. */
|
||||
err = i915_gem_object_wait_migration(obj, true);
|
||||
if (err)
|
||||
goto out_unlock;
|
||||
|
||||
err = igt_fill_check_buffer(obj, false);
|
||||
|
||||
out_unlock:
|
||||
i915_gem_object_unlock(obj);
|
||||
out_put:
|
||||
i915_gem_object_put(obj);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
int i915_gem_migrate_live_selftests(struct drm_i915_private *i915)
|
||||
{
|
||||
static const struct i915_subtest tests[] = {
|
||||
SUBTEST(igt_smem_create_migrate),
|
||||
SUBTEST(igt_lmem_create_migrate),
|
||||
SUBTEST(igt_same_create_migrate),
|
||||
SUBTEST(igt_lmem_pages_migrate),
|
||||
};
|
||||
|
||||
if (!HAS_LMEM(i915))
|
||||
return 0;
|
||||
|
||||
return intel_gt_live_subtests(tests, &i915->gt);
|
||||
}
|
|
@ -573,6 +573,14 @@ err:
|
|||
return 0;
|
||||
}
|
||||
|
||||
static enum i915_mmap_type default_mapping(struct drm_i915_private *i915)
|
||||
{
|
||||
if (HAS_LMEM(i915))
|
||||
return I915_MMAP_TYPE_FIXED;
|
||||
|
||||
return I915_MMAP_TYPE_GTT;
|
||||
}
|
||||
|
||||
static bool assert_mmap_offset(struct drm_i915_private *i915,
|
||||
unsigned long size,
|
||||
int expected)
|
||||
|
@ -585,7 +593,7 @@ static bool assert_mmap_offset(struct drm_i915_private *i915,
|
|||
if (IS_ERR(obj))
|
||||
return expected && expected == PTR_ERR(obj);
|
||||
|
||||
ret = __assign_mmap_offset(obj, I915_MMAP_TYPE_GTT, &offset, NULL);
|
||||
ret = __assign_mmap_offset(obj, default_mapping(i915), &offset, NULL);
|
||||
i915_gem_object_put(obj);
|
||||
|
||||
return ret == expected;
|
||||
|
@ -689,7 +697,7 @@ static int igt_mmap_offset_exhaustion(void *arg)
|
|||
goto out;
|
||||
}
|
||||
|
||||
err = __assign_mmap_offset(obj, I915_MMAP_TYPE_GTT, &offset, NULL);
|
||||
err = __assign_mmap_offset(obj, default_mapping(i915), &offset, NULL);
|
||||
if (err) {
|
||||
pr_err("Unable to insert object into reclaimed hole\n");
|
||||
goto err_obj;
|
||||
|
@ -831,34 +839,25 @@ static int wc_check(struct drm_i915_gem_object *obj)
|
|||
|
||||
static bool can_mmap(struct drm_i915_gem_object *obj, enum i915_mmap_type type)
|
||||
{
|
||||
struct drm_i915_private *i915 = to_i915(obj->base.dev);
|
||||
bool no_map;
|
||||
|
||||
if (HAS_LMEM(i915))
|
||||
return type == I915_MMAP_TYPE_FIXED;
|
||||
else if (type == I915_MMAP_TYPE_FIXED)
|
||||
return false;
|
||||
|
||||
if (type == I915_MMAP_TYPE_GTT &&
|
||||
!i915_ggtt_has_aperture(&to_i915(obj->base.dev)->ggtt))
|
||||
return false;
|
||||
|
||||
if (type != I915_MMAP_TYPE_GTT &&
|
||||
!i915_gem_object_has_struct_page(obj) &&
|
||||
!i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM))
|
||||
return false;
|
||||
i915_gem_object_lock(obj, NULL);
|
||||
no_map = (type != I915_MMAP_TYPE_GTT &&
|
||||
!i915_gem_object_has_struct_page(obj) &&
|
||||
!i915_gem_object_has_iomem(obj));
|
||||
i915_gem_object_unlock(obj);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static void object_set_placements(struct drm_i915_gem_object *obj,
|
||||
struct intel_memory_region **placements,
|
||||
unsigned int n_placements)
|
||||
{
|
||||
GEM_BUG_ON(!n_placements);
|
||||
|
||||
if (n_placements == 1) {
|
||||
struct drm_i915_private *i915 = to_i915(obj->base.dev);
|
||||
struct intel_memory_region *mr = placements[0];
|
||||
|
||||
obj->mm.placements = &i915->mm.regions[mr->id];
|
||||
obj->mm.n_placements = 1;
|
||||
} else {
|
||||
obj->mm.placements = placements;
|
||||
obj->mm.n_placements = n_placements;
|
||||
}
|
||||
return !no_map;
|
||||
}
|
||||
|
||||
#define expand32(x) (((x) << 0) | ((x) << 8) | ((x) << 16) | ((x) << 24))
|
||||
|
@ -955,18 +954,18 @@ static int igt_mmap(void *arg)
|
|||
struct drm_i915_gem_object *obj;
|
||||
int err;
|
||||
|
||||
obj = i915_gem_object_create_region(mr, sizes[i], I915_BO_ALLOC_USER);
|
||||
obj = __i915_gem_object_create_user(i915, sizes[i], &mr, 1);
|
||||
if (obj == ERR_PTR(-ENODEV))
|
||||
continue;
|
||||
|
||||
if (IS_ERR(obj))
|
||||
return PTR_ERR(obj);
|
||||
|
||||
object_set_placements(obj, &mr, 1);
|
||||
|
||||
err = __igt_mmap(i915, obj, I915_MMAP_TYPE_GTT);
|
||||
if (err == 0)
|
||||
err = __igt_mmap(i915, obj, I915_MMAP_TYPE_WC);
|
||||
if (err == 0)
|
||||
err = __igt_mmap(i915, obj, I915_MMAP_TYPE_FIXED);
|
||||
|
||||
i915_gem_object_put(obj);
|
||||
if (err)
|
||||
|
@ -984,14 +983,21 @@ static const char *repr_mmap_type(enum i915_mmap_type type)
|
|||
case I915_MMAP_TYPE_WB: return "wb";
|
||||
case I915_MMAP_TYPE_WC: return "wc";
|
||||
case I915_MMAP_TYPE_UC: return "uc";
|
||||
case I915_MMAP_TYPE_FIXED: return "fixed";
|
||||
default: return "unknown";
|
||||
}
|
||||
}
|
||||
|
||||
static bool can_access(const struct drm_i915_gem_object *obj)
|
||||
static bool can_access(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
return i915_gem_object_has_struct_page(obj) ||
|
||||
i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM);
|
||||
bool access;
|
||||
|
||||
i915_gem_object_lock(obj, NULL);
|
||||
access = i915_gem_object_has_struct_page(obj) ||
|
||||
i915_gem_object_has_iomem(obj);
|
||||
i915_gem_object_unlock(obj);
|
||||
|
||||
return access;
|
||||
}
|
||||
|
||||
static int __igt_mmap_access(struct drm_i915_private *i915,
|
||||
|
@ -1075,15 +1081,13 @@ static int igt_mmap_access(void *arg)
|
|||
struct drm_i915_gem_object *obj;
|
||||
int err;
|
||||
|
||||
obj = i915_gem_object_create_region(mr, PAGE_SIZE, I915_BO_ALLOC_USER);
|
||||
obj = __i915_gem_object_create_user(i915, PAGE_SIZE, &mr, 1);
|
||||
if (obj == ERR_PTR(-ENODEV))
|
||||
continue;
|
||||
|
||||
if (IS_ERR(obj))
|
||||
return PTR_ERR(obj);
|
||||
|
||||
object_set_placements(obj, &mr, 1);
|
||||
|
||||
err = __igt_mmap_access(i915, obj, I915_MMAP_TYPE_GTT);
|
||||
if (err == 0)
|
||||
err = __igt_mmap_access(i915, obj, I915_MMAP_TYPE_WB);
|
||||
|
@ -1091,6 +1095,8 @@ static int igt_mmap_access(void *arg)
|
|||
err = __igt_mmap_access(i915, obj, I915_MMAP_TYPE_WC);
|
||||
if (err == 0)
|
||||
err = __igt_mmap_access(i915, obj, I915_MMAP_TYPE_UC);
|
||||
if (err == 0)
|
||||
err = __igt_mmap_access(i915, obj, I915_MMAP_TYPE_FIXED);
|
||||
|
||||
i915_gem_object_put(obj);
|
||||
if (err)
|
||||
|
@ -1220,18 +1226,18 @@ static int igt_mmap_gpu(void *arg)
|
|||
struct drm_i915_gem_object *obj;
|
||||
int err;
|
||||
|
||||
obj = i915_gem_object_create_region(mr, PAGE_SIZE, I915_BO_ALLOC_USER);
|
||||
obj = __i915_gem_object_create_user(i915, PAGE_SIZE, &mr, 1);
|
||||
if (obj == ERR_PTR(-ENODEV))
|
||||
continue;
|
||||
|
||||
if (IS_ERR(obj))
|
||||
return PTR_ERR(obj);
|
||||
|
||||
object_set_placements(obj, &mr, 1);
|
||||
|
||||
err = __igt_mmap_gpu(i915, obj, I915_MMAP_TYPE_GTT);
|
||||
if (err == 0)
|
||||
err = __igt_mmap_gpu(i915, obj, I915_MMAP_TYPE_WC);
|
||||
if (err == 0)
|
||||
err = __igt_mmap_gpu(i915, obj, I915_MMAP_TYPE_FIXED);
|
||||
|
||||
i915_gem_object_put(obj);
|
||||
if (err)
|
||||
|
@ -1375,18 +1381,18 @@ static int igt_mmap_revoke(void *arg)
|
|||
struct drm_i915_gem_object *obj;
|
||||
int err;
|
||||
|
||||
obj = i915_gem_object_create_region(mr, PAGE_SIZE, I915_BO_ALLOC_USER);
|
||||
obj = __i915_gem_object_create_user(i915, PAGE_SIZE, &mr, 1);
|
||||
if (obj == ERR_PTR(-ENODEV))
|
||||
continue;
|
||||
|
||||
if (IS_ERR(obj))
|
||||
return PTR_ERR(obj);
|
||||
|
||||
object_set_placements(obj, &mr, 1);
|
||||
|
||||
err = __igt_mmap_revoke(i915, obj, I915_MMAP_TYPE_GTT);
|
||||
if (err == 0)
|
||||
err = __igt_mmap_revoke(i915, obj, I915_MMAP_TYPE_WC);
|
||||
if (err == 0)
|
||||
err = __igt_mmap_revoke(i915, obj, I915_MMAP_TYPE_FIXED);
|
||||
|
||||
i915_gem_object_put(obj);
|
||||
if (err)
|
||||
|
|
|
@ -1,597 +0,0 @@
|
|||
// SPDX-License-Identifier: MIT
|
||||
/*
|
||||
* Copyright © 2019 Intel Corporation
|
||||
*/
|
||||
|
||||
#include <linux/sort.h>
|
||||
|
||||
#include "gt/intel_gt.h"
|
||||
#include "gt/intel_engine_user.h"
|
||||
|
||||
#include "i915_selftest.h"
|
||||
|
||||
#include "gem/i915_gem_context.h"
|
||||
#include "selftests/igt_flush_test.h"
|
||||
#include "selftests/i915_random.h"
|
||||
#include "selftests/mock_drm.h"
|
||||
#include "huge_gem_object.h"
|
||||
#include "mock_context.h"
|
||||
|
||||
static int wrap_ktime_compare(const void *A, const void *B)
|
||||
{
|
||||
const ktime_t *a = A, *b = B;
|
||||
|
||||
return ktime_compare(*a, *b);
|
||||
}
|
||||
|
||||
static int __perf_fill_blt(struct drm_i915_gem_object *obj)
|
||||
{
|
||||
struct drm_i915_private *i915 = to_i915(obj->base.dev);
|
||||
int inst = 0;
|
||||
|
||||
do {
|
||||
struct intel_engine_cs *engine;
|
||||
ktime_t t[5];
|
||||
int pass;
|
||||
int err;
|
||||
|
||||
engine = intel_engine_lookup_user(i915,
|
||||
I915_ENGINE_CLASS_COPY,
|
||||
inst++);
|
||||
if (!engine)
|
||||
return 0;
|
||||
|
||||
intel_engine_pm_get(engine);
|
||||
for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
|
||||
struct intel_context *ce = engine->kernel_context;
|
||||
ktime_t t0, t1;
|
||||
|
||||
t0 = ktime_get();
|
||||
|
||||
err = i915_gem_object_fill_blt(obj, ce, 0);
|
||||
if (err)
|
||||
break;
|
||||
|
||||
err = i915_gem_object_wait(obj,
|
||||
I915_WAIT_ALL,
|
||||
MAX_SCHEDULE_TIMEOUT);
|
||||
if (err)
|
||||
break;
|
||||
|
||||
t1 = ktime_get();
|
||||
t[pass] = ktime_sub(t1, t0);
|
||||
}
|
||||
intel_engine_pm_put(engine);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
|
||||
pr_info("%s: blt %zd KiB fill: %lld MiB/s\n",
|
||||
engine->name,
|
||||
obj->base.size >> 10,
|
||||
div64_u64(mul_u32_u32(4 * obj->base.size,
|
||||
1000 * 1000 * 1000),
|
||||
t[1] + 2 * t[2] + t[3]) >> 20);
|
||||
} while (1);
|
||||
}
|
||||
|
||||
static int perf_fill_blt(void *arg)
|
||||
{
|
||||
struct drm_i915_private *i915 = arg;
|
||||
static const unsigned long sizes[] = {
|
||||
SZ_4K,
|
||||
SZ_64K,
|
||||
SZ_2M,
|
||||
SZ_64M
|
||||
};
|
||||
int i;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(sizes); i++) {
|
||||
struct drm_i915_gem_object *obj;
|
||||
int err;
|
||||
|
||||
obj = i915_gem_object_create_internal(i915, sizes[i]);
|
||||
if (IS_ERR(obj))
|
||||
return PTR_ERR(obj);
|
||||
|
||||
err = __perf_fill_blt(obj);
|
||||
i915_gem_object_put(obj);
|
||||
if (err)
|
||||
return err;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int __perf_copy_blt(struct drm_i915_gem_object *src,
|
||||
struct drm_i915_gem_object *dst)
|
||||
{
|
||||
struct drm_i915_private *i915 = to_i915(src->base.dev);
|
||||
int inst = 0;
|
||||
|
||||
do {
|
||||
struct intel_engine_cs *engine;
|
||||
ktime_t t[5];
|
||||
int pass;
|
||||
int err = 0;
|
||||
|
||||
engine = intel_engine_lookup_user(i915,
|
||||
I915_ENGINE_CLASS_COPY,
|
||||
inst++);
|
||||
if (!engine)
|
||||
return 0;
|
||||
|
||||
intel_engine_pm_get(engine);
|
||||
for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
|
||||
struct intel_context *ce = engine->kernel_context;
|
||||
ktime_t t0, t1;
|
||||
|
||||
t0 = ktime_get();
|
||||
|
||||
err = i915_gem_object_copy_blt(src, dst, ce);
|
||||
if (err)
|
||||
break;
|
||||
|
||||
err = i915_gem_object_wait(dst,
|
||||
I915_WAIT_ALL,
|
||||
MAX_SCHEDULE_TIMEOUT);
|
||||
if (err)
|
||||
break;
|
||||
|
||||
t1 = ktime_get();
|
||||
t[pass] = ktime_sub(t1, t0);
|
||||
}
|
||||
intel_engine_pm_put(engine);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
|
||||
pr_info("%s: blt %zd KiB copy: %lld MiB/s\n",
|
||||
engine->name,
|
||||
src->base.size >> 10,
|
||||
div64_u64(mul_u32_u32(4 * src->base.size,
|
||||
1000 * 1000 * 1000),
|
||||
t[1] + 2 * t[2] + t[3]) >> 20);
|
||||
} while (1);
|
||||
}
|
||||
|
||||
static int perf_copy_blt(void *arg)
|
||||
{
|
||||
struct drm_i915_private *i915 = arg;
|
||||
static const unsigned long sizes[] = {
|
||||
SZ_4K,
|
||||
SZ_64K,
|
||||
SZ_2M,
|
||||
SZ_64M
|
||||
};
|
||||
int i;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(sizes); i++) {
|
||||
struct drm_i915_gem_object *src, *dst;
|
||||
int err;
|
||||
|
||||
src = i915_gem_object_create_internal(i915, sizes[i]);
|
||||
if (IS_ERR(src))
|
||||
return PTR_ERR(src);
|
||||
|
||||
dst = i915_gem_object_create_internal(i915, sizes[i]);
|
||||
if (IS_ERR(dst)) {
|
||||
err = PTR_ERR(dst);
|
||||
goto err_src;
|
||||
}
|
||||
|
||||
err = __perf_copy_blt(src, dst);
|
||||
|
||||
i915_gem_object_put(dst);
|
||||
err_src:
|
||||
i915_gem_object_put(src);
|
||||
if (err)
|
||||
return err;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
struct igt_thread_arg {
|
||||
struct intel_engine_cs *engine;
|
||||
struct i915_gem_context *ctx;
|
||||
struct file *file;
|
||||
struct rnd_state prng;
|
||||
unsigned int n_cpus;
|
||||
};
|
||||
|
||||
static int igt_fill_blt_thread(void *arg)
|
||||
{
|
||||
struct igt_thread_arg *thread = arg;
|
||||
struct intel_engine_cs *engine = thread->engine;
|
||||
struct rnd_state *prng = &thread->prng;
|
||||
struct drm_i915_gem_object *obj;
|
||||
struct i915_gem_context *ctx;
|
||||
struct intel_context *ce;
|
||||
unsigned int prio;
|
||||
IGT_TIMEOUT(end);
|
||||
u64 total, max;
|
||||
int err;
|
||||
|
||||
ctx = thread->ctx;
|
||||
if (!ctx) {
|
||||
ctx = live_context_for_engine(engine, thread->file);
|
||||
if (IS_ERR(ctx))
|
||||
return PTR_ERR(ctx);
|
||||
|
||||
prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
|
||||
ctx->sched.priority = prio;
|
||||
}
|
||||
|
||||
ce = i915_gem_context_get_engine(ctx, 0);
|
||||
GEM_BUG_ON(IS_ERR(ce));
|
||||
|
||||
/*
|
||||
* If we have a tiny shared address space, like for the GGTT
|
||||
* then we can't be too greedy.
|
||||
*/
|
||||
max = ce->vm->total;
|
||||
if (i915_is_ggtt(ce->vm) || thread->ctx)
|
||||
max = div_u64(max, thread->n_cpus);
|
||||
max >>= 4;
|
||||
|
||||
total = PAGE_SIZE;
|
||||
do {
|
||||
/* Aim to keep the runtime under reasonable bounds! */
|
||||
const u32 max_phys_size = SZ_64K;
|
||||
u32 val = prandom_u32_state(prng);
|
||||
u32 phys_sz;
|
||||
u32 sz;
|
||||
u32 *vaddr;
|
||||
u32 i;
|
||||
|
||||
total = min(total, max);
|
||||
sz = i915_prandom_u32_max_state(total, prng) + 1;
|
||||
phys_sz = sz % max_phys_size + 1;
|
||||
|
||||
sz = round_up(sz, PAGE_SIZE);
|
||||
phys_sz = round_up(phys_sz, PAGE_SIZE);
|
||||
phys_sz = min(phys_sz, sz);
|
||||
|
||||
pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
|
||||
phys_sz, sz, val);
|
||||
|
||||
obj = huge_gem_object(engine->i915, phys_sz, sz);
|
||||
if (IS_ERR(obj)) {
|
||||
err = PTR_ERR(obj);
|
||||
goto err_flush;
|
||||
}
|
||||
|
||||
vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
|
||||
if (IS_ERR(vaddr)) {
|
||||
err = PTR_ERR(vaddr);
|
||||
goto err_put;
|
||||
}
|
||||
|
||||
/*
|
||||
* Make sure the potentially async clflush does its job, if
|
||||
* required.
|
||||
*/
|
||||
memset32(vaddr, val ^ 0xdeadbeaf,
|
||||
huge_gem_object_phys_size(obj) / sizeof(u32));
|
||||
|
||||
if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
|
||||
obj->cache_dirty = true;
|
||||
|
||||
err = i915_gem_object_fill_blt(obj, ce, val);
|
||||
if (err)
|
||||
goto err_unpin;
|
||||
|
||||
err = i915_gem_object_wait(obj, 0, MAX_SCHEDULE_TIMEOUT);
|
||||
if (err)
|
||||
goto err_unpin;
|
||||
|
||||
for (i = 0; i < huge_gem_object_phys_size(obj) / sizeof(u32); i += 17) {
|
||||
if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
|
||||
drm_clflush_virt_range(&vaddr[i], sizeof(vaddr[i]));
|
||||
|
||||
if (vaddr[i] != val) {
|
||||
pr_err("vaddr[%u]=%x, expected=%x\n", i,
|
||||
vaddr[i], val);
|
||||
err = -EINVAL;
|
||||
goto err_unpin;
|
||||
}
|
||||
}
|
||||
|
||||
i915_gem_object_unpin_map(obj);
|
||||
i915_gem_object_put(obj);
|
||||
|
||||
total <<= 1;
|
||||
} while (!time_after(jiffies, end));
|
||||
|
||||
goto err_flush;
|
||||
|
||||
err_unpin:
|
||||
i915_gem_object_unpin_map(obj);
|
||||
err_put:
|
||||
i915_gem_object_put(obj);
|
||||
err_flush:
|
||||
if (err == -ENOMEM)
|
||||
err = 0;
|
||||
|
||||
intel_context_put(ce);
|
||||
return err;
|
||||
}
|
||||
|
||||
static int igt_copy_blt_thread(void *arg)
|
||||
{
|
||||
struct igt_thread_arg *thread = arg;
|
||||
struct intel_engine_cs *engine = thread->engine;
|
||||
struct rnd_state *prng = &thread->prng;
|
||||
struct drm_i915_gem_object *src, *dst;
|
||||
struct i915_gem_context *ctx;
|
||||
struct intel_context *ce;
|
||||
unsigned int prio;
|
||||
IGT_TIMEOUT(end);
|
||||
u64 total, max;
|
||||
int err;
|
||||
|
||||
ctx = thread->ctx;
|
||||
if (!ctx) {
|
||||
ctx = live_context_for_engine(engine, thread->file);
|
||||
if (IS_ERR(ctx))
|
||||
return PTR_ERR(ctx);
|
||||
|
||||
prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
|
||||
ctx->sched.priority = prio;
|
||||
}
|
||||
|
||||
ce = i915_gem_context_get_engine(ctx, 0);
|
||||
GEM_BUG_ON(IS_ERR(ce));
|
||||
|
||||
/*
|
||||
* If we have a tiny shared address space, like for the GGTT
|
||||
* then we can't be too greedy.
|
||||
*/
|
||||
max = ce->vm->total;
|
||||
if (i915_is_ggtt(ce->vm) || thread->ctx)
|
||||
max = div_u64(max, thread->n_cpus);
|
||||
max >>= 4;
|
||||
|
||||
total = PAGE_SIZE;
|
||||
do {
|
||||
/* Aim to keep the runtime under reasonable bounds! */
|
||||
const u32 max_phys_size = SZ_64K;
|
||||
u32 val = prandom_u32_state(prng);
|
||||
u32 phys_sz;
|
||||
u32 sz;
|
||||
u32 *vaddr;
|
||||
u32 i;
|
||||
|
||||
total = min(total, max);
|
||||
sz = i915_prandom_u32_max_state(total, prng) + 1;
|
||||
phys_sz = sz % max_phys_size + 1;
|
||||
|
||||
sz = round_up(sz, PAGE_SIZE);
|
||||
phys_sz = round_up(phys_sz, PAGE_SIZE);
|
||||
phys_sz = min(phys_sz, sz);
|
||||
|
||||
pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
|
||||
phys_sz, sz, val);
|
||||
|
||||
src = huge_gem_object(engine->i915, phys_sz, sz);
|
||||
if (IS_ERR(src)) {
|
||||
err = PTR_ERR(src);
|
||||
goto err_flush;
|
||||
}
|
||||
|
||||
vaddr = i915_gem_object_pin_map_unlocked(src, I915_MAP_WB);
|
||||
if (IS_ERR(vaddr)) {
|
||||
err = PTR_ERR(vaddr);
|
||||
goto err_put_src;
|
||||
}
|
||||
|
||||
memset32(vaddr, val,
|
||||
huge_gem_object_phys_size(src) / sizeof(u32));
|
||||
|
||||
i915_gem_object_unpin_map(src);
|
||||
|
||||
if (!(src->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
|
||||
src->cache_dirty = true;
|
||||
|
||||
dst = huge_gem_object(engine->i915, phys_sz, sz);
|
||||
if (IS_ERR(dst)) {
|
||||
err = PTR_ERR(dst);
|
||||
goto err_put_src;
|
||||
}
|
||||
|
||||
vaddr = i915_gem_object_pin_map_unlocked(dst, I915_MAP_WB);
|
||||
if (IS_ERR(vaddr)) {
|
||||
err = PTR_ERR(vaddr);
|
||||
goto err_put_dst;
|
||||
}
|
||||
|
||||
memset32(vaddr, val ^ 0xdeadbeaf,
|
||||
huge_gem_object_phys_size(dst) / sizeof(u32));
|
||||
|
||||
if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
|
||||
dst->cache_dirty = true;
|
||||
|
||||
err = i915_gem_object_copy_blt(src, dst, ce);
|
||||
if (err)
|
||||
goto err_unpin;
|
||||
|
||||
err = i915_gem_object_wait(dst, 0, MAX_SCHEDULE_TIMEOUT);
|
||||
if (err)
|
||||
goto err_unpin;
|
||||
|
||||
for (i = 0; i < huge_gem_object_phys_size(dst) / sizeof(u32); i += 17) {
|
||||
if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
|
||||
drm_clflush_virt_range(&vaddr[i], sizeof(vaddr[i]));
|
||||
|
||||
if (vaddr[i] != val) {
|
||||
pr_err("vaddr[%u]=%x, expected=%x\n", i,
|
||||
vaddr[i], val);
|
||||
err = -EINVAL;
|
||||
goto err_unpin;
|
||||
}
|
||||
}
|
||||
|
||||
i915_gem_object_unpin_map(dst);
|
||||
|
||||
i915_gem_object_put(src);
|
||||
i915_gem_object_put(dst);
|
||||
|
||||
total <<= 1;
|
||||
} while (!time_after(jiffies, end));
|
||||
|
||||
goto err_flush;
|
||||
|
||||
err_unpin:
|
||||
i915_gem_object_unpin_map(dst);
|
||||
err_put_dst:
|
||||
i915_gem_object_put(dst);
|
||||
err_put_src:
|
||||
i915_gem_object_put(src);
|
||||
err_flush:
|
||||
if (err == -ENOMEM)
|
||||
err = 0;
|
||||
|
||||
intel_context_put(ce);
|
||||
return err;
|
||||
}
|
||||
|
||||
static int igt_threaded_blt(struct intel_engine_cs *engine,
|
||||
int (*blt_fn)(void *arg),
|
||||
unsigned int flags)
|
||||
#define SINGLE_CTX BIT(0)
|
||||
{
|
||||
struct igt_thread_arg *thread;
|
||||
struct task_struct **tsk;
|
||||
unsigned int n_cpus, i;
|
||||
I915_RND_STATE(prng);
|
||||
int err = 0;
|
||||
|
||||
n_cpus = num_online_cpus() + 1;
|
||||
|
||||
tsk = kcalloc(n_cpus, sizeof(struct task_struct *), GFP_KERNEL);
|
||||
if (!tsk)
|
||||
return 0;
|
||||
|
||||
thread = kcalloc(n_cpus, sizeof(struct igt_thread_arg), GFP_KERNEL);
|
||||
if (!thread)
|
||||
goto out_tsk;
|
||||
|
||||
thread[0].file = mock_file(engine->i915);
|
||||
if (IS_ERR(thread[0].file)) {
|
||||
err = PTR_ERR(thread[0].file);
|
||||
goto out_thread;
|
||||
}
|
||||
|
||||
if (flags & SINGLE_CTX) {
|
||||
thread[0].ctx = live_context_for_engine(engine, thread[0].file);
|
||||
if (IS_ERR(thread[0].ctx)) {
|
||||
err = PTR_ERR(thread[0].ctx);
|
||||
goto out_file;
|
||||
}
|
||||
}
|
||||
|
||||
for (i = 0; i < n_cpus; ++i) {
|
||||
thread[i].engine = engine;
|
||||
thread[i].file = thread[0].file;
|
||||
thread[i].ctx = thread[0].ctx;
|
||||
thread[i].n_cpus = n_cpus;
|
||||
thread[i].prng =
|
||||
I915_RND_STATE_INITIALIZER(prandom_u32_state(&prng));
|
||||
|
||||
tsk[i] = kthread_run(blt_fn, &thread[i], "igt/blt-%d", i);
|
||||
if (IS_ERR(tsk[i])) {
|
||||
err = PTR_ERR(tsk[i]);
|
||||
break;
|
||||
}
|
||||
|
||||
get_task_struct(tsk[i]);
|
||||
}
|
||||
|
||||
yield(); /* start all threads before we kthread_stop() */
|
||||
|
||||
for (i = 0; i < n_cpus; ++i) {
|
||||
int status;
|
||||
|
||||
if (IS_ERR_OR_NULL(tsk[i]))
|
||||
continue;
|
||||
|
||||
status = kthread_stop(tsk[i]);
|
||||
if (status && !err)
|
||||
err = status;
|
||||
|
||||
put_task_struct(tsk[i]);
|
||||
}
|
||||
|
||||
out_file:
|
||||
fput(thread[0].file);
|
||||
out_thread:
|
||||
kfree(thread);
|
||||
out_tsk:
|
||||
kfree(tsk);
|
||||
return err;
|
||||
}
|
||||
|
||||
static int test_copy_engines(struct drm_i915_private *i915,
|
||||
int (*fn)(void *arg),
|
||||
unsigned int flags)
|
||||
{
|
||||
struct intel_engine_cs *engine;
|
||||
int ret;
|
||||
|
||||
for_each_uabi_class_engine(engine, I915_ENGINE_CLASS_COPY, i915) {
|
||||
ret = igt_threaded_blt(engine, fn, flags);
|
||||
if (ret)
|
||||
return ret;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int igt_fill_blt(void *arg)
|
||||
{
|
||||
return test_copy_engines(arg, igt_fill_blt_thread, 0);
|
||||
}
|
||||
|
||||
static int igt_fill_blt_ctx0(void *arg)
|
||||
{
|
||||
return test_copy_engines(arg, igt_fill_blt_thread, SINGLE_CTX);
|
||||
}
|
||||
|
||||
static int igt_copy_blt(void *arg)
|
||||
{
|
||||
return test_copy_engines(arg, igt_copy_blt_thread, 0);
|
||||
}
|
||||
|
||||
static int igt_copy_blt_ctx0(void *arg)
|
||||
{
|
||||
return test_copy_engines(arg, igt_copy_blt_thread, SINGLE_CTX);
|
||||
}
|
||||
|
||||
int i915_gem_object_blt_live_selftests(struct drm_i915_private *i915)
|
||||
{
|
||||
static const struct i915_subtest tests[] = {
|
||||
SUBTEST(igt_fill_blt),
|
||||
SUBTEST(igt_fill_blt_ctx0),
|
||||
SUBTEST(igt_copy_blt),
|
||||
SUBTEST(igt_copy_blt_ctx0),
|
||||
};
|
||||
|
||||
if (intel_gt_is_wedged(&i915->gt))
|
||||
return 0;
|
||||
|
||||
return i915_live_subtests(tests, i915);
|
||||
}
|
||||
|
||||
int i915_gem_object_blt_perf_selftests(struct drm_i915_private *i915)
|
||||
{
|
||||
static const struct i915_subtest tests[] = {
|
||||
SUBTEST(perf_fill_blt),
|
||||
SUBTEST(perf_copy_blt),
|
||||
};
|
||||
|
||||
if (intel_gt_is_wedged(&i915->gt))
|
||||
return 0;
|
||||
|
||||
return i915_live_subtests(tests, i915);
|
||||
}
|
|
@ -25,13 +25,14 @@ static int mock_phys_object(void *arg)
|
|||
goto out;
|
||||
}
|
||||
|
||||
i915_gem_object_lock(obj, NULL);
|
||||
if (!i915_gem_object_has_struct_page(obj)) {
|
||||
i915_gem_object_unlock(obj);
|
||||
err = -EINVAL;
|
||||
pr_err("shmem has no struct page\n");
|
||||
goto out_obj;
|
||||
}
|
||||
|
||||
i915_gem_object_lock(obj, NULL);
|
||||
err = i915_gem_object_attach_phys(obj, PAGE_SIZE);
|
||||
i915_gem_object_unlock(obj);
|
||||
if (err) {
|
||||
|
|
|
@ -14,6 +14,7 @@ mock_context(struct drm_i915_private *i915,
|
|||
{
|
||||
struct i915_gem_context *ctx;
|
||||
struct i915_gem_engines *e;
|
||||
struct intel_sseu null_sseu = {};
|
||||
|
||||
ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
|
||||
if (!ctx)
|
||||
|
@ -30,15 +31,6 @@ mock_context(struct drm_i915_private *i915,
|
|||
|
||||
i915_gem_context_set_persistence(ctx);
|
||||
|
||||
mutex_init(&ctx->engines_mutex);
|
||||
e = default_engines(ctx);
|
||||
if (IS_ERR(e))
|
||||
goto err_free;
|
||||
RCU_INIT_POINTER(ctx->engines, e);
|
||||
|
||||
INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
|
||||
mutex_init(&ctx->lut_mutex);
|
||||
|
||||
if (name) {
|
||||
struct i915_ppgtt *ppgtt;
|
||||
|
||||
|
@ -46,25 +38,29 @@ mock_context(struct drm_i915_private *i915,
|
|||
|
||||
ppgtt = mock_ppgtt(i915, name);
|
||||
if (!ppgtt)
|
||||
goto err_put;
|
||||
|
||||
mutex_lock(&ctx->mutex);
|
||||
__set_ppgtt(ctx, &ppgtt->vm);
|
||||
mutex_unlock(&ctx->mutex);
|
||||
goto err_free;
|
||||
|
||||
ctx->vm = i915_vm_open(&ppgtt->vm);
|
||||
i915_vm_put(&ppgtt->vm);
|
||||
}
|
||||
|
||||
mutex_init(&ctx->engines_mutex);
|
||||
e = default_engines(ctx, null_sseu);
|
||||
if (IS_ERR(e))
|
||||
goto err_vm;
|
||||
RCU_INIT_POINTER(ctx->engines, e);
|
||||
|
||||
INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
|
||||
mutex_init(&ctx->lut_mutex);
|
||||
|
||||
return ctx;
|
||||
|
||||
err_vm:
|
||||
if (ctx->vm)
|
||||
i915_vm_close(ctx->vm);
|
||||
err_free:
|
||||
kfree(ctx);
|
||||
return NULL;
|
||||
|
||||
err_put:
|
||||
i915_gem_context_set_closed(ctx);
|
||||
i915_gem_context_put(ctx);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
void mock_context_close(struct i915_gem_context *ctx)
|
||||
|
@ -80,20 +76,29 @@ void mock_init_contexts(struct drm_i915_private *i915)
|
|||
struct i915_gem_context *
|
||||
live_context(struct drm_i915_private *i915, struct file *file)
|
||||
{
|
||||
struct drm_i915_file_private *fpriv = to_drm_file(file)->driver_priv;
|
||||
struct i915_gem_proto_context *pc;
|
||||
struct i915_gem_context *ctx;
|
||||
int err;
|
||||
u32 id;
|
||||
|
||||
ctx = i915_gem_create_context(i915, 0);
|
||||
pc = proto_context_create(i915, 0);
|
||||
if (IS_ERR(pc))
|
||||
return ERR_CAST(pc);
|
||||
|
||||
ctx = i915_gem_create_context(i915, pc);
|
||||
proto_context_close(pc);
|
||||
if (IS_ERR(ctx))
|
||||
return ctx;
|
||||
|
||||
i915_gem_context_set_no_error_capture(ctx);
|
||||
|
||||
err = gem_context_register(ctx, to_drm_file(file)->driver_priv, &id);
|
||||
err = xa_alloc(&fpriv->context_xa, &id, NULL, xa_limit_32b, GFP_KERNEL);
|
||||
if (err < 0)
|
||||
goto err_ctx;
|
||||
|
||||
gem_context_register(ctx, fpriv, id);
|
||||
|
||||
return ctx;
|
||||
|
||||
err_ctx:
|
||||
|
@ -106,6 +111,7 @@ live_context_for_engine(struct intel_engine_cs *engine, struct file *file)
|
|||
{
|
||||
struct i915_gem_engines *engines;
|
||||
struct i915_gem_context *ctx;
|
||||
struct intel_sseu null_sseu = {};
|
||||
struct intel_context *ce;
|
||||
|
||||
engines = alloc_engines(1);
|
||||
|
@ -124,7 +130,7 @@ live_context_for_engine(struct intel_engine_cs *engine, struct file *file)
|
|||
return ERR_CAST(ce);
|
||||
}
|
||||
|
||||
intel_context_set_gem(ce, ctx);
|
||||
intel_context_set_gem(ce, ctx, null_sseu);
|
||||
engines->engines[0] = ce;
|
||||
engines->num_engines = 1;
|
||||
|
||||
|
@ -139,11 +145,24 @@ live_context_for_engine(struct intel_engine_cs *engine, struct file *file)
|
|||
}
|
||||
|
||||
struct i915_gem_context *
|
||||
kernel_context(struct drm_i915_private *i915)
|
||||
kernel_context(struct drm_i915_private *i915,
|
||||
struct i915_address_space *vm)
|
||||
{
|
||||
struct i915_gem_context *ctx;
|
||||
struct i915_gem_proto_context *pc;
|
||||
|
||||
ctx = i915_gem_create_context(i915, 0);
|
||||
pc = proto_context_create(i915, 0);
|
||||
if (IS_ERR(pc))
|
||||
return ERR_CAST(pc);
|
||||
|
||||
if (vm) {
|
||||
if (pc->vm)
|
||||
i915_vm_put(pc->vm);
|
||||
pc->vm = i915_vm_get(vm);
|
||||
}
|
||||
|
||||
ctx = i915_gem_create_context(i915, pc);
|
||||
proto_context_close(pc);
|
||||
if (IS_ERR(ctx))
|
||||
return ctx;
|
||||
|
||||
|
|
|
@ -10,6 +10,7 @@
|
|||
struct file;
|
||||
struct drm_i915_private;
|
||||
struct intel_engine_cs;
|
||||
struct i915_address_space;
|
||||
|
||||
void mock_init_contexts(struct drm_i915_private *i915);
|
||||
|
||||
|
@ -25,7 +26,8 @@ live_context(struct drm_i915_private *i915, struct file *file);
|
|||
struct i915_gem_context *
|
||||
live_context_for_engine(struct intel_engine_cs *engine, struct file *file);
|
||||
|
||||
struct i915_gem_context *kernel_context(struct drm_i915_private *i915);
|
||||
struct i915_gem_context *kernel_context(struct drm_i915_private *i915,
|
||||
struct i915_address_space *vm);
|
||||
void kernel_context_close(struct i915_gem_context *ctx);
|
||||
|
||||
#endif /* !__MOCK_CONTEXT_H */
|
||||
|
|
|
@ -437,20 +437,20 @@ static int frequency_show(struct seq_file *m, void *unused)
|
|||
max_freq = (IS_GEN9_LP(i915) ? rp_state_cap >> 0 :
|
||||
rp_state_cap >> 16) & 0xff;
|
||||
max_freq *= (IS_GEN9_BC(i915) ||
|
||||
GRAPHICS_VER(i915) >= 10 ? GEN9_FREQ_SCALER : 1);
|
||||
GRAPHICS_VER(i915) >= 11 ? GEN9_FREQ_SCALER : 1);
|
||||
seq_printf(m, "Lowest (RPN) frequency: %dMHz\n",
|
||||
intel_gpu_freq(rps, max_freq));
|
||||
|
||||
max_freq = (rp_state_cap & 0xff00) >> 8;
|
||||
max_freq *= (IS_GEN9_BC(i915) ||
|
||||
GRAPHICS_VER(i915) >= 10 ? GEN9_FREQ_SCALER : 1);
|
||||
GRAPHICS_VER(i915) >= 11 ? GEN9_FREQ_SCALER : 1);
|
||||
seq_printf(m, "Nominal (RP1) frequency: %dMHz\n",
|
||||
intel_gpu_freq(rps, max_freq));
|
||||
|
||||
max_freq = (IS_GEN9_LP(i915) ? rp_state_cap >> 16 :
|
||||
rp_state_cap >> 0) & 0xff;
|
||||
max_freq *= (IS_GEN9_BC(i915) ||
|
||||
GRAPHICS_VER(i915) >= 10 ? GEN9_FREQ_SCALER : 1);
|
||||
GRAPHICS_VER(i915) >= 11 ? GEN9_FREQ_SCALER : 1);
|
||||
seq_printf(m, "Max non-overclocked (RP0) frequency: %dMHz\n",
|
||||
intel_gpu_freq(rps, max_freq));
|
||||
seq_printf(m, "Max overclocked frequency: %dMHz\n",
|
||||
|
@ -500,7 +500,7 @@ static int llc_show(struct seq_file *m, void *data)
|
|||
|
||||
min_gpu_freq = rps->min_freq;
|
||||
max_gpu_freq = rps->max_freq;
|
||||
if (IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 10) {
|
||||
if (IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 11) {
|
||||
/* Convert GT frequency to 50 HZ units */
|
||||
min_gpu_freq /= GEN9_FREQ_SCALER;
|
||||
max_gpu_freq /= GEN9_FREQ_SCALER;
|
||||
|
@ -518,7 +518,7 @@ static int llc_show(struct seq_file *m, void *data)
|
|||
intel_gpu_freq(rps,
|
||||
(gpu_freq *
|
||||
(IS_GEN9_BC(i915) ||
|
||||
GRAPHICS_VER(i915) >= 10 ?
|
||||
GRAPHICS_VER(i915) >= 11 ?
|
||||
GEN9_FREQ_SCALER : 1))),
|
||||
((ia_freq >> 0) & 0xff) * 100,
|
||||
((ia_freq >> 8) & 0xff) * 100);
|
||||
|
|
|
@ -42,7 +42,7 @@ int gen8_emit_flush_rcs(struct i915_request *rq, u32 mode)
|
|||
vf_flush_wa = true;
|
||||
|
||||
/* WaForGAMHang:kbl */
|
||||
if (IS_KBL_GT_STEP(rq->engine->i915, 0, STEP_B0))
|
||||
if (IS_KBL_GT_STEP(rq->engine->i915, 0, STEP_C0))
|
||||
dc_flush_wa = true;
|
||||
}
|
||||
|
||||
|
@ -208,7 +208,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
|
|||
flags |= PIPE_CONTROL_FLUSH_L3;
|
||||
flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
|
||||
flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
|
||||
/* Wa_1409600907:tgl */
|
||||
/* Wa_1409600907:tgl,adl-p */
|
||||
flags |= PIPE_CONTROL_DEPTH_STALL;
|
||||
flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
|
||||
flags |= PIPE_CONTROL_FLUSH_ENABLE;
|
||||
|
@ -279,7 +279,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
|
|||
if (mode & EMIT_INVALIDATE)
|
||||
aux_inv = rq->engine->mask & ~BIT(BCS0);
|
||||
if (aux_inv)
|
||||
cmd += 2 * hweight8(aux_inv) + 2;
|
||||
cmd += 2 * hweight32(aux_inv) + 2;
|
||||
|
||||
cs = intel_ring_begin(rq, cmd);
|
||||
if (IS_ERR(cs))
|
||||
|
@ -313,9 +313,8 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
|
|||
struct intel_engine_cs *engine;
|
||||
unsigned int tmp;
|
||||
|
||||
*cs++ = MI_LOAD_REGISTER_IMM(hweight8(aux_inv));
|
||||
for_each_engine_masked(engine, rq->engine->gt,
|
||||
aux_inv, tmp) {
|
||||
*cs++ = MI_LOAD_REGISTER_IMM(hweight32(aux_inv));
|
||||
for_each_engine_masked(engine, rq->engine->gt, aux_inv, tmp) {
|
||||
*cs++ = i915_mmio_reg_offset(aux_inv_reg(engine));
|
||||
*cs++ = AUX_INV;
|
||||
}
|
||||
|
@ -506,7 +505,8 @@ gen8_emit_fini_breadcrumb_tail(struct i915_request *rq, u32 *cs)
|
|||
*cs++ = MI_USER_INTERRUPT;
|
||||
|
||||
*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
|
||||
if (intel_engine_has_semaphores(rq->engine))
|
||||
if (intel_engine_has_semaphores(rq->engine) &&
|
||||
!intel_uc_uses_guc_submission(&rq->engine->gt->uc))
|
||||
cs = emit_preempt_busywait(rq, cs);
|
||||
|
||||
rq->tail = intel_ring_offset(rq, cs);
|
||||
|
@ -598,7 +598,8 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *rq, u32 *cs)
|
|||
*cs++ = MI_USER_INTERRUPT;
|
||||
|
||||
*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
|
||||
if (intel_engine_has_semaphores(rq->engine))
|
||||
if (intel_engine_has_semaphores(rq->engine) &&
|
||||
!intel_uc_uses_guc_submission(&rq->engine->gt->uc))
|
||||
cs = gen12_emit_preempt_busywait(rq, cs);
|
||||
|
||||
rq->tail = intel_ring_offset(rq, cs);
|
||||
|
|
|
@ -358,6 +358,54 @@ static void gen8_ppgtt_alloc(struct i915_address_space *vm,
|
|||
&start, start + length, vm->top);
|
||||
}
|
||||
|
||||
static void __gen8_ppgtt_foreach(struct i915_address_space *vm,
|
||||
struct i915_page_directory *pd,
|
||||
u64 *start, u64 end, int lvl,
|
||||
void (*fn)(struct i915_address_space *vm,
|
||||
struct i915_page_table *pt,
|
||||
void *data),
|
||||
void *data)
|
||||
{
|
||||
unsigned int idx, len;
|
||||
|
||||
len = gen8_pd_range(*start, end, lvl--, &idx);
|
||||
|
||||
spin_lock(&pd->lock);
|
||||
do {
|
||||
struct i915_page_table *pt = pd->entry[idx];
|
||||
|
||||
atomic_inc(&pt->used);
|
||||
spin_unlock(&pd->lock);
|
||||
|
||||
if (lvl) {
|
||||
__gen8_ppgtt_foreach(vm, as_pd(pt), start, end, lvl,
|
||||
fn, data);
|
||||
} else {
|
||||
fn(vm, pt, data);
|
||||
*start += gen8_pt_count(*start, end);
|
||||
}
|
||||
|
||||
spin_lock(&pd->lock);
|
||||
atomic_dec(&pt->used);
|
||||
} while (idx++, --len);
|
||||
spin_unlock(&pd->lock);
|
||||
}
|
||||
|
||||
static void gen8_ppgtt_foreach(struct i915_address_space *vm,
|
||||
u64 start, u64 length,
|
||||
void (*fn)(struct i915_address_space *vm,
|
||||
struct i915_page_table *pt,
|
||||
void *data),
|
||||
void *data)
|
||||
{
|
||||
start >>= GEN8_PTE_SHIFT;
|
||||
length >>= GEN8_PTE_SHIFT;
|
||||
|
||||
__gen8_ppgtt_foreach(vm, i915_vm_to_ppgtt(vm)->pd,
|
||||
&start, start + length, vm->top,
|
||||
fn, data);
|
||||
}
|
||||
|
||||
static __always_inline u64
|
||||
gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
|
||||
struct i915_page_directory *pdp,
|
||||
|
@ -552,6 +600,24 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
|
|||
}
|
||||
}
|
||||
|
||||
static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
|
||||
dma_addr_t addr,
|
||||
u64 offset,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
u64 idx = offset >> GEN8_PTE_SHIFT;
|
||||
struct i915_page_directory * const pdp =
|
||||
gen8_pdp_for_page_index(vm, idx);
|
||||
struct i915_page_directory *pd =
|
||||
i915_pd_entry(pdp, gen8_pd_index(idx, 2));
|
||||
gen8_pte_t *vaddr;
|
||||
|
||||
vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
|
||||
vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
|
||||
clflush_cache_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
|
||||
}
|
||||
|
||||
static int gen8_init_scratch(struct i915_address_space *vm)
|
||||
{
|
||||
u32 pte_flags;
|
||||
|
@ -731,8 +797,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
|
|||
|
||||
ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
|
||||
ppgtt->vm.insert_entries = gen8_ppgtt_insert;
|
||||
ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
|
||||
ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
|
||||
ppgtt->vm.clear_range = gen8_ppgtt_clear;
|
||||
ppgtt->vm.foreach = gen8_ppgtt_foreach;
|
||||
|
||||
ppgtt->vm.pte_encode = gen8_pte_encode;
|
||||
|
||||
|
|
|
@ -15,28 +15,14 @@
|
|||
#include "intel_gt_pm.h"
|
||||
#include "intel_gt_requests.h"
|
||||
|
||||
static bool irq_enable(struct intel_engine_cs *engine)
|
||||
static bool irq_enable(struct intel_breadcrumbs *b)
|
||||
{
|
||||
if (!engine->irq_enable)
|
||||
return false;
|
||||
|
||||
/* Caller disables interrupts */
|
||||
spin_lock(&engine->gt->irq_lock);
|
||||
engine->irq_enable(engine);
|
||||
spin_unlock(&engine->gt->irq_lock);
|
||||
|
||||
return true;
|
||||
return intel_engine_irq_enable(b->irq_engine);
|
||||
}
|
||||
|
||||
static void irq_disable(struct intel_engine_cs *engine)
|
||||
static void irq_disable(struct intel_breadcrumbs *b)
|
||||
{
|
||||
if (!engine->irq_disable)
|
||||
return;
|
||||
|
||||
/* Caller disables interrupts */
|
||||
spin_lock(&engine->gt->irq_lock);
|
||||
engine->irq_disable(engine);
|
||||
spin_unlock(&engine->gt->irq_lock);
|
||||
intel_engine_irq_disable(b->irq_engine);
|
||||
}
|
||||
|
||||
static void __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b)
|
||||
|
@ -57,7 +43,7 @@ static void __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b)
|
|||
WRITE_ONCE(b->irq_armed, true);
|
||||
|
||||
/* Requests may have completed before we could enable the interrupt. */
|
||||
if (!b->irq_enabled++ && irq_enable(b->irq_engine))
|
||||
if (!b->irq_enabled++ && b->irq_enable(b))
|
||||
irq_work_queue(&b->irq_work);
|
||||
}
|
||||
|
||||
|
@ -76,7 +62,7 @@ static void __intel_breadcrumbs_disarm_irq(struct intel_breadcrumbs *b)
|
|||
{
|
||||
GEM_BUG_ON(!b->irq_enabled);
|
||||
if (!--b->irq_enabled)
|
||||
irq_disable(b->irq_engine);
|
||||
b->irq_disable(b);
|
||||
|
||||
WRITE_ONCE(b->irq_armed, false);
|
||||
intel_gt_pm_put_async(b->irq_engine->gt);
|
||||
|
@ -259,6 +245,9 @@ static void signal_irq_work(struct irq_work *work)
|
|||
llist_entry(signal, typeof(*rq), signal_node);
|
||||
struct list_head cb_list;
|
||||
|
||||
if (rq->engine->sched_engine->retire_inflight_request_prio)
|
||||
rq->engine->sched_engine->retire_inflight_request_prio(rq);
|
||||
|
||||
spin_lock(&rq->lock);
|
||||
list_replace(&rq->fence.cb_list, &cb_list);
|
||||
__dma_fence_signal__timestamp(&rq->fence, timestamp);
|
||||
|
@ -281,7 +270,7 @@ intel_breadcrumbs_create(struct intel_engine_cs *irq_engine)
|
|||
if (!b)
|
||||
return NULL;
|
||||
|
||||
b->irq_engine = irq_engine;
|
||||
kref_init(&b->ref);
|
||||
|
||||
spin_lock_init(&b->signalers_lock);
|
||||
INIT_LIST_HEAD(&b->signalers);
|
||||
|
@ -290,6 +279,10 @@ intel_breadcrumbs_create(struct intel_engine_cs *irq_engine)
|
|||
spin_lock_init(&b->irq_lock);
|
||||
init_irq_work(&b->irq_work, signal_irq_work);
|
||||
|
||||
b->irq_engine = irq_engine;
|
||||
b->irq_enable = irq_enable;
|
||||
b->irq_disable = irq_disable;
|
||||
|
||||
return b;
|
||||
}
|
||||
|
||||
|
@ -303,9 +296,9 @@ void intel_breadcrumbs_reset(struct intel_breadcrumbs *b)
|
|||
spin_lock_irqsave(&b->irq_lock, flags);
|
||||
|
||||
if (b->irq_enabled)
|
||||
irq_enable(b->irq_engine);
|
||||
b->irq_enable(b);
|
||||
else
|
||||
irq_disable(b->irq_engine);
|
||||
b->irq_disable(b);
|
||||
|
||||
spin_unlock_irqrestore(&b->irq_lock, flags);
|
||||
}
|
||||
|
@ -325,11 +318,14 @@ void __intel_breadcrumbs_park(struct intel_breadcrumbs *b)
|
|||
}
|
||||
}
|
||||
|
||||
void intel_breadcrumbs_free(struct intel_breadcrumbs *b)
|
||||
void intel_breadcrumbs_free(struct kref *kref)
|
||||
{
|
||||
struct intel_breadcrumbs *b = container_of(kref, typeof(*b), ref);
|
||||
|
||||
irq_work_sync(&b->irq_work);
|
||||
GEM_BUG_ON(!list_empty(&b->signalers));
|
||||
GEM_BUG_ON(b->irq_armed);
|
||||
|
||||
kfree(b);
|
||||
}
|
||||
|
||||
|
|
|
@ -9,7 +9,7 @@
|
|||
#include <linux/atomic.h>
|
||||
#include <linux/irq_work.h>
|
||||
|
||||
#include "intel_engine_types.h"
|
||||
#include "intel_breadcrumbs_types.h"
|
||||
|
||||
struct drm_printer;
|
||||
struct i915_request;
|
||||
|
@ -17,7 +17,7 @@ struct intel_breadcrumbs;
|
|||
|
||||
struct intel_breadcrumbs *
|
||||
intel_breadcrumbs_create(struct intel_engine_cs *irq_engine);
|
||||
void intel_breadcrumbs_free(struct intel_breadcrumbs *b);
|
||||
void intel_breadcrumbs_free(struct kref *kref);
|
||||
|
||||
void intel_breadcrumbs_reset(struct intel_breadcrumbs *b);
|
||||
void __intel_breadcrumbs_park(struct intel_breadcrumbs *b);
|
||||
|
@ -48,4 +48,16 @@ void i915_request_cancel_breadcrumb(struct i915_request *request);
|
|||
void intel_context_remove_breadcrumbs(struct intel_context *ce,
|
||||
struct intel_breadcrumbs *b);
|
||||
|
||||
static inline struct intel_breadcrumbs *
|
||||
intel_breadcrumbs_get(struct intel_breadcrumbs *b)
|
||||
{
|
||||
kref_get(&b->ref);
|
||||
return b;
|
||||
}
|
||||
|
||||
static inline void intel_breadcrumbs_put(struct intel_breadcrumbs *b)
|
||||
{
|
||||
kref_put(&b->ref, intel_breadcrumbs_free);
|
||||
}
|
||||
|
||||
#endif /* __INTEL_BREADCRUMBS__ */
|
||||
|
|
|
@ -7,10 +7,13 @@
|
|||
#define __INTEL_BREADCRUMBS_TYPES__
|
||||
|
||||
#include <linux/irq_work.h>
|
||||
#include <linux/kref.h>
|
||||
#include <linux/list.h>
|
||||
#include <linux/spinlock.h>
|
||||
#include <linux/types.h>
|
||||
|
||||
#include "intel_engine_types.h"
|
||||
|
||||
/*
|
||||
* Rather than have every client wait upon all user interrupts,
|
||||
* with the herd waking after every interrupt and each doing the
|
||||
|
@ -29,6 +32,7 @@
|
|||
* the overhead of waking that client is much preferred.
|
||||
*/
|
||||
struct intel_breadcrumbs {
|
||||
struct kref ref;
|
||||
atomic_t active;
|
||||
|
||||
spinlock_t signalers_lock; /* protects the list of signalers */
|
||||
|
@ -42,7 +46,10 @@ struct intel_breadcrumbs {
|
|||
bool irq_armed;
|
||||
|
||||
/* Not all breadcrumbs are attached to physical HW */
|
||||
intel_engine_mask_t engine_mask;
|
||||
struct intel_engine_cs *irq_engine;
|
||||
bool (*irq_enable)(struct intel_breadcrumbs *b);
|
||||
void (*irq_disable)(struct intel_breadcrumbs *b);
|
||||
};
|
||||
|
||||
#endif /* __INTEL_BREADCRUMBS_TYPES__ */
|
||||
|
|
|
@ -7,28 +7,26 @@
|
|||
#include "gem/i915_gem_pm.h"
|
||||
|
||||
#include "i915_drv.h"
|
||||
#include "i915_globals.h"
|
||||
#include "i915_trace.h"
|
||||
|
||||
#include "intel_context.h"
|
||||
#include "intel_engine.h"
|
||||
#include "intel_engine_pm.h"
|
||||
#include "intel_ring.h"
|
||||
|
||||
static struct i915_global_context {
|
||||
struct i915_global base;
|
||||
struct kmem_cache *slab_ce;
|
||||
} global;
|
||||
static struct kmem_cache *slab_ce;
|
||||
|
||||
static struct intel_context *intel_context_alloc(void)
|
||||
{
|
||||
return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL);
|
||||
return kmem_cache_zalloc(slab_ce, GFP_KERNEL);
|
||||
}
|
||||
|
||||
static void rcu_context_free(struct rcu_head *rcu)
|
||||
{
|
||||
struct intel_context *ce = container_of(rcu, typeof(*ce), rcu);
|
||||
|
||||
kmem_cache_free(global.slab_ce, ce);
|
||||
trace_intel_context_free(ce);
|
||||
kmem_cache_free(slab_ce, ce);
|
||||
}
|
||||
|
||||
void intel_context_free(struct intel_context *ce)
|
||||
|
@ -46,6 +44,7 @@ intel_context_create(struct intel_engine_cs *engine)
|
|||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
intel_context_init(ce, engine);
|
||||
trace_intel_context_create(ce);
|
||||
return ce;
|
||||
}
|
||||
|
||||
|
@ -80,7 +79,7 @@ static int intel_context_active_acquire(struct intel_context *ce)
|
|||
|
||||
__i915_active_acquire(&ce->active);
|
||||
|
||||
if (intel_context_is_barrier(ce))
|
||||
if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine))
|
||||
return 0;
|
||||
|
||||
/* Preallocate tracking nodes */
|
||||
|
@ -268,6 +267,8 @@ int __intel_context_do_pin_ww(struct intel_context *ce,
|
|||
|
||||
GEM_BUG_ON(!intel_context_is_pinned(ce)); /* no overflow! */
|
||||
|
||||
trace_intel_context_do_pin(ce);
|
||||
|
||||
err_unlock:
|
||||
mutex_unlock(&ce->pin_mutex);
|
||||
err_post_unpin:
|
||||
|
@ -306,9 +307,9 @@ retry:
|
|||
return err;
|
||||
}
|
||||
|
||||
void intel_context_unpin(struct intel_context *ce)
|
||||
void __intel_context_do_unpin(struct intel_context *ce, int sub)
|
||||
{
|
||||
if (!atomic_dec_and_test(&ce->pin_count))
|
||||
if (!atomic_sub_and_test(sub, &ce->pin_count))
|
||||
return;
|
||||
|
||||
CE_TRACE(ce, "unpin\n");
|
||||
|
@ -323,6 +324,7 @@ void intel_context_unpin(struct intel_context *ce)
|
|||
*/
|
||||
intel_context_get(ce);
|
||||
intel_context_active_release(ce);
|
||||
trace_intel_context_do_unpin(ce);
|
||||
intel_context_put(ce);
|
||||
}
|
||||
|
||||
|
@ -360,6 +362,12 @@ static int __intel_context_active(struct i915_active *active)
|
|||
return 0;
|
||||
}
|
||||
|
||||
static int sw_fence_dummy_notify(struct i915_sw_fence *sf,
|
||||
enum i915_sw_fence_notify state)
|
||||
{
|
||||
return NOTIFY_DONE;
|
||||
}
|
||||
|
||||
void
|
||||
intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
|
||||
{
|
||||
|
@ -371,7 +379,8 @@ intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
|
|||
ce->engine = engine;
|
||||
ce->ops = engine->cops;
|
||||
ce->sseu = engine->sseu;
|
||||
ce->ring = __intel_context_ring_size(SZ_4K);
|
||||
ce->ring = NULL;
|
||||
ce->ring_size = SZ_4K;
|
||||
|
||||
ewma_runtime_init(&ce->runtime.avg);
|
||||
|
||||
|
@ -383,6 +392,22 @@ intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
|
|||
|
||||
mutex_init(&ce->pin_mutex);
|
||||
|
||||
spin_lock_init(&ce->guc_state.lock);
|
||||
INIT_LIST_HEAD(&ce->guc_state.fences);
|
||||
|
||||
spin_lock_init(&ce->guc_active.lock);
|
||||
INIT_LIST_HEAD(&ce->guc_active.requests);
|
||||
|
||||
ce->guc_id = GUC_INVALID_LRC_ID;
|
||||
INIT_LIST_HEAD(&ce->guc_id_link);
|
||||
|
||||
/*
|
||||
* Initialize fence to be complete as this is expected to be complete
|
||||
* unless there is a pending schedule disable outstanding.
|
||||
*/
|
||||
i915_sw_fence_init(&ce->guc_blocked, sw_fence_dummy_notify);
|
||||
i915_sw_fence_commit(&ce->guc_blocked);
|
||||
|
||||
i915_active_init(&ce->active,
|
||||
__intel_context_active, __intel_context_retire, 0);
|
||||
}
|
||||
|
@ -397,28 +422,17 @@ void intel_context_fini(struct intel_context *ce)
|
|||
i915_active_fini(&ce->active);
|
||||
}
|
||||
|
||||
static void i915_global_context_shrink(void)
|
||||
void i915_context_module_exit(void)
|
||||
{
|
||||
kmem_cache_shrink(global.slab_ce);
|
||||
kmem_cache_destroy(slab_ce);
|
||||
}
|
||||
|
||||
static void i915_global_context_exit(void)
|
||||
int __init i915_context_module_init(void)
|
||||
{
|
||||
kmem_cache_destroy(global.slab_ce);
|
||||
}
|
||||
|
||||
static struct i915_global_context global = { {
|
||||
.shrink = i915_global_context_shrink,
|
||||
.exit = i915_global_context_exit,
|
||||
} };
|
||||
|
||||
int __init i915_global_context_init(void)
|
||||
{
|
||||
global.slab_ce = KMEM_CACHE(intel_context, SLAB_HWCACHE_ALIGN);
|
||||
if (!global.slab_ce)
|
||||
slab_ce = KMEM_CACHE(intel_context, SLAB_HWCACHE_ALIGN);
|
||||
if (!slab_ce)
|
||||
return -ENOMEM;
|
||||
|
||||
i915_global_register(&global.base);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -499,6 +513,26 @@ retry:
|
|||
return rq;
|
||||
}
|
||||
|
||||
struct i915_request *intel_context_find_active_request(struct intel_context *ce)
|
||||
{
|
||||
struct i915_request *rq, *active = NULL;
|
||||
unsigned long flags;
|
||||
|
||||
GEM_BUG_ON(!intel_engine_uses_guc(ce->engine));
|
||||
|
||||
spin_lock_irqsave(&ce->guc_active.lock, flags);
|
||||
list_for_each_entry_reverse(rq, &ce->guc_active.requests,
|
||||
sched.link) {
|
||||
if (i915_request_completed(rq))
|
||||
break;
|
||||
|
||||
active = rq;
|
||||
}
|
||||
spin_unlock_irqrestore(&ce->guc_active.lock, flags);
|
||||
|
||||
return active;
|
||||
}
|
||||
|
||||
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
|
||||
#include "selftest_context.c"
|
||||
#endif
|
||||
|
|
|
@ -16,6 +16,7 @@
|
|||
#include "intel_engine_types.h"
|
||||
#include "intel_ring_types.h"
|
||||
#include "intel_timeline_types.h"
|
||||
#include "i915_trace.h"
|
||||
|
||||
#define CE_TRACE(ce, fmt, ...) do { \
|
||||
const struct intel_context *ce__ = (ce); \
|
||||
|
@ -30,6 +31,9 @@ void intel_context_init(struct intel_context *ce,
|
|||
struct intel_engine_cs *engine);
|
||||
void intel_context_fini(struct intel_context *ce);
|
||||
|
||||
void i915_context_module_exit(void);
|
||||
int i915_context_module_init(void);
|
||||
|
||||
struct intel_context *
|
||||
intel_context_create(struct intel_engine_cs *engine);
|
||||
|
||||
|
@ -69,6 +73,13 @@ intel_context_is_pinned(struct intel_context *ce)
|
|||
return atomic_read(&ce->pin_count);
|
||||
}
|
||||
|
||||
static inline void intel_context_cancel_request(struct intel_context *ce,
|
||||
struct i915_request *rq)
|
||||
{
|
||||
GEM_BUG_ON(!ce->ops->cancel_request);
|
||||
return ce->ops->cancel_request(ce, rq);
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_context_unlock_pinned - Releases the earlier locking of 'pinned' status
|
||||
* @ce - the context
|
||||
|
@ -113,7 +124,32 @@ static inline void __intel_context_pin(struct intel_context *ce)
|
|||
atomic_inc(&ce->pin_count);
|
||||
}
|
||||
|
||||
void intel_context_unpin(struct intel_context *ce);
|
||||
void __intel_context_do_unpin(struct intel_context *ce, int sub);
|
||||
|
||||
static inline void intel_context_sched_disable_unpin(struct intel_context *ce)
|
||||
{
|
||||
__intel_context_do_unpin(ce, 2);
|
||||
}
|
||||
|
||||
static inline void intel_context_unpin(struct intel_context *ce)
|
||||
{
|
||||
if (!ce->ops->sched_disable) {
|
||||
__intel_context_do_unpin(ce, 1);
|
||||
} else {
|
||||
/*
|
||||
* Move ownership of this pin to the scheduling disable which is
|
||||
* an async operation. When that operation completes the above
|
||||
* intel_context_sched_disable_unpin is called potentially
|
||||
* unpinning the context.
|
||||
*/
|
||||
while (!atomic_add_unless(&ce->pin_count, -1, 1)) {
|
||||
if (atomic_cmpxchg(&ce->pin_count, 1, 2) == 1) {
|
||||
ce->ops->sched_disable(ce);
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void intel_context_enter_engine(struct intel_context *ce);
|
||||
void intel_context_exit_engine(struct intel_context *ce);
|
||||
|
@ -175,10 +211,8 @@ int intel_context_prepare_remote_request(struct intel_context *ce,
|
|||
|
||||
struct i915_request *intel_context_create_request(struct intel_context *ce);
|
||||
|
||||
static inline struct intel_ring *__intel_context_ring_size(u64 sz)
|
||||
{
|
||||
return u64_to_ptr(struct intel_ring, sz);
|
||||
}
|
||||
struct i915_request *
|
||||
intel_context_find_active_request(struct intel_context *ce);
|
||||
|
||||
static inline bool intel_context_is_barrier(const struct intel_context *ce)
|
||||
{
|
||||
|
@ -220,6 +254,18 @@ static inline bool intel_context_set_banned(struct intel_context *ce)
|
|||
return test_and_set_bit(CONTEXT_BANNED, &ce->flags);
|
||||
}
|
||||
|
||||
static inline bool intel_context_ban(struct intel_context *ce,
|
||||
struct i915_request *rq)
|
||||
{
|
||||
bool ret = intel_context_set_banned(ce);
|
||||
|
||||
trace_intel_context_ban(ce);
|
||||
if (ce->ops->ban)
|
||||
ce->ops->ban(ce, rq);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static inline bool
|
||||
intel_context_force_single_submission(const struct intel_context *ce)
|
||||
{
|
||||
|
|
|
@ -1,63 +0,0 @@
|
|||
// SPDX-License-Identifier: MIT
|
||||
/*
|
||||
* Copyright © 2019 Intel Corporation
|
||||
*/
|
||||
|
||||
#include "i915_active.h"
|
||||
#include "intel_context.h"
|
||||
#include "intel_context_param.h"
|
||||
#include "intel_ring.h"
|
||||
|
||||
int intel_context_set_ring_size(struct intel_context *ce, long sz)
|
||||
{
|
||||
int err;
|
||||
|
||||
if (intel_context_lock_pinned(ce))
|
||||
return -EINTR;
|
||||
|
||||
err = i915_active_wait(&ce->active);
|
||||
if (err < 0)
|
||||
goto unlock;
|
||||
|
||||
if (intel_context_is_pinned(ce)) {
|
||||
err = -EBUSY; /* In active use, come back later! */
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
|
||||
struct intel_ring *ring;
|
||||
|
||||
/* Replace the existing ringbuffer */
|
||||
ring = intel_engine_create_ring(ce->engine, sz);
|
||||
if (IS_ERR(ring)) {
|
||||
err = PTR_ERR(ring);
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
intel_ring_put(ce->ring);
|
||||
ce->ring = ring;
|
||||
|
||||
/* Context image will be updated on next pin */
|
||||
} else {
|
||||
ce->ring = __intel_context_ring_size(sz);
|
||||
}
|
||||
|
||||
unlock:
|
||||
intel_context_unlock_pinned(ce);
|
||||
return err;
|
||||
}
|
||||
|
||||
long intel_context_get_ring_size(struct intel_context *ce)
|
||||
{
|
||||
long sz = (unsigned long)READ_ONCE(ce->ring);
|
||||
|
||||
if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
|
||||
if (intel_context_lock_pinned(ce))
|
||||
return -EINTR;
|
||||
|
||||
sz = ce->ring->size;
|
||||
intel_context_unlock_pinned(ce);
|
||||
}
|
||||
|
||||
return sz;
|
||||
}
|
|
@ -10,14 +10,10 @@
|
|||
|
||||
#include "intel_context.h"
|
||||
|
||||
int intel_context_set_ring_size(struct intel_context *ce, long sz);
|
||||
long intel_context_get_ring_size(struct intel_context *ce);
|
||||
|
||||
static inline int
|
||||
static inline void
|
||||
intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
|
||||
{
|
||||
ce->watchdog.timeout_us = timeout_us;
|
||||
return 0;
|
||||
}
|
||||
|
||||
#endif /* INTEL_CONTEXT_PARAM_H */
|
||||
|
|
|
@ -13,12 +13,14 @@
|
|||
#include <linux/types.h>
|
||||
|
||||
#include "i915_active_types.h"
|
||||
#include "i915_sw_fence.h"
|
||||
#include "i915_utils.h"
|
||||
#include "intel_engine_types.h"
|
||||
#include "intel_sseu.h"
|
||||
|
||||
#define CONTEXT_REDZONE POISON_INUSE
|
||||
#include "uc/intel_guc_fwif.h"
|
||||
|
||||
#define CONTEXT_REDZONE POISON_INUSE
|
||||
DECLARE_EWMA(runtime, 3, 8);
|
||||
|
||||
struct i915_gem_context;
|
||||
|
@ -35,16 +37,29 @@ struct intel_context_ops {
|
|||
|
||||
int (*alloc)(struct intel_context *ce);
|
||||
|
||||
void (*ban)(struct intel_context *ce, struct i915_request *rq);
|
||||
|
||||
int (*pre_pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void **vaddr);
|
||||
int (*pin)(struct intel_context *ce, void *vaddr);
|
||||
void (*unpin)(struct intel_context *ce);
|
||||
void (*post_unpin)(struct intel_context *ce);
|
||||
|
||||
void (*cancel_request)(struct intel_context *ce,
|
||||
struct i915_request *rq);
|
||||
|
||||
void (*enter)(struct intel_context *ce);
|
||||
void (*exit)(struct intel_context *ce);
|
||||
|
||||
void (*sched_disable)(struct intel_context *ce);
|
||||
|
||||
void (*reset)(struct intel_context *ce);
|
||||
void (*destroy)(struct kref *kref);
|
||||
|
||||
/* virtual engine/context interface */
|
||||
struct intel_context *(*create_virtual)(struct intel_engine_cs **engine,
|
||||
unsigned int count);
|
||||
struct intel_engine_cs *(*get_sibling)(struct intel_engine_cs *engine,
|
||||
unsigned int sibling);
|
||||
};
|
||||
|
||||
struct intel_context {
|
||||
|
@ -82,6 +97,7 @@ struct intel_context {
|
|||
spinlock_t signal_lock; /* protects signals, the list of requests */
|
||||
|
||||
struct i915_vma *state;
|
||||
u32 ring_size;
|
||||
struct intel_ring *ring;
|
||||
struct intel_timeline *timeline;
|
||||
|
||||
|
@ -95,6 +111,7 @@ struct intel_context {
|
|||
#define CONTEXT_BANNED 6
|
||||
#define CONTEXT_FORCE_SINGLE_SUBMISSION 7
|
||||
#define CONTEXT_NOPREEMPT 8
|
||||
#define CONTEXT_LRCA_DIRTY 9
|
||||
|
||||
struct {
|
||||
u64 timeout_us;
|
||||
|
@ -136,6 +153,51 @@ struct intel_context {
|
|||
struct intel_sseu sseu;
|
||||
|
||||
u8 wa_bb_page; /* if set, page num reserved for context workarounds */
|
||||
|
||||
struct {
|
||||
/** lock: protects everything in guc_state */
|
||||
spinlock_t lock;
|
||||
/**
|
||||
* sched_state: scheduling state of this context using GuC
|
||||
* submission
|
||||
*/
|
||||
u16 sched_state;
|
||||
/*
|
||||
* fences: maintains of list of requests that have a submit
|
||||
* fence related to GuC submission
|
||||
*/
|
||||
struct list_head fences;
|
||||
} guc_state;
|
||||
|
||||
struct {
|
||||
/** lock: protects everything in guc_active */
|
||||
spinlock_t lock;
|
||||
/** requests: active requests on this context */
|
||||
struct list_head requests;
|
||||
} guc_active;
|
||||
|
||||
/* GuC scheduling state flags that do not require a lock. */
|
||||
atomic_t guc_sched_state_no_lock;
|
||||
|
||||
/* GuC LRC descriptor ID */
|
||||
u16 guc_id;
|
||||
|
||||
/* GuC LRC descriptor reference count */
|
||||
atomic_t guc_id_ref;
|
||||
|
||||
/*
|
||||
* GuC ID link - in list when unpinned but guc_id still valid in GuC
|
||||
*/
|
||||
struct list_head guc_id_link;
|
||||
|
||||
/* GuC context blocked fence */
|
||||
struct i915_sw_fence guc_blocked;
|
||||
|
||||
/*
|
||||
* GuC priority management
|
||||
*/
|
||||
u8 guc_prio;
|
||||
u32 guc_prio_count[GUC_CLIENT_PRIORITY_NUM];
|
||||
};
|
||||
|
||||
#endif /* __INTEL_CONTEXT_TYPES__ */
|
||||
|
|
|
@ -19,7 +19,9 @@
|
|||
#include "intel_workarounds.h"
|
||||
|
||||
struct drm_printer;
|
||||
struct intel_context;
|
||||
struct intel_gt;
|
||||
struct lock_class_key;
|
||||
|
||||
/* Early gen2 devices have a cacheline of just 32 bytes, using 64 is overkill,
|
||||
* but keeps the logic simple. Indeed, the whole purpose of this macro is just
|
||||
|
@ -123,20 +125,6 @@ execlists_active(const struct intel_engine_execlists *execlists)
|
|||
return active;
|
||||
}
|
||||
|
||||
static inline void
|
||||
execlists_active_lock_bh(struct intel_engine_execlists *execlists)
|
||||
{
|
||||
local_bh_disable(); /* prevent local softirq and lock recursion */
|
||||
tasklet_lock(&execlists->tasklet);
|
||||
}
|
||||
|
||||
static inline void
|
||||
execlists_active_unlock_bh(struct intel_engine_execlists *execlists)
|
||||
{
|
||||
tasklet_unlock(&execlists->tasklet);
|
||||
local_bh_enable(); /* restore softirq, and kick ksoftirqd! */
|
||||
}
|
||||
|
||||
struct i915_request *
|
||||
execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists);
|
||||
|
||||
|
@ -186,11 +174,12 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
|
|||
#define I915_GEM_HWS_PREEMPT_ADDR (I915_GEM_HWS_PREEMPT * sizeof(u32))
|
||||
#define I915_GEM_HWS_SEQNO 0x40
|
||||
#define I915_GEM_HWS_SEQNO_ADDR (I915_GEM_HWS_SEQNO * sizeof(u32))
|
||||
#define I915_GEM_HWS_MIGRATE (0x42 * sizeof(u32))
|
||||
#define I915_GEM_HWS_SCRATCH 0x80
|
||||
|
||||
#define I915_HWS_CSB_BUF0_INDEX 0x10
|
||||
#define I915_HWS_CSB_WRITE_INDEX 0x1f
|
||||
#define CNL_HWS_CSB_WRITE_INDEX 0x2f
|
||||
#define ICL_HWS_CSB_WRITE_INDEX 0x2f
|
||||
|
||||
void intel_engine_stop(struct intel_engine_cs *engine);
|
||||
void intel_engine_cleanup(struct intel_engine_cs *engine);
|
||||
|
@ -223,6 +212,9 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
|
|||
|
||||
void intel_engine_init_execlists(struct intel_engine_cs *engine);
|
||||
|
||||
bool intel_engine_irq_enable(struct intel_engine_cs *engine);
|
||||
void intel_engine_irq_disable(struct intel_engine_cs *engine);
|
||||
|
||||
static inline void __intel_engine_reset(struct intel_engine_cs *engine,
|
||||
bool stalled)
|
||||
{
|
||||
|
@ -248,17 +240,27 @@ __printf(3, 4)
|
|||
void intel_engine_dump(struct intel_engine_cs *engine,
|
||||
struct drm_printer *m,
|
||||
const char *header, ...);
|
||||
void intel_engine_dump_active_requests(struct list_head *requests,
|
||||
struct i915_request *hung_rq,
|
||||
struct drm_printer *m);
|
||||
|
||||
ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine,
|
||||
ktime_t *now);
|
||||
|
||||
struct i915_request *
|
||||
intel_engine_find_active_request(struct intel_engine_cs *engine);
|
||||
intel_engine_execlist_find_hung_request(struct intel_engine_cs *engine);
|
||||
|
||||
u32 intel_engine_context_size(struct intel_gt *gt, u8 class);
|
||||
struct intel_context *
|
||||
intel_engine_create_pinned_context(struct intel_engine_cs *engine,
|
||||
struct i915_address_space *vm,
|
||||
unsigned int ring_size,
|
||||
unsigned int hwsp,
|
||||
struct lock_class_key *key,
|
||||
const char *name);
|
||||
|
||||
void intel_engine_destroy_pinned_context(struct intel_context *ce);
|
||||
|
||||
void intel_engine_init_active(struct intel_engine_cs *engine,
|
||||
unsigned int subclass);
|
||||
#define ENGINE_PHYSICAL 0
|
||||
#define ENGINE_MOCK 1
|
||||
#define ENGINE_VIRTUAL 2
|
||||
|
@ -277,13 +279,60 @@ intel_engine_has_preempt_reset(const struct intel_engine_cs *engine)
|
|||
return intel_engine_has_preemption(engine);
|
||||
}
|
||||
|
||||
struct intel_context *
|
||||
intel_engine_create_virtual(struct intel_engine_cs **siblings,
|
||||
unsigned int count);
|
||||
|
||||
static inline bool
|
||||
intel_virtual_engine_has_heartbeat(const struct intel_engine_cs *engine)
|
||||
{
|
||||
/*
|
||||
* For non-GuC submission we expect the back-end to look at the
|
||||
* heartbeat status of the actual physical engine that the work
|
||||
* has been (or is being) scheduled on, so we should only reach
|
||||
* here with GuC submission enabled.
|
||||
*/
|
||||
GEM_BUG_ON(!intel_engine_uses_guc(engine));
|
||||
|
||||
return intel_guc_virtual_engine_has_heartbeat(engine);
|
||||
}
|
||||
|
||||
static inline bool
|
||||
intel_engine_has_heartbeat(const struct intel_engine_cs *engine)
|
||||
{
|
||||
if (!IS_ACTIVE(CONFIG_DRM_I915_HEARTBEAT_INTERVAL))
|
||||
return false;
|
||||
|
||||
return READ_ONCE(engine->props.heartbeat_interval_ms);
|
||||
if (intel_engine_is_virtual(engine))
|
||||
return intel_virtual_engine_has_heartbeat(engine);
|
||||
else
|
||||
return READ_ONCE(engine->props.heartbeat_interval_ms);
|
||||
}
|
||||
|
||||
static inline struct intel_engine_cs *
|
||||
intel_engine_get_sibling(struct intel_engine_cs *engine, unsigned int sibling)
|
||||
{
|
||||
GEM_BUG_ON(!intel_engine_is_virtual(engine));
|
||||
return engine->cops->get_sibling(engine, sibling);
|
||||
}
|
||||
|
||||
static inline void
|
||||
intel_engine_set_hung_context(struct intel_engine_cs *engine,
|
||||
struct intel_context *ce)
|
||||
{
|
||||
engine->hung_ce = ce;
|
||||
}
|
||||
|
||||
static inline void
|
||||
intel_engine_clear_hung_context(struct intel_engine_cs *engine)
|
||||
{
|
||||
intel_engine_set_hung_context(engine, NULL);
|
||||
}
|
||||
|
||||
static inline struct intel_context *
|
||||
intel_engine_get_hung_context(struct intel_engine_cs *engine)
|
||||
{
|
||||
return engine->hung_ce;
|
||||
}
|
||||
|
||||
#endif /* _INTEL_RINGBUFFER_H_ */
|
||||
|
|
|
@ -35,14 +35,12 @@
|
|||
#define DEFAULT_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
|
||||
#define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
|
||||
#define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
|
||||
#define GEN10_LR_CONTEXT_RENDER_SIZE (18 * PAGE_SIZE)
|
||||
#define GEN11_LR_CONTEXT_RENDER_SIZE (14 * PAGE_SIZE)
|
||||
|
||||
#define GEN8_LR_CONTEXT_OTHER_SIZE ( 2 * PAGE_SIZE)
|
||||
|
||||
#define MAX_MMIO_BASES 3
|
||||
struct engine_info {
|
||||
unsigned int hw_id;
|
||||
u8 class;
|
||||
u8 instance;
|
||||
/* mmio bases table *must* be sorted in reverse graphics_ver order */
|
||||
|
@ -54,7 +52,6 @@ struct engine_info {
|
|||
|
||||
static const struct engine_info intel_engines[] = {
|
||||
[RCS0] = {
|
||||
.hw_id = RCS0_HW,
|
||||
.class = RENDER_CLASS,
|
||||
.instance = 0,
|
||||
.mmio_bases = {
|
||||
|
@ -62,7 +59,6 @@ static const struct engine_info intel_engines[] = {
|
|||
},
|
||||
},
|
||||
[BCS0] = {
|
||||
.hw_id = BCS0_HW,
|
||||
.class = COPY_ENGINE_CLASS,
|
||||
.instance = 0,
|
||||
.mmio_bases = {
|
||||
|
@ -70,7 +66,6 @@ static const struct engine_info intel_engines[] = {
|
|||
},
|
||||
},
|
||||
[VCS0] = {
|
||||
.hw_id = VCS0_HW,
|
||||
.class = VIDEO_DECODE_CLASS,
|
||||
.instance = 0,
|
||||
.mmio_bases = {
|
||||
|
@ -80,7 +75,6 @@ static const struct engine_info intel_engines[] = {
|
|||
},
|
||||
},
|
||||
[VCS1] = {
|
||||
.hw_id = VCS1_HW,
|
||||
.class = VIDEO_DECODE_CLASS,
|
||||
.instance = 1,
|
||||
.mmio_bases = {
|
||||
|
@ -89,7 +83,6 @@ static const struct engine_info intel_engines[] = {
|
|||
},
|
||||
},
|
||||
[VCS2] = {
|
||||
.hw_id = VCS2_HW,
|
||||
.class = VIDEO_DECODE_CLASS,
|
||||
.instance = 2,
|
||||
.mmio_bases = {
|
||||
|
@ -97,15 +90,41 @@ static const struct engine_info intel_engines[] = {
|
|||
},
|
||||
},
|
||||
[VCS3] = {
|
||||
.hw_id = VCS3_HW,
|
||||
.class = VIDEO_DECODE_CLASS,
|
||||
.instance = 3,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 11, .base = GEN11_BSD4_RING_BASE }
|
||||
},
|
||||
},
|
||||
[VCS4] = {
|
||||
.class = VIDEO_DECODE_CLASS,
|
||||
.instance = 4,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHP_BSD5_RING_BASE }
|
||||
},
|
||||
},
|
||||
[VCS5] = {
|
||||
.class = VIDEO_DECODE_CLASS,
|
||||
.instance = 5,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHP_BSD6_RING_BASE }
|
||||
},
|
||||
},
|
||||
[VCS6] = {
|
||||
.class = VIDEO_DECODE_CLASS,
|
||||
.instance = 6,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHP_BSD7_RING_BASE }
|
||||
},
|
||||
},
|
||||
[VCS7] = {
|
||||
.class = VIDEO_DECODE_CLASS,
|
||||
.instance = 7,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHP_BSD8_RING_BASE }
|
||||
},
|
||||
},
|
||||
[VECS0] = {
|
||||
.hw_id = VECS0_HW,
|
||||
.class = VIDEO_ENHANCEMENT_CLASS,
|
||||
.instance = 0,
|
||||
.mmio_bases = {
|
||||
|
@ -114,13 +133,26 @@ static const struct engine_info intel_engines[] = {
|
|||
},
|
||||
},
|
||||
[VECS1] = {
|
||||
.hw_id = VECS1_HW,
|
||||
.class = VIDEO_ENHANCEMENT_CLASS,
|
||||
.instance = 1,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 11, .base = GEN11_VEBOX2_RING_BASE }
|
||||
},
|
||||
},
|
||||
[VECS2] = {
|
||||
.class = VIDEO_ENHANCEMENT_CLASS,
|
||||
.instance = 2,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHP_VEBOX3_RING_BASE }
|
||||
},
|
||||
},
|
||||
[VECS3] = {
|
||||
.class = VIDEO_ENHANCEMENT_CLASS,
|
||||
.instance = 3,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHP_VEBOX4_RING_BASE }
|
||||
},
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
|
@ -153,8 +185,6 @@ u32 intel_engine_context_size(struct intel_gt *gt, u8 class)
|
|||
case 12:
|
||||
case 11:
|
||||
return GEN11_LR_CONTEXT_RENDER_SIZE;
|
||||
case 10:
|
||||
return GEN10_LR_CONTEXT_RENDER_SIZE;
|
||||
case 9:
|
||||
return GEN9_LR_CONTEXT_RENDER_SIZE;
|
||||
case 8:
|
||||
|
@ -269,6 +299,8 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
|
|||
|
||||
BUILD_BUG_ON(MAX_ENGINE_CLASS >= BIT(GEN11_ENGINE_CLASS_WIDTH));
|
||||
BUILD_BUG_ON(MAX_ENGINE_INSTANCE >= BIT(GEN11_ENGINE_INSTANCE_WIDTH));
|
||||
BUILD_BUG_ON(I915_MAX_VCS > (MAX_ENGINE_INSTANCE + 1));
|
||||
BUILD_BUG_ON(I915_MAX_VECS > (MAX_ENGINE_INSTANCE + 1));
|
||||
|
||||
if (GEM_DEBUG_WARN_ON(id >= ARRAY_SIZE(gt->engine)))
|
||||
return -EINVAL;
|
||||
|
@ -294,7 +326,6 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
|
|||
engine->i915 = i915;
|
||||
engine->gt = gt;
|
||||
engine->uncore = gt->uncore;
|
||||
engine->hw_id = info->hw_id;
|
||||
guc_class = engine_class_to_guc_class(info->class);
|
||||
engine->guc_id = MAKE_GUC_ID(guc_class, info->instance);
|
||||
engine->mmio_base = __engine_mmio_base(i915, info->mmio_bases);
|
||||
|
@ -328,9 +359,6 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
|
|||
if (engine->context_size)
|
||||
DRIVER_CAPS(i915)->has_logical_contexts = true;
|
||||
|
||||
/* Nothing to do here, execute in order of dependencies */
|
||||
engine->schedule = NULL;
|
||||
|
||||
ewma__engine_latency_init(&engine->latency);
|
||||
seqcount_init(&engine->stats.lock);
|
||||
|
||||
|
@ -445,6 +473,28 @@ void intel_engines_free(struct intel_gt *gt)
|
|||
}
|
||||
}
|
||||
|
||||
static
|
||||
bool gen11_vdbox_has_sfc(struct drm_i915_private *i915,
|
||||
unsigned int physical_vdbox,
|
||||
unsigned int logical_vdbox, u16 vdbox_mask)
|
||||
{
|
||||
/*
|
||||
* In Gen11, only even numbered logical VDBOXes are hooked
|
||||
* up to an SFC (Scaler & Format Converter) unit.
|
||||
* In Gen12, Even numbered physical instance always are connected
|
||||
* to an SFC. Odd numbered physical instances have SFC only if
|
||||
* previous even instance is fused off.
|
||||
*/
|
||||
if (GRAPHICS_VER(i915) == 12)
|
||||
return (physical_vdbox % 2 == 0) ||
|
||||
!(BIT(physical_vdbox - 1) & vdbox_mask);
|
||||
else if (GRAPHICS_VER(i915) == 11)
|
||||
return logical_vdbox % 2 == 0;
|
||||
|
||||
MISSING_CASE(GRAPHICS_VER(i915));
|
||||
return false;
|
||||
}
|
||||
|
||||
/*
|
||||
* Determine which engines are fused off in our particular hardware.
|
||||
* Note that we have a catch-22 situation where we need to be able to access
|
||||
|
@ -471,7 +521,14 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt)
|
|||
if (GRAPHICS_VER(i915) < 11)
|
||||
return info->engine_mask;
|
||||
|
||||
media_fuse = ~intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE);
|
||||
/*
|
||||
* On newer platforms the fusing register is called 'enable' and has
|
||||
* enable semantics, while on older platforms it is called 'disable'
|
||||
* and bits have disable semantices.
|
||||
*/
|
||||
media_fuse = intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE);
|
||||
if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50))
|
||||
media_fuse = ~media_fuse;
|
||||
|
||||
vdbox_mask = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
|
||||
vebox_mask = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
|
||||
|
@ -489,13 +546,9 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt)
|
|||
continue;
|
||||
}
|
||||
|
||||
/*
|
||||
* In Gen11, only even numbered logical VDBOXes are
|
||||
* hooked up to an SFC (Scaler & Format Converter) unit.
|
||||
* In TGL each VDBOX has access to an SFC.
|
||||
*/
|
||||
if (GRAPHICS_VER(i915) >= 12 || logical_vdbox++ % 2 == 0)
|
||||
if (gen11_vdbox_has_sfc(i915, i, logical_vdbox, vdbox_mask))
|
||||
gt->info.vdbox_sfc_access |= BIT(i);
|
||||
logical_vdbox++;
|
||||
}
|
||||
drm_dbg(&i915->drm, "vdbox enable: %04x, instances: %04lx\n",
|
||||
vdbox_mask, VDBOX_MASK(gt));
|
||||
|
@ -585,9 +638,6 @@ void intel_engine_init_execlists(struct intel_engine_cs *engine)
|
|||
memset(execlists->pending, 0, sizeof(execlists->pending));
|
||||
execlists->active =
|
||||
memset(execlists->inflight, 0, sizeof(execlists->inflight));
|
||||
|
||||
execlists->queue_priority_hint = INT_MIN;
|
||||
execlists->queue = RB_ROOT_CACHED;
|
||||
}
|
||||
|
||||
static void cleanup_status_page(struct intel_engine_cs *engine)
|
||||
|
@ -714,11 +764,17 @@ static int engine_setup_common(struct intel_engine_cs *engine)
|
|||
goto err_status;
|
||||
}
|
||||
|
||||
engine->sched_engine = i915_sched_engine_create(ENGINE_PHYSICAL);
|
||||
if (!engine->sched_engine) {
|
||||
err = -ENOMEM;
|
||||
goto err_sched_engine;
|
||||
}
|
||||
engine->sched_engine->private_data = engine;
|
||||
|
||||
err = intel_engine_init_cmd_parser(engine);
|
||||
if (err)
|
||||
goto err_cmd_parser;
|
||||
|
||||
intel_engine_init_active(engine, ENGINE_PHYSICAL);
|
||||
intel_engine_init_execlists(engine);
|
||||
intel_engine_init__pm(engine);
|
||||
intel_engine_init_retire(engine);
|
||||
|
@ -737,7 +793,9 @@ static int engine_setup_common(struct intel_engine_cs *engine)
|
|||
return 0;
|
||||
|
||||
err_cmd_parser:
|
||||
intel_breadcrumbs_free(engine->breadcrumbs);
|
||||
i915_sched_engine_put(engine->sched_engine);
|
||||
err_sched_engine:
|
||||
intel_breadcrumbs_put(engine->breadcrumbs);
|
||||
err_status:
|
||||
cleanup_status_page(engine);
|
||||
return err;
|
||||
|
@ -775,11 +833,11 @@ static int measure_breadcrumb_dw(struct intel_context *ce)
|
|||
frame->rq.ring = &frame->ring;
|
||||
|
||||
mutex_lock(&ce->timeline->mutex);
|
||||
spin_lock_irq(&engine->active.lock);
|
||||
spin_lock_irq(&engine->sched_engine->lock);
|
||||
|
||||
dw = engine->emit_fini_breadcrumb(&frame->rq, frame->cs) - frame->cs;
|
||||
|
||||
spin_unlock_irq(&engine->active.lock);
|
||||
spin_unlock_irq(&engine->sched_engine->lock);
|
||||
mutex_unlock(&ce->timeline->mutex);
|
||||
|
||||
GEM_BUG_ON(dw & 1); /* RING_TAIL must be qword aligned */
|
||||
|
@ -788,33 +846,13 @@ static int measure_breadcrumb_dw(struct intel_context *ce)
|
|||
return dw;
|
||||
}
|
||||
|
||||
void
|
||||
intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass)
|
||||
{
|
||||
INIT_LIST_HEAD(&engine->active.requests);
|
||||
INIT_LIST_HEAD(&engine->active.hold);
|
||||
|
||||
spin_lock_init(&engine->active.lock);
|
||||
lockdep_set_subclass(&engine->active.lock, subclass);
|
||||
|
||||
/*
|
||||
* Due to an interesting quirk in lockdep's internal debug tracking,
|
||||
* after setting a subclass we must ensure the lock is used. Otherwise,
|
||||
* nr_unused_locks is incremented once too often.
|
||||
*/
|
||||
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
||||
local_irq_disable();
|
||||
lock_map_acquire(&engine->active.lock.dep_map);
|
||||
lock_map_release(&engine->active.lock.dep_map);
|
||||
local_irq_enable();
|
||||
#endif
|
||||
}
|
||||
|
||||
static struct intel_context *
|
||||
create_pinned_context(struct intel_engine_cs *engine,
|
||||
unsigned int hwsp,
|
||||
struct lock_class_key *key,
|
||||
const char *name)
|
||||
struct intel_context *
|
||||
intel_engine_create_pinned_context(struct intel_engine_cs *engine,
|
||||
struct i915_address_space *vm,
|
||||
unsigned int ring_size,
|
||||
unsigned int hwsp,
|
||||
struct lock_class_key *key,
|
||||
const char *name)
|
||||
{
|
||||
struct intel_context *ce;
|
||||
int err;
|
||||
|
@ -825,6 +863,11 @@ create_pinned_context(struct intel_engine_cs *engine,
|
|||
|
||||
__set_bit(CONTEXT_BARRIER_BIT, &ce->flags);
|
||||
ce->timeline = page_pack_bits(NULL, hwsp);
|
||||
ce->ring = NULL;
|
||||
ce->ring_size = ring_size;
|
||||
|
||||
i915_vm_put(ce->vm);
|
||||
ce->vm = i915_vm_get(vm);
|
||||
|
||||
err = intel_context_pin(ce); /* perma-pin so it is always available */
|
||||
if (err) {
|
||||
|
@ -843,7 +886,7 @@ create_pinned_context(struct intel_engine_cs *engine,
|
|||
return ce;
|
||||
}
|
||||
|
||||
static void destroy_pinned_context(struct intel_context *ce)
|
||||
void intel_engine_destroy_pinned_context(struct intel_context *ce)
|
||||
{
|
||||
struct intel_engine_cs *engine = ce->engine;
|
||||
struct i915_vma *hwsp = engine->status_page.vma;
|
||||
|
@ -863,8 +906,9 @@ create_kernel_context(struct intel_engine_cs *engine)
|
|||
{
|
||||
static struct lock_class_key kernel;
|
||||
|
||||
return create_pinned_context(engine, I915_GEM_HWS_SEQNO_ADDR,
|
||||
&kernel, "kernel_context");
|
||||
return intel_engine_create_pinned_context(engine, engine->gt->vm, SZ_4K,
|
||||
I915_GEM_HWS_SEQNO_ADDR,
|
||||
&kernel, "kernel_context");
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -907,7 +951,7 @@ static int engine_init_common(struct intel_engine_cs *engine)
|
|||
return 0;
|
||||
|
||||
err_context:
|
||||
destroy_pinned_context(ce);
|
||||
intel_engine_destroy_pinned_context(ce);
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
@ -957,10 +1001,10 @@ int intel_engines_init(struct intel_gt *gt)
|
|||
*/
|
||||
void intel_engine_cleanup_common(struct intel_engine_cs *engine)
|
||||
{
|
||||
GEM_BUG_ON(!list_empty(&engine->active.requests));
|
||||
tasklet_kill(&engine->execlists.tasklet); /* flush the callback */
|
||||
GEM_BUG_ON(!list_empty(&engine->sched_engine->requests));
|
||||
|
||||
intel_breadcrumbs_free(engine->breadcrumbs);
|
||||
i915_sched_engine_put(engine->sched_engine);
|
||||
intel_breadcrumbs_put(engine->breadcrumbs);
|
||||
|
||||
intel_engine_fini_retire(engine);
|
||||
intel_engine_cleanup_cmd_parser(engine);
|
||||
|
@ -969,7 +1013,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
|
|||
fput(engine->default_state);
|
||||
|
||||
if (engine->kernel_context)
|
||||
destroy_pinned_context(engine->kernel_context);
|
||||
intel_engine_destroy_pinned_context(engine->kernel_context);
|
||||
|
||||
GEM_BUG_ON(!llist_empty(&engine->barrier_tasks));
|
||||
cleanup_status_page(engine);
|
||||
|
@ -1105,45 +1149,8 @@ static u32
|
|||
read_subslice_reg(const struct intel_engine_cs *engine,
|
||||
int slice, int subslice, i915_reg_t reg)
|
||||
{
|
||||
struct drm_i915_private *i915 = engine->i915;
|
||||
struct intel_uncore *uncore = engine->uncore;
|
||||
u32 mcr_mask, mcr_ss, mcr, old_mcr, val;
|
||||
enum forcewake_domains fw_domains;
|
||||
|
||||
if (GRAPHICS_VER(i915) >= 11) {
|
||||
mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
|
||||
mcr_ss = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
|
||||
} else {
|
||||
mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
|
||||
mcr_ss = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
|
||||
}
|
||||
|
||||
fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
|
||||
FW_REG_READ);
|
||||
fw_domains |= intel_uncore_forcewake_for_reg(uncore,
|
||||
GEN8_MCR_SELECTOR,
|
||||
FW_REG_READ | FW_REG_WRITE);
|
||||
|
||||
spin_lock_irq(&uncore->lock);
|
||||
intel_uncore_forcewake_get__locked(uncore, fw_domains);
|
||||
|
||||
old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
|
||||
|
||||
mcr &= ~mcr_mask;
|
||||
mcr |= mcr_ss;
|
||||
intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
|
||||
|
||||
val = intel_uncore_read_fw(uncore, reg);
|
||||
|
||||
mcr &= ~mcr_mask;
|
||||
mcr |= old_mcr & mcr_mask;
|
||||
|
||||
intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
|
||||
|
||||
intel_uncore_forcewake_put__locked(uncore, fw_domains);
|
||||
spin_unlock_irq(&uncore->lock);
|
||||
|
||||
return val;
|
||||
return intel_uncore_read_with_mcr_steering(engine->uncore, reg,
|
||||
slice, subslice);
|
||||
}
|
||||
|
||||
/* NB: please notice the memset */
|
||||
|
@ -1243,7 +1250,7 @@ static bool ring_is_idle(struct intel_engine_cs *engine)
|
|||
|
||||
void __intel_engine_flush_submission(struct intel_engine_cs *engine, bool sync)
|
||||
{
|
||||
struct tasklet_struct *t = &engine->execlists.tasklet;
|
||||
struct tasklet_struct *t = &engine->sched_engine->tasklet;
|
||||
|
||||
if (!t->callback)
|
||||
return;
|
||||
|
@ -1283,7 +1290,7 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
|
|||
intel_engine_flush_submission(engine);
|
||||
|
||||
/* ELSP is empty, but there are ready requests? E.g. after reset */
|
||||
if (!RB_EMPTY_ROOT(&engine->execlists.queue.rb_root))
|
||||
if (!i915_sched_engine_is_empty(engine->sched_engine))
|
||||
return false;
|
||||
|
||||
/* Ring stopped? */
|
||||
|
@ -1314,6 +1321,30 @@ bool intel_engines_are_idle(struct intel_gt *gt)
|
|||
return true;
|
||||
}
|
||||
|
||||
bool intel_engine_irq_enable(struct intel_engine_cs *engine)
|
||||
{
|
||||
if (!engine->irq_enable)
|
||||
return false;
|
||||
|
||||
/* Caller disables interrupts */
|
||||
spin_lock(&engine->gt->irq_lock);
|
||||
engine->irq_enable(engine);
|
||||
spin_unlock(&engine->gt->irq_lock);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
void intel_engine_irq_disable(struct intel_engine_cs *engine)
|
||||
{
|
||||
if (!engine->irq_disable)
|
||||
return;
|
||||
|
||||
/* Caller disables interrupts */
|
||||
spin_lock(&engine->gt->irq_lock);
|
||||
engine->irq_disable(engine);
|
||||
spin_unlock(&engine->gt->irq_lock);
|
||||
}
|
||||
|
||||
void intel_engines_reset_default_submission(struct intel_gt *gt)
|
||||
{
|
||||
struct intel_engine_cs *engine;
|
||||
|
@ -1349,7 +1380,7 @@ static struct intel_timeline *get_timeline(struct i915_request *rq)
|
|||
struct intel_timeline *tl;
|
||||
|
||||
/*
|
||||
* Even though we are holding the engine->active.lock here, there
|
||||
* Even though we are holding the engine->sched_engine->lock here, there
|
||||
* is no control over the submission queue per-se and we are
|
||||
* inspecting the active state at a random point in time, with an
|
||||
* unknown queue. Play safe and make sure the timeline remains valid.
|
||||
|
@ -1504,8 +1535,8 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
|
|||
|
||||
drm_printf(m, "\tExeclist tasklet queued? %s (%s), preempt? %s, timeslice? %s\n",
|
||||
yesno(test_bit(TASKLET_STATE_SCHED,
|
||||
&engine->execlists.tasklet.state)),
|
||||
enableddisabled(!atomic_read(&engine->execlists.tasklet.count)),
|
||||
&engine->sched_engine->tasklet.state)),
|
||||
enableddisabled(!atomic_read(&engine->sched_engine->tasklet.count)),
|
||||
repr_timer(&engine->execlists.preempt),
|
||||
repr_timer(&engine->execlists.timer));
|
||||
|
||||
|
@ -1529,7 +1560,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
|
|||
idx, hws[idx * 2], hws[idx * 2 + 1]);
|
||||
}
|
||||
|
||||
execlists_active_lock_bh(execlists);
|
||||
i915_sched_engine_active_lock_bh(engine->sched_engine);
|
||||
rcu_read_lock();
|
||||
for (port = execlists->active; (rq = *port); port++) {
|
||||
char hdr[160];
|
||||
|
@ -1560,7 +1591,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
|
|||
i915_request_show(m, rq, hdr, 0);
|
||||
}
|
||||
rcu_read_unlock();
|
||||
execlists_active_unlock_bh(execlists);
|
||||
i915_sched_engine_active_unlock_bh(engine->sched_engine);
|
||||
} else if (GRAPHICS_VER(dev_priv) > 6) {
|
||||
drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
|
||||
ENGINE_READ(engine, RING_PP_DIR_BASE));
|
||||
|
@ -1650,6 +1681,98 @@ static void print_properties(struct intel_engine_cs *engine,
|
|||
read_ul(&engine->defaults, p->offset));
|
||||
}
|
||||
|
||||
static void engine_dump_request(struct i915_request *rq, struct drm_printer *m, const char *msg)
|
||||
{
|
||||
struct intel_timeline *tl = get_timeline(rq);
|
||||
|
||||
i915_request_show(m, rq, msg, 0);
|
||||
|
||||
drm_printf(m, "\t\tring->start: 0x%08x\n",
|
||||
i915_ggtt_offset(rq->ring->vma));
|
||||
drm_printf(m, "\t\tring->head: 0x%08x\n",
|
||||
rq->ring->head);
|
||||
drm_printf(m, "\t\tring->tail: 0x%08x\n",
|
||||
rq->ring->tail);
|
||||
drm_printf(m, "\t\tring->emit: 0x%08x\n",
|
||||
rq->ring->emit);
|
||||
drm_printf(m, "\t\tring->space: 0x%08x\n",
|
||||
rq->ring->space);
|
||||
|
||||
if (tl) {
|
||||
drm_printf(m, "\t\tring->hwsp: 0x%08x\n",
|
||||
tl->hwsp_offset);
|
||||
intel_timeline_put(tl);
|
||||
}
|
||||
|
||||
print_request_ring(m, rq);
|
||||
|
||||
if (rq->context->lrc_reg_state) {
|
||||
drm_printf(m, "Logical Ring Context:\n");
|
||||
hexdump(m, rq->context->lrc_reg_state, PAGE_SIZE);
|
||||
}
|
||||
}
|
||||
|
||||
void intel_engine_dump_active_requests(struct list_head *requests,
|
||||
struct i915_request *hung_rq,
|
||||
struct drm_printer *m)
|
||||
{
|
||||
struct i915_request *rq;
|
||||
const char *msg;
|
||||
enum i915_request_state state;
|
||||
|
||||
list_for_each_entry(rq, requests, sched.link) {
|
||||
if (rq == hung_rq)
|
||||
continue;
|
||||
|
||||
state = i915_test_request_state(rq);
|
||||
if (state < I915_REQUEST_QUEUED)
|
||||
continue;
|
||||
|
||||
if (state == I915_REQUEST_ACTIVE)
|
||||
msg = "\t\tactive on engine";
|
||||
else
|
||||
msg = "\t\tactive in queue";
|
||||
|
||||
engine_dump_request(rq, m, msg);
|
||||
}
|
||||
}
|
||||
|
||||
static void engine_dump_active_requests(struct intel_engine_cs *engine, struct drm_printer *m)
|
||||
{
|
||||
struct i915_request *hung_rq = NULL;
|
||||
struct intel_context *ce;
|
||||
bool guc;
|
||||
|
||||
/*
|
||||
* No need for an engine->irq_seqno_barrier() before the seqno reads.
|
||||
* The GPU is still running so requests are still executing and any
|
||||
* hardware reads will be out of date by the time they are reported.
|
||||
* But the intention here is just to report an instantaneous snapshot
|
||||
* so that's fine.
|
||||
*/
|
||||
lockdep_assert_held(&engine->sched_engine->lock);
|
||||
|
||||
drm_printf(m, "\tRequests:\n");
|
||||
|
||||
guc = intel_uc_uses_guc_submission(&engine->gt->uc);
|
||||
if (guc) {
|
||||
ce = intel_engine_get_hung_context(engine);
|
||||
if (ce)
|
||||
hung_rq = intel_context_find_active_request(ce);
|
||||
} else {
|
||||
hung_rq = intel_engine_execlist_find_hung_request(engine);
|
||||
}
|
||||
|
||||
if (hung_rq)
|
||||
engine_dump_request(hung_rq, m, "\t\thung");
|
||||
|
||||
if (guc)
|
||||
intel_guc_dump_active_requests(engine, hung_rq, m);
|
||||
else
|
||||
intel_engine_dump_active_requests(&engine->sched_engine->requests,
|
||||
hung_rq, m);
|
||||
}
|
||||
|
||||
void intel_engine_dump(struct intel_engine_cs *engine,
|
||||
struct drm_printer *m,
|
||||
const char *header, ...)
|
||||
|
@ -1694,41 +1817,12 @@ void intel_engine_dump(struct intel_engine_cs *engine,
|
|||
i915_reset_count(error));
|
||||
print_properties(engine, m);
|
||||
|
||||
drm_printf(m, "\tRequests:\n");
|
||||
spin_lock_irqsave(&engine->sched_engine->lock, flags);
|
||||
engine_dump_active_requests(engine, m);
|
||||
|
||||
spin_lock_irqsave(&engine->active.lock, flags);
|
||||
rq = intel_engine_find_active_request(engine);
|
||||
if (rq) {
|
||||
struct intel_timeline *tl = get_timeline(rq);
|
||||
|
||||
i915_request_show(m, rq, "\t\tactive ", 0);
|
||||
|
||||
drm_printf(m, "\t\tring->start: 0x%08x\n",
|
||||
i915_ggtt_offset(rq->ring->vma));
|
||||
drm_printf(m, "\t\tring->head: 0x%08x\n",
|
||||
rq->ring->head);
|
||||
drm_printf(m, "\t\tring->tail: 0x%08x\n",
|
||||
rq->ring->tail);
|
||||
drm_printf(m, "\t\tring->emit: 0x%08x\n",
|
||||
rq->ring->emit);
|
||||
drm_printf(m, "\t\tring->space: 0x%08x\n",
|
||||
rq->ring->space);
|
||||
|
||||
if (tl) {
|
||||
drm_printf(m, "\t\tring->hwsp: 0x%08x\n",
|
||||
tl->hwsp_offset);
|
||||
intel_timeline_put(tl);
|
||||
}
|
||||
|
||||
print_request_ring(m, rq);
|
||||
|
||||
if (rq->context->lrc_reg_state) {
|
||||
drm_printf(m, "Logical Ring Context:\n");
|
||||
hexdump(m, rq->context->lrc_reg_state, PAGE_SIZE);
|
||||
}
|
||||
}
|
||||
drm_printf(m, "\tOn hold?: %lu\n", list_count(&engine->active.hold));
|
||||
spin_unlock_irqrestore(&engine->active.lock, flags);
|
||||
drm_printf(m, "\tOn hold?: %lu\n",
|
||||
list_count(&engine->sched_engine->hold));
|
||||
spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
|
||||
|
||||
drm_printf(m, "\tMMIO base: 0x%08x\n", engine->mmio_base);
|
||||
wakeref = intel_runtime_pm_get_if_in_use(engine->uncore->rpm);
|
||||
|
@ -1785,18 +1879,32 @@ ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine, ktime_t *now)
|
|||
return total;
|
||||
}
|
||||
|
||||
static bool match_ring(struct i915_request *rq)
|
||||
struct intel_context *
|
||||
intel_engine_create_virtual(struct intel_engine_cs **siblings,
|
||||
unsigned int count)
|
||||
{
|
||||
u32 ring = ENGINE_READ(rq->engine, RING_START);
|
||||
if (count == 0)
|
||||
return ERR_PTR(-EINVAL);
|
||||
|
||||
return ring == i915_ggtt_offset(rq->ring->vma);
|
||||
if (count == 1)
|
||||
return intel_context_create(siblings[0]);
|
||||
|
||||
GEM_BUG_ON(!siblings[0]->cops->create_virtual);
|
||||
return siblings[0]->cops->create_virtual(siblings, count);
|
||||
}
|
||||
|
||||
struct i915_request *
|
||||
intel_engine_find_active_request(struct intel_engine_cs *engine)
|
||||
intel_engine_execlist_find_hung_request(struct intel_engine_cs *engine)
|
||||
{
|
||||
struct i915_request *request, *active = NULL;
|
||||
|
||||
/*
|
||||
* This search does not work in GuC submission mode. However, the GuC
|
||||
* will report the hanging context directly to the driver itself. So
|
||||
* the driver should never get here when in GuC mode.
|
||||
*/
|
||||
GEM_BUG_ON(intel_uc_uses_guc_submission(&engine->gt->uc));
|
||||
|
||||
/*
|
||||
* We are called by the error capture, reset and to dump engine
|
||||
* state at random points in time. In particular, note that neither is
|
||||
|
@ -1808,7 +1916,7 @@ intel_engine_find_active_request(struct intel_engine_cs *engine)
|
|||
* At all other times, we must assume the GPU is still running, but
|
||||
* we only care about the snapshot of this moment.
|
||||
*/
|
||||
lockdep_assert_held(&engine->active.lock);
|
||||
lockdep_assert_held(&engine->sched_engine->lock);
|
||||
|
||||
rcu_read_lock();
|
||||
request = execlists_active(&engine->execlists);
|
||||
|
@ -1826,15 +1934,9 @@ intel_engine_find_active_request(struct intel_engine_cs *engine)
|
|||
if (active)
|
||||
return active;
|
||||
|
||||
list_for_each_entry(request, &engine->active.requests, sched.link) {
|
||||
if (__i915_request_is_complete(request))
|
||||
continue;
|
||||
|
||||
if (!__i915_request_has_started(request))
|
||||
continue;
|
||||
|
||||
/* More than one preemptible request may match! */
|
||||
if (!match_ring(request))
|
||||
list_for_each_entry(request, &engine->sched_engine->requests,
|
||||
sched.link) {
|
||||
if (i915_test_request_state(request) != I915_REQUEST_ACTIVE)
|
||||
continue;
|
||||
|
||||
active = request;
|
||||
|
|
|
@ -70,12 +70,38 @@ static void show_heartbeat(const struct i915_request *rq,
|
|||
{
|
||||
struct drm_printer p = drm_debug_printer("heartbeat");
|
||||
|
||||
intel_engine_dump(engine, &p,
|
||||
"%s heartbeat {seqno:%llx:%lld, prio:%d} not ticking\n",
|
||||
engine->name,
|
||||
rq->fence.context,
|
||||
rq->fence.seqno,
|
||||
rq->sched.attr.priority);
|
||||
if (!rq) {
|
||||
intel_engine_dump(engine, &p,
|
||||
"%s heartbeat not ticking\n",
|
||||
engine->name);
|
||||
} else {
|
||||
intel_engine_dump(engine, &p,
|
||||
"%s heartbeat {seqno:%llx:%lld, prio:%d} not ticking\n",
|
||||
engine->name,
|
||||
rq->fence.context,
|
||||
rq->fence.seqno,
|
||||
rq->sched.attr.priority);
|
||||
}
|
||||
}
|
||||
|
||||
static void
|
||||
reset_engine(struct intel_engine_cs *engine, struct i915_request *rq)
|
||||
{
|
||||
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
|
||||
show_heartbeat(rq, engine);
|
||||
|
||||
if (intel_engine_uses_guc(engine))
|
||||
/*
|
||||
* GuC itself is toast or GuC's hang detection
|
||||
* is disabled. Either way, need to find the
|
||||
* hang culprit manually.
|
||||
*/
|
||||
intel_guc_find_hung_context(engine);
|
||||
|
||||
intel_gt_handle_error(engine->gt, engine->mask,
|
||||
I915_ERROR_CAPTURE,
|
||||
"stopped heartbeat on %s",
|
||||
engine->name);
|
||||
}
|
||||
|
||||
static void heartbeat(struct work_struct *wrk)
|
||||
|
@ -102,6 +128,11 @@ static void heartbeat(struct work_struct *wrk)
|
|||
if (intel_gt_is_wedged(engine->gt))
|
||||
goto out;
|
||||
|
||||
if (i915_sched_engine_disabled(engine->sched_engine)) {
|
||||
reset_engine(engine, engine->heartbeat.systole);
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (engine->heartbeat.systole) {
|
||||
long delay = READ_ONCE(engine->props.heartbeat_interval_ms);
|
||||
|
||||
|
@ -121,7 +152,7 @@ static void heartbeat(struct work_struct *wrk)
|
|||
* but all other contexts, including the kernel
|
||||
* context are stuck waiting for the signal.
|
||||
*/
|
||||
} else if (engine->schedule &&
|
||||
} else if (engine->sched_engine->schedule &&
|
||||
rq->sched.attr.priority < I915_PRIORITY_BARRIER) {
|
||||
/*
|
||||
* Gradually raise the priority of the heartbeat to
|
||||
|
@ -136,16 +167,10 @@ static void heartbeat(struct work_struct *wrk)
|
|||
attr.priority = I915_PRIORITY_BARRIER;
|
||||
|
||||
local_bh_disable();
|
||||
engine->schedule(rq, &attr);
|
||||
engine->sched_engine->schedule(rq, &attr);
|
||||
local_bh_enable();
|
||||
} else {
|
||||
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
|
||||
show_heartbeat(rq, engine);
|
||||
|
||||
intel_gt_handle_error(engine->gt, engine->mask,
|
||||
I915_ERROR_CAPTURE,
|
||||
"stopped heartbeat on %s",
|
||||
engine->name);
|
||||
reset_engine(engine, rq);
|
||||
}
|
||||
|
||||
rq->emitted_jiffies = jiffies;
|
||||
|
@ -194,6 +219,25 @@ void intel_engine_park_heartbeat(struct intel_engine_cs *engine)
|
|||
i915_request_put(fetch_and_zero(&engine->heartbeat.systole));
|
||||
}
|
||||
|
||||
void intel_gt_unpark_heartbeats(struct intel_gt *gt)
|
||||
{
|
||||
struct intel_engine_cs *engine;
|
||||
enum intel_engine_id id;
|
||||
|
||||
for_each_engine(engine, gt, id)
|
||||
if (intel_engine_pm_is_awake(engine))
|
||||
intel_engine_unpark_heartbeat(engine);
|
||||
}
|
||||
|
||||
void intel_gt_park_heartbeats(struct intel_gt *gt)
|
||||
{
|
||||
struct intel_engine_cs *engine;
|
||||
enum intel_engine_id id;
|
||||
|
||||
for_each_engine(engine, gt, id)
|
||||
intel_engine_park_heartbeat(engine);
|
||||
}
|
||||
|
||||
void intel_engine_init_heartbeat(struct intel_engine_cs *engine)
|
||||
{
|
||||
INIT_DELAYED_WORK(&engine->heartbeat.work, heartbeat);
|
||||
|
|
|
@ -7,6 +7,7 @@
|
|||
#define INTEL_ENGINE_HEARTBEAT_H
|
||||
|
||||
struct intel_engine_cs;
|
||||
struct intel_gt;
|
||||
|
||||
void intel_engine_init_heartbeat(struct intel_engine_cs *engine);
|
||||
|
||||
|
@ -16,6 +17,9 @@ int intel_engine_set_heartbeat(struct intel_engine_cs *engine,
|
|||
void intel_engine_park_heartbeat(struct intel_engine_cs *engine);
|
||||
void intel_engine_unpark_heartbeat(struct intel_engine_cs *engine);
|
||||
|
||||
void intel_gt_park_heartbeats(struct intel_gt *gt);
|
||||
void intel_gt_unpark_heartbeats(struct intel_gt *gt);
|
||||
|
||||
int intel_engine_pulse(struct intel_engine_cs *engine);
|
||||
int intel_engine_flush_barriers(struct intel_engine_cs *engine);
|
||||
|
||||
|
|
|
@ -275,13 +275,11 @@ static int __engine_park(struct intel_wakeref *wf)
|
|||
intel_breadcrumbs_park(engine->breadcrumbs);
|
||||
|
||||
/* Must be reset upon idling, or we may miss the busy wakeup. */
|
||||
GEM_BUG_ON(engine->execlists.queue_priority_hint != INT_MIN);
|
||||
GEM_BUG_ON(engine->sched_engine->queue_priority_hint != INT_MIN);
|
||||
|
||||
if (engine->park)
|
||||
engine->park(engine);
|
||||
|
||||
engine->execlists.no_priolist = false;
|
||||
|
||||
/* While gt calls i915_vma_parked(), we have to break the lock cycle */
|
||||
intel_gt_pm_put_async(engine->gt);
|
||||
return 0;
|
||||
|
|
|
@ -21,32 +21,20 @@
|
|||
#include "i915_pmu.h"
|
||||
#include "i915_priolist_types.h"
|
||||
#include "i915_selftest.h"
|
||||
#include "intel_breadcrumbs_types.h"
|
||||
#include "intel_sseu.h"
|
||||
#include "intel_timeline_types.h"
|
||||
#include "intel_uncore.h"
|
||||
#include "intel_wakeref.h"
|
||||
#include "intel_workarounds_types.h"
|
||||
|
||||
/* Legacy HW Engine ID */
|
||||
|
||||
#define RCS0_HW 0
|
||||
#define VCS0_HW 1
|
||||
#define BCS0_HW 2
|
||||
#define VECS0_HW 3
|
||||
#define VCS1_HW 4
|
||||
#define VCS2_HW 6
|
||||
#define VCS3_HW 7
|
||||
#define VECS1_HW 12
|
||||
|
||||
/* Gen11+ HW Engine class + instance */
|
||||
/* HW Engine class + instance */
|
||||
#define RENDER_CLASS 0
|
||||
#define VIDEO_DECODE_CLASS 1
|
||||
#define VIDEO_ENHANCEMENT_CLASS 2
|
||||
#define COPY_ENGINE_CLASS 3
|
||||
#define OTHER_CLASS 4
|
||||
#define MAX_ENGINE_CLASS 4
|
||||
#define MAX_ENGINE_INSTANCE 3
|
||||
#define MAX_ENGINE_INSTANCE 7
|
||||
|
||||
#define I915_MAX_SLICES 3
|
||||
#define I915_MAX_SUBSLICES 8
|
||||
|
@ -59,11 +47,13 @@ struct drm_i915_reg_table;
|
|||
struct i915_gem_context;
|
||||
struct i915_request;
|
||||
struct i915_sched_attr;
|
||||
struct i915_sched_engine;
|
||||
struct intel_gt;
|
||||
struct intel_ring;
|
||||
struct intel_uncore;
|
||||
struct intel_breadcrumbs;
|
||||
|
||||
typedef u8 intel_engine_mask_t;
|
||||
typedef u32 intel_engine_mask_t;
|
||||
#define ALL_ENGINES ((intel_engine_mask_t)~0ul)
|
||||
|
||||
struct intel_hw_status_page {
|
||||
|
@ -100,8 +90,8 @@ struct i915_ctx_workarounds {
|
|||
struct i915_vma *vma;
|
||||
};
|
||||
|
||||
#define I915_MAX_VCS 4
|
||||
#define I915_MAX_VECS 2
|
||||
#define I915_MAX_VCS 8
|
||||
#define I915_MAX_VECS 4
|
||||
|
||||
/*
|
||||
* Engine IDs definitions.
|
||||
|
@ -114,9 +104,15 @@ enum intel_engine_id {
|
|||
VCS1,
|
||||
VCS2,
|
||||
VCS3,
|
||||
VCS4,
|
||||
VCS5,
|
||||
VCS6,
|
||||
VCS7,
|
||||
#define _VCS(n) (VCS0 + (n))
|
||||
VECS0,
|
||||
VECS1,
|
||||
VECS2,
|
||||
VECS3,
|
||||
#define _VECS(n) (VECS0 + (n))
|
||||
I915_NUM_ENGINES
|
||||
#define INVALID_ENGINE ((enum intel_engine_id)-1)
|
||||
|
@ -137,11 +133,6 @@ struct st_preempt_hang {
|
|||
* driver and the hardware state for execlist mode of submission.
|
||||
*/
|
||||
struct intel_engine_execlists {
|
||||
/**
|
||||
* @tasklet: softirq tasklet for bottom handler
|
||||
*/
|
||||
struct tasklet_struct tasklet;
|
||||
|
||||
/**
|
||||
* @timer: kick the current context if its timeslice expires
|
||||
*/
|
||||
|
@ -152,11 +143,6 @@ struct intel_engine_execlists {
|
|||
*/
|
||||
struct timer_list preempt;
|
||||
|
||||
/**
|
||||
* @default_priolist: priority list for I915_PRIORITY_NORMAL
|
||||
*/
|
||||
struct i915_priolist default_priolist;
|
||||
|
||||
/**
|
||||
* @ccid: identifier for contexts submitted to this engine
|
||||
*/
|
||||
|
@ -191,11 +177,6 @@ struct intel_engine_execlists {
|
|||
*/
|
||||
u32 reset_ccid;
|
||||
|
||||
/**
|
||||
* @no_priolist: priority lists disabled
|
||||
*/
|
||||
bool no_priolist;
|
||||
|
||||
/**
|
||||
* @submit_reg: gen-specific execlist submission register
|
||||
* set to the ExecList Submission Port (elsp) register pre-Gen11 and to
|
||||
|
@ -238,23 +219,10 @@ struct intel_engine_execlists {
|
|||
unsigned int port_mask;
|
||||
|
||||
/**
|
||||
* @queue_priority_hint: Highest pending priority.
|
||||
*
|
||||
* When we add requests into the queue, or adjust the priority of
|
||||
* executing requests, we compute the maximum priority of those
|
||||
* pending requests. We can then use this value to determine if
|
||||
* we need to preempt the executing requests to service the queue.
|
||||
* However, since the we may have recorded the priority of an inflight
|
||||
* request we wanted to preempt but since completed, at the time of
|
||||
* dequeuing the priority hint may no longer may match the highest
|
||||
* available request priority.
|
||||
* @virtual: Queue of requets on a virtual engine, sorted by priority.
|
||||
* Each RB entry is a struct i915_priolist containing a list of requests
|
||||
* of the same priority.
|
||||
*/
|
||||
int queue_priority_hint;
|
||||
|
||||
/**
|
||||
* @queue: queue of requests, in priority lists
|
||||
*/
|
||||
struct rb_root_cached queue;
|
||||
struct rb_root_cached virtual;
|
||||
|
||||
/**
|
||||
|
@ -295,7 +263,6 @@ struct intel_engine_cs {
|
|||
enum intel_engine_id id;
|
||||
enum intel_engine_id legacy_idx;
|
||||
|
||||
unsigned int hw_id;
|
||||
unsigned int guc_id;
|
||||
|
||||
intel_engine_mask_t mask;
|
||||
|
@ -326,15 +293,13 @@ struct intel_engine_cs {
|
|||
|
||||
struct intel_sseu sseu;
|
||||
|
||||
struct {
|
||||
spinlock_t lock;
|
||||
struct list_head requests;
|
||||
struct list_head hold; /* ready requests, but on hold */
|
||||
} active;
|
||||
struct i915_sched_engine *sched_engine;
|
||||
|
||||
/* keep a request in reserve for a [pm] barrier under oom */
|
||||
struct i915_request *request_pool;
|
||||
|
||||
struct intel_context *hung_ce;
|
||||
|
||||
struct llist_head barrier_tasks;
|
||||
|
||||
struct intel_context *kernel_context; /* pinned */
|
||||
|
@ -419,6 +384,8 @@ struct intel_engine_cs {
|
|||
void (*park)(struct intel_engine_cs *engine);
|
||||
void (*unpark)(struct intel_engine_cs *engine);
|
||||
|
||||
void (*bump_serial)(struct intel_engine_cs *engine);
|
||||
|
||||
void (*set_default_submission)(struct intel_engine_cs *engine);
|
||||
|
||||
const struct intel_context_ops *cops;
|
||||
|
@ -447,23 +414,14 @@ struct intel_engine_cs {
|
|||
*/
|
||||
void (*submit_request)(struct i915_request *rq);
|
||||
|
||||
/*
|
||||
* Called on signaling of a SUBMIT_FENCE, passing along the signaling
|
||||
* request down to the bonded pairs.
|
||||
*/
|
||||
void (*bond_execute)(struct i915_request *rq,
|
||||
struct dma_fence *signal);
|
||||
|
||||
/*
|
||||
* Call when the priority on a request has changed and it and its
|
||||
* dependencies may need rescheduling. Note the request itself may
|
||||
* not be ready to run!
|
||||
*/
|
||||
void (*schedule)(struct i915_request *request,
|
||||
const struct i915_sched_attr *attr);
|
||||
|
||||
void (*release)(struct intel_engine_cs *engine);
|
||||
|
||||
/*
|
||||
* Add / remove request from engine active tracking
|
||||
*/
|
||||
void (*add_active_request)(struct i915_request *rq);
|
||||
void (*remove_active_request)(struct i915_request *rq);
|
||||
|
||||
struct intel_engine_execlists execlists;
|
||||
|
||||
/*
|
||||
|
@ -485,6 +443,7 @@ struct intel_engine_cs {
|
|||
#define I915_ENGINE_IS_VIRTUAL BIT(5)
|
||||
#define I915_ENGINE_HAS_RELATIVE_MMIO BIT(6)
|
||||
#define I915_ENGINE_REQUIRES_CMD_PARSER BIT(7)
|
||||
#define I915_ENGINE_WANT_FORCED_PREEMPTION BIT(8)
|
||||
unsigned int flags;
|
||||
|
||||
/*
|
||||
|
|
|
@ -11,6 +11,7 @@
|
|||
#include "intel_engine.h"
|
||||
#include "intel_engine_user.h"
|
||||
#include "intel_gt.h"
|
||||
#include "uc/intel_guc_submission.h"
|
||||
|
||||
struct intel_engine_cs *
|
||||
intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance)
|
||||
|
@ -108,13 +109,16 @@ static void set_scheduler_caps(struct drm_i915_private *i915)
|
|||
for_each_uabi_engine(engine, i915) { /* all engines must agree! */
|
||||
int i;
|
||||
|
||||
if (engine->schedule)
|
||||
if (engine->sched_engine->schedule)
|
||||
enabled |= (I915_SCHEDULER_CAP_ENABLED |
|
||||
I915_SCHEDULER_CAP_PRIORITY);
|
||||
else
|
||||
disabled |= (I915_SCHEDULER_CAP_ENABLED |
|
||||
I915_SCHEDULER_CAP_PRIORITY);
|
||||
|
||||
if (intel_uc_uses_guc_submission(&i915->gt.uc))
|
||||
enabled |= I915_SCHEDULER_CAP_STATIC_PRIORITY_MAP;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(map); i++) {
|
||||
if (engine->flags & BIT(map[i].engine))
|
||||
enabled |= BIT(map[i].sched);
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -32,15 +32,7 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
|
|||
int indent),
|
||||
unsigned int max);
|
||||
|
||||
struct intel_context *
|
||||
intel_execlists_create_virtual(struct intel_engine_cs **siblings,
|
||||
unsigned int count);
|
||||
|
||||
struct intel_context *
|
||||
intel_execlists_clone_virtual(struct intel_engine_cs *src);
|
||||
|
||||
int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
|
||||
const struct intel_engine_cs *master,
|
||||
const struct intel_engine_cs *sibling);
|
||||
bool
|
||||
intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
|
||||
|
||||
#endif /* __INTEL_EXECLISTS_SUBMISSION_H__ */
|
||||
|
|
|
@ -826,13 +826,13 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
|
|||
phys_addr = pci_resource_start(pdev, 0) + pci_resource_len(pdev, 0) / 2;
|
||||
|
||||
/*
|
||||
* On BXT+/CNL+ writes larger than 64 bit to the GTT pagetable range
|
||||
* On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range
|
||||
* will be dropped. For WC mappings in general we have 64 byte burst
|
||||
* writes when the WC buffer is flushed, so we can't use it, but have to
|
||||
* resort to an uncached mapping. The WC issue is easily caught by the
|
||||
* readback check when writing GTT PTE entries.
|
||||
*/
|
||||
if (IS_GEN9_LP(i915) || GRAPHICS_VER(i915) >= 10)
|
||||
if (IS_GEN9_LP(i915) || GRAPHICS_VER(i915) >= 11)
|
||||
ggtt->gsm = ioremap(phys_addr, size);
|
||||
else
|
||||
ggtt->gsm = ioremap_wc(phys_addr, size);
|
||||
|
@ -1494,7 +1494,7 @@ intel_partial_pages(const struct i915_ggtt_view *view,
|
|||
if (ret)
|
||||
goto err_sg_alloc;
|
||||
|
||||
iter = i915_gem_object_get_sg_dma(obj, view->partial.offset, &offset, true);
|
||||
iter = i915_gem_object_get_sg_dma(obj, view->partial.offset, &offset);
|
||||
GEM_BUG_ON(!iter);
|
||||
|
||||
sg = st->sgl;
|
||||
|
|
|
@ -123,8 +123,10 @@
|
|||
#define MI_SEMAPHORE_SAD_NEQ_SDD (5 << 12)
|
||||
#define MI_SEMAPHORE_TOKEN_MASK REG_GENMASK(9, 5)
|
||||
#define MI_SEMAPHORE_TOKEN_SHIFT 5
|
||||
#define MI_STORE_DATA_IMM MI_INSTR(0x20, 0)
|
||||
#define MI_STORE_DWORD_IMM MI_INSTR(0x20, 1)
|
||||
#define MI_STORE_DWORD_IMM_GEN4 MI_INSTR(0x20, 2)
|
||||
#define MI_STORE_QWORD_IMM_GEN8 (MI_INSTR(0x20, 3) | REG_BIT(21))
|
||||
#define MI_MEM_VIRTUAL (1 << 22) /* 945,g33,965 */
|
||||
#define MI_USE_GGTT (1 << 22) /* g4x+ */
|
||||
#define MI_STORE_DWORD_INDEX MI_INSTR(0x21, 1)
|
||||
|
|
|
@ -13,6 +13,7 @@
|
|||
#include "intel_gt_clock_utils.h"
|
||||
#include "intel_gt_pm.h"
|
||||
#include "intel_gt_requests.h"
|
||||
#include "intel_migrate.h"
|
||||
#include "intel_mocs.h"
|
||||
#include "intel_rc6.h"
|
||||
#include "intel_renderstate.h"
|
||||
|
@ -40,8 +41,8 @@ void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915)
|
|||
intel_gt_init_timelines(gt);
|
||||
intel_gt_pm_init_early(gt);
|
||||
|
||||
intel_rps_init_early(>->rps);
|
||||
intel_uc_init_early(>->uc);
|
||||
intel_rps_init_early(>->rps);
|
||||
}
|
||||
|
||||
int intel_gt_probe_lmem(struct intel_gt *gt)
|
||||
|
@ -83,13 +84,73 @@ void intel_gt_init_hw_early(struct intel_gt *gt, struct i915_ggtt *ggtt)
|
|||
gt->ggtt = ggtt;
|
||||
}
|
||||
|
||||
static const struct intel_mmio_range icl_l3bank_steering_table[] = {
|
||||
{ 0x00B100, 0x00B3FF },
|
||||
{},
|
||||
};
|
||||
|
||||
static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
|
||||
{ 0x004000, 0x004AFF },
|
||||
{ 0x00C800, 0x00CFFF },
|
||||
{ 0x00DD00, 0x00DDFF },
|
||||
{ 0x00E900, 0x00FFFF }, /* 0xEA00 - OxEFFF is unused */
|
||||
{},
|
||||
};
|
||||
|
||||
static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = {
|
||||
{ 0x00B000, 0x00B0FF },
|
||||
{ 0x00D800, 0x00D8FF },
|
||||
{},
|
||||
};
|
||||
|
||||
static const struct intel_mmio_range dg2_lncf_steering_table[] = {
|
||||
{ 0x00B000, 0x00B0FF },
|
||||
{ 0x00D880, 0x00D8FF },
|
||||
{},
|
||||
};
|
||||
|
||||
static u16 slicemask(struct intel_gt *gt, int count)
|
||||
{
|
||||
u64 dss_mask = intel_sseu_get_subslices(>->info.sseu, 0);
|
||||
|
||||
return intel_slicemask_from_dssmask(dss_mask, count);
|
||||
}
|
||||
|
||||
int intel_gt_init_mmio(struct intel_gt *gt)
|
||||
{
|
||||
struct drm_i915_private *i915 = gt->i915;
|
||||
|
||||
intel_gt_init_clock_frequency(gt);
|
||||
|
||||
intel_uc_init_mmio(>->uc);
|
||||
intel_sseu_info_init(gt);
|
||||
|
||||
/*
|
||||
* An mslice is unavailable only if both the meml3 for the slice is
|
||||
* disabled *and* all of the DSS in the slice (quadrant) are disabled.
|
||||
*/
|
||||
if (HAS_MSLICES(i915))
|
||||
gt->info.mslice_mask =
|
||||
slicemask(gt, GEN_DSS_PER_MSLICE) |
|
||||
(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
|
||||
GEN12_MEML3_EN_MASK);
|
||||
|
||||
if (IS_DG2(i915)) {
|
||||
gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
|
||||
gt->steering_table[LNCF] = dg2_lncf_steering_table;
|
||||
} else if (IS_XEHPSDV(i915)) {
|
||||
gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
|
||||
gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
|
||||
} else if (GRAPHICS_VER(i915) >= 11 &&
|
||||
GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) {
|
||||
gt->steering_table[L3BANK] = icl_l3bank_steering_table;
|
||||
gt->info.l3bank_mask =
|
||||
~intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
|
||||
GEN10_L3BANK_MASK;
|
||||
} else if (HAS_MSLICES(i915)) {
|
||||
MISSING_CASE(INTEL_INFO(i915)->platform);
|
||||
}
|
||||
|
||||
return intel_engines_init_mmio(gt);
|
||||
}
|
||||
|
||||
|
@ -192,7 +253,7 @@ static void clear_register(struct intel_uncore *uncore, i915_reg_t reg)
|
|||
intel_uncore_rmw(uncore, reg, 0, 0);
|
||||
}
|
||||
|
||||
static void gen8_clear_engine_error_register(struct intel_engine_cs *engine)
|
||||
static void gen6_clear_engine_error_register(struct intel_engine_cs *engine)
|
||||
{
|
||||
GEN6_RING_FAULT_REG_RMW(engine, RING_FAULT_VALID, 0);
|
||||
GEN6_RING_FAULT_REG_POSTING_READ(engine);
|
||||
|
@ -238,7 +299,7 @@ intel_gt_clear_error_registers(struct intel_gt *gt,
|
|||
enum intel_engine_id id;
|
||||
|
||||
for_each_engine_masked(engine, gt, engine_mask, id)
|
||||
gen8_clear_engine_error_register(engine);
|
||||
gen6_clear_engine_error_register(engine);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -572,6 +633,25 @@ static void __intel_gt_disable(struct intel_gt *gt)
|
|||
GEM_BUG_ON(intel_gt_pm_is_awake(gt));
|
||||
}
|
||||
|
||||
int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout)
|
||||
{
|
||||
long remaining_timeout;
|
||||
|
||||
/* If the device is asleep, we have no requests outstanding */
|
||||
if (!intel_gt_pm_is_awake(gt))
|
||||
return 0;
|
||||
|
||||
while ((timeout = intel_gt_retire_requests_timeout(gt, timeout,
|
||||
&remaining_timeout)) > 0) {
|
||||
cond_resched();
|
||||
if (signal_pending(current))
|
||||
return -EINTR;
|
||||
}
|
||||
|
||||
return timeout ? timeout : intel_uc_wait_for_idle(>->uc,
|
||||
remaining_timeout);
|
||||
}
|
||||
|
||||
int intel_gt_init(struct intel_gt *gt)
|
||||
{
|
||||
int err;
|
||||
|
@ -622,10 +702,14 @@ int intel_gt_init(struct intel_gt *gt)
|
|||
if (err)
|
||||
goto err_gt;
|
||||
|
||||
intel_uc_init_late(>->uc);
|
||||
|
||||
err = i915_inject_probe_error(gt->i915, -EIO);
|
||||
if (err)
|
||||
goto err_gt;
|
||||
|
||||
intel_migrate_init(>->migrate, gt);
|
||||
|
||||
goto out_fw;
|
||||
err_gt:
|
||||
__intel_gt_disable(gt);
|
||||
|
@ -649,6 +733,7 @@ void intel_gt_driver_remove(struct intel_gt *gt)
|
|||
{
|
||||
__intel_gt_disable(gt);
|
||||
|
||||
intel_migrate_fini(>->migrate);
|
||||
intel_uc_driver_remove(>->uc);
|
||||
|
||||
intel_engines_release(gt);
|
||||
|
@ -697,6 +782,112 @@ void intel_gt_driver_late_release(struct intel_gt *gt)
|
|||
intel_engines_free(gt);
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_reg_needs_read_steering - determine whether a register read
|
||||
* requires explicit steering
|
||||
* @gt: GT structure
|
||||
* @reg: the register to check steering requirements for
|
||||
* @type: type of multicast steering to check
|
||||
*
|
||||
* Determines whether @reg needs explicit steering of a specific type for
|
||||
* reads.
|
||||
*
|
||||
* Returns false if @reg does not belong to a register range of the given
|
||||
* steering type, or if the default (subslice-based) steering IDs are suitable
|
||||
* for @type steering too.
|
||||
*/
|
||||
static bool intel_gt_reg_needs_read_steering(struct intel_gt *gt,
|
||||
i915_reg_t reg,
|
||||
enum intel_steering_type type)
|
||||
{
|
||||
const u32 offset = i915_mmio_reg_offset(reg);
|
||||
const struct intel_mmio_range *entry;
|
||||
|
||||
if (likely(!intel_gt_needs_read_steering(gt, type)))
|
||||
return false;
|
||||
|
||||
for (entry = gt->steering_table[type]; entry->end; entry++) {
|
||||
if (offset >= entry->start && offset <= entry->end)
|
||||
return true;
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_get_valid_steering - determines valid IDs for a class of MCR steering
|
||||
* @gt: GT structure
|
||||
* @type: multicast register type
|
||||
* @sliceid: Slice ID returned
|
||||
* @subsliceid: Subslice ID returned
|
||||
*
|
||||
* Determines sliceid and subsliceid values that will steer reads
|
||||
* of a specific multicast register class to a valid value.
|
||||
*/
|
||||
static void intel_gt_get_valid_steering(struct intel_gt *gt,
|
||||
enum intel_steering_type type,
|
||||
u8 *sliceid, u8 *subsliceid)
|
||||
{
|
||||
switch (type) {
|
||||
case L3BANK:
|
||||
GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
|
||||
|
||||
*sliceid = 0; /* unused */
|
||||
*subsliceid = __ffs(gt->info.l3bank_mask);
|
||||
break;
|
||||
case MSLICE:
|
||||
GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
|
||||
|
||||
*sliceid = __ffs(gt->info.mslice_mask);
|
||||
*subsliceid = 0; /* unused */
|
||||
break;
|
||||
case LNCF:
|
||||
GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
|
||||
|
||||
/*
|
||||
* An LNCF is always present if its mslice is present, so we
|
||||
* can safely just steer to LNCF 0 in all cases.
|
||||
*/
|
||||
*sliceid = __ffs(gt->info.mslice_mask) << 1;
|
||||
*subsliceid = 0; /* unused */
|
||||
break;
|
||||
default:
|
||||
MISSING_CASE(type);
|
||||
*sliceid = 0;
|
||||
*subsliceid = 0;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_read_register_fw - reads a GT register with support for multicast
|
||||
* @gt: GT structure
|
||||
* @reg: register to read
|
||||
*
|
||||
* This function will read a GT register. If the register is a multicast
|
||||
* register, the read will be steered to a valid instance (i.e., one that
|
||||
* isn't fused off or powered down by power gating).
|
||||
*
|
||||
* Returns the value from a valid instance of @reg.
|
||||
*/
|
||||
u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg)
|
||||
{
|
||||
int type;
|
||||
u8 sliceid, subsliceid;
|
||||
|
||||
for (type = 0; type < NUM_STEERING_TYPES; type++) {
|
||||
if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
|
||||
intel_gt_get_valid_steering(gt, type, &sliceid,
|
||||
&subsliceid);
|
||||
return intel_uncore_read_with_mcr_steering_fw(gt->uncore,
|
||||
reg,
|
||||
sliceid,
|
||||
subsliceid);
|
||||
}
|
||||
}
|
||||
|
||||
return intel_uncore_read_fw(gt->uncore, reg);
|
||||
}
|
||||
|
||||
void intel_gt_info_print(const struct intel_gt_info *info,
|
||||
struct drm_printer *p)
|
||||
{
|
||||
|
|
|
@ -48,6 +48,8 @@ void intel_gt_driver_release(struct intel_gt *gt);
|
|||
|
||||
void intel_gt_driver_late_release(struct intel_gt *gt);
|
||||
|
||||
int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
|
||||
|
||||
void intel_gt_check_and_clear_faults(struct intel_gt *gt);
|
||||
void intel_gt_clear_error_registers(struct intel_gt *gt,
|
||||
intel_engine_mask_t engine_mask);
|
||||
|
@ -75,6 +77,14 @@ static inline bool intel_gt_is_wedged(const struct intel_gt *gt)
|
|||
return unlikely(test_bit(I915_WEDGED, >->reset.flags));
|
||||
}
|
||||
|
||||
static inline bool intel_gt_needs_read_steering(struct intel_gt *gt,
|
||||
enum intel_steering_type type)
|
||||
{
|
||||
return gt->steering_table[type];
|
||||
}
|
||||
|
||||
u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg);
|
||||
|
||||
void intel_gt_info_print(const struct intel_gt_info *info,
|
||||
struct drm_printer *p);
|
||||
|
||||
|
|
|
@ -24,8 +24,8 @@ static u32 read_reference_ts_freq(struct intel_uncore *uncore)
|
|||
return base_freq + frac_freq;
|
||||
}
|
||||
|
||||
static u32 gen10_get_crystal_clock_freq(struct intel_uncore *uncore,
|
||||
u32 rpm_config_reg)
|
||||
static u32 gen9_get_crystal_clock_freq(struct intel_uncore *uncore,
|
||||
u32 rpm_config_reg)
|
||||
{
|
||||
u32 f19_2_mhz = 19200000;
|
||||
u32 f24_mhz = 24000000;
|
||||
|
@ -128,10 +128,10 @@ static u32 read_clock_frequency(struct intel_uncore *uncore)
|
|||
} else {
|
||||
u32 c0 = intel_uncore_read(uncore, RPM_CONFIG0);
|
||||
|
||||
if (GRAPHICS_VER(uncore->i915) <= 10)
|
||||
freq = gen10_get_crystal_clock_freq(uncore, c0);
|
||||
else
|
||||
if (GRAPHICS_VER(uncore->i915) >= 11)
|
||||
freq = gen11_get_crystal_clock_freq(uncore, c0);
|
||||
else
|
||||
freq = gen9_get_crystal_clock_freq(uncore, c0);
|
||||
|
||||
/*
|
||||
* Now figure out how the command stream's timestamp
|
||||
|
|
|
@ -184,7 +184,13 @@ void gen11_gt_irq_reset(struct intel_gt *gt)
|
|||
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~0);
|
||||
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~0);
|
||||
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~0);
|
||||
if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
|
||||
intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~0);
|
||||
if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7))
|
||||
intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~0);
|
||||
intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~0);
|
||||
if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3))
|
||||
intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~0);
|
||||
|
||||
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0);
|
||||
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK, ~0);
|
||||
|
@ -218,8 +224,13 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
|
|||
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask);
|
||||
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask);
|
||||
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask);
|
||||
if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
|
||||
intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~dmask);
|
||||
if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7))
|
||||
intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~dmask);
|
||||
intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~dmask);
|
||||
|
||||
if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3))
|
||||
intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~dmask);
|
||||
/*
|
||||
* RPS interrupts will get enabled/disabled on demand when RPS itself
|
||||
* is enabled/disabled.
|
||||
|
|
|
@ -6,7 +6,6 @@
|
|||
#include <linux/suspend.h>
|
||||
|
||||
#include "i915_drv.h"
|
||||
#include "i915_globals.h"
|
||||
#include "i915_params.h"
|
||||
#include "intel_context.h"
|
||||
#include "intel_engine_pm.h"
|
||||
|
@ -67,8 +66,6 @@ static int __gt_unpark(struct intel_wakeref *wf)
|
|||
|
||||
GT_TRACE(gt, "\n");
|
||||
|
||||
i915_globals_unpark();
|
||||
|
||||
/*
|
||||
* It seems that the DMC likes to transition between the DC states a lot
|
||||
* when there are no connected displays (no active power domains) during
|
||||
|
@ -116,8 +113,6 @@ static int __gt_park(struct intel_wakeref *wf)
|
|||
GEM_BUG_ON(!wakeref);
|
||||
intel_display_power_put_async(i915, POWER_DOMAIN_GT_IRQ, wakeref);
|
||||
|
||||
i915_globals_park();
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -174,8 +169,6 @@ static void gt_sanitize(struct intel_gt *gt, bool force)
|
|||
if (intel_gt_is_wedged(gt))
|
||||
intel_gt_unset_wedged(gt);
|
||||
|
||||
intel_uc_sanitize(>->uc);
|
||||
|
||||
for_each_engine(engine, gt, id)
|
||||
if (engine->reset.prepare)
|
||||
engine->reset.prepare(engine);
|
||||
|
@ -191,6 +184,8 @@ static void gt_sanitize(struct intel_gt *gt, bool force)
|
|||
__intel_engine_reset(engine, false);
|
||||
}
|
||||
|
||||
intel_uc_reset(>->uc, false);
|
||||
|
||||
for_each_engine(engine, gt, id)
|
||||
if (engine->reset.finish)
|
||||
engine->reset.finish(engine);
|
||||
|
@ -243,6 +238,8 @@ int intel_gt_resume(struct intel_gt *gt)
|
|||
goto err_wedged;
|
||||
}
|
||||
|
||||
intel_uc_reset_finish(>->uc);
|
||||
|
||||
intel_rps_enable(>->rps);
|
||||
intel_llc_enable(>->llc);
|
||||
|
||||
|
|
|
@ -130,7 +130,8 @@ void intel_engine_fini_retire(struct intel_engine_cs *engine)
|
|||
GEM_BUG_ON(engine->retire);
|
||||
}
|
||||
|
||||
long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
|
||||
long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
|
||||
long *remaining_timeout)
|
||||
{
|
||||
struct intel_gt_timelines *timelines = >->timelines;
|
||||
struct intel_timeline *tl, *tn;
|
||||
|
@ -195,24 +196,12 @@ out_active: spin_lock(&timelines->lock);
|
|||
if (flush_submission(gt, timeout)) /* Wait, there's more! */
|
||||
active_count++;
|
||||
|
||||
if (remaining_timeout)
|
||||
*remaining_timeout = timeout;
|
||||
|
||||
return active_count ? timeout : 0;
|
||||
}
|
||||
|
||||
int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout)
|
||||
{
|
||||
/* If the device is asleep, we have no requests outstanding */
|
||||
if (!intel_gt_pm_is_awake(gt))
|
||||
return 0;
|
||||
|
||||
while ((timeout = intel_gt_retire_requests_timeout(gt, timeout)) > 0) {
|
||||
cond_resched();
|
||||
if (signal_pending(current))
|
||||
return -EINTR;
|
||||
}
|
||||
|
||||
return timeout;
|
||||
}
|
||||
|
||||
static void retire_work_handler(struct work_struct *work)
|
||||
{
|
||||
struct intel_gt *gt =
|
||||
|
|
|
@ -6,14 +6,17 @@
|
|||
#ifndef INTEL_GT_REQUESTS_H
|
||||
#define INTEL_GT_REQUESTS_H
|
||||
|
||||
#include <stddef.h>
|
||||
|
||||
struct intel_engine_cs;
|
||||
struct intel_gt;
|
||||
struct intel_timeline;
|
||||
|
||||
long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout);
|
||||
long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
|
||||
long *remaining_timeout);
|
||||
static inline void intel_gt_retire_requests(struct intel_gt *gt)
|
||||
{
|
||||
intel_gt_retire_requests_timeout(gt, 0);
|
||||
intel_gt_retire_requests_timeout(gt, 0, NULL);
|
||||
}
|
||||
|
||||
void intel_engine_init_retire(struct intel_engine_cs *engine);
|
||||
|
@ -21,8 +24,6 @@ void intel_engine_add_retire(struct intel_engine_cs *engine,
|
|||
struct intel_timeline *tl);
|
||||
void intel_engine_fini_retire(struct intel_engine_cs *engine);
|
||||
|
||||
int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
|
||||
|
||||
void intel_gt_init_requests(struct intel_gt *gt);
|
||||
void intel_gt_park_requests(struct intel_gt *gt);
|
||||
void intel_gt_unpark_requests(struct intel_gt *gt);
|
||||
|
|
|
@ -24,6 +24,7 @@
|
|||
#include "intel_reset_types.h"
|
||||
#include "intel_rc6_types.h"
|
||||
#include "intel_rps_types.h"
|
||||
#include "intel_migrate_types.h"
|
||||
#include "intel_wakeref.h"
|
||||
|
||||
struct drm_i915_private;
|
||||
|
@ -31,6 +32,33 @@ struct i915_ggtt;
|
|||
struct intel_engine_cs;
|
||||
struct intel_uncore;
|
||||
|
||||
struct intel_mmio_range {
|
||||
u32 start;
|
||||
u32 end;
|
||||
};
|
||||
|
||||
/*
|
||||
* The hardware has multiple kinds of multicast register ranges that need
|
||||
* special register steering (and future platforms are expected to add
|
||||
* additional types).
|
||||
*
|
||||
* During driver startup, we initialize the steering control register to
|
||||
* direct reads to a slice/subslice that are valid for the 'subslice' class
|
||||
* of multicast registers. If another type of steering does not have any
|
||||
* overlap in valid steering targets with 'subslice' style registers, we will
|
||||
* need to explicitly re-steer reads of registers of the other type.
|
||||
*
|
||||
* Only the replication types that may need additional non-default steering
|
||||
* are listed here.
|
||||
*/
|
||||
enum intel_steering_type {
|
||||
L3BANK,
|
||||
MSLICE,
|
||||
LNCF,
|
||||
|
||||
NUM_STEERING_TYPES
|
||||
};
|
||||
|
||||
enum intel_submission_method {
|
||||
INTEL_SUBMISSION_RING,
|
||||
INTEL_SUBMISSION_ELSP,
|
||||
|
@ -145,8 +173,15 @@ struct intel_gt {
|
|||
|
||||
struct i915_vma *scratch;
|
||||
|
||||
struct intel_migrate migrate;
|
||||
|
||||
const struct intel_mmio_range *steering_table[NUM_STEERING_TYPES];
|
||||
|
||||
struct intel_gt_info {
|
||||
intel_engine_mask_t engine_mask;
|
||||
|
||||
u32 l3bank_mask;
|
||||
|
||||
u8 num_engines;
|
||||
|
||||
/* Media engine access to SFC per instance */
|
||||
|
@ -154,6 +189,8 @@ struct intel_gt {
|
|||
|
||||
/* Slice/subslice/EU info */
|
||||
struct sseu_dev_info sseu;
|
||||
|
||||
unsigned long mslice_mask;
|
||||
} info;
|
||||
};
|
||||
|
||||
|
|
|
@ -16,7 +16,19 @@ struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz)
|
|||
{
|
||||
struct drm_i915_gem_object *obj;
|
||||
|
||||
obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
|
||||
/*
|
||||
* To avoid severe over-allocation when dealing with min_page_size
|
||||
* restrictions, we override that behaviour here by allowing an object
|
||||
* size and page layout which can be smaller. In practice this should be
|
||||
* totally fine, since GTT paging structures are not typically inserted
|
||||
* into the GTT.
|
||||
*
|
||||
* Note that we also hit this path for the scratch page, and for this
|
||||
* case it might need to be 64K, but that should work fine here since we
|
||||
* used the passed in size for the page size, which should ensure it
|
||||
* also has the same alignment.
|
||||
*/
|
||||
obj = __i915_gem_object_create_lmem_with_ps(vm->i915, sz, sz, 0);
|
||||
/*
|
||||
* Ensure all paging structures for this vm share the same dma-resv
|
||||
* object underneath, with the idea that one object_lock() will lock
|
||||
|
@ -414,7 +426,7 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore)
|
|||
intel_uncore_write(uncore, GEN12_PAT_INDEX(7), GEN8_PPAT_WB);
|
||||
}
|
||||
|
||||
static void cnl_setup_private_ppat(struct intel_uncore *uncore)
|
||||
static void icl_setup_private_ppat(struct intel_uncore *uncore)
|
||||
{
|
||||
intel_uncore_write(uncore,
|
||||
GEN10_PAT_INDEX(0),
|
||||
|
@ -514,8 +526,8 @@ void setup_private_pat(struct intel_uncore *uncore)
|
|||
|
||||
if (GRAPHICS_VER(i915) >= 12)
|
||||
tgl_setup_private_ppat(uncore);
|
||||
else if (GRAPHICS_VER(i915) >= 10)
|
||||
cnl_setup_private_ppat(uncore);
|
||||
else if (GRAPHICS_VER(i915) >= 11)
|
||||
icl_setup_private_ppat(uncore);
|
||||
else if (IS_CHERRYVIEW(i915) || IS_GEN9_LP(i915))
|
||||
chv_setup_private_ppat(uncore);
|
||||
else
|
||||
|
|
|
@ -140,7 +140,6 @@ typedef u64 gen8_pte_t;
|
|||
|
||||
enum i915_cache_level;
|
||||
|
||||
struct drm_i915_file_private;
|
||||
struct drm_i915_gem_object;
|
||||
struct i915_fence_reg;
|
||||
struct i915_vma;
|
||||
|
@ -220,16 +219,6 @@ struct i915_address_space {
|
|||
struct intel_gt *gt;
|
||||
struct drm_i915_private *i915;
|
||||
struct device *dma;
|
||||
/*
|
||||
* Every address space belongs to a struct file - except for the global
|
||||
* GTT that is owned by the driver (and so @file is set to NULL). In
|
||||
* principle, no information should leak from one context to another
|
||||
* (or between files/processes etc) unless explicitly shared by the
|
||||
* owner. Tracking the owner is important in order to free up per-file
|
||||
* objects along with the file, to aide resource tracking, and to
|
||||
* assign blame.
|
||||
*/
|
||||
struct drm_i915_file_private *file;
|
||||
u64 total; /* size addr space maps (ex. 2GB for ggtt) */
|
||||
u64 reserved; /* size addr space reserved */
|
||||
|
||||
|
@ -296,6 +285,13 @@ struct i915_address_space {
|
|||
u32 flags);
|
||||
void (*cleanup)(struct i915_address_space *vm);
|
||||
|
||||
void (*foreach)(struct i915_address_space *vm,
|
||||
u64 start, u64 length,
|
||||
void (*fn)(struct i915_address_space *vm,
|
||||
struct i915_page_table *pt,
|
||||
void *data),
|
||||
void *data);
|
||||
|
||||
struct i915_vma_ops vma_ops;
|
||||
|
||||
I915_SELFTEST_DECLARE(struct fault_attr fault_attr);
|
||||
|
|
|
@ -70,7 +70,7 @@ static void set_offsets(u32 *regs,
|
|||
if (close) {
|
||||
/* Close the batch; used mainly by live_lrc_layout() */
|
||||
*regs = MI_BATCH_BUFFER_END;
|
||||
if (GRAPHICS_VER(engine->i915) >= 10)
|
||||
if (GRAPHICS_VER(engine->i915) >= 11)
|
||||
*regs |= BIT(0);
|
||||
}
|
||||
}
|
||||
|
@ -484,6 +484,47 @@ static const u8 gen12_rcs_offsets[] = {
|
|||
END
|
||||
};
|
||||
|
||||
static const u8 xehp_rcs_offsets[] = {
|
||||
NOP(1),
|
||||
LRI(13, POSTED),
|
||||
REG16(0x244),
|
||||
REG(0x034),
|
||||
REG(0x030),
|
||||
REG(0x038),
|
||||
REG(0x03c),
|
||||
REG(0x168),
|
||||
REG(0x140),
|
||||
REG(0x110),
|
||||
REG(0x1c0),
|
||||
REG(0x1c4),
|
||||
REG(0x1c8),
|
||||
REG(0x180),
|
||||
REG16(0x2b4),
|
||||
|
||||
NOP(5),
|
||||
LRI(9, POSTED),
|
||||
REG16(0x3a8),
|
||||
REG16(0x28c),
|
||||
REG16(0x288),
|
||||
REG16(0x284),
|
||||
REG16(0x280),
|
||||
REG16(0x27c),
|
||||
REG16(0x278),
|
||||
REG16(0x274),
|
||||
REG16(0x270),
|
||||
|
||||
LRI(3, POSTED),
|
||||
REG(0x1b0),
|
||||
REG16(0x5a8),
|
||||
REG16(0x5ac),
|
||||
|
||||
NOP(6),
|
||||
LRI(1, 0),
|
||||
REG(0x0c8),
|
||||
|
||||
END
|
||||
};
|
||||
|
||||
#undef END
|
||||
#undef REG16
|
||||
#undef REG
|
||||
|
@ -502,7 +543,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine)
|
|||
!intel_engine_has_relative_mmio(engine));
|
||||
|
||||
if (engine->class == RENDER_CLASS) {
|
||||
if (GRAPHICS_VER(engine->i915) >= 12)
|
||||
if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
|
||||
return xehp_rcs_offsets;
|
||||
else if (GRAPHICS_VER(engine->i915) >= 12)
|
||||
return gen12_rcs_offsets;
|
||||
else if (GRAPHICS_VER(engine->i915) >= 11)
|
||||
return gen11_rcs_offsets;
|
||||
|
@ -522,7 +565,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine)
|
|||
|
||||
static int lrc_ring_mi_mode(const struct intel_engine_cs *engine)
|
||||
{
|
||||
if (GRAPHICS_VER(engine->i915) >= 12)
|
||||
if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
|
||||
return 0x70;
|
||||
else if (GRAPHICS_VER(engine->i915) >= 12)
|
||||
return 0x60;
|
||||
else if (GRAPHICS_VER(engine->i915) >= 9)
|
||||
return 0x54;
|
||||
|
@ -534,7 +579,9 @@ static int lrc_ring_mi_mode(const struct intel_engine_cs *engine)
|
|||
|
||||
static int lrc_ring_gpr0(const struct intel_engine_cs *engine)
|
||||
{
|
||||
if (GRAPHICS_VER(engine->i915) >= 12)
|
||||
if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
|
||||
return 0x84;
|
||||
else if (GRAPHICS_VER(engine->i915) >= 12)
|
||||
return 0x74;
|
||||
else if (GRAPHICS_VER(engine->i915) >= 9)
|
||||
return 0x68;
|
||||
|
@ -578,10 +625,16 @@ static int lrc_ring_indirect_offset(const struct intel_engine_cs *engine)
|
|||
|
||||
static int lrc_ring_cmd_buf_cctl(const struct intel_engine_cs *engine)
|
||||
{
|
||||
if (engine->class != RENDER_CLASS)
|
||||
return -1;
|
||||
|
||||
if (GRAPHICS_VER(engine->i915) >= 12)
|
||||
if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
|
||||
/*
|
||||
* Note that the CSFE context has a dummy slot for CMD_BUF_CCTL
|
||||
* simply to match the RCS context image layout.
|
||||
*/
|
||||
return 0xc6;
|
||||
else if (engine->class != RENDER_CLASS)
|
||||
return -1;
|
||||
else if (GRAPHICS_VER(engine->i915) >= 12)
|
||||
return 0xb6;
|
||||
else if (GRAPHICS_VER(engine->i915) >= 11)
|
||||
return 0xaa;
|
||||
|
@ -600,8 +653,6 @@ lrc_ring_indirect_offset_default(const struct intel_engine_cs *engine)
|
|||
return GEN12_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
|
||||
case 11:
|
||||
return GEN11_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
|
||||
case 10:
|
||||
return GEN10_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
|
||||
case 9:
|
||||
return GEN9_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
|
||||
case 8:
|
||||
|
@ -845,7 +896,7 @@ int lrc_alloc(struct intel_context *ce, struct intel_engine_cs *engine)
|
|||
if (IS_ERR(vma))
|
||||
return PTR_ERR(vma);
|
||||
|
||||
ring = intel_engine_create_ring(engine, (unsigned long)ce->ring);
|
||||
ring = intel_engine_create_ring(engine, ce->ring_size);
|
||||
if (IS_ERR(ring)) {
|
||||
err = PTR_ERR(ring);
|
||||
goto err_vma;
|
||||
|
@ -1101,6 +1152,14 @@ setup_indirect_ctx_bb(const struct intel_context *ce,
|
|||
* bits 55-60: SW counter
|
||||
* bits 61-63: engine class
|
||||
*
|
||||
* On Xe_HP, the upper dword of the descriptor has a new format:
|
||||
*
|
||||
* bits 32-37: virtual function number
|
||||
* bit 38: mbz, reserved for use by hardware
|
||||
* bits 39-54: SW context ID
|
||||
* bits 55-57: reserved
|
||||
* bits 58-63: SW counter
|
||||
*
|
||||
* engine info, SW context ID and SW counter need to form a unique number
|
||||
* (Context ID) per lrc.
|
||||
*/
|
||||
|
@ -1387,40 +1446,6 @@ static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
|
|||
return batch;
|
||||
}
|
||||
|
||||
static u32 *
|
||||
gen10_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
|
||||
{
|
||||
int i;
|
||||
|
||||
/*
|
||||
* WaPipeControlBefore3DStateSamplePattern: cnl
|
||||
*
|
||||
* Ensure the engine is idle prior to programming a
|
||||
* 3DSTATE_SAMPLE_PATTERN during a context restore.
|
||||
*/
|
||||
batch = gen8_emit_pipe_control(batch,
|
||||
PIPE_CONTROL_CS_STALL,
|
||||
0);
|
||||
/*
|
||||
* WaPipeControlBefore3DStateSamplePattern says we need 4 dwords for
|
||||
* the PIPE_CONTROL followed by 12 dwords of 0x0, so 16 dwords in
|
||||
* total. However, a PIPE_CONTROL is 6 dwords long, not 4, which is
|
||||
* confusing. Since gen8_emit_pipe_control() already advances the
|
||||
* batch by 6 dwords, we advance the other 10 here, completing a
|
||||
* cacheline. It's not clear if the workaround requires this padding
|
||||
* before other commands, or if it's just the regular padding we would
|
||||
* already have for the workaround bb, so leave it here for now.
|
||||
*/
|
||||
for (i = 0; i < 10; i++)
|
||||
*batch++ = MI_NOOP;
|
||||
|
||||
/* Pad to end of cacheline */
|
||||
while ((unsigned long)batch % CACHELINE_BYTES)
|
||||
*batch++ = MI_NOOP;
|
||||
|
||||
return batch;
|
||||
}
|
||||
|
||||
#define CTX_WA_BB_SIZE (PAGE_SIZE)
|
||||
|
||||
static int lrc_create_wa_ctx(struct intel_engine_cs *engine)
|
||||
|
@ -1473,10 +1498,6 @@ void lrc_init_wa_ctx(struct intel_engine_cs *engine)
|
|||
case 12:
|
||||
case 11:
|
||||
return;
|
||||
case 10:
|
||||
wa_bb_fn[0] = gen10_init_indirectctx_bb;
|
||||
wa_bb_fn[1] = NULL;
|
||||
break;
|
||||
case 9:
|
||||
wa_bb_fn[0] = gen9_init_indirectctx_bb;
|
||||
wa_bb_fn[1] = NULL;
|
||||
|
|
|
@ -87,9 +87,10 @@
|
|||
#define GEN11_CSB_WRITE_PTR_MASK (GEN11_CSB_PTR_MASK << 0)
|
||||
|
||||
#define MAX_CONTEXT_HW_ID (1 << 21) /* exclusive */
|
||||
#define MAX_GUC_CONTEXT_HW_ID (1 << 20) /* exclusive */
|
||||
#define GEN11_MAX_CONTEXT_HW_ID (1 << 11) /* exclusive */
|
||||
/* in Gen12 ID 0x7FF is reserved to indicate idle */
|
||||
#define GEN12_MAX_CONTEXT_HW_ID (GEN11_MAX_CONTEXT_HW_ID - 1)
|
||||
/* in Xe_HP ID 0xFFFF is reserved to indicate "invalid context" */
|
||||
#define XEHP_MAX_CONTEXT_HW_ID 0xFFFF
|
||||
|
||||
#endif /* _INTEL_LRC_REG_H_ */
|
||||
|
|
|
@ -0,0 +1,688 @@
|
|||
// SPDX-License-Identifier: MIT
|
||||
/*
|
||||
* Copyright © 2020 Intel Corporation
|
||||
*/
|
||||
|
||||
#include "i915_drv.h"
|
||||
#include "intel_context.h"
|
||||
#include "intel_gpu_commands.h"
|
||||
#include "intel_gt.h"
|
||||
#include "intel_gtt.h"
|
||||
#include "intel_migrate.h"
|
||||
#include "intel_ring.h"
|
||||
|
||||
struct insert_pte_data {
|
||||
u64 offset;
|
||||
bool is_lmem;
|
||||
};
|
||||
|
||||
#define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */
|
||||
|
||||
static bool engine_supports_migration(struct intel_engine_cs *engine)
|
||||
{
|
||||
if (!engine)
|
||||
return false;
|
||||
|
||||
/*
|
||||
* We need the ability to prevent aribtration (MI_ARB_ON_OFF),
|
||||
* the ability to write PTE using inline data (MI_STORE_DATA)
|
||||
* and of course the ability to do the block transfer (blits).
|
||||
*/
|
||||
GEM_BUG_ON(engine->class != COPY_ENGINE_CLASS);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static void insert_pte(struct i915_address_space *vm,
|
||||
struct i915_page_table *pt,
|
||||
void *data)
|
||||
{
|
||||
struct insert_pte_data *d = data;
|
||||
|
||||
vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE,
|
||||
d->is_lmem ? PTE_LM : 0);
|
||||
d->offset += PAGE_SIZE;
|
||||
}
|
||||
|
||||
static struct i915_address_space *migrate_vm(struct intel_gt *gt)
|
||||
{
|
||||
struct i915_vm_pt_stash stash = {};
|
||||
struct i915_ppgtt *vm;
|
||||
int err;
|
||||
int i;
|
||||
|
||||
/*
|
||||
* We construct a very special VM for use by all migration contexts,
|
||||
* it is kept pinned so that it can be used at any time. As we need
|
||||
* to pre-allocate the page directories for the migration VM, this
|
||||
* limits us to only using a small number of prepared vma.
|
||||
*
|
||||
* To be able to pipeline and reschedule migration operations while
|
||||
* avoiding unnecessary contention on the vm itself, the PTE updates
|
||||
* are inline with the blits. All the blits use the same fixed
|
||||
* addresses, with the backing store redirection being updated on the
|
||||
* fly. Only 2 implicit vma are used for all migration operations.
|
||||
*
|
||||
* We lay the ppGTT out as:
|
||||
*
|
||||
* [0, CHUNK_SZ) -> first object
|
||||
* [CHUNK_SZ, 2 * CHUNK_SZ) -> second object
|
||||
* [2 * CHUNK_SZ, 2 * CHUNK_SZ + 2 * CHUNK_SZ >> 9] -> PTE
|
||||
*
|
||||
* By exposing the dma addresses of the page directories themselves
|
||||
* within the ppGTT, we are then able to rewrite the PTE prior to use.
|
||||
* But the PTE update and subsequent migration operation must be atomic,
|
||||
* i.e. within the same non-preemptible window so that we do not switch
|
||||
* to another migration context that overwrites the PTE.
|
||||
*
|
||||
* TODO: Add support for huge LMEM PTEs
|
||||
*/
|
||||
|
||||
vm = i915_ppgtt_create(gt);
|
||||
if (IS_ERR(vm))
|
||||
return ERR_CAST(vm);
|
||||
|
||||
if (!vm->vm.allocate_va_range || !vm->vm.foreach) {
|
||||
err = -ENODEV;
|
||||
goto err_vm;
|
||||
}
|
||||
|
||||
/*
|
||||
* Each engine instance is assigned its own chunk in the VM, so
|
||||
* that we can run multiple instances concurrently
|
||||
*/
|
||||
for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) {
|
||||
struct intel_engine_cs *engine;
|
||||
u64 base = (u64)i << 32;
|
||||
struct insert_pte_data d = {};
|
||||
struct i915_gem_ww_ctx ww;
|
||||
u64 sz;
|
||||
|
||||
engine = gt->engine_class[COPY_ENGINE_CLASS][i];
|
||||
if (!engine_supports_migration(engine))
|
||||
continue;
|
||||
|
||||
/*
|
||||
* We copy in 8MiB chunks. Each PDE covers 2MiB, so we need
|
||||
* 4x2 page directories for source/destination.
|
||||
*/
|
||||
sz = 2 * CHUNK_SZ;
|
||||
d.offset = base + sz;
|
||||
|
||||
/*
|
||||
* We need another page directory setup so that we can write
|
||||
* the 8x512 PTE in each chunk.
|
||||
*/
|
||||
sz += (sz >> 12) * sizeof(u64);
|
||||
|
||||
err = i915_vm_alloc_pt_stash(&vm->vm, &stash, sz);
|
||||
if (err)
|
||||
goto err_vm;
|
||||
|
||||
for_i915_gem_ww(&ww, err, true) {
|
||||
err = i915_vm_lock_objects(&vm->vm, &ww);
|
||||
if (err)
|
||||
continue;
|
||||
err = i915_vm_map_pt_stash(&vm->vm, &stash);
|
||||
if (err)
|
||||
continue;
|
||||
|
||||
vm->vm.allocate_va_range(&vm->vm, &stash, base, sz);
|
||||
}
|
||||
i915_vm_free_pt_stash(&vm->vm, &stash);
|
||||
if (err)
|
||||
goto err_vm;
|
||||
|
||||
/* Now allow the GPU to rewrite the PTE via its own ppGTT */
|
||||
d.is_lmem = i915_gem_object_is_lmem(vm->vm.scratch[0]);
|
||||
vm->vm.foreach(&vm->vm, base, base + sz, insert_pte, &d);
|
||||
}
|
||||
|
||||
return &vm->vm;
|
||||
|
||||
err_vm:
|
||||
i915_vm_put(&vm->vm);
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
|
||||
static struct intel_engine_cs *first_copy_engine(struct intel_gt *gt)
|
||||
{
|
||||
struct intel_engine_cs *engine;
|
||||
int i;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) {
|
||||
engine = gt->engine_class[COPY_ENGINE_CLASS][i];
|
||||
if (engine_supports_migration(engine))
|
||||
return engine;
|
||||
}
|
||||
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static struct intel_context *pinned_context(struct intel_gt *gt)
|
||||
{
|
||||
static struct lock_class_key key;
|
||||
struct intel_engine_cs *engine;
|
||||
struct i915_address_space *vm;
|
||||
struct intel_context *ce;
|
||||
|
||||
engine = first_copy_engine(gt);
|
||||
if (!engine)
|
||||
return ERR_PTR(-ENODEV);
|
||||
|
||||
vm = migrate_vm(gt);
|
||||
if (IS_ERR(vm))
|
||||
return ERR_CAST(vm);
|
||||
|
||||
ce = intel_engine_create_pinned_context(engine, vm, SZ_512K,
|
||||
I915_GEM_HWS_MIGRATE,
|
||||
&key, "migrate");
|
||||
i915_vm_put(ce->vm);
|
||||
return ce;
|
||||
}
|
||||
|
||||
int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt)
|
||||
{
|
||||
struct intel_context *ce;
|
||||
|
||||
memset(m, 0, sizeof(*m));
|
||||
|
||||
ce = pinned_context(gt);
|
||||
if (IS_ERR(ce))
|
||||
return PTR_ERR(ce);
|
||||
|
||||
m->context = ce;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int random_index(unsigned int max)
|
||||
{
|
||||
return upper_32_bits(mul_u32_u32(get_random_u32(), max));
|
||||
}
|
||||
|
||||
static struct intel_context *__migrate_engines(struct intel_gt *gt)
|
||||
{
|
||||
struct intel_engine_cs *engines[MAX_ENGINE_INSTANCE];
|
||||
struct intel_engine_cs *engine;
|
||||
unsigned int count, i;
|
||||
|
||||
count = 0;
|
||||
for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) {
|
||||
engine = gt->engine_class[COPY_ENGINE_CLASS][i];
|
||||
if (engine_supports_migration(engine))
|
||||
engines[count++] = engine;
|
||||
}
|
||||
|
||||
return intel_context_create(engines[random_index(count)]);
|
||||
}
|
||||
|
||||
struct intel_context *intel_migrate_create_context(struct intel_migrate *m)
|
||||
{
|
||||
struct intel_context *ce;
|
||||
|
||||
/*
|
||||
* We randomly distribute contexts across the engines upon constrction,
|
||||
* as they all share the same pinned vm, and so in order to allow
|
||||
* multiple blits to run in parallel, we must construct each blit
|
||||
* to use a different range of the vm for its GTT. This has to be
|
||||
* known at construction, so we can not use the late greedy load
|
||||
* balancing of the virtual-engine.
|
||||
*/
|
||||
ce = __migrate_engines(m->context->engine->gt);
|
||||
if (IS_ERR(ce))
|
||||
return ce;
|
||||
|
||||
ce->ring = NULL;
|
||||
ce->ring_size = SZ_256K;
|
||||
|
||||
i915_vm_put(ce->vm);
|
||||
ce->vm = i915_vm_get(m->context->vm);
|
||||
|
||||
return ce;
|
||||
}
|
||||
|
||||
static inline struct sgt_dma sg_sgt(struct scatterlist *sg)
|
||||
{
|
||||
dma_addr_t addr = sg_dma_address(sg);
|
||||
|
||||
return (struct sgt_dma){ sg, addr, addr + sg_dma_len(sg) };
|
||||
}
|
||||
|
||||
static int emit_no_arbitration(struct i915_request *rq)
|
||||
{
|
||||
u32 *cs;
|
||||
|
||||
cs = intel_ring_begin(rq, 2);
|
||||
if (IS_ERR(cs))
|
||||
return PTR_ERR(cs);
|
||||
|
||||
/* Explicitly disable preemption for this request. */
|
||||
*cs++ = MI_ARB_ON_OFF;
|
||||
*cs++ = MI_NOOP;
|
||||
intel_ring_advance(rq, cs);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int emit_pte(struct i915_request *rq,
|
||||
struct sgt_dma *it,
|
||||
enum i915_cache_level cache_level,
|
||||
bool is_lmem,
|
||||
u64 offset,
|
||||
int length)
|
||||
{
|
||||
const u64 encode = rq->context->vm->pte_encode(0, cache_level,
|
||||
is_lmem ? PTE_LM : 0);
|
||||
struct intel_ring *ring = rq->ring;
|
||||
int total = 0;
|
||||
u32 *hdr, *cs;
|
||||
int pkt;
|
||||
|
||||
GEM_BUG_ON(GRAPHICS_VER(rq->engine->i915) < 8);
|
||||
|
||||
/* Compute the page directory offset for the target address range */
|
||||
offset += (u64)rq->engine->instance << 32;
|
||||
offset >>= 12;
|
||||
offset *= sizeof(u64);
|
||||
offset += 2 * CHUNK_SZ;
|
||||
|
||||
cs = intel_ring_begin(rq, 6);
|
||||
if (IS_ERR(cs))
|
||||
return PTR_ERR(cs);
|
||||
|
||||
/* Pack as many PTE updates as possible into a single MI command */
|
||||
pkt = min_t(int, 0x400, ring->space / sizeof(u32) + 5);
|
||||
pkt = min_t(int, pkt, (ring->size - ring->emit) / sizeof(u32) + 5);
|
||||
|
||||
hdr = cs;
|
||||
*cs++ = MI_STORE_DATA_IMM | REG_BIT(21); /* as qword elements */
|
||||
*cs++ = lower_32_bits(offset);
|
||||
*cs++ = upper_32_bits(offset);
|
||||
|
||||
do {
|
||||
if (cs - hdr >= pkt) {
|
||||
*hdr += cs - hdr - 2;
|
||||
*cs++ = MI_NOOP;
|
||||
|
||||
ring->emit = (void *)cs - ring->vaddr;
|
||||
intel_ring_advance(rq, cs);
|
||||
intel_ring_update_space(ring);
|
||||
|
||||
cs = intel_ring_begin(rq, 6);
|
||||
if (IS_ERR(cs))
|
||||
return PTR_ERR(cs);
|
||||
|
||||
pkt = min_t(int, 0x400, ring->space / sizeof(u32) + 5);
|
||||
pkt = min_t(int, pkt, (ring->size - ring->emit) / sizeof(u32) + 5);
|
||||
|
||||
hdr = cs;
|
||||
*cs++ = MI_STORE_DATA_IMM | REG_BIT(21);
|
||||
*cs++ = lower_32_bits(offset);
|
||||
*cs++ = upper_32_bits(offset);
|
||||
}
|
||||
|
||||
*cs++ = lower_32_bits(encode | it->dma);
|
||||
*cs++ = upper_32_bits(encode | it->dma);
|
||||
|
||||
offset += 8;
|
||||
total += I915_GTT_PAGE_SIZE;
|
||||
|
||||
it->dma += I915_GTT_PAGE_SIZE;
|
||||
if (it->dma >= it->max) {
|
||||
it->sg = __sg_next(it->sg);
|
||||
if (!it->sg || sg_dma_len(it->sg) == 0)
|
||||
break;
|
||||
|
||||
it->dma = sg_dma_address(it->sg);
|
||||
it->max = it->dma + sg_dma_len(it->sg);
|
||||
}
|
||||
} while (total < length);
|
||||
|
||||
*hdr += cs - hdr - 2;
|
||||
*cs++ = MI_NOOP;
|
||||
|
||||
ring->emit = (void *)cs - ring->vaddr;
|
||||
intel_ring_advance(rq, cs);
|
||||
intel_ring_update_space(ring);
|
||||
|
||||
return total;
|
||||
}
|
||||
|
||||
static bool wa_1209644611_applies(int ver, u32 size)
|
||||
{
|
||||
u32 height = size >> PAGE_SHIFT;
|
||||
|
||||
if (ver != 11)
|
||||
return false;
|
||||
|
||||
return height % 4 == 3 && height <= 8;
|
||||
}
|
||||
|
||||
static int emit_copy(struct i915_request *rq, int size)
|
||||
{
|
||||
const int ver = GRAPHICS_VER(rq->engine->i915);
|
||||
u32 instance = rq->engine->instance;
|
||||
u32 *cs;
|
||||
|
||||
cs = intel_ring_begin(rq, ver >= 8 ? 10 : 6);
|
||||
if (IS_ERR(cs))
|
||||
return PTR_ERR(cs);
|
||||
|
||||
if (ver >= 9 && !wa_1209644611_applies(ver, size)) {
|
||||
*cs++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
|
||||
*cs++ = BLT_DEPTH_32 | PAGE_SIZE;
|
||||
*cs++ = 0;
|
||||
*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
|
||||
*cs++ = CHUNK_SZ; /* dst offset */
|
||||
*cs++ = instance;
|
||||
*cs++ = 0;
|
||||
*cs++ = PAGE_SIZE;
|
||||
*cs++ = 0; /* src offset */
|
||||
*cs++ = instance;
|
||||
} else if (ver >= 8) {
|
||||
*cs++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (10 - 2);
|
||||
*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
|
||||
*cs++ = 0;
|
||||
*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
|
||||
*cs++ = CHUNK_SZ; /* dst offset */
|
||||
*cs++ = instance;
|
||||
*cs++ = 0;
|
||||
*cs++ = PAGE_SIZE;
|
||||
*cs++ = 0; /* src offset */
|
||||
*cs++ = instance;
|
||||
} else {
|
||||
GEM_BUG_ON(instance);
|
||||
*cs++ = SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
|
||||
*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
|
||||
*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE;
|
||||
*cs++ = CHUNK_SZ; /* dst offset */
|
||||
*cs++ = PAGE_SIZE;
|
||||
*cs++ = 0; /* src offset */
|
||||
}
|
||||
|
||||
intel_ring_advance(rq, cs);
|
||||
return 0;
|
||||
}
|
||||
|
||||
int
|
||||
intel_context_migrate_copy(struct intel_context *ce,
|
||||
struct dma_fence *await,
|
||||
struct scatterlist *src,
|
||||
enum i915_cache_level src_cache_level,
|
||||
bool src_is_lmem,
|
||||
struct scatterlist *dst,
|
||||
enum i915_cache_level dst_cache_level,
|
||||
bool dst_is_lmem,
|
||||
struct i915_request **out)
|
||||
{
|
||||
struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst);
|
||||
struct i915_request *rq;
|
||||
int err;
|
||||
|
||||
GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm);
|
||||
*out = NULL;
|
||||
|
||||
GEM_BUG_ON(ce->ring->size < SZ_64K);
|
||||
|
||||
do {
|
||||
int len;
|
||||
|
||||
rq = i915_request_create(ce);
|
||||
if (IS_ERR(rq)) {
|
||||
err = PTR_ERR(rq);
|
||||
goto out_ce;
|
||||
}
|
||||
|
||||
if (await) {
|
||||
err = i915_request_await_dma_fence(rq, await);
|
||||
if (err)
|
||||
goto out_rq;
|
||||
|
||||
if (rq->engine->emit_init_breadcrumb) {
|
||||
err = rq->engine->emit_init_breadcrumb(rq);
|
||||
if (err)
|
||||
goto out_rq;
|
||||
}
|
||||
|
||||
await = NULL;
|
||||
}
|
||||
|
||||
/* The PTE updates + copy must not be interrupted. */
|
||||
err = emit_no_arbitration(rq);
|
||||
if (err)
|
||||
goto out_rq;
|
||||
|
||||
len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem, 0,
|
||||
CHUNK_SZ);
|
||||
if (len <= 0) {
|
||||
err = len;
|
||||
goto out_rq;
|
||||
}
|
||||
|
||||
err = emit_pte(rq, &it_dst, dst_cache_level, dst_is_lmem,
|
||||
CHUNK_SZ, len);
|
||||
if (err < 0)
|
||||
goto out_rq;
|
||||
if (err < len) {
|
||||
err = -EINVAL;
|
||||
goto out_rq;
|
||||
}
|
||||
|
||||
err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
|
||||
if (err)
|
||||
goto out_rq;
|
||||
|
||||
err = emit_copy(rq, len);
|
||||
|
||||
/* Arbitration is re-enabled between requests. */
|
||||
out_rq:
|
||||
if (*out)
|
||||
i915_request_put(*out);
|
||||
*out = i915_request_get(rq);
|
||||
i915_request_add(rq);
|
||||
if (err || !it_src.sg || !sg_dma_len(it_src.sg))
|
||||
break;
|
||||
|
||||
cond_resched();
|
||||
} while (1);
|
||||
|
||||
out_ce:
|
||||
return err;
|
||||
}
|
||||
|
||||
static int emit_clear(struct i915_request *rq, int size, u32 value)
|
||||
{
|
||||
const int ver = GRAPHICS_VER(rq->engine->i915);
|
||||
u32 instance = rq->engine->instance;
|
||||
u32 *cs;
|
||||
|
||||
GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
|
||||
|
||||
cs = intel_ring_begin(rq, ver >= 8 ? 8 : 6);
|
||||
if (IS_ERR(cs))
|
||||
return PTR_ERR(cs);
|
||||
|
||||
if (ver >= 8) {
|
||||
*cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2);
|
||||
*cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
|
||||
*cs++ = 0;
|
||||
*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
|
||||
*cs++ = 0; /* offset */
|
||||
*cs++ = instance;
|
||||
*cs++ = value;
|
||||
*cs++ = MI_NOOP;
|
||||
} else {
|
||||
GEM_BUG_ON(instance);
|
||||
*cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
|
||||
*cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
|
||||
*cs++ = 0;
|
||||
*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
|
||||
*cs++ = 0;
|
||||
*cs++ = value;
|
||||
}
|
||||
|
||||
intel_ring_advance(rq, cs);
|
||||
return 0;
|
||||
}
|
||||
|
||||
int
|
||||
intel_context_migrate_clear(struct intel_context *ce,
|
||||
struct dma_fence *await,
|
||||
struct scatterlist *sg,
|
||||
enum i915_cache_level cache_level,
|
||||
bool is_lmem,
|
||||
u32 value,
|
||||
struct i915_request **out)
|
||||
{
|
||||
struct sgt_dma it = sg_sgt(sg);
|
||||
struct i915_request *rq;
|
||||
int err;
|
||||
|
||||
GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm);
|
||||
*out = NULL;
|
||||
|
||||
GEM_BUG_ON(ce->ring->size < SZ_64K);
|
||||
|
||||
do {
|
||||
int len;
|
||||
|
||||
rq = i915_request_create(ce);
|
||||
if (IS_ERR(rq)) {
|
||||
err = PTR_ERR(rq);
|
||||
goto out_ce;
|
||||
}
|
||||
|
||||
if (await) {
|
||||
err = i915_request_await_dma_fence(rq, await);
|
||||
if (err)
|
||||
goto out_rq;
|
||||
|
||||
if (rq->engine->emit_init_breadcrumb) {
|
||||
err = rq->engine->emit_init_breadcrumb(rq);
|
||||
if (err)
|
||||
goto out_rq;
|
||||
}
|
||||
|
||||
await = NULL;
|
||||
}
|
||||
|
||||
/* The PTE updates + clear must not be interrupted. */
|
||||
err = emit_no_arbitration(rq);
|
||||
if (err)
|
||||
goto out_rq;
|
||||
|
||||
len = emit_pte(rq, &it, cache_level, is_lmem, 0, CHUNK_SZ);
|
||||
if (len <= 0) {
|
||||
err = len;
|
||||
goto out_rq;
|
||||
}
|
||||
|
||||
err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
|
||||
if (err)
|
||||
goto out_rq;
|
||||
|
||||
err = emit_clear(rq, len, value);
|
||||
|
||||
/* Arbitration is re-enabled between requests. */
|
||||
out_rq:
|
||||
if (*out)
|
||||
i915_request_put(*out);
|
||||
*out = i915_request_get(rq);
|
||||
i915_request_add(rq);
|
||||
if (err || !it.sg || !sg_dma_len(it.sg))
|
||||
break;
|
||||
|
||||
cond_resched();
|
||||
} while (1);
|
||||
|
||||
out_ce:
|
||||
return err;
|
||||
}
|
||||
|
||||
int intel_migrate_copy(struct intel_migrate *m,
|
||||
struct i915_gem_ww_ctx *ww,
|
||||
struct dma_fence *await,
|
||||
struct scatterlist *src,
|
||||
enum i915_cache_level src_cache_level,
|
||||
bool src_is_lmem,
|
||||
struct scatterlist *dst,
|
||||
enum i915_cache_level dst_cache_level,
|
||||
bool dst_is_lmem,
|
||||
struct i915_request **out)
|
||||
{
|
||||
struct intel_context *ce;
|
||||
int err;
|
||||
|
||||
*out = NULL;
|
||||
if (!m->context)
|
||||
return -ENODEV;
|
||||
|
||||
ce = intel_migrate_create_context(m);
|
||||
if (IS_ERR(ce))
|
||||
ce = intel_context_get(m->context);
|
||||
GEM_BUG_ON(IS_ERR(ce));
|
||||
|
||||
err = intel_context_pin_ww(ce, ww);
|
||||
if (err)
|
||||
goto out;
|
||||
|
||||
err = intel_context_migrate_copy(ce, await,
|
||||
src, src_cache_level, src_is_lmem,
|
||||
dst, dst_cache_level, dst_is_lmem,
|
||||
out);
|
||||
|
||||
intel_context_unpin(ce);
|
||||
out:
|
||||
intel_context_put(ce);
|
||||
return err;
|
||||
}
|
||||
|
||||
int
|
||||
intel_migrate_clear(struct intel_migrate *m,
|
||||
struct i915_gem_ww_ctx *ww,
|
||||
struct dma_fence *await,
|
||||
struct scatterlist *sg,
|
||||
enum i915_cache_level cache_level,
|
||||
bool is_lmem,
|
||||
u32 value,
|
||||
struct i915_request **out)
|
||||
{
|
||||
struct intel_context *ce;
|
||||
int err;
|
||||
|
||||
*out = NULL;
|
||||
if (!m->context)
|
||||
return -ENODEV;
|
||||
|
||||
ce = intel_migrate_create_context(m);
|
||||
if (IS_ERR(ce))
|
||||
ce = intel_context_get(m->context);
|
||||
GEM_BUG_ON(IS_ERR(ce));
|
||||
|
||||
err = intel_context_pin_ww(ce, ww);
|
||||
if (err)
|
||||
goto out;
|
||||
|
||||
err = intel_context_migrate_clear(ce, await, sg, cache_level,
|
||||
is_lmem, value, out);
|
||||
|
||||
intel_context_unpin(ce);
|
||||
out:
|
||||
intel_context_put(ce);
|
||||
return err;
|
||||
}
|
||||
|
||||
void intel_migrate_fini(struct intel_migrate *m)
|
||||
{
|
||||
struct intel_context *ce;
|
||||
|
||||
ce = fetch_and_zero(&m->context);
|
||||
if (!ce)
|
||||
return;
|
||||
|
||||
intel_engine_destroy_pinned_context(ce);
|
||||
}
|
||||
|
||||
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
|
||||
#include "selftest_migrate.c"
|
||||
#endif
|
|
@ -0,0 +1,65 @@
|
|||
/* SPDX-License-Identifier: MIT */
|
||||
/*
|
||||
* Copyright © 2020 Intel Corporation
|
||||
*/
|
||||
|
||||
#ifndef __INTEL_MIGRATE__
|
||||
#define __INTEL_MIGRATE__
|
||||
|
||||
#include <linux/types.h>
|
||||
|
||||
#include "intel_migrate_types.h"
|
||||
|
||||
struct dma_fence;
|
||||
struct i915_request;
|
||||
struct i915_gem_ww_ctx;
|
||||
struct intel_gt;
|
||||
struct scatterlist;
|
||||
enum i915_cache_level;
|
||||
|
||||
int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt);
|
||||
|
||||
struct intel_context *intel_migrate_create_context(struct intel_migrate *m);
|
||||
|
||||
int intel_migrate_copy(struct intel_migrate *m,
|
||||
struct i915_gem_ww_ctx *ww,
|
||||
struct dma_fence *await,
|
||||
struct scatterlist *src,
|
||||
enum i915_cache_level src_cache_level,
|
||||
bool src_is_lmem,
|
||||
struct scatterlist *dst,
|
||||
enum i915_cache_level dst_cache_level,
|
||||
bool dst_is_lmem,
|
||||
struct i915_request **out);
|
||||
|
||||
int intel_context_migrate_copy(struct intel_context *ce,
|
||||
struct dma_fence *await,
|
||||
struct scatterlist *src,
|
||||
enum i915_cache_level src_cache_level,
|
||||
bool src_is_lmem,
|
||||
struct scatterlist *dst,
|
||||
enum i915_cache_level dst_cache_level,
|
||||
bool dst_is_lmem,
|
||||
struct i915_request **out);
|
||||
|
||||
int
|
||||
intel_migrate_clear(struct intel_migrate *m,
|
||||
struct i915_gem_ww_ctx *ww,
|
||||
struct dma_fence *await,
|
||||
struct scatterlist *sg,
|
||||
enum i915_cache_level cache_level,
|
||||
bool is_lmem,
|
||||
u32 value,
|
||||
struct i915_request **out);
|
||||
int
|
||||
intel_context_migrate_clear(struct intel_context *ce,
|
||||
struct dma_fence *await,
|
||||
struct scatterlist *sg,
|
||||
enum i915_cache_level cache_level,
|
||||
bool is_lmem,
|
||||
u32 value,
|
||||
struct i915_request **out);
|
||||
|
||||
void intel_migrate_fini(struct intel_migrate *m);
|
||||
|
||||
#endif /* __INTEL_MIGRATE__ */
|
|
@ -0,0 +1,15 @@
|
|||
/* SPDX-License-Identifier: MIT */
|
||||
/*
|
||||
* Copyright © 2020 Intel Corporation
|
||||
*/
|
||||
|
||||
#ifndef __INTEL_MIGRATE_TYPES__
|
||||
#define __INTEL_MIGRATE_TYPES__
|
||||
|
||||
struct intel_context;
|
||||
|
||||
struct intel_migrate {
|
||||
struct intel_context *context;
|
||||
};
|
||||
|
||||
#endif /* __INTEL_MIGRATE_TYPES__ */
|
|
@ -352,7 +352,7 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
|
|||
table->size = ARRAY_SIZE(icl_mocs_table);
|
||||
table->table = icl_mocs_table;
|
||||
table->n_entries = GEN9_NUM_MOCS_ENTRIES;
|
||||
} else if (IS_GEN9_BC(i915) || IS_CANNONLAKE(i915)) {
|
||||
} else if (IS_GEN9_BC(i915)) {
|
||||
table->size = ARRAY_SIZE(skl_mocs_table);
|
||||
table->n_entries = GEN9_NUM_MOCS_ENTRIES;
|
||||
table->table = skl_mocs_table;
|
||||
|
|
|
@ -62,20 +62,25 @@ static void gen11_rc6_enable(struct intel_rc6 *rc6)
|
|||
u32 pg_enable;
|
||||
int i;
|
||||
|
||||
/* 2b: Program RC6 thresholds.*/
|
||||
set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 54 << 16 | 85);
|
||||
set(uncore, GEN10_MEDIA_WAKE_RATE_LIMIT, 150);
|
||||
/*
|
||||
* With GuCRC, these parameters are set by GuC
|
||||
*/
|
||||
if (!intel_uc_uses_guc_rc(>->uc)) {
|
||||
/* 2b: Program RC6 thresholds.*/
|
||||
set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 54 << 16 | 85);
|
||||
set(uncore, GEN10_MEDIA_WAKE_RATE_LIMIT, 150);
|
||||
|
||||
set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */
|
||||
set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */
|
||||
for_each_engine(engine, rc6_to_gt(rc6), id)
|
||||
set(uncore, RING_MAX_IDLE(engine->mmio_base), 10);
|
||||
set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */
|
||||
set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */
|
||||
for_each_engine(engine, rc6_to_gt(rc6), id)
|
||||
set(uncore, RING_MAX_IDLE(engine->mmio_base), 10);
|
||||
|
||||
set(uncore, GUC_MAX_IDLE_COUNT, 0xA);
|
||||
set(uncore, GUC_MAX_IDLE_COUNT, 0xA);
|
||||
|
||||
set(uncore, GEN6_RC_SLEEP, 0);
|
||||
set(uncore, GEN6_RC_SLEEP, 0);
|
||||
|
||||
set(uncore, GEN6_RC6_THRESHOLD, 50000); /* 50/125ms per EI */
|
||||
set(uncore, GEN6_RC6_THRESHOLD, 50000); /* 50/125ms per EI */
|
||||
}
|
||||
|
||||
/*
|
||||
* 2c: Program Coarse Power Gating Policies.
|
||||
|
@ -98,11 +103,19 @@ static void gen11_rc6_enable(struct intel_rc6 *rc6)
|
|||
set(uncore, GEN9_MEDIA_PG_IDLE_HYSTERESIS, 60);
|
||||
set(uncore, GEN9_RENDER_PG_IDLE_HYSTERESIS, 60);
|
||||
|
||||
/* 3a: Enable RC6 */
|
||||
rc6->ctl_enable =
|
||||
GEN6_RC_CTL_HW_ENABLE |
|
||||
GEN6_RC_CTL_RC6_ENABLE |
|
||||
GEN6_RC_CTL_EI_MODE(1);
|
||||
/* 3a: Enable RC6
|
||||
*
|
||||
* With GuCRC, we do not enable bit 31 of RC_CTL,
|
||||
* thus allowing GuC to control RC6 entry/exit fully instead.
|
||||
* We will not set the HW ENABLE and EI bits
|
||||
*/
|
||||
if (!intel_guc_rc_enable(>->uc.guc))
|
||||
rc6->ctl_enable = GEN6_RC_CTL_RC6_ENABLE;
|
||||
else
|
||||
rc6->ctl_enable =
|
||||
GEN6_RC_CTL_HW_ENABLE |
|
||||
GEN6_RC_CTL_RC6_ENABLE |
|
||||
GEN6_RC_CTL_EI_MODE(1);
|
||||
|
||||
pg_enable =
|
||||
GEN9_RENDER_PG_ENABLE |
|
||||
|
@ -126,7 +139,7 @@ static void gen9_rc6_enable(struct intel_rc6 *rc6)
|
|||
enum intel_engine_id id;
|
||||
|
||||
/* 2b: Program RC6 thresholds.*/
|
||||
if (GRAPHICS_VER(rc6_to_i915(rc6)) >= 10) {
|
||||
if (GRAPHICS_VER(rc6_to_i915(rc6)) >= 11) {
|
||||
set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 54 << 16 | 85);
|
||||
set(uncore, GEN10_MEDIA_WAKE_RATE_LIMIT, 150);
|
||||
} else if (IS_SKYLAKE(rc6_to_i915(rc6))) {
|
||||
|
@ -513,6 +526,10 @@ static void __intel_rc6_disable(struct intel_rc6 *rc6)
|
|||
{
|
||||
struct drm_i915_private *i915 = rc6_to_i915(rc6);
|
||||
struct intel_uncore *uncore = rc6_to_uncore(rc6);
|
||||
struct intel_gt *gt = rc6_to_gt(rc6);
|
||||
|
||||
/* Take control of RC6 back from GuC */
|
||||
intel_guc_rc_disable(>->uc.guc);
|
||||
|
||||
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
|
||||
if (GRAPHICS_VER(i915) >= 9)
|
||||
|
|
|
@ -10,7 +10,7 @@
|
|||
#include "gem/i915_gem_lmem.h"
|
||||
#include "gem/i915_gem_region.h"
|
||||
#include "gem/i915_gem_ttm.h"
|
||||
#include "intel_region_lmem.h"
|
||||
#include "gt/intel_gt.h"
|
||||
|
||||
static int init_fake_lmem_bar(struct intel_memory_region *mem)
|
||||
{
|
||||
|
@ -158,7 +158,7 @@ intel_gt_setup_fake_lmem(struct intel_gt *gt)
|
|||
static bool get_legacy_lowmem_region(struct intel_uncore *uncore,
|
||||
u64 *start, u32 *size)
|
||||
{
|
||||
if (!IS_DG1_REVID(uncore->i915, DG1_REVID_A0, DG1_REVID_B0))
|
||||
if (!IS_DG1_GT_STEP(uncore->i915, STEP_A0, STEP_C0))
|
||||
return false;
|
||||
|
||||
*start = 0;
|
||||
|
|
|
@ -8,6 +8,7 @@
|
|||
|
||||
#include <linux/types.h>
|
||||
#include "i915_gem.h"
|
||||
#include "i915_gem_ww.h"
|
||||
|
||||
struct i915_request;
|
||||
struct intel_context;
|
||||
|
|
|
@ -22,7 +22,6 @@
|
|||
#include "intel_reset.h"
|
||||
|
||||
#include "uc/intel_guc.h"
|
||||
#include "uc/intel_guc_submission.h"
|
||||
|
||||
#define RESET_MAX_RETRIES 3
|
||||
|
||||
|
@ -39,21 +38,6 @@ static void rmw_clear_fw(struct intel_uncore *uncore, i915_reg_t reg, u32 clr)
|
|||
intel_uncore_rmw_fw(uncore, reg, clr, 0);
|
||||
}
|
||||
|
||||
static void skip_context(struct i915_request *rq)
|
||||
{
|
||||
struct intel_context *hung_ctx = rq->context;
|
||||
|
||||
list_for_each_entry_from_rcu(rq, &hung_ctx->timeline->requests, link) {
|
||||
if (!i915_request_is_active(rq))
|
||||
return;
|
||||
|
||||
if (rq->context == hung_ctx) {
|
||||
i915_request_set_error_once(rq, -EIO);
|
||||
__i915_request_skip(rq);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
static void client_mark_guilty(struct i915_gem_context *ctx, bool banned)
|
||||
{
|
||||
struct drm_i915_file_private *file_priv = ctx->file_priv;
|
||||
|
@ -88,10 +72,8 @@ static bool mark_guilty(struct i915_request *rq)
|
|||
bool banned;
|
||||
int i;
|
||||
|
||||
if (intel_context_is_closed(rq->context)) {
|
||||
intel_context_set_banned(rq->context);
|
||||
if (intel_context_is_closed(rq->context))
|
||||
return true;
|
||||
}
|
||||
|
||||
rcu_read_lock();
|
||||
ctx = rcu_dereference(rq->context->gem_context);
|
||||
|
@ -123,11 +105,9 @@ static bool mark_guilty(struct i915_request *rq)
|
|||
banned = !i915_gem_context_is_recoverable(ctx);
|
||||
if (time_before(jiffies, prev_hang + CONTEXT_FAST_HANG_JIFFIES))
|
||||
banned = true;
|
||||
if (banned) {
|
||||
if (banned)
|
||||
drm_dbg(&ctx->i915->drm, "context %s: guilty %d, banned\n",
|
||||
ctx->name, atomic_read(&ctx->guilty_count));
|
||||
intel_context_set_banned(rq->context);
|
||||
}
|
||||
|
||||
client_mark_guilty(ctx, banned);
|
||||
|
||||
|
@ -149,6 +129,8 @@ static void mark_innocent(struct i915_request *rq)
|
|||
|
||||
void __i915_request_reset(struct i915_request *rq, bool guilty)
|
||||
{
|
||||
bool banned = false;
|
||||
|
||||
RQ_TRACE(rq, "guilty? %s\n", yesno(guilty));
|
||||
GEM_BUG_ON(__i915_request_is_complete(rq));
|
||||
|
||||
|
@ -156,13 +138,15 @@ void __i915_request_reset(struct i915_request *rq, bool guilty)
|
|||
if (guilty) {
|
||||
i915_request_set_error_once(rq, -EIO);
|
||||
__i915_request_skip(rq);
|
||||
if (mark_guilty(rq))
|
||||
skip_context(rq);
|
||||
banned = mark_guilty(rq);
|
||||
} else {
|
||||
i915_request_set_error_once(rq, -EAGAIN);
|
||||
mark_innocent(rq);
|
||||
}
|
||||
rcu_read_unlock();
|
||||
|
||||
if (banned)
|
||||
intel_context_ban(rq->context, rq);
|
||||
}
|
||||
|
||||
static bool i915_in_reset(struct pci_dev *pdev)
|
||||
|
@ -515,8 +499,14 @@ static int gen11_reset_engines(struct intel_gt *gt,
|
|||
[VCS1] = GEN11_GRDOM_MEDIA2,
|
||||
[VCS2] = GEN11_GRDOM_MEDIA3,
|
||||
[VCS3] = GEN11_GRDOM_MEDIA4,
|
||||
[VCS4] = GEN11_GRDOM_MEDIA5,
|
||||
[VCS5] = GEN11_GRDOM_MEDIA6,
|
||||
[VCS6] = GEN11_GRDOM_MEDIA7,
|
||||
[VCS7] = GEN11_GRDOM_MEDIA8,
|
||||
[VECS0] = GEN11_GRDOM_VECS,
|
||||
[VECS1] = GEN11_GRDOM_VECS2,
|
||||
[VECS2] = GEN11_GRDOM_VECS3,
|
||||
[VECS3] = GEN11_GRDOM_VECS4,
|
||||
};
|
||||
struct intel_engine_cs *engine;
|
||||
intel_engine_mask_t tmp;
|
||||
|
@ -826,6 +816,8 @@ static int gt_reset(struct intel_gt *gt, intel_engine_mask_t stalled_mask)
|
|||
__intel_engine_reset(engine, stalled_mask & engine->mask);
|
||||
local_bh_enable();
|
||||
|
||||
intel_uc_reset(>->uc, true);
|
||||
|
||||
intel_ggtt_restore_fences(gt->ggtt);
|
||||
|
||||
return err;
|
||||
|
@ -850,6 +842,8 @@ static void reset_finish(struct intel_gt *gt, intel_engine_mask_t awake)
|
|||
if (awake & engine->mask)
|
||||
intel_engine_pm_put(engine);
|
||||
}
|
||||
|
||||
intel_uc_reset_finish(>->uc);
|
||||
}
|
||||
|
||||
static void nop_submit_request(struct i915_request *request)
|
||||
|
@ -903,6 +897,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt)
|
|||
for_each_engine(engine, gt, id)
|
||||
if (engine->reset.cancel)
|
||||
engine->reset.cancel(engine);
|
||||
intel_uc_cancel_requests(>->uc);
|
||||
local_bh_enable();
|
||||
|
||||
reset_finish(gt, awake);
|
||||
|
@ -1191,6 +1186,9 @@ int __intel_engine_reset_bh(struct intel_engine_cs *engine, const char *msg)
|
|||
ENGINE_TRACE(engine, "flags=%lx\n", gt->reset.flags);
|
||||
GEM_BUG_ON(!test_bit(I915_RESET_ENGINE + engine->id, >->reset.flags));
|
||||
|
||||
if (intel_engine_uses_guc(engine))
|
||||
return -ENODEV;
|
||||
|
||||
if (!intel_engine_pm_get_if_awake(engine))
|
||||
return 0;
|
||||
|
||||
|
@ -1201,13 +1199,10 @@ int __intel_engine_reset_bh(struct intel_engine_cs *engine, const char *msg)
|
|||
"Resetting %s for %s\n", engine->name, msg);
|
||||
atomic_inc(&engine->i915->gpu_error.reset_engine_count[engine->uabi_class]);
|
||||
|
||||
if (intel_engine_uses_guc(engine))
|
||||
ret = intel_guc_reset_engine(&engine->gt->uc.guc, engine);
|
||||
else
|
||||
ret = intel_gt_reset_engine(engine);
|
||||
ret = intel_gt_reset_engine(engine);
|
||||
if (ret) {
|
||||
/* If we fail here, we expect to fallback to a global reset */
|
||||
ENGINE_TRACE(engine, "Failed to reset, err: %d\n", ret);
|
||||
ENGINE_TRACE(engine, "Failed to reset %s, err: %d\n", engine->name, ret);
|
||||
goto out;
|
||||
}
|
||||
|
||||
|
@ -1341,7 +1336,8 @@ void intel_gt_handle_error(struct intel_gt *gt,
|
|||
* Try engine reset when available. We fall back to full reset if
|
||||
* single reset fails.
|
||||
*/
|
||||
if (intel_has_reset_engine(gt) && !intel_gt_is_wedged(gt)) {
|
||||
if (!intel_uc_uses_guc_submission(>->uc) &&
|
||||
intel_has_reset_engine(gt) && !intel_gt_is_wedged(gt)) {
|
||||
local_bh_disable();
|
||||
for_each_engine_masked(engine, gt, engine_mask, tmp) {
|
||||
BUILD_BUG_ON(I915_RESET_MODESET >= I915_RESET_ENGINE);
|
||||
|
|
|
@ -49,6 +49,7 @@ static inline void intel_ring_advance(struct i915_request *rq, u32 *cs)
|
|||
* intel_ring_begin()).
|
||||
*/
|
||||
GEM_BUG_ON((rq->ring->vaddr + rq->ring->emit) != cs);
|
||||
GEM_BUG_ON(!IS_ALIGNED(rq->ring->emit, 8)); /* RING_TAIL qword align */
|
||||
}
|
||||
|
||||
static inline u32 intel_ring_wrap(const struct intel_ring *ring, u32 pos)
|
||||
|
|
|
@ -16,6 +16,7 @@
|
|||
#include "intel_reset.h"
|
||||
#include "intel_ring.h"
|
||||
#include "shmem_utils.h"
|
||||
#include "intel_engine_heartbeat.h"
|
||||
|
||||
/* Rough estimate of the typical request size, performing a flush,
|
||||
* set-context and then emitting the batch.
|
||||
|
@ -342,9 +343,9 @@ static void reset_rewind(struct intel_engine_cs *engine, bool stalled)
|
|||
u32 head;
|
||||
|
||||
rq = NULL;
|
||||
spin_lock_irqsave(&engine->active.lock, flags);
|
||||
spin_lock_irqsave(&engine->sched_engine->lock, flags);
|
||||
rcu_read_lock();
|
||||
list_for_each_entry(pos, &engine->active.requests, sched.link) {
|
||||
list_for_each_entry(pos, &engine->sched_engine->requests, sched.link) {
|
||||
if (!__i915_request_is_complete(pos)) {
|
||||
rq = pos;
|
||||
break;
|
||||
|
@ -399,7 +400,7 @@ static void reset_rewind(struct intel_engine_cs *engine, bool stalled)
|
|||
}
|
||||
engine->legacy.ring->head = intel_ring_wrap(engine->legacy.ring, head);
|
||||
|
||||
spin_unlock_irqrestore(&engine->active.lock, flags);
|
||||
spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
|
||||
}
|
||||
|
||||
static void reset_finish(struct intel_engine_cs *engine)
|
||||
|
@ -411,16 +412,16 @@ static void reset_cancel(struct intel_engine_cs *engine)
|
|||
struct i915_request *request;
|
||||
unsigned long flags;
|
||||
|
||||
spin_lock_irqsave(&engine->active.lock, flags);
|
||||
spin_lock_irqsave(&engine->sched_engine->lock, flags);
|
||||
|
||||
/* Mark all submitted requests as skipped. */
|
||||
list_for_each_entry(request, &engine->active.requests, sched.link)
|
||||
list_for_each_entry(request, &engine->sched_engine->requests, sched.link)
|
||||
i915_request_put(i915_request_mark_eio(request));
|
||||
intel_engine_signal_breadcrumbs(engine);
|
||||
|
||||
/* Remaining _unready_ requests will be nop'ed when submitted */
|
||||
|
||||
spin_unlock_irqrestore(&engine->active.lock, flags);
|
||||
spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
|
||||
}
|
||||
|
||||
static void i9xx_submit_request(struct i915_request *request)
|
||||
|
@ -586,9 +587,44 @@ static void ring_context_reset(struct intel_context *ce)
|
|||
clear_bit(CONTEXT_VALID_BIT, &ce->flags);
|
||||
}
|
||||
|
||||
static void ring_context_ban(struct intel_context *ce,
|
||||
struct i915_request *rq)
|
||||
{
|
||||
struct intel_engine_cs *engine;
|
||||
|
||||
if (!rq || !i915_request_is_active(rq))
|
||||
return;
|
||||
|
||||
engine = rq->engine;
|
||||
lockdep_assert_held(&engine->sched_engine->lock);
|
||||
list_for_each_entry_continue(rq, &engine->sched_engine->requests,
|
||||
sched.link)
|
||||
if (rq->context == ce) {
|
||||
i915_request_set_error_once(rq, -EIO);
|
||||
__i915_request_skip(rq);
|
||||
}
|
||||
}
|
||||
|
||||
static void ring_context_cancel_request(struct intel_context *ce,
|
||||
struct i915_request *rq)
|
||||
{
|
||||
struct intel_engine_cs *engine = NULL;
|
||||
|
||||
i915_request_active_engine(rq, &engine);
|
||||
|
||||
if (engine && intel_engine_pulse(engine))
|
||||
intel_gt_handle_error(engine->gt, engine->mask, 0,
|
||||
"request cancellation by %s",
|
||||
current->comm);
|
||||
}
|
||||
|
||||
static const struct intel_context_ops ring_context_ops = {
|
||||
.alloc = ring_context_alloc,
|
||||
|
||||
.cancel_request = ring_context_cancel_request,
|
||||
|
||||
.ban = ring_context_ban,
|
||||
|
||||
.pre_pin = ring_context_pre_pin,
|
||||
.pin = ring_context_pin,
|
||||
.unpin = ring_context_unpin,
|
||||
|
@ -1047,6 +1083,25 @@ static void setup_irq(struct intel_engine_cs *engine)
|
|||
}
|
||||
}
|
||||
|
||||
static void add_to_engine(struct i915_request *rq)
|
||||
{
|
||||
lockdep_assert_held(&rq->engine->sched_engine->lock);
|
||||
list_move_tail(&rq->sched.link, &rq->engine->sched_engine->requests);
|
||||
}
|
||||
|
||||
static void remove_from_engine(struct i915_request *rq)
|
||||
{
|
||||
spin_lock_irq(&rq->engine->sched_engine->lock);
|
||||
list_del_init(&rq->sched.link);
|
||||
|
||||
/* Prevent further __await_execution() registering a cb, then flush */
|
||||
set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
|
||||
|
||||
spin_unlock_irq(&rq->engine->sched_engine->lock);
|
||||
|
||||
i915_request_notify_execute_cb_imm(rq);
|
||||
}
|
||||
|
||||
static void setup_common(struct intel_engine_cs *engine)
|
||||
{
|
||||
struct drm_i915_private *i915 = engine->i915;
|
||||
|
@ -1064,6 +1119,9 @@ static void setup_common(struct intel_engine_cs *engine)
|
|||
engine->reset.cancel = reset_cancel;
|
||||
engine->reset.finish = reset_finish;
|
||||
|
||||
engine->add_active_request = add_to_engine;
|
||||
engine->remove_active_request = remove_from_engine;
|
||||
|
||||
engine->cops = &ring_context_ops;
|
||||
engine->request_alloc = ring_request_alloc;
|
||||
|
||||
|
|
|
@ -37,6 +37,20 @@ static struct intel_uncore *rps_to_uncore(struct intel_rps *rps)
|
|||
return rps_to_gt(rps)->uncore;
|
||||
}
|
||||
|
||||
static struct intel_guc_slpc *rps_to_slpc(struct intel_rps *rps)
|
||||
{
|
||||
struct intel_gt *gt = rps_to_gt(rps);
|
||||
|
||||
return >->uc.guc.slpc;
|
||||
}
|
||||
|
||||
static bool rps_uses_slpc(struct intel_rps *rps)
|
||||
{
|
||||
struct intel_gt *gt = rps_to_gt(rps);
|
||||
|
||||
return intel_uc_uses_guc_slpc(>->uc);
|
||||
}
|
||||
|
||||
static u32 rps_pm_sanitize_mask(struct intel_rps *rps, u32 mask)
|
||||
{
|
||||
return mask & ~rps->pm_intrmsk_mbz;
|
||||
|
@ -167,6 +181,8 @@ static void rps_enable_interrupts(struct intel_rps *rps)
|
|||
{
|
||||
struct intel_gt *gt = rps_to_gt(rps);
|
||||
|
||||
GEM_BUG_ON(rps_uses_slpc(rps));
|
||||
|
||||
GT_TRACE(gt, "interrupts:on rps->pm_events: %x, rps_pm_mask:%x\n",
|
||||
rps->pm_events, rps_pm_mask(rps, rps->last_freq));
|
||||
|
||||
|
@ -771,6 +787,8 @@ static int gen6_rps_set(struct intel_rps *rps, u8 val)
|
|||
struct drm_i915_private *i915 = rps_to_i915(rps);
|
||||
u32 swreq;
|
||||
|
||||
GEM_BUG_ON(rps_uses_slpc(rps));
|
||||
|
||||
if (GRAPHICS_VER(i915) >= 9)
|
||||
swreq = GEN9_FREQUENCY(val);
|
||||
else if (IS_HASWELL(i915) || IS_BROADWELL(i915))
|
||||
|
@ -861,6 +879,9 @@ void intel_rps_park(struct intel_rps *rps)
|
|||
{
|
||||
int adj;
|
||||
|
||||
if (!intel_rps_is_enabled(rps))
|
||||
return;
|
||||
|
||||
GEM_BUG_ON(atomic_read(&rps->num_waiters));
|
||||
|
||||
if (!intel_rps_clear_active(rps))
|
||||
|
@ -999,7 +1020,7 @@ static void gen6_rps_init(struct intel_rps *rps)
|
|||
|
||||
rps->efficient_freq = rps->rp1_freq;
|
||||
if (IS_HASWELL(i915) || IS_BROADWELL(i915) ||
|
||||
IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 10) {
|
||||
IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 11) {
|
||||
u32 ddcc_status = 0;
|
||||
|
||||
if (sandybridge_pcode_read(i915,
|
||||
|
@ -1012,7 +1033,7 @@ static void gen6_rps_init(struct intel_rps *rps)
|
|||
rps->max_freq);
|
||||
}
|
||||
|
||||
if (IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 10) {
|
||||
if (IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 11) {
|
||||
/* Store the frequency values in 16.66 MHZ units, which is
|
||||
* the natural hardware unit for SKL
|
||||
*/
|
||||
|
@ -1356,6 +1377,9 @@ void intel_rps_enable(struct intel_rps *rps)
|
|||
if (!HAS_RPS(i915))
|
||||
return;
|
||||
|
||||
if (rps_uses_slpc(rps))
|
||||
return;
|
||||
|
||||
intel_gt_check_clock_frequency(rps_to_gt(rps));
|
||||
|
||||
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
|
||||
|
@ -1829,6 +1853,9 @@ void intel_rps_init(struct intel_rps *rps)
|
|||
{
|
||||
struct drm_i915_private *i915 = rps_to_i915(rps);
|
||||
|
||||
if (rps_uses_slpc(rps))
|
||||
return;
|
||||
|
||||
if (IS_CHERRYVIEW(i915))
|
||||
chv_rps_init(rps);
|
||||
else if (IS_VALLEYVIEW(i915))
|
||||
|
@ -1877,10 +1904,17 @@ void intel_rps_init(struct intel_rps *rps)
|
|||
|
||||
if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) < 11)
|
||||
rps->pm_intrmsk_mbz |= GEN8_PMINTR_DISABLE_REDIRECT_TO_GUC;
|
||||
|
||||
/* GuC needs ARAT expired interrupt unmasked */
|
||||
if (intel_uc_uses_guc_submission(&rps_to_gt(rps)->uc))
|
||||
rps->pm_intrmsk_mbz |= ARAT_EXPIRED_INTRMSK;
|
||||
}
|
||||
|
||||
void intel_rps_sanitize(struct intel_rps *rps)
|
||||
{
|
||||
if (rps_uses_slpc(rps))
|
||||
return;
|
||||
|
||||
if (GRAPHICS_VER(rps_to_i915(rps)) >= 6)
|
||||
rps_disable_interrupts(rps);
|
||||
}
|
||||
|
@ -1936,6 +1970,176 @@ u32 intel_rps_read_actual_frequency(struct intel_rps *rps)
|
|||
return freq;
|
||||
}
|
||||
|
||||
u32 intel_rps_read_punit_req(struct intel_rps *rps)
|
||||
{
|
||||
struct intel_uncore *uncore = rps_to_uncore(rps);
|
||||
|
||||
return intel_uncore_read(uncore, GEN6_RPNSWREQ);
|
||||
}
|
||||
|
||||
static u32 intel_rps_get_req(u32 pureq)
|
||||
{
|
||||
u32 req = pureq >> GEN9_SW_REQ_UNSLICE_RATIO_SHIFT;
|
||||
|
||||
return req;
|
||||
}
|
||||
|
||||
u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps)
|
||||
{
|
||||
u32 freq = intel_rps_get_req(intel_rps_read_punit_req(rps));
|
||||
|
||||
return intel_gpu_freq(rps, freq);
|
||||
}
|
||||
|
||||
u32 intel_rps_get_requested_frequency(struct intel_rps *rps)
|
||||
{
|
||||
if (rps_uses_slpc(rps))
|
||||
return intel_rps_read_punit_req_frequency(rps);
|
||||
else
|
||||
return intel_gpu_freq(rps, rps->cur_freq);
|
||||
}
|
||||
|
||||
u32 intel_rps_get_max_frequency(struct intel_rps *rps)
|
||||
{
|
||||
struct intel_guc_slpc *slpc = rps_to_slpc(rps);
|
||||
|
||||
if (rps_uses_slpc(rps))
|
||||
return slpc->max_freq_softlimit;
|
||||
else
|
||||
return intel_gpu_freq(rps, rps->max_freq_softlimit);
|
||||
}
|
||||
|
||||
u32 intel_rps_get_rp0_frequency(struct intel_rps *rps)
|
||||
{
|
||||
struct intel_guc_slpc *slpc = rps_to_slpc(rps);
|
||||
|
||||
if (rps_uses_slpc(rps))
|
||||
return slpc->rp0_freq;
|
||||
else
|
||||
return intel_gpu_freq(rps, rps->rp0_freq);
|
||||
}
|
||||
|
||||
u32 intel_rps_get_rp1_frequency(struct intel_rps *rps)
|
||||
{
|
||||
struct intel_guc_slpc *slpc = rps_to_slpc(rps);
|
||||
|
||||
if (rps_uses_slpc(rps))
|
||||
return slpc->rp1_freq;
|
||||
else
|
||||
return intel_gpu_freq(rps, rps->rp1_freq);
|
||||
}
|
||||
|
||||
u32 intel_rps_get_rpn_frequency(struct intel_rps *rps)
|
||||
{
|
||||
struct intel_guc_slpc *slpc = rps_to_slpc(rps);
|
||||
|
||||
if (rps_uses_slpc(rps))
|
||||
return slpc->min_freq;
|
||||
else
|
||||
return intel_gpu_freq(rps, rps->min_freq);
|
||||
}
|
||||
|
||||
static int set_max_freq(struct intel_rps *rps, u32 val)
|
||||
{
|
||||
struct drm_i915_private *i915 = rps_to_i915(rps);
|
||||
int ret = 0;
|
||||
|
||||
mutex_lock(&rps->lock);
|
||||
|
||||
val = intel_freq_opcode(rps, val);
|
||||
if (val < rps->min_freq ||
|
||||
val > rps->max_freq ||
|
||||
val < rps->min_freq_softlimit) {
|
||||
ret = -EINVAL;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
if (val > rps->rp0_freq)
|
||||
drm_dbg(&i915->drm, "User requested overclocking to %d\n",
|
||||
intel_gpu_freq(rps, val));
|
||||
|
||||
rps->max_freq_softlimit = val;
|
||||
|
||||
val = clamp_t(int, rps->cur_freq,
|
||||
rps->min_freq_softlimit,
|
||||
rps->max_freq_softlimit);
|
||||
|
||||
/*
|
||||
* We still need *_set_rps to process the new max_delay and
|
||||
* update the interrupt limits and PMINTRMSK even though
|
||||
* frequency request may be unchanged.
|
||||
*/
|
||||
intel_rps_set(rps, val);
|
||||
|
||||
unlock:
|
||||
mutex_unlock(&rps->lock);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
int intel_rps_set_max_frequency(struct intel_rps *rps, u32 val)
|
||||
{
|
||||
struct intel_guc_slpc *slpc = rps_to_slpc(rps);
|
||||
|
||||
if (rps_uses_slpc(rps))
|
||||
return intel_guc_slpc_set_max_freq(slpc, val);
|
||||
else
|
||||
return set_max_freq(rps, val);
|
||||
}
|
||||
|
||||
u32 intel_rps_get_min_frequency(struct intel_rps *rps)
|
||||
{
|
||||
struct intel_guc_slpc *slpc = rps_to_slpc(rps);
|
||||
|
||||
if (rps_uses_slpc(rps))
|
||||
return slpc->min_freq_softlimit;
|
||||
else
|
||||
return intel_gpu_freq(rps, rps->min_freq_softlimit);
|
||||
}
|
||||
|
||||
static int set_min_freq(struct intel_rps *rps, u32 val)
|
||||
{
|
||||
int ret = 0;
|
||||
|
||||
mutex_lock(&rps->lock);
|
||||
|
||||
val = intel_freq_opcode(rps, val);
|
||||
if (val < rps->min_freq ||
|
||||
val > rps->max_freq ||
|
||||
val > rps->max_freq_softlimit) {
|
||||
ret = -EINVAL;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
rps->min_freq_softlimit = val;
|
||||
|
||||
val = clamp_t(int, rps->cur_freq,
|
||||
rps->min_freq_softlimit,
|
||||
rps->max_freq_softlimit);
|
||||
|
||||
/*
|
||||
* We still need *_set_rps to process the new min_delay and
|
||||
* update the interrupt limits and PMINTRMSK even though
|
||||
* frequency request may be unchanged.
|
||||
*/
|
||||
intel_rps_set(rps, val);
|
||||
|
||||
unlock:
|
||||
mutex_unlock(&rps->lock);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
int intel_rps_set_min_frequency(struct intel_rps *rps, u32 val)
|
||||
{
|
||||
struct intel_guc_slpc *slpc = rps_to_slpc(rps);
|
||||
|
||||
if (rps_uses_slpc(rps))
|
||||
return intel_guc_slpc_set_min_freq(slpc, val);
|
||||
else
|
||||
return set_min_freq(rps, val);
|
||||
}
|
||||
|
||||
/* External interface for intel_ips.ko */
|
||||
|
||||
static struct drm_i915_private __rcu *ips_mchdev;
|
||||
|
@ -2129,4 +2333,5 @@ EXPORT_SYMBOL_GPL(i915_gpu_turbo_disable);
|
|||
|
||||
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
|
||||
#include "selftest_rps.c"
|
||||
#include "selftest_slpc.c"
|
||||
#endif
|
||||
|
|
|
@ -31,6 +31,16 @@ int intel_gpu_freq(struct intel_rps *rps, int val);
|
|||
int intel_freq_opcode(struct intel_rps *rps, int val);
|
||||
u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat1);
|
||||
u32 intel_rps_read_actual_frequency(struct intel_rps *rps);
|
||||
u32 intel_rps_get_requested_frequency(struct intel_rps *rps);
|
||||
u32 intel_rps_get_min_frequency(struct intel_rps *rps);
|
||||
int intel_rps_set_min_frequency(struct intel_rps *rps, u32 val);
|
||||
u32 intel_rps_get_max_frequency(struct intel_rps *rps);
|
||||
int intel_rps_set_max_frequency(struct intel_rps *rps, u32 val);
|
||||
u32 intel_rps_get_rp0_frequency(struct intel_rps *rps);
|
||||
u32 intel_rps_get_rp1_frequency(struct intel_rps *rps);
|
||||
u32 intel_rps_get_rpn_frequency(struct intel_rps *rps);
|
||||
u32 intel_rps_read_punit_req(struct intel_rps *rps);
|
||||
u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps);
|
||||
|
||||
void gen5_rps_irq_handler(struct intel_rps *rps);
|
||||
void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir);
|
||||
|
|
|
@ -139,17 +139,36 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
|
|||
* Gen12 has Dual-Subslices, which behave similarly to 2 gen11 SS.
|
||||
* Instead of splitting these, provide userspace with an array
|
||||
* of DSS to more closely represent the hardware resource.
|
||||
*
|
||||
* In addition, the concept of slice has been removed in Xe_HP.
|
||||
* To be compatible with prior generations, assume a single slice
|
||||
* across the entire device. Then calculate out the DSS for each
|
||||
* workload type within that software slice.
|
||||
*/
|
||||
intel_sseu_set_info(sseu, 1, 6, 16);
|
||||
if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915))
|
||||
intel_sseu_set_info(sseu, 1, 32, 16);
|
||||
else
|
||||
intel_sseu_set_info(sseu, 1, 6, 16);
|
||||
|
||||
s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
|
||||
GEN11_GT_S_ENA_MASK;
|
||||
/*
|
||||
* As mentioned above, Xe_HP does not have the concept of a slice.
|
||||
* Enable one for software backwards compatibility.
|
||||
*/
|
||||
if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
|
||||
s_en = 0x1;
|
||||
else
|
||||
s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
|
||||
GEN11_GT_S_ENA_MASK;
|
||||
|
||||
dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE);
|
||||
|
||||
/* one bit per pair of EUs */
|
||||
eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
|
||||
GEN11_EU_DIS_MASK);
|
||||
if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
|
||||
eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK;
|
||||
else
|
||||
eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
|
||||
GEN11_EU_DIS_MASK);
|
||||
|
||||
for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++)
|
||||
if (eu_en_fuse & BIT(eu))
|
||||
eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1);
|
||||
|
@ -188,83 +207,6 @@ static void gen11_sseu_info_init(struct intel_gt *gt)
|
|||
sseu->has_eu_pg = 1;
|
||||
}
|
||||
|
||||
static void gen10_sseu_info_init(struct intel_gt *gt)
|
||||
{
|
||||
struct intel_uncore *uncore = gt->uncore;
|
||||
struct sseu_dev_info *sseu = >->info.sseu;
|
||||
const u32 fuse2 = intel_uncore_read(uncore, GEN8_FUSE2);
|
||||
const int eu_mask = 0xff;
|
||||
u32 subslice_mask, eu_en;
|
||||
int s, ss;
|
||||
|
||||
intel_sseu_set_info(sseu, 6, 4, 8);
|
||||
|
||||
sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
|
||||
GEN10_F2_S_ENA_SHIFT;
|
||||
|
||||
/* Slice0 */
|
||||
eu_en = ~intel_uncore_read(uncore, GEN8_EU_DISABLE0);
|
||||
for (ss = 0; ss < sseu->max_subslices; ss++)
|
||||
sseu_set_eus(sseu, 0, ss, (eu_en >> (8 * ss)) & eu_mask);
|
||||
/* Slice1 */
|
||||
sseu_set_eus(sseu, 1, 0, (eu_en >> 24) & eu_mask);
|
||||
eu_en = ~intel_uncore_read(uncore, GEN8_EU_DISABLE1);
|
||||
sseu_set_eus(sseu, 1, 1, eu_en & eu_mask);
|
||||
/* Slice2 */
|
||||
sseu_set_eus(sseu, 2, 0, (eu_en >> 8) & eu_mask);
|
||||
sseu_set_eus(sseu, 2, 1, (eu_en >> 16) & eu_mask);
|
||||
/* Slice3 */
|
||||
sseu_set_eus(sseu, 3, 0, (eu_en >> 24) & eu_mask);
|
||||
eu_en = ~intel_uncore_read(uncore, GEN8_EU_DISABLE2);
|
||||
sseu_set_eus(sseu, 3, 1, eu_en & eu_mask);
|
||||
/* Slice4 */
|
||||
sseu_set_eus(sseu, 4, 0, (eu_en >> 8) & eu_mask);
|
||||
sseu_set_eus(sseu, 4, 1, (eu_en >> 16) & eu_mask);
|
||||
/* Slice5 */
|
||||
sseu_set_eus(sseu, 5, 0, (eu_en >> 24) & eu_mask);
|
||||
eu_en = ~intel_uncore_read(uncore, GEN10_EU_DISABLE3);
|
||||
sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
|
||||
|
||||
subslice_mask = (1 << 4) - 1;
|
||||
subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
|
||||
GEN10_F2_SS_DIS_SHIFT);
|
||||
|
||||
for (s = 0; s < sseu->max_slices; s++) {
|
||||
u32 subslice_mask_with_eus = subslice_mask;
|
||||
|
||||
for (ss = 0; ss < sseu->max_subslices; ss++) {
|
||||
if (sseu_get_eus(sseu, s, ss) == 0)
|
||||
subslice_mask_with_eus &= ~BIT(ss);
|
||||
}
|
||||
|
||||
/*
|
||||
* Slice0 can have up to 3 subslices, but there are only 2 in
|
||||
* slice1/2.
|
||||
*/
|
||||
intel_sseu_set_subslices(sseu, s, s == 0 ?
|
||||
subslice_mask_with_eus :
|
||||
subslice_mask_with_eus & 0x3);
|
||||
}
|
||||
|
||||
sseu->eu_total = compute_eu_total(sseu);
|
||||
|
||||
/*
|
||||
* CNL is expected to always have a uniform distribution
|
||||
* of EU across subslices with the exception that any one
|
||||
* EU in any one subslice may be fused off for die
|
||||
* recovery.
|
||||
*/
|
||||
sseu->eu_per_subslice =
|
||||
intel_sseu_subslice_total(sseu) ?
|
||||
DIV_ROUND_UP(sseu->eu_total, intel_sseu_subslice_total(sseu)) :
|
||||
0;
|
||||
|
||||
/* No restrictions on Power Gating */
|
||||
sseu->has_slice_pg = 1;
|
||||
sseu->has_subslice_pg = 1;
|
||||
sseu->has_eu_pg = 1;
|
||||
}
|
||||
|
||||
static void cherryview_sseu_info_init(struct intel_gt *gt)
|
||||
{
|
||||
struct sseu_dev_info *sseu = >->info.sseu;
|
||||
|
@ -592,8 +534,6 @@ void intel_sseu_info_init(struct intel_gt *gt)
|
|||
bdw_sseu_info_init(gt);
|
||||
else if (GRAPHICS_VER(i915) == 9)
|
||||
gen9_sseu_info_init(gt);
|
||||
else if (GRAPHICS_VER(i915) == 10)
|
||||
gen10_sseu_info_init(gt);
|
||||
else if (GRAPHICS_VER(i915) == 11)
|
||||
gen11_sseu_info_init(gt);
|
||||
else if (GRAPHICS_VER(i915) >= 12)
|
||||
|
@ -759,3 +699,21 @@ void intel_sseu_print_topology(const struct sseu_dev_info *sseu,
|
|||
}
|
||||
}
|
||||
}
|
||||
|
||||
u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice)
|
||||
{
|
||||
u16 slice_mask = 0;
|
||||
int i;
|
||||
|
||||
WARN_ON(sizeof(dss_mask) * 8 / dss_per_slice > 8 * sizeof(slice_mask));
|
||||
|
||||
for (i = 0; dss_mask; i++) {
|
||||
if (dss_mask & GENMASK(dss_per_slice - 1, 0))
|
||||
slice_mask |= BIT(i);
|
||||
|
||||
dss_mask >>= dss_per_slice;
|
||||
}
|
||||
|
||||
return slice_mask;
|
||||
}
|
||||
|
||||
|
|
|
@ -15,13 +15,17 @@ struct drm_i915_private;
|
|||
struct intel_gt;
|
||||
struct drm_printer;
|
||||
|
||||
#define GEN_MAX_SLICES (6) /* CNL upper bound */
|
||||
#define GEN_MAX_SUBSLICES (8) /* ICL upper bound */
|
||||
#define GEN_MAX_SLICES (3) /* SKL upper bound */
|
||||
#define GEN_MAX_SUBSLICES (32) /* XEHPSDV upper bound */
|
||||
#define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE)
|
||||
#define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES)
|
||||
#define GEN_MAX_EUS (16) /* TGL upper bound */
|
||||
#define GEN_MAX_EU_STRIDE GEN_SSEU_STRIDE(GEN_MAX_EUS)
|
||||
|
||||
#define GEN_DSS_PER_GSLICE 4
|
||||
#define GEN_DSS_PER_CSLICE 8
|
||||
#define GEN_DSS_PER_MSLICE 8
|
||||
|
||||
struct sseu_dev_info {
|
||||
u8 slice_mask;
|
||||
u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
|
||||
|
@ -104,4 +108,6 @@ void intel_sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p);
|
|||
void intel_sseu_print_topology(const struct sseu_dev_info *sseu,
|
||||
struct drm_printer *p);
|
||||
|
||||
u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice);
|
||||
|
||||
#endif /* __INTEL_SSEU_H__ */
|
||||
|
|
|
@ -50,10 +50,10 @@ static void cherryview_sseu_device_status(struct intel_gt *gt,
|
|||
#undef SS_MAX
|
||||
}
|
||||
|
||||
static void gen10_sseu_device_status(struct intel_gt *gt,
|
||||
static void gen11_sseu_device_status(struct intel_gt *gt,
|
||||
struct sseu_dev_info *sseu)
|
||||
{
|
||||
#define SS_MAX 6
|
||||
#define SS_MAX 8
|
||||
struct intel_uncore *uncore = gt->uncore;
|
||||
const struct intel_gt_info *info = >->info;
|
||||
u32 s_reg[SS_MAX], eu_reg[2 * SS_MAX], eu_mask[2];
|
||||
|
@ -267,8 +267,8 @@ int intel_sseu_status(struct seq_file *m, struct intel_gt *gt)
|
|||
bdw_sseu_device_status(gt, &sseu);
|
||||
else if (GRAPHICS_VER(i915) == 9)
|
||||
gen9_sseu_device_status(gt, &sseu);
|
||||
else if (GRAPHICS_VER(i915) >= 10)
|
||||
gen10_sseu_device_status(gt, &sseu);
|
||||
else if (GRAPHICS_VER(i915) >= 11)
|
||||
gen11_sseu_device_status(gt, &sseu);
|
||||
}
|
||||
|
||||
i915_print_sseu_info(m, false, HAS_POOLED_EU(i915), &sseu);
|
||||
|
|
|
@ -150,13 +150,14 @@ static void _wa_add(struct i915_wa_list *wal, const struct i915_wa *wa)
|
|||
}
|
||||
|
||||
static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
|
||||
u32 clear, u32 set, u32 read_mask)
|
||||
u32 clear, u32 set, u32 read_mask, bool masked_reg)
|
||||
{
|
||||
struct i915_wa wa = {
|
||||
.reg = reg,
|
||||
.clr = clear,
|
||||
.set = set,
|
||||
.read = read_mask,
|
||||
.masked_reg = masked_reg,
|
||||
};
|
||||
|
||||
_wa_add(wal, &wa);
|
||||
|
@ -165,7 +166,7 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
|
|||
static void
|
||||
wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
|
||||
{
|
||||
wa_add(wal, reg, clear, set, clear);
|
||||
wa_add(wal, reg, clear, set, clear, false);
|
||||
}
|
||||
|
||||
static void
|
||||
|
@ -200,20 +201,20 @@ wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
|
|||
static void
|
||||
wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
|
||||
{
|
||||
wa_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val);
|
||||
wa_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
|
||||
}
|
||||
|
||||
static void
|
||||
wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
|
||||
{
|
||||
wa_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val);
|
||||
wa_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
|
||||
}
|
||||
|
||||
static void
|
||||
wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
|
||||
u32 mask, u32 val)
|
||||
{
|
||||
wa_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask);
|
||||
wa_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true);
|
||||
}
|
||||
|
||||
static void gen6_ctx_workarounds_init(struct intel_engine_cs *engine,
|
||||
|
@ -514,53 +515,15 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine,
|
|||
GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
|
||||
}
|
||||
|
||||
static void cnl_ctx_workarounds_init(struct intel_engine_cs *engine,
|
||||
struct i915_wa_list *wal)
|
||||
{
|
||||
/* WaForceContextSaveRestoreNonCoherent:cnl */
|
||||
wa_masked_en(wal, CNL_HDC_CHICKEN0,
|
||||
HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
|
||||
|
||||
/* WaDisableReplayBufferBankArbitrationOptimization:cnl */
|
||||
wa_masked_en(wal, COMMON_SLICE_CHICKEN2,
|
||||
GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
|
||||
|
||||
/* WaPushConstantDereferenceHoldDisable:cnl */
|
||||
wa_masked_en(wal, GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
|
||||
|
||||
/* FtrEnableFastAnisoL1BankingFix:cnl */
|
||||
wa_masked_en(wal, HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
|
||||
|
||||
/* WaDisable3DMidCmdPreemption:cnl */
|
||||
wa_masked_dis(wal, GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
|
||||
|
||||
/* WaDisableGPGPUMidCmdPreemption:cnl */
|
||||
wa_masked_field_set(wal, GEN8_CS_CHICKEN1,
|
||||
GEN9_PREEMPT_GPGPU_LEVEL_MASK,
|
||||
GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
|
||||
|
||||
/* WaDisableEarlyEOT:cnl */
|
||||
wa_masked_en(wal, GEN8_ROW_CHICKEN, DISABLE_EARLY_EOT);
|
||||
}
|
||||
|
||||
static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
|
||||
struct i915_wa_list *wal)
|
||||
{
|
||||
struct drm_i915_private *i915 = engine->i915;
|
||||
|
||||
/* WaDisableBankHangMode:icl */
|
||||
/* Wa_1406697149 (WaDisableBankHangMode:icl) */
|
||||
wa_write(wal,
|
||||
GEN8_L3CNTLREG,
|
||||
intel_uncore_read(engine->uncore, GEN8_L3CNTLREG) |
|
||||
GEN8_ERRDETBCTRL);
|
||||
|
||||
/* Wa_1604370585:icl (pre-prod)
|
||||
* Formerly known as WaPushConstantDereferenceHoldDisable
|
||||
*/
|
||||
if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
|
||||
wa_masked_en(wal, GEN7_ROW_CHICKEN2,
|
||||
PUSH_CONSTANT_DEREF_DISABLE);
|
||||
|
||||
/* WaForceEnableNonCoherent:icl
|
||||
* This is not the same workaround as in early Gen9 platforms, where
|
||||
* lacking this could cause system hangs, but coherency performance
|
||||
|
@ -570,23 +533,11 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
|
|||
*/
|
||||
wa_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT);
|
||||
|
||||
/* Wa_2006611047:icl (pre-prod)
|
||||
* Formerly known as WaDisableImprovedTdlClkGating
|
||||
*/
|
||||
if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
|
||||
wa_masked_en(wal, GEN7_ROW_CHICKEN2,
|
||||
GEN11_TDL_CLOCK_GATING_FIX_DISABLE);
|
||||
|
||||
/* Wa_2006665173:icl (pre-prod) */
|
||||
if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
|
||||
wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
|
||||
GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC);
|
||||
|
||||
/* WaEnableFloatBlendOptimization:icl */
|
||||
wa_write_clr_set(wal,
|
||||
GEN10_CACHE_MODE_SS,
|
||||
0, /* write-only, so skip validation */
|
||||
_MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE));
|
||||
wa_add(wal, GEN10_CACHE_MODE_SS, 0,
|
||||
_MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE),
|
||||
0 /* write-only, so skip validation */,
|
||||
true);
|
||||
|
||||
/* WaDisableGPGPUMidThreadPreemption:icl */
|
||||
wa_masked_field_set(wal, GEN8_CS_CHICKEN1,
|
||||
|
@ -631,7 +582,7 @@ static void gen12_ctx_gt_tuning_init(struct intel_engine_cs *engine,
|
|||
FF_MODE2,
|
||||
FF_MODE2_TDS_TIMER_MASK,
|
||||
FF_MODE2_TDS_TIMER_128,
|
||||
0);
|
||||
0, false);
|
||||
}
|
||||
|
||||
static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine,
|
||||
|
@ -640,15 +591,16 @@ static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine,
|
|||
gen12_ctx_gt_tuning_init(engine, wal);
|
||||
|
||||
/*
|
||||
* Wa_1409142259:tgl
|
||||
* Wa_1409347922:tgl
|
||||
* Wa_1409252684:tgl
|
||||
* Wa_1409217633:tgl
|
||||
* Wa_1409207793:tgl
|
||||
* Wa_1409178076:tgl
|
||||
* Wa_1408979724:tgl
|
||||
* Wa_14010443199:rkl
|
||||
* Wa_14010698770:rkl
|
||||
* Wa_1409142259:tgl,dg1,adl-p
|
||||
* Wa_1409347922:tgl,dg1,adl-p
|
||||
* Wa_1409252684:tgl,dg1,adl-p
|
||||
* Wa_1409217633:tgl,dg1,adl-p
|
||||
* Wa_1409207793:tgl,dg1,adl-p
|
||||
* Wa_1409178076:tgl,dg1,adl-p
|
||||
* Wa_1408979724:tgl,dg1,adl-p
|
||||
* Wa_14010443199:tgl,rkl,dg1,adl-p
|
||||
* Wa_14010698770:tgl,rkl,dg1,adl-s,adl-p
|
||||
* Wa_1409342910:tgl,rkl,dg1,adl-s,adl-p
|
||||
*/
|
||||
wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
|
||||
GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);
|
||||
|
@ -668,7 +620,14 @@ static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine,
|
|||
FF_MODE2,
|
||||
FF_MODE2_GS_TIMER_MASK,
|
||||
FF_MODE2_GS_TIMER_224,
|
||||
0);
|
||||
0, false);
|
||||
|
||||
/*
|
||||
* Wa_14012131227:dg1
|
||||
* Wa_1508744258:tgl,rkl,dg1,adl-s,adl-p
|
||||
*/
|
||||
wa_masked_en(wal, GEN7_COMMON_SLICE_CHICKEN1,
|
||||
GEN9_RHWO_OPTIMIZATION_DISABLE);
|
||||
}
|
||||
|
||||
static void dg1_ctx_workarounds_init(struct intel_engine_cs *engine,
|
||||
|
@ -703,8 +662,6 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
|
|||
gen12_ctx_workarounds_init(engine, wal);
|
||||
else if (GRAPHICS_VER(i915) == 11)
|
||||
icl_ctx_workarounds_init(engine, wal);
|
||||
else if (IS_CANNONLAKE(i915))
|
||||
cnl_ctx_workarounds_init(engine, wal);
|
||||
else if (IS_COFFEELAKE(i915) || IS_COMETLAKE(i915))
|
||||
cfl_ctx_workarounds_init(engine, wal);
|
||||
else if (IS_GEMINILAKE(i915))
|
||||
|
@ -839,7 +796,7 @@ hsw_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
|||
wa_add(wal,
|
||||
HSW_ROW_CHICKEN3, 0,
|
||||
_MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE),
|
||||
0 /* XXX does this reg exist? */);
|
||||
0 /* XXX does this reg exist? */, true);
|
||||
|
||||
/* WaVSRefCountFullforceMissDisable:hsw */
|
||||
wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME);
|
||||
|
@ -882,30 +839,19 @@ skl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
|||
GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
|
||||
|
||||
/* WaInPlaceDecompressionHang:skl */
|
||||
if (IS_SKL_REVID(i915, SKL_REVID_H0, REVID_FOREVER))
|
||||
if (IS_SKL_GT_STEP(i915, STEP_A0, STEP_H0))
|
||||
wa_write_or(wal,
|
||||
GEN9_GAMT_ECO_REG_RW_IA,
|
||||
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
|
||||
}
|
||||
|
||||
static void
|
||||
bxt_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
||||
{
|
||||
gen9_gt_workarounds_init(i915, wal);
|
||||
|
||||
/* WaInPlaceDecompressionHang:bxt */
|
||||
wa_write_or(wal,
|
||||
GEN9_GAMT_ECO_REG_RW_IA,
|
||||
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
|
||||
}
|
||||
|
||||
static void
|
||||
kbl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
||||
{
|
||||
gen9_gt_workarounds_init(i915, wal);
|
||||
|
||||
/* WaDisableDynamicCreditSharing:kbl */
|
||||
if (IS_KBL_GT_STEP(i915, 0, STEP_B0))
|
||||
if (IS_KBL_GT_STEP(i915, 0, STEP_C0))
|
||||
wa_write_or(wal,
|
||||
GAMT_CHKN_BIT_REG,
|
||||
GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING);
|
||||
|
@ -943,98 +889,144 @@ cfl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
|||
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
|
||||
}
|
||||
|
||||
static void
|
||||
wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
||||
static void __set_mcr_steering(struct i915_wa_list *wal,
|
||||
i915_reg_t steering_reg,
|
||||
unsigned int slice, unsigned int subslice)
|
||||
{
|
||||
const struct sseu_dev_info *sseu = &i915->gt.info.sseu;
|
||||
unsigned int slice, subslice;
|
||||
u32 l3_en, mcr, mcr_mask;
|
||||
u32 mcr, mcr_mask;
|
||||
|
||||
GEM_BUG_ON(GRAPHICS_VER(i915) < 10);
|
||||
mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
|
||||
mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
|
||||
|
||||
/*
|
||||
* WaProgramMgsrForL3BankSpecificMmioReads: cnl,icl
|
||||
* L3Banks could be fused off in single slice scenario. If that is
|
||||
* the case, we might need to program MCR select to a valid L3Bank
|
||||
* by default, to make sure we correctly read certain registers
|
||||
* later on (in the range 0xB100 - 0xB3FF).
|
||||
*
|
||||
* WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl,icl
|
||||
* Before any MMIO read into slice/subslice specific registers, MCR
|
||||
* packet control register needs to be programmed to point to any
|
||||
* enabled s/ss pair. Otherwise, incorrect values will be returned.
|
||||
* This means each subsequent MMIO read will be forwarded to an
|
||||
* specific s/ss combination, but this is OK since these registers
|
||||
* are consistent across s/ss in almost all cases. In the rare
|
||||
* occasions, such as INSTDONE, where this value is dependent
|
||||
* on s/ss combo, the read should be done with read_subslice_reg.
|
||||
*
|
||||
* Since GEN8_MCR_SELECTOR contains dual-purpose bits which select both
|
||||
* to which subslice, or to which L3 bank, the respective mmio reads
|
||||
* will go, we have to find a common index which works for both
|
||||
* accesses.
|
||||
*
|
||||
* Case where we cannot find a common index fortunately should not
|
||||
* happen in production hardware, so we only emit a warning instead of
|
||||
* implementing something more complex that requires checking the range
|
||||
* of every MMIO read.
|
||||
*/
|
||||
wa_write_clr_set(wal, steering_reg, mcr_mask, mcr);
|
||||
}
|
||||
|
||||
if (GRAPHICS_VER(i915) >= 10 && is_power_of_2(sseu->slice_mask)) {
|
||||
u32 l3_fuse =
|
||||
intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3) &
|
||||
GEN10_L3BANK_MASK;
|
||||
static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal,
|
||||
unsigned int slice, unsigned int subslice)
|
||||
{
|
||||
drm_dbg(&i915->drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice);
|
||||
|
||||
drm_dbg(&i915->drm, "L3 fuse = %x\n", l3_fuse);
|
||||
l3_en = ~(l3_fuse << GEN10_L3BANK_PAIR_COUNT | l3_fuse);
|
||||
} else {
|
||||
l3_en = ~0;
|
||||
}
|
||||
|
||||
slice = fls(sseu->slice_mask) - 1;
|
||||
subslice = fls(l3_en & intel_sseu_get_subslices(sseu, slice));
|
||||
if (!subslice) {
|
||||
drm_warn(&i915->drm,
|
||||
"No common index found between subslice mask %x and L3 bank mask %x!\n",
|
||||
intel_sseu_get_subslices(sseu, slice), l3_en);
|
||||
subslice = fls(l3_en);
|
||||
drm_WARN_ON(&i915->drm, !subslice);
|
||||
}
|
||||
subslice--;
|
||||
|
||||
if (GRAPHICS_VER(i915) >= 11) {
|
||||
mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
|
||||
mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
|
||||
} else {
|
||||
mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
|
||||
mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
|
||||
}
|
||||
|
||||
drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr);
|
||||
|
||||
wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
|
||||
__set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice);
|
||||
}
|
||||
|
||||
static void
|
||||
cnl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
||||
icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
||||
{
|
||||
wa_init_mcr(i915, wal);
|
||||
const struct sseu_dev_info *sseu = &i915->gt.info.sseu;
|
||||
unsigned int slice, subslice;
|
||||
|
||||
/* WaInPlaceDecompressionHang:cnl */
|
||||
wa_write_or(wal,
|
||||
GEN9_GAMT_ECO_REG_RW_IA,
|
||||
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
|
||||
GEM_BUG_ON(GRAPHICS_VER(i915) < 11);
|
||||
GEM_BUG_ON(hweight8(sseu->slice_mask) > 1);
|
||||
slice = 0;
|
||||
|
||||
/*
|
||||
* Although a platform may have subslices, we need to always steer
|
||||
* reads to the lowest instance that isn't fused off. When Render
|
||||
* Power Gating is enabled, grabbing forcewake will only power up a
|
||||
* single subslice (the "minconfig") if there isn't a real workload
|
||||
* that needs to be run; this means that if we steer register reads to
|
||||
* one of the higher subslices, we run the risk of reading back 0's or
|
||||
* random garbage.
|
||||
*/
|
||||
subslice = __ffs(intel_sseu_get_subslices(sseu, slice));
|
||||
|
||||
/*
|
||||
* If the subslice we picked above also steers us to a valid L3 bank,
|
||||
* then we can just rely on the default steering and won't need to
|
||||
* worry about explicitly re-steering L3BANK reads later.
|
||||
*/
|
||||
if (i915->gt.info.l3bank_mask & BIT(subslice))
|
||||
i915->gt.steering_table[L3BANK] = NULL;
|
||||
|
||||
__add_mcr_wa(i915, wal, slice, subslice);
|
||||
}
|
||||
|
||||
static void
|
||||
xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
|
||||
{
|
||||
struct drm_i915_private *i915 = gt->i915;
|
||||
const struct sseu_dev_info *sseu = >->info.sseu;
|
||||
unsigned long slice, subslice = 0, slice_mask = 0;
|
||||
u64 dss_mask = 0;
|
||||
u32 lncf_mask = 0;
|
||||
int i;
|
||||
|
||||
/*
|
||||
* On Xe_HP the steering increases in complexity. There are now several
|
||||
* more units that require steering and we're not guaranteed to be able
|
||||
* to find a common setting for all of them. These are:
|
||||
* - GSLICE (fusable)
|
||||
* - DSS (sub-unit within gslice; fusable)
|
||||
* - L3 Bank (fusable)
|
||||
* - MSLICE (fusable)
|
||||
* - LNCF (sub-unit within mslice; always present if mslice is present)
|
||||
*
|
||||
* We'll do our default/implicit steering based on GSLICE (in the
|
||||
* sliceid field) and DSS (in the subsliceid field). If we can
|
||||
* find overlap between the valid MSLICE and/or LNCF values with
|
||||
* a suitable GSLICE, then we can just re-use the default value and
|
||||
* skip and explicit steering at runtime.
|
||||
*
|
||||
* We only need to look for overlap between GSLICE/MSLICE/LNCF to find
|
||||
* a valid sliceid value. DSS steering is the only type of steering
|
||||
* that utilizes the 'subsliceid' bits.
|
||||
*
|
||||
* Also note that, even though the steering domain is called "GSlice"
|
||||
* and it is encoded in the register using the gslice format, the spec
|
||||
* says that the combined (geometry | compute) fuse should be used to
|
||||
* select the steering.
|
||||
*/
|
||||
|
||||
/* Find the potential gslice candidates */
|
||||
dss_mask = intel_sseu_get_subslices(sseu, 0);
|
||||
slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE);
|
||||
|
||||
/*
|
||||
* Find the potential LNCF candidates. Either LNCF within a valid
|
||||
* mslice is fine.
|
||||
*/
|
||||
for_each_set_bit(i, >->info.mslice_mask, GEN12_MAX_MSLICES)
|
||||
lncf_mask |= (0x3 << (i * 2));
|
||||
|
||||
/*
|
||||
* Are there any sliceid values that work for both GSLICE and LNCF
|
||||
* steering?
|
||||
*/
|
||||
if (slice_mask & lncf_mask) {
|
||||
slice_mask &= lncf_mask;
|
||||
gt->steering_table[LNCF] = NULL;
|
||||
}
|
||||
|
||||
/* How about sliceid values that also work for MSLICE steering? */
|
||||
if (slice_mask & gt->info.mslice_mask) {
|
||||
slice_mask &= gt->info.mslice_mask;
|
||||
gt->steering_table[MSLICE] = NULL;
|
||||
}
|
||||
|
||||
slice = __ffs(slice_mask);
|
||||
subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE));
|
||||
WARN_ON(subslice > GEN_DSS_PER_GSLICE);
|
||||
WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
|
||||
|
||||
__add_mcr_wa(i915, wal, slice, subslice);
|
||||
|
||||
/*
|
||||
* SQIDI ranges are special because they use different steering
|
||||
* registers than everything else we work with. On XeHP SDV and
|
||||
* DG2-G10, any value in the steering registers will work fine since
|
||||
* all instances are present, but DG2-G11 only has SQIDI instances at
|
||||
* ID's 2 and 3, so we need to steer to one of those. For simplicity
|
||||
* we'll just steer to a hardcoded "2" since that value will work
|
||||
* everywhere.
|
||||
*/
|
||||
__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
|
||||
__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
|
||||
}
|
||||
|
||||
static void
|
||||
icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
||||
{
|
||||
wa_init_mcr(i915, wal);
|
||||
|
||||
/* WaInPlaceDecompressionHang:icl */
|
||||
wa_write_or(wal,
|
||||
GEN9_GAMT_ECO_REG_RW_IA,
|
||||
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
|
||||
icl_wa_init_mcr(i915, wal);
|
||||
|
||||
/* WaModifyGamTlbPartitioning:icl */
|
||||
wa_write_clr_set(wal,
|
||||
|
@ -1057,18 +1049,6 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
|||
GEN8_GAMW_ECO_DEV_RW_IA,
|
||||
GAMW_ECO_DEV_CTX_RELOAD_DISABLE);
|
||||
|
||||
/* Wa_1405779004:icl (pre-prod) */
|
||||
if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
|
||||
wa_write_or(wal,
|
||||
SLICE_UNIT_LEVEL_CLKGATE,
|
||||
MSCUNIT_CLKGATE_DIS);
|
||||
|
||||
/* Wa_1406838659:icl (pre-prod) */
|
||||
if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
|
||||
wa_write_or(wal,
|
||||
INF_UNIT_LEVEL_CLKGATE,
|
||||
CGPSF_CLKGATE_DIS);
|
||||
|
||||
/* Wa_1406463099:icl
|
||||
* Formerly known as WaGamTlbPendError
|
||||
*/
|
||||
|
@ -1078,10 +1058,16 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
|||
|
||||
/* Wa_1607087056:icl,ehl,jsl */
|
||||
if (IS_ICELAKE(i915) ||
|
||||
IS_JSL_EHL_REVID(i915, EHL_REVID_A0, EHL_REVID_A0))
|
||||
IS_JSL_EHL_GT_STEP(i915, STEP_A0, STEP_B0))
|
||||
wa_write_or(wal,
|
||||
SLICE_UNIT_LEVEL_CLKGATE,
|
||||
L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
|
||||
|
||||
/*
|
||||
* This is not a documented workaround, but rather an optimization
|
||||
* to reduce sampler power.
|
||||
*/
|
||||
wa_write_clr(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -1111,10 +1097,13 @@ static void
|
|||
gen12_gt_workarounds_init(struct drm_i915_private *i915,
|
||||
struct i915_wa_list *wal)
|
||||
{
|
||||
wa_init_mcr(i915, wal);
|
||||
icl_wa_init_mcr(i915, wal);
|
||||
|
||||
/* Wa_14011060649:tgl,rkl,dg1,adls */
|
||||
/* Wa_14011060649:tgl,rkl,dg1,adl-s,adl-p */
|
||||
wa_14011060649(i915, wal);
|
||||
|
||||
/* Wa_14011059788:tgl,rkl,adl-s,dg1,adl-p */
|
||||
wa_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
|
||||
}
|
||||
|
||||
static void
|
||||
|
@ -1123,19 +1112,19 @@ tgl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
|||
gen12_gt_workarounds_init(i915, wal);
|
||||
|
||||
/* Wa_1409420604:tgl */
|
||||
if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0))
|
||||
if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_B0))
|
||||
wa_write_or(wal,
|
||||
SUBSLICE_UNIT_LEVEL_CLKGATE2,
|
||||
CPSSUNIT_CLKGATE_DIS);
|
||||
|
||||
/* Wa_1607087056:tgl also know as BUG:1409180338 */
|
||||
if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0))
|
||||
if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_B0))
|
||||
wa_write_or(wal,
|
||||
SLICE_UNIT_LEVEL_CLKGATE,
|
||||
L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
|
||||
|
||||
/* Wa_1408615072:tgl[a0] */
|
||||
if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0))
|
||||
if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_B0))
|
||||
wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE2,
|
||||
VSUNIT_CLKGATE_DIS_TGL);
|
||||
}
|
||||
|
@ -1146,7 +1135,7 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
|||
gen12_gt_workarounds_init(i915, wal);
|
||||
|
||||
/* Wa_1607087056:dg1 */
|
||||
if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0))
|
||||
if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_B0))
|
||||
wa_write_or(wal,
|
||||
SLICE_UNIT_LEVEL_CLKGATE,
|
||||
L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
|
||||
|
@ -1164,10 +1153,18 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
|||
VSUNIT_CLKGATE_DIS_TGL);
|
||||
}
|
||||
|
||||
static void
|
||||
xehpsdv_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
||||
{
|
||||
xehp_init_mcr(&i915->gt, wal);
|
||||
}
|
||||
|
||||
static void
|
||||
gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
||||
{
|
||||
if (IS_DG1(i915))
|
||||
if (IS_XEHPSDV(i915))
|
||||
xehpsdv_gt_workarounds_init(i915, wal);
|
||||
else if (IS_DG1(i915))
|
||||
dg1_gt_workarounds_init(i915, wal);
|
||||
else if (IS_TIGERLAKE(i915))
|
||||
tgl_gt_workarounds_init(i915, wal);
|
||||
|
@ -1175,8 +1172,6 @@ gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
|||
gen12_gt_workarounds_init(i915, wal);
|
||||
else if (GRAPHICS_VER(i915) == 11)
|
||||
icl_gt_workarounds_init(i915, wal);
|
||||
else if (IS_CANNONLAKE(i915))
|
||||
cnl_gt_workarounds_init(i915, wal);
|
||||
else if (IS_COFFEELAKE(i915) || IS_COMETLAKE(i915))
|
||||
cfl_gt_workarounds_init(i915, wal);
|
||||
else if (IS_GEMINILAKE(i915))
|
||||
|
@ -1184,7 +1179,7 @@ gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
|||
else if (IS_KABYLAKE(i915))
|
||||
kbl_gt_workarounds_init(i915, wal);
|
||||
else if (IS_BROXTON(i915))
|
||||
bxt_gt_workarounds_init(i915, wal);
|
||||
gen9_gt_workarounds_init(i915, wal);
|
||||
else if (IS_SKYLAKE(i915))
|
||||
skl_gt_workarounds_init(i915, wal);
|
||||
else if (IS_HASWELL(i915))
|
||||
|
@ -1247,8 +1242,9 @@ wa_verify(const struct i915_wa *wa, u32 cur, const char *name, const char *from)
|
|||
}
|
||||
|
||||
static void
|
||||
wa_list_apply(struct intel_uncore *uncore, const struct i915_wa_list *wal)
|
||||
wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
|
||||
{
|
||||
struct intel_uncore *uncore = gt->uncore;
|
||||
enum forcewake_domains fw;
|
||||
unsigned long flags;
|
||||
struct i915_wa *wa;
|
||||
|
@ -1263,13 +1259,16 @@ wa_list_apply(struct intel_uncore *uncore, const struct i915_wa_list *wal)
|
|||
intel_uncore_forcewake_get__locked(uncore, fw);
|
||||
|
||||
for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
|
||||
if (wa->clr)
|
||||
intel_uncore_rmw_fw(uncore, wa->reg, wa->clr, wa->set);
|
||||
else
|
||||
intel_uncore_write_fw(uncore, wa->reg, wa->set);
|
||||
u32 val, old = 0;
|
||||
|
||||
/* open-coded rmw due to steering */
|
||||
old = wa->clr ? intel_gt_read_register_fw(gt, wa->reg) : 0;
|
||||
val = (old & ~wa->clr) | wa->set;
|
||||
if (val != old || !wa->clr)
|
||||
intel_uncore_write_fw(uncore, wa->reg, val);
|
||||
|
||||
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
|
||||
wa_verify(wa,
|
||||
intel_uncore_read_fw(uncore, wa->reg),
|
||||
wa_verify(wa, intel_gt_read_register_fw(gt, wa->reg),
|
||||
wal->name, "application");
|
||||
}
|
||||
|
||||
|
@ -1279,28 +1278,39 @@ wa_list_apply(struct intel_uncore *uncore, const struct i915_wa_list *wal)
|
|||
|
||||
void intel_gt_apply_workarounds(struct intel_gt *gt)
|
||||
{
|
||||
wa_list_apply(gt->uncore, >->i915->gt_wa_list);
|
||||
wa_list_apply(gt, >->i915->gt_wa_list);
|
||||
}
|
||||
|
||||
static bool wa_list_verify(struct intel_uncore *uncore,
|
||||
static bool wa_list_verify(struct intel_gt *gt,
|
||||
const struct i915_wa_list *wal,
|
||||
const char *from)
|
||||
{
|
||||
struct intel_uncore *uncore = gt->uncore;
|
||||
struct i915_wa *wa;
|
||||
enum forcewake_domains fw;
|
||||
unsigned long flags;
|
||||
unsigned int i;
|
||||
bool ok = true;
|
||||
|
||||
fw = wal_get_fw_for_rmw(uncore, wal);
|
||||
|
||||
spin_lock_irqsave(&uncore->lock, flags);
|
||||
intel_uncore_forcewake_get__locked(uncore, fw);
|
||||
|
||||
for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
|
||||
ok &= wa_verify(wa,
|
||||
intel_uncore_read(uncore, wa->reg),
|
||||
intel_gt_read_register_fw(gt, wa->reg),
|
||||
wal->name, from);
|
||||
|
||||
intel_uncore_forcewake_put__locked(uncore, fw);
|
||||
spin_unlock_irqrestore(&uncore->lock, flags);
|
||||
|
||||
return ok;
|
||||
}
|
||||
|
||||
bool intel_gt_verify_workarounds(struct intel_gt *gt, const char *from)
|
||||
{
|
||||
return wa_list_verify(gt->uncore, >->i915->gt_wa_list, from);
|
||||
return wa_list_verify(gt, >->i915->gt_wa_list, from);
|
||||
}
|
||||
|
||||
__maybe_unused
|
||||
|
@ -1438,17 +1448,6 @@ static void cml_whitelist_build(struct intel_engine_cs *engine)
|
|||
cfl_whitelist_build(engine);
|
||||
}
|
||||
|
||||
static void cnl_whitelist_build(struct intel_engine_cs *engine)
|
||||
{
|
||||
struct i915_wa_list *w = &engine->whitelist;
|
||||
|
||||
if (engine->class != RENDER_CLASS)
|
||||
return;
|
||||
|
||||
/* WaEnablePreemptionGranularityControlByUMD:cnl */
|
||||
whitelist_reg(w, GEN8_CS_CHICKEN1);
|
||||
}
|
||||
|
||||
static void icl_whitelist_build(struct intel_engine_cs *engine)
|
||||
{
|
||||
struct i915_wa_list *w = &engine->whitelist;
|
||||
|
@ -1542,7 +1541,7 @@ static void dg1_whitelist_build(struct intel_engine_cs *engine)
|
|||
tgl_whitelist_build(engine);
|
||||
|
||||
/* GEN:BUG:1409280441:dg1 */
|
||||
if (IS_DG1_REVID(engine->i915, DG1_REVID_A0, DG1_REVID_A0) &&
|
||||
if (IS_DG1_GT_STEP(engine->i915, STEP_A0, STEP_B0) &&
|
||||
(engine->class == RENDER_CLASS ||
|
||||
engine->class == COPY_ENGINE_CLASS))
|
||||
whitelist_reg_ext(w, RING_ID(engine->mmio_base),
|
||||
|
@ -1562,8 +1561,6 @@ void intel_engine_init_whitelist(struct intel_engine_cs *engine)
|
|||
tgl_whitelist_build(engine);
|
||||
else if (GRAPHICS_VER(i915) == 11)
|
||||
icl_whitelist_build(engine);
|
||||
else if (IS_CANNONLAKE(i915))
|
||||
cnl_whitelist_build(engine);
|
||||
else if (IS_COMETLAKE(i915))
|
||||
cml_whitelist_build(engine);
|
||||
else if (IS_COFFEELAKE(i915))
|
||||
|
@ -1612,8 +1609,8 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
{
|
||||
struct drm_i915_private *i915 = engine->i915;
|
||||
|
||||
if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
|
||||
IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0)) {
|
||||
if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_B0) ||
|
||||
IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_B0)) {
|
||||
/*
|
||||
* Wa_1607138336:tgl[a0],dg1[a0]
|
||||
* Wa_1607063988:tgl[a0],dg1[a0]
|
||||
|
@ -1623,7 +1620,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
GEN12_DISABLE_POSH_BUSY_FF_DOP_CG);
|
||||
}
|
||||
|
||||
if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0)) {
|
||||
if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_B0)) {
|
||||
/*
|
||||
* Wa_1606679103:tgl
|
||||
* (see also Wa_1606682166:icl)
|
||||
|
@ -1633,44 +1630,46 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
GEN7_DISABLE_SAMPLER_PREFETCH);
|
||||
}
|
||||
|
||||
if (IS_ALDERLAKE_S(i915) || IS_DG1(i915) ||
|
||||
if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || IS_DG1(i915) ||
|
||||
IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
|
||||
/* Wa_1606931601:tgl,rkl,dg1,adl-s */
|
||||
/* Wa_1606931601:tgl,rkl,dg1,adl-s,adl-p */
|
||||
wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
|
||||
|
||||
/*
|
||||
* Wa_1407928979:tgl A*
|
||||
* Wa_18011464164:tgl[B0+],dg1[B0+]
|
||||
* Wa_22010931296:tgl[B0+],dg1[B0+]
|
||||
* Wa_14010919138:rkl,dg1,adl-s
|
||||
* Wa_14010919138:rkl,dg1,adl-s,adl-p
|
||||
*/
|
||||
wa_write_or(wal, GEN7_FF_THREAD_MODE,
|
||||
GEN12_FF_TESSELATION_DOP_GATE_DISABLE);
|
||||
|
||||
/*
|
||||
* Wa_1606700617:tgl,dg1
|
||||
* Wa_22010271021:tgl,rkl,dg1, adl-s
|
||||
* Wa_1606700617:tgl,dg1,adl-p
|
||||
* Wa_22010271021:tgl,rkl,dg1,adl-s,adl-p
|
||||
* Wa_14010826681:tgl,dg1,rkl,adl-p
|
||||
*/
|
||||
wa_masked_en(wal,
|
||||
GEN9_CS_DEBUG_MODE1,
|
||||
FF_DOP_CLOCK_GATE_DISABLE);
|
||||
}
|
||||
|
||||
if (IS_ALDERLAKE_S(i915) || IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
|
||||
if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) ||
|
||||
IS_DG1_GT_STEP(i915, STEP_A0, STEP_B0) ||
|
||||
IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
|
||||
/* Wa_1409804808:tgl,rkl,dg1[a0],adl-s */
|
||||
/* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
|
||||
wa_masked_en(wal, GEN7_ROW_CHICKEN2,
|
||||
GEN12_PUSH_CONST_DEREF_HOLD_DIS);
|
||||
|
||||
/*
|
||||
* Wa_1409085225:tgl
|
||||
* Wa_14010229206:tgl,rkl,dg1[a0],adl-s
|
||||
* Wa_14010229206:tgl,rkl,dg1[a0],adl-s,adl-p
|
||||
*/
|
||||
wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN12_DISABLE_TDL_PUSH);
|
||||
}
|
||||
|
||||
|
||||
if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
|
||||
if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_B0) ||
|
||||
IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
|
||||
/*
|
||||
* Wa_1607030317:tgl
|
||||
|
@ -1688,8 +1687,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
GEN8_RC_SEMA_IDLE_MSG_DISABLE);
|
||||
}
|
||||
|
||||
if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
|
||||
/* Wa_1406941453:tgl,rkl,dg1 */
|
||||
if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915) ||
|
||||
IS_ALDERLAKE_S(i915) || IS_ALDERLAKE_P(i915)) {
|
||||
/* Wa_1406941453:tgl,rkl,dg1,adl-s,adl-p */
|
||||
wa_masked_en(wal,
|
||||
GEN10_SAMPLER_MODE,
|
||||
ENABLE_SMALLPL);
|
||||
|
@ -1701,11 +1701,6 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
_3D_CHICKEN3,
|
||||
_3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE);
|
||||
|
||||
/* WaPipelineFlushCoherentLines:icl */
|
||||
wa_write_or(wal,
|
||||
GEN8_L3SQCREG4,
|
||||
GEN8_LQSC_FLUSH_COHERENT_LINES);
|
||||
|
||||
/*
|
||||
* Wa_1405543622:icl
|
||||
* Formerly known as WaGAPZPriorityScheme
|
||||
|
@ -1735,19 +1730,6 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
GEN8_L3SQCREG4,
|
||||
GEN11_LQSC_CLEAN_EVICT_DISABLE);
|
||||
|
||||
/* WaForwardProgressSoftReset:icl */
|
||||
wa_write_or(wal,
|
||||
GEN10_SCRATCH_LNCF2,
|
||||
PMFLUSHDONE_LNICRSDROP |
|
||||
PMFLUSH_GAPL3UNBLOCK |
|
||||
PMFLUSHDONE_LNEBLK);
|
||||
|
||||
/* Wa_1406609255:icl (pre-prod) */
|
||||
if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
|
||||
wa_write_or(wal,
|
||||
GEN7_SARCHKMD,
|
||||
GEN7_DISABLE_DEMAND_PREFETCH);
|
||||
|
||||
/* Wa_1606682166:icl */
|
||||
wa_write_or(wal,
|
||||
GEN7_SARCHKMD,
|
||||
|
@ -1947,10 +1929,10 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
* disable bit, which we don't touch here, but it's good
|
||||
* to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
|
||||
*/
|
||||
wa_add(wal, GEN7_GT_MODE, 0,
|
||||
_MASKED_FIELD(GEN6_WIZ_HASHING_MASK,
|
||||
GEN6_WIZ_HASHING_16x4),
|
||||
GEN6_WIZ_HASHING_16x4);
|
||||
wa_masked_field_set(wal,
|
||||
GEN7_GT_MODE,
|
||||
GEN6_WIZ_HASHING_MASK,
|
||||
GEN6_WIZ_HASHING_16x4);
|
||||
}
|
||||
|
||||
if (IS_GRAPHICS_VER(i915, 6, 7))
|
||||
|
@ -2000,10 +1982,10 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
* disable bit, which we don't touch here, but it's good
|
||||
* to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
|
||||
*/
|
||||
wa_add(wal,
|
||||
GEN6_GT_MODE, 0,
|
||||
_MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4),
|
||||
GEN6_WIZ_HASHING_16x4);
|
||||
wa_masked_field_set(wal,
|
||||
GEN6_GT_MODE,
|
||||
GEN6_WIZ_HASHING_MASK,
|
||||
GEN6_WIZ_HASHING_16x4);
|
||||
|
||||
/* WaDisable_RenderCache_OperationalFlush:snb */
|
||||
wa_masked_dis(wal, CACHE_MODE_0, RC_OP_FLUSH_ENABLE);
|
||||
|
@ -2024,7 +2006,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
wa_add(wal, MI_MODE,
|
||||
0, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH),
|
||||
/* XXX bit doesn't stick on Broadwater */
|
||||
IS_I965G(i915) ? 0 : VS_TIMER_DISPATCH);
|
||||
IS_I965G(i915) ? 0 : VS_TIMER_DISPATCH, true);
|
||||
|
||||
if (GRAPHICS_VER(i915) == 4)
|
||||
/*
|
||||
|
@ -2039,7 +2021,8 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
*/
|
||||
wa_add(wal, ECOSKPD,
|
||||
0, _MASKED_BIT_ENABLE(ECO_CONSTANT_BUFFER_SR_DISABLE),
|
||||
0 /* XXX bit doesn't stick on Broadwater */);
|
||||
0 /* XXX bit doesn't stick on Broadwater */,
|
||||
true);
|
||||
}
|
||||
|
||||
static void
|
||||
|
@ -2048,7 +2031,7 @@ xcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
struct drm_i915_private *i915 = engine->i915;
|
||||
|
||||
/* WaKBLVECSSemaphoreWaitPoll:kbl */
|
||||
if (IS_KBL_GT_STEP(i915, STEP_A0, STEP_E0)) {
|
||||
if (IS_KBL_GT_STEP(i915, STEP_A0, STEP_F0)) {
|
||||
wa_write(wal,
|
||||
RING_SEMA_WAIT_POLL(engine->mmio_base),
|
||||
1);
|
||||
|
@ -2081,7 +2064,7 @@ void intel_engine_init_workarounds(struct intel_engine_cs *engine)
|
|||
|
||||
void intel_engine_apply_workarounds(struct intel_engine_cs *engine)
|
||||
{
|
||||
wa_list_apply(engine->uncore, &engine->wa_list);
|
||||
wa_list_apply(engine->gt, &engine->wa_list);
|
||||
}
|
||||
|
||||
struct mcr_range {
|
||||
|
@ -2107,12 +2090,31 @@ static const struct mcr_range mcr_ranges_gen12[] = {
|
|||
{},
|
||||
};
|
||||
|
||||
static const struct mcr_range mcr_ranges_xehp[] = {
|
||||
{ .start = 0x4000, .end = 0x4aff },
|
||||
{ .start = 0x5200, .end = 0x52ff },
|
||||
{ .start = 0x5400, .end = 0x7fff },
|
||||
{ .start = 0x8140, .end = 0x815f },
|
||||
{ .start = 0x8c80, .end = 0x8dff },
|
||||
{ .start = 0x94d0, .end = 0x955f },
|
||||
{ .start = 0x9680, .end = 0x96ff },
|
||||
{ .start = 0xb000, .end = 0xb3ff },
|
||||
{ .start = 0xc800, .end = 0xcfff },
|
||||
{ .start = 0xd800, .end = 0xd8ff },
|
||||
{ .start = 0xdc00, .end = 0xffff },
|
||||
{ .start = 0x17000, .end = 0x17fff },
|
||||
{ .start = 0x24a00, .end = 0x24a7f },
|
||||
{},
|
||||
};
|
||||
|
||||
static bool mcr_range(struct drm_i915_private *i915, u32 offset)
|
||||
{
|
||||
const struct mcr_range *mcr_ranges;
|
||||
int i;
|
||||
|
||||
if (GRAPHICS_VER(i915) >= 12)
|
||||
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
|
||||
mcr_ranges = mcr_ranges_xehp;
|
||||
else if (GRAPHICS_VER(i915) >= 12)
|
||||
mcr_ranges = mcr_ranges_gen12;
|
||||
else if (GRAPHICS_VER(i915) >= 8)
|
||||
mcr_ranges = mcr_ranges_gen8;
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue