drm/i915: Narration overview on GEM

Add a narration to i915.rst about Intel GEN GPU's: engines,
driver context and relocation. Also do minor reorder to improve
narration.

v5:
  More type fixes.
  Flow bullet list so lines are not too long.

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
[Joonas: correcting the patch title]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1523001957-6427-2-git-send-email-kevin.rogovin@intel.com
This commit is contained in:
Kevin Rogovin 2018-04-06 11:05:55 +03:00 committed by Joonas Lahtinen
parent 028666793a
commit fd5ff5f6f6
1 changed files with 97 additions and 23 deletions

View File

@ -249,6 +249,103 @@ Memory Management and Command Submission
This sections covers all things related to the GEM implementation in the
i915 driver.
Intel GPU Basics
----------------
An Intel GPU has multiple engines. There are several engine types.
- RCS engine is for rendering 3D and performing compute, this is named
`I915_EXEC_RENDER` in user space.
- BCS is a blitting (copy) engine, this is named `I915_EXEC_BLT` in user
space.
- VCS is a video encode and decode engine, this is named `I915_EXEC_BSD`
in user space
- VECS is video enhancement engine, this is named `I915_EXEC_VEBOX` in user
space.
- The enumeration `I915_EXEC_DEFAULT` does not refer to specific engine;
instead it is to be used by user space to specify a default rendering
engine (for 3D) that may or may not be the same as RCS.
The Intel GPU family is a family of integrated GPU's using Unified
Memory Access. For having the GPU "do work", user space will feed the
GPU batch buffers via one of the ioctls `DRM_IOCTL_I915_GEM_EXECBUFFER2`
or `DRM_IOCTL_I915_GEM_EXECBUFFER2_WR`. Most such batchbuffers will
instruct the GPU to perform work (for example rendering) and that work
needs memory from which to read and memory to which to write. All memory
is encapsulated within GEM buffer objects (usually created with the ioctl
`DRM_IOCTL_I915_GEM_CREATE`). An ioctl providing a batchbuffer for the GPU
to create will also list all GEM buffer objects that the batchbuffer reads
and/or writes. For implementation details of memory management see
`GEM BO Management Implementation Details`_.
The i915 driver allows user space to create a context via the ioctl
`DRM_IOCTL_I915_GEM_CONTEXT_CREATE` which is identified by a 32-bit
integer. Such a context should be viewed by user-space as -loosely-
analogous to the idea of a CPU process of an operating system. The i915
driver guarantees that commands issued to a fixed context are to be
executed so that writes of a previously issued command are seen by
reads of following commands. Actions issued between different contexts
(even if from the same file descriptor) are NOT given that guarantee
and the only way to synchronize across contexts (even from the same
file descriptor) is through the use of fences. At least as far back as
Gen4, also have that a context carries with it a GPU HW context;
the HW context is essentially (most of atleast) the state of a GPU.
In addition to the ordering guarantees, the kernel will restore GPU
state via HW context when commands are issued to a context, this saves
user space the need to restore (most of atleast) the GPU state at the
start of each batchbuffer. The non-deprecated ioctls to submit batchbuffer
work can pass that ID (in the lower bits of drm_i915_gem_execbuffer2::rsvd1)
to identify what context to use with the command.
The GPU has its own memory management and address space. The kernel
driver maintains the memory translation table for the GPU. For older
GPUs (i.e. those before Gen8), there is a single global such translation
table, a global Graphics Translation Table (GTT). For newer generation
GPUs each context has its own translation table, called Per-Process
Graphics Translation Table (PPGTT). Of important note, is that although
PPGTT is named per-process it is actually per context. When user space
submits a batchbuffer, the kernel walks the list of GEM buffer objects
used by the batchbuffer and guarantees that not only is the memory of
each such GEM buffer object resident but it is also present in the
(PP)GTT. If the GEM buffer object is not yet placed in the (PP)GTT,
then it is given an address. Two consequences of this are: the kernel
needs to edit the batchbuffer submitted to write the correct value of
the GPU address when a GEM BO is assigned a GPU address and the kernel
might evict a different GEM BO from the (PP)GTT to make address room
for another GEM BO. Consequently, the ioctls submitting a batchbuffer
for execution also include a list of all locations within buffers that
refer to GPU-addresses so that the kernel can edit the buffer correctly.
This process is dubbed relocation.
GEM BO Management Implementation Details
----------------------------------------
.. kernel-doc:: drivers/gpu/drm/i915/i915_vma.h
:doc: Virtual Memory Address
Buffer Object Eviction
----------------------
This section documents the interface functions for evicting buffer
objects to make space available in the virtual gpu address spaces. Note
that this is mostly orthogonal to shrinking buffer objects caches, which
has the goal to make main memory (shared with the gpu through the
unified memory architecture) available.
.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_evict.c
:internal:
Buffer Object Memory Shrinking
------------------------------
This section documents the interface function for shrinking memory usage
of buffer object caches. Shrinking is used to make main memory
available. Note that this is mostly orthogonal to evicting buffer
objects, which has the goal to make space in gpu virtual address spaces.
.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_shrinker.c
:internal:
Batchbuffer Parsing
-------------------
@ -312,29 +409,6 @@ Object Tiling IOCTLs
.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_tiling.c
:doc: buffer object tiling
Buffer Object Eviction
----------------------
This section documents the interface functions for evicting buffer
objects to make space available in the virtual gpu address spaces. Note
that this is mostly orthogonal to shrinking buffer objects caches, which
has the goal to make main memory (shared with the gpu through the
unified memory architecture) available.
.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_evict.c
:internal:
Buffer Object Memory Shrinking
------------------------------
This section documents the interface function for shrinking memory usage
of buffer object caches. Shrinking is used to make main memory
available. Note that this is mostly orthogonal to evicting buffer
objects, which has the goal to make space in gpu virtual address spaces.
.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_shrinker.c
:internal:
WOPCM
=====