This won't work on Ampere, and, it's questionable whether we should have
been using our FW's method of storing the golden context image with NV's
firmware to begin with.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
This doesn't work on Ampere for some reason, switch to directly modifying
NV_PGRAPH_FECS_CTXSW_MAILBOX instead.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
This was thought to be per-channel initially - it's not. The backing
pages for the VMM mappings are shared for all channels.
- switches to more straight-forward patch interfaces
- prepares for sub-context support
- this is saving a *sizeable* amount of vram
v2:
- whitespace
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
This was thought to be per-channel initially - it's not. The backing
pages for the VMM mappings are shared for all channels.
- switches to more straight-forward patch interfaces
- prepares for sub-context support
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
This was thought to be per-channel initially - it's not. The backing
pages for the VMM mappings are shared for all channels.
- switches to more straight-forward patch interfaces
- prepares for sub-context support
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Needed for GV100 (and only GV100 for some reason) for WFI_GOLDEN_SAVE.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Adds context binding and support for FWs with a bootloader to the code
that was added to load VPR scrubber HS binaries, and ports ACR over to
using all of it.
- gv100 split from gp108 to handle FW exit status differences
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Under memory load, instmem allocations could end up in the regions of
VRAM that are inaccessible right after boot, and be corrupted after a
suspend/resume cycle as a result of being restored before booting the
mem unlock firmware.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
- also executes pre-DEVINIT, so early boot is able to DMA sysmem
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Adds the start of common interfaces to load and boot the HS binaries
provided by NVIDIA that enable the usage of GR.
ACR already handles most of this, but it's very much tied into ACR's
init process, and there's other code that could benefit from reusing
a lot of this stuff too (ie. VBIOS DEVINIT/PreOS, VPR scrubber).
The VPR scrubber code is fairly independent, and a good first target.
- adds better debug output to fw loading process, to ease bring-up/debug
v2:
- whitespace, 0->false
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Mostly preparation to fit in Ampere changes, but should result in reset
sequences a lot closer to RM's, and perhaps help out with the issues we
sometimes see reported in this area.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reset regs won't be available on Ampere while SEC2 RTOS is running, and
we're apparently supposed to be doing this on earlier GPUs too.
v2:
- fixed some excessive indentation
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Cleanup before falcon changes.
- fixes (attempt at?) reset of pmu while rtos is running, on gm20b
v2:
- remove extra whitespace
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
- replaces the hacked-up version that existed solely to support TTM
v2. remove earlier hack preventing use of non-stall intr for fences
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
- replaces the hacked-up version that existed solely to support TTM
- noop until the next commit, adding proper support for ampere host
v2. fixup for ga103 early merge
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Exposes a bunch of the new features that became possible as a result
of the earlier commits. DRM will build on this in the future to add
support for features such as SCG ("async compute") and multi-device
rendering, as part of the work necessary to be able to write a half-
decent vulkan driver - finally.
For the moment, this just crudely ports DRM to the API changes.
- channel class interfaces now the same for all HW classes
- channel group class exposed (SCG)
- channel runqueue selector exposed (SCG)
- channel sub-device id control exposed (multi-device rendering)
- channel names in logging will reflect creating process, not fd owner
- explicit USERD allocation required by VOLTA_CHANNEL_GPFIFO_A and newer
- drm is smarter about determining the appropriate channel class to use
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Simplifies the GPU-specific code, completing the switch to newer HALs.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Builds on the context tracking that was added earlier.
- marks engine context PTEs as 'priv' where possible
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
- adds support for specifying SUBDEVICE_ID for channel
- rounds non-power-of-two GPFIFO sizes down, rather than up
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
And use it to cleanup multiple implementations of almost the same thing.
- prepares for non-polled / client-provided USERD
- only zeroes relevant "registers", rather than entire USERD
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Currently provided by {chan,dma,gpfifo}*.c, and those are going away.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
- less dependence on waiting for runlist updates, on GPUs that allow it
- supports runqueue selector in RAMRL entries
- completes switch to common runl/cgrp/chan topology info
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
That sure was fun to untangle.
- handled per-runlist, rather than globally
- more straight-forward process in general
- various potential SW/HW races have been fixed
- fixes lockdep issues that were present in >=gk104's prior implementation
- volta recovery now actually stands a chance of working
- volta/turing waiting for PBDMA idle before engine reset
- turing using hw-provided TSG info for CTXSW_TIMEOUT
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
A bunch of these can be handled in such a way that the channel can
continue, however, any of these are a pretty decent sign something
has gone horribly wrong, and the safest option is to disable the
channel.
This is a bit of a hack, we will want to handle these individually
and dump relevant debug info for each at some point.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
- nvkm_chan_error() built on top, stops channel and sends 'killed' event
- removes an odd double-bashing of channel enable regs on kepler and up
- pokes doorbell on turing and up, after enabling channel
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
- stops programming (non-existent) runl id field on bind(), from maxwell
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
- adds g8x/turing registers, which were missing before
- switches fermi to polled wait, like later hw (see: 4f2fc25c0f8bc...)
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Channel groups have somewhat more complicated requirements than what we
currently support. An engine context is shared between all channels in
a channel group, VEID/subctx support (later) brings per-VEID components,
and we need to track an individual channel's engine context pointers.
This commit adds the structures and refcounting to support the above,
wrapping the prior implementation for the moment.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
- supports per-runlist CHIDs
- channel group lock held across reference, rather than global lock
v2:
- remove unnecessary parenthesis
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
After updating GF100 implementation from the GK104/TU102 ones, and using
the new runlist/engine topology info, all three handlers become (almost)
identical.
- there's a temporary kludge to call through to the HW-specific recovery
- engine fault mapping info determined at load time, not on every fault
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
- merges gf100/gk104- NV_PFIFO_INTR_0_PBDMA and NV_PPBDMA_INTR_0 code
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
- bumps pbdma timeout to value RM uses on newer HW
- bumps fb timeout to max from boot default
- one/both of these greatly improves stability on // piglit runs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
- removes a layer of indirection in the intr handling
- prevents non-stall ctrl racing with unknown intrs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>