These will be replaced in the near future, the code isn't yet stable enough
for this merge window however.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Been tested on each major revision that's relevant here, but I'm sure there
are still bugs waiting to be ironed out.
This is a *very* invasive change.
There's a couple of pieces left that I don't like much (eg. other engines
using fifo_priv for the channel count), but that's an artefact of there
being a master channel list still. This is changing, slowly.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
All the places this stuff is actually needed tends to be chipset-specific
anyway, so we're able to just inline the register bashing instead.
The parts of the common code that still directly touch PFIFO temporarily
have conditionals, these will be removed in subsequent commits that will
refactor the fifo modules into engine modules like graph/mpeg etc.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Now have a somewhat simpler semaphore sync implementation for nv17:nv84,
and a switched to using semaphores as fences on nv84+ and making use of
the hardware's >= acquire operation.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Just a cleanup more or less, and to remove the need for special handling of
software objects.
This removes a heap of documentation on dma/graph object formats. The info
is very out of date with our current understanding, and is far better
documented in rnndb in envytools git.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This adds prime->fd and fd->prime support to nouveau,
it passes the SG object to TTM, and then populates the
GART entries using it.
v2: add stubbed kmap + use new function to fill out pages array
for faulting + add reimport test.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This reverts commit a81f15499887d3f9f24ec70bb9b7e778942a6b7b.
Gah, we have a released userspace component using fixed subc assignment
that conflicts with this. To avoid breaking ABI this needs to be
reverted.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
All available subchannels are now available for userspace to do with as it
pleases on NVC0+.
On all earlier chipsets, the kernel still uses a software object on subc 0
to implement the page flip completion method. I hope to find some decent
way of addressing this too, but it's a tad tricker prior to fermi.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Fence lock needs to be initialized before any call to nouveau_channel_put
because it calls nouveau_channel_idle->nouveau_fence_update which uses
fence lock.
BUG: spinlock bad magic on CPU#0, test/24134
lock: ffff88019f90dba8, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
Pid: 24134, comm: test Not tainted 3.0.0-nv+ #800
Call Trace:
spin_bug+0x9c/0xa3
do_raw_spin_lock+0x29/0x13c
_raw_spin_lock+0x1e/0x22
nouveau_fence_update+0x2d/0xf1
nouveau_channel_idle+0x22/0xa0
nouveau_channel_put_unlocked+0x84/0x1bd
nouveau_channel_put+0x20/0x24
nouveau_channel_alloc+0x4ec/0x585
nouveau_ioctl_fifo_alloc+0x50/0x130
drm_ioctl+0x289/0x361
do_vfs_ioctl+0x4dd/0x52c
sys_ioctl+0x42/0x65
system_call_fastpath+0x16/0x1b
It's easily triggerable from userspace.
Additionally remove double initialization of chan->fence.pending.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The nouveau_wait_for_idle() call should hopefully not have been actually
necessary, we *do* wait for the channel to go idle already. If it's
an issue somehow, the chipset-specific hooks can wait for idle themselves
before taking the lock.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This needs a massive cleanup, but to catch bugs from the interface changes
vs the engine code cleanup, this will be done later.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
There's lots of more-or-less independant engines present on NVIDIA GPUs
these days, and we generally want to perform the same operations on them.
Implementing new ones requires hooking into lots of different places,
the aim of this work is to make this simpler and cleaner.
NV84:NV98 PCRYPT moved over as a test.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
'mappable' isn't really used at all, nor is it necessary anymore as the
bo code is capable of moving buffers to mappable vram as required.
'no_vm' isn't necessary anymore either, any places that don't want to be
mapped into a GPU address space should allocate the VRAM directly instead.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Current 3D driver expects this behaviour. While this could be changed,
there's no compelling reason to reserve more than one subchannel for the
DRM. If we ever need to use an object other then M2MF, we can just
re-bind subchannel 0 as required.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
When hacking the libdrm for improvements, I triggered a kernel crash
related to the fact that the NOUVEAU_NOTIFIEROBJ_ALLOC ioctl calls
nouveau_channel_get with an unchecked channel index.
The patch ensures that the channel index is an unsigned and validates
its value in nouveau_channel_get.
Signed-off-by: Michel Hermier <hermier@frugalware.org>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
As of this commit, it's guaranteed that if an object is in VRAM that its
GPU virtual address will be constant.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The regs belong to PFIFO, they're different for pretty much the same
generations we need different PFIFO control for, and NVC0 is going
to be even more different than the rest.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
There have been reports of PFIFO cache errors during context take down
(fdo bug 31637). They are caused by some GPU objects being taken out
while the channel is still potentially processing commands. Make sure
that all the previous rendering has landed before releasing a GPU
object.
Reported-by: Grzesiek Sójka <pld@pfu.pl>
Reported-by: Patrice Mandin <patmandin@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nv0x-nv4x should be mostly fine, nv50 doesn't work yet.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nouveau_fence_* functions are not type safe, which could lead to bugs.
Additionally every use of nouveau_fence_unref had to cast struct
nouveau_fence to void **.
Fix it by renaming old functions and creating static inline functions with
new prototypes. We still need old functions, because we pass function
pointers to ttm.
As we are wrapping functions, drop unused "void *arg" parameter where possible.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This really needs cleaning up somehow, and probably investigate what's
needed to do this on earlier generations. NVIDIA do something similar
there too.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nouveau_channel_ref() takes a "weak" channel reference that doesn't
prevent the hardware channel resources from being released, it just
keeps the channel data structure alive.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nouveau_channel_put() can be executed after the 'refcount == 0' check
in nouveau_channel_get() and before the channel reference count is
incremented. In that case CPU0 will take the context down while CPU1
thinks it owns the channel and 'refcount == 1'.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The destroy_context() engine hooks call gpuobj management functions to
release the channel resources, these functions use HARDIRQ-unsafe locks
whereas destroy_context() is called with the HARDIRQ-safe
context_switch_lock held, that's a lock ordering violation.
Push the engine-specific channel destruction logic into destroy_context()
and let the hardware-specific code lock and unlock when it's actually
needed. Change the engine destruction order to avoid a race in the small
gap between pgraph and pfifo context uninitialization.
Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This fixes a race condition between fbcon acceleration and TTM buffer
moves. To reproduce:
- start X
- switch to vt and "while (true); do dmesg; done"
- switch to another vt and "sleep 2 && cat /path/to/debugfs/dri/0/evict_vram"
- switch back to vt running dmesg
We don't make use of this on any other channel yet, they're currently
protected by drm_global_mutex. This will change in the near future.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>