Commit Graph

88 Commits

Author SHA1 Message Date
Ben Skeggs 097d56cdcd drm/nouveau/fifo: remove rd32/wr32 accessors from channels
No need for these, we always map USERD to the client.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-07-13 13:56:42 +10:00
Ben Skeggs 49b2dfc081 drm/nouveau/ga102-: support ttm buffer moves via copy engine
We don't currently have any kind of real acceleration on Ampere GPUs,
but the TTM memcpy() fallback paths aren't really designed to handle
copies between different devices, such as on Optimus systems, and
result in a kernel OOPS.

A few options were investigated to try and fix this, but didn't work
out, and likely would have resulted in a very unpleasant experience
for users anyway.

This commit adds just enough support for setting up a single channel
connected to a copy engine, which the kernel can use to accelerate
the buffer copies between devices.  Userspace has no access to this
incomplete channel support, but it's suitable for TTM's needs.

A more complete implementation of host(fifo) for Ampere GPUs is in
the works, but the required changes are far too invasive that they
would be unsuitable to backport to fix this issue on current kernels.

v2: fix GPFIFO length in RAMFC (reported by Karol)

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: <stable@vger.kernel.org> # v5.12+
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210916220406.666454-1-skeggsb@gmail.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-10-06 11:05:45 +02:00
Ben Skeggs 59f216cf04 drm/nouveau: rip out nvkm_client.super
No longer required now that userspace can't touch anything that might
need it, and should fix DRM MM operations racing with each other, and
the random hangs/crashes that come with that.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-08-18 19:00:15 +10:00
Christian König d3116756a7 drm/ttm: rename bo->mem and make it a pointer
When we want to decouble resource management from buffer management we need to
be able to handle resources separately.

Add a resource pointer and rename bo->mem so that all code needs to
change to access the pointer instead.

No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210430092508.60710-4-christian.koenig@amd.com
2021-06-02 11:07:25 +02:00
Ben Skeggs f8fabd31fa drm/nouveau/fifo/gk104-: remove use of subdev index in runlist topology info
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-02-11 11:49:58 +10:00
Frantisek Hrbata eaba3b2840 drm/nouveau: bail out of nouveau_channel_new if channel init fails
Unprivileged user can crash kernel by using DRM_IOCTL_NOUVEAU_CHANNEL_ALLOC
ioctl. This was reported by trinity[1] fuzzer.

[   71.073906] nouveau 0000:01:00.0: crashme[1329]: channel failed to initialise, -17
[   71.081730] BUG: kernel NULL pointer dereference, address: 00000000000000a0
[   71.088928] #PF: supervisor read access in kernel mode
[   71.094059] #PF: error_code(0x0000) - not-present page
[   71.099189] PGD 119590067 P4D 119590067 PUD 1054f5067 PMD 0
[   71.104842] Oops: 0000 [#1] SMP NOPTI
[   71.108498] CPU: 2 PID: 1329 Comm: crashme Not tainted 5.8.0-rc6+ #2
[   71.114993] Hardware name: AMD Pike/Pike, BIOS RPK1506A 09/03/2014
[   71.121213] RIP: 0010:nouveau_abi16_ioctl_channel_alloc+0x108/0x380 [nouveau]
[   71.128339] Code: 48 89 9d f0 00 00 00 41 8b 4c 24 04 41 8b 14 24 45 31 c0 4c 8d 4b 10 48 89 ee 4c 89 f7 e8 10 11 00 00 85 c0 75 78 48 8b 43 10 <8b> 90 a0 00 00 00 41 89 54 24 08 80 7d 3d 05 0f 86 bb 01 00 00 41
[   71.147074] RSP: 0018:ffffb4a1809cfd38 EFLAGS: 00010246
[   71.152526] RAX: 0000000000000000 RBX: ffff98cedbaa1d20 RCX: 00000000000003bf
[   71.159651] RDX: 00000000000003be RSI: 0000000000000000 RDI: 0000000000030160
[   71.166774] RBP: ffff98cee776de00 R08: ffffdc0144198a08 R09: ffff98ceeefd4000
[   71.173901] R10: ffff98cee7e81780 R11: 0000000000000001 R12: ffffb4a1809cfe08
[   71.181214] R13: ffff98cee776d000 R14: ffff98cec519e000 R15: ffff98cee776def0
[   71.188339] FS:  00007fd926250500(0000) GS:ffff98ceeac80000(0000) knlGS:0000000000000000
[   71.196418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   71.202155] CR2: 00000000000000a0 CR3: 0000000106622000 CR4: 00000000000406e0
[   71.209297] Call Trace:
[   71.211777]  ? nouveau_abi16_ioctl_getparam+0x1f0/0x1f0 [nouveau]
[   71.218053]  drm_ioctl_kernel+0xac/0xf0 [drm]
[   71.222421]  drm_ioctl+0x211/0x3c0 [drm]
[   71.226379]  ? nouveau_abi16_ioctl_getparam+0x1f0/0x1f0 [nouveau]
[   71.232500]  nouveau_drm_ioctl+0x57/0xb0 [nouveau]
[   71.237285]  ksys_ioctl+0x86/0xc0
[   71.240595]  __x64_sys_ioctl+0x16/0x20
[   71.244340]  do_syscall_64+0x4c/0x90
[   71.248110]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   71.253162] RIP: 0033:0x7fd925d4b88b
[   71.256731] Code: Bad RIP value.
[   71.259955] RSP: 002b:00007ffc743592d8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[   71.267514] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd925d4b88b
[   71.274637] RDX: 0000000000601080 RSI: 00000000c0586442 RDI: 0000000000000003
[   71.281986] RBP: 00007ffc74359340 R08: 00007fd926016ce0 R09: 00007fd926016ce0
[   71.289111] R10: 0000000000000003 R11: 0000000000000206 R12: 0000000000400620
[   71.296235] R13: 00007ffc74359420 R14: 0000000000000000 R15: 0000000000000000
[   71.303361] Modules linked in: rfkill sunrpc snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core edac_mce_amd snd_hwdep kvm_amd snd_seq ccp snd_seq_device snd_pcm kvm snd_timer snd irqbypass soundcore sp5100_tco pcspkr crct10dif_pclmul crc32_pclmul ghash_clmulni_intel wmi_bmof joydev i2c_piix4 fam15h_power k10temp acpi_cpufreq ip_tables xfs libcrc32c sd_mod t10_pi sg nouveau video mxm_wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm broadcom bcm_phy_lib ata_generic ahci drm e1000 crc32c_intel libahci serio_raw tg3 libata firewire_ohci firewire_core wmi crc_itu_t dm_mirror dm_region_hash dm_log dm_mod
[   71.365269] CR2: 00000000000000a0

simplified reproducer
---------------------------------8<----------------------------------------
/*
 * gcc -o crashme crashme.c
 * ./crashme /dev/dri/renderD128
 */

struct drm_nouveau_channel_alloc {
	uint32_t     fb_ctxdma_handle;
	uint32_t     tt_ctxdma_handle;

	int          channel;
	uint32_t     pushbuf_domains;

	/* Notifier memory */
	uint32_t     notifier_handle;

	/* DRM-enforced subchannel assignments */
	struct {
		uint32_t handle;
		uint32_t grclass;
	} subchan[8];
	uint32_t nr_subchan;
};

static struct drm_nouveau_channel_alloc channel;

int main(int argc, char *argv[]) {
	int fd;
	int rv;

	if (argc != 2)
		die("usage: %s <dev>", 0, argv[0]);

	if ((fd = open(argv[1], O_RDONLY)) == -1)
		die("open %s", errno, argv[1]);

	if (ioctl(fd, DRM_IOCTL_NOUVEAU_CHANNEL_ALLOC, &channel) == -1 &&
			errno == EACCES)
		die("ioctl %s", errno, argv[1]);

	close(fd);

	printf("PASS\n");

	return 0;
}
---------------------------------8<----------------------------------------

[1] https://github.com/kernelslacker/trinity

Fixes: eeaf06ac1a ("drm/nouveau/svm: initial support for shared virtual memory")
Signed-off-by: Frantisek Hrbata <frantisek@hrbata.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2021-01-29 16:49:15 +10:00
Christian König 6797cea18d drm/nouveau: switch over to the new pin interface
Stop using TTM_PL_FLAG_NO_EVICT.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/391606/?series=81973&rev=1
2020-09-24 16:16:50 +02:00
Christian König 81b615798e drm/nouveau: stop using TTM placement flags
Those are going to be removed, stop using them here.

Instead use the GEM flags from the UAPI.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/389825/?series=81551&rev=1
2020-09-11 13:31:23 +02:00
Ben Skeggs cd346a89d2 drm/nouveau/chan: convert nvsw init to new push macros
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2020-07-24 18:50:58 +10:00
Ben Skeggs fdb06e2b2a drm/nouveau: interop with new push macros
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2020-07-24 18:50:56 +10:00
Ben Skeggs f7a7d22ad6 drm/nouveau/nvif: give every notify object a human-readable name
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2020-07-24 18:50:51 +10:00
Ben Skeggs 9ac596a4e8 drm/nouveau/nvif: give every object a human-readable identifier
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2020-07-24 18:50:50 +10:00
Nirmoy Das 0dc9b286b8 drm/nouveau: don't use ttm bo->offset v3
Store ttm bo->offset in struct nouveau_bo instead.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/372932/
Signed-off-by: Christian König <christian.koenig@amd.com>
2020-06-26 14:00:41 +02:00
Ben Skeggs ea13e5abf8 drm/nouveau: signal pending fences when channel has been killed
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-29 15:49:47 +10:00
Ben Skeggs eeaf06ac1a drm/nouveau/svm: initial support for shared virtual memory
This uses HMM to mirror a process' CPU page tables into a channel's page
tables, and keep them synchronised so that both the CPU and GPU are able
to access the same memory at the same virtual address.

While this code also supports Volta/Turing, it's only enabled for Pascal
GPUs currently due to channel recovery being unreliable right now on the
later GPUs.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-02-20 09:00:02 +10:00
Ben Skeggs bfe91afaca drm/nouveau: prepare for enabling svm with existing userspace interfaces
For a channel to make use of SVM features, it requires a different GPU MMU
configuration than we would normally use, which is not desirable to switch
to unless a client is actively going to use SVM.

In order to supporting SVM without more extensive changes to the userspace
interfaces, the SVM_INIT ioctl needs to replace the previous configuration
safely.

The only way we can currently do this safely, accounting for some unlikely
failure conditions, is to allocate the new VMM without destroying the last
one, and prioritising the SVM-enabled configuration in the code that cares.

This will get cleaned up again further down the track.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-02-20 09:00:01 +10:00
Ben Skeggs 641d0b3056 drm/nouveau/fifo/tu104: initial support
Various different bits and pieces vs GV100.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-12-11 15:37:55 +10:00
Ben Skeggs 9d24907ccf drm/nouveau/fifo/gv100: return work submission token in channel ctor args
The token will also contain runlist ID on Turing, so instead expose it as
an opaque value from NVKM so the client doesn't need to care.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-12-11 15:37:49 +10:00
Ben Skeggs 85532bd984 drm/nouveau/fifo/gk104-: support enabling privileged ce functions
Will be used by SVM code to allow direct (without going through MMU) memcpy
using the GPU copy engines.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-12-11 15:37:47 +10:00
Ben Skeggs 86b442d74c drm/nouveau/fifo/gk104-: return channel instance in ctor args
Will be used to match fault buffer entries with a channel.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-12-11 15:37:47 +10:00
Ben Skeggs 37e1c45a58 drm/nouveau/fifo/gv100: initial support
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-05-18 15:01:46 +10:00
Ben Skeggs 92b4eaaf9a drm/nouveau: no need to create ctxdma for push buffers on fermi and up
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-05-18 15:01:26 +10:00
Ben Skeggs a7cf01809b drm/nouveau/fifo/gk104-: require explicit runlist selection for channel allocation
We didn't used to be aware that runlist/engine IDs weren't the same thing,
or that there was such variability in configuration between GPUs.

By exposing this information to a client, and giving it explicit control
of which runlist it's allocating a channel on, we're able to make better
choices.

The immediate effect of this is that on GPUs where CE0 is the "GRCE", we
will now be allocating a copy engine running asynchronously to GR for BO
migrations - as intended.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-05-18 15:01:21 +10:00
Ben Skeggs eb47db4f3b drm/nouveau/fifo: support channel count query
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-05-18 15:01:21 +10:00
Ben Skeggs d7722134b8 drm/nouveau: switch over to new memory and vmm interfaces
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-11-02 13:32:33 +10:00
Ben Skeggs 832ca2ac3c drm/nouveau: pass handle of vmm object to channel allocation ioctls
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-11-02 13:32:33 +10:00
Ben Skeggs 3c5026395b drm/nouveau: switch to vmm limit
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-11-02 13:32:33 +10:00
Ben Skeggs 24e8375b1b drm/nouveau: separate constant-va tracking from nvkm vma structure
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-11-02 13:32:21 +10:00
Ben Skeggs 0132605039 drm/nouveau/core/object: allow arguments to be passed to map function
MMU will be needing this to specify kind info on BAR mappings.

We have no userspace currently using these interfaces, so break the ABI
instead of supporting both.  NVIF version bump so any future use can be
guarded.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-11-02 13:32:16 +10:00
Ben Skeggs 84cd0a5565 drm/nouveau: check for dead channel before trying to idle
This prevents *very* long waits while attempting to destroy channels
after a fault has occurred.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:09 +10:00
Ben Skeggs d8cc37d878 drm/nouveau: request notifications for channels that have been killed
These will be used to improve error recovery behaviour.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 17:38:09 +10:00
Ben Skeggs bab7cc18d3 drm/nouveau: pass nvif_client to nouveau_bo_new() instead of drm_device
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17 15:15:02 +10:00
Ben Skeggs e8ff979492 drm/nouveau/fifo/gp100: initial support
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-07-14 11:53:25 +10:00
Ben Skeggs 4dc28134a8 drm/nouveau: rename nouveau_drm.h to nouveau_drv.h
Fixes out-of-tree build issue where uapi/drm/nouveau_drm.h gets picked
up instead.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 63f8c9b7f6 drm/nouveau/fifo/gk110: expose KeplerChannelGpfifoB
This class supports a WFI method (0x0078) that's not present on the
KeplerChannelGpfifoA class.

The binary driver exposes both classes on these GPUs for some reason,
though there doesn't appear to be any difference in the setup that's
done for each (ie. even if you allocate GpfifoA, the WFI method will
still work).

We shall just expose GpfifoB, as I don't see a good reason to report
the presence of both.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:48 +10:00
Ben Skeggs 1f5ff7f52b drm/nouveau/fifo/gk104: make use of topology info during gpfifo construction
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-03-14 10:13:42 +10:00
Ben Skeggs 845f27253c drm/nouveau/nvif: split out ctxdma interface definitions
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-01-11 11:17:40 +10:00
Ben Skeggs 8ed1730ccd drm/nouveau/nvif: split out fifo interface definitions
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-01-11 11:17:40 +10:00
Ben Skeggs 08f7633c1d drm/nouveau/nvif: move internal class identifiers to class.h
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-01-11 11:17:40 +10:00
Ben Skeggs fcf3f91c34 drm/nouveau: remove unnecessary usage of object handles
No longer required in a lot of cases, as objects are identified over NVIF
via an alternate mechanism since the rework.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-11-03 15:02:18 +10:00
Ben Skeggs 340b0e7c50 drm/nouveau/pci: merge agp handling from nouveau drm
This commit reinstates the pre-DEVINIT AGP fiddling that was broken in
an earlier commit.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:49 +10:00
Ben Skeggs 7e8820fed7 drm/nouveau/device: cleaner abstraction for device resource functions
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:48 +10:00
Ben Skeggs fbd58ebda9 drm/nouveau/object: merge with handle
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:47 +10:00
Ben Skeggs 898a2b3213 drm/nouveau/sw: turn flip completion into an event
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:38 +10:00
Ben Skeggs 159045cdc4 drm/nouveau/nvif: replace pushbuf with vm in fermi/kepler gpfifo class args
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:33 +10:00
Ben Skeggs f58ddf9581 drm/nouveau/nvif: assign internal class identifiers to sw classes
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:32 +10:00
Ben Skeggs bf81df9be2 drm/nouveau/nvif: replace path-based object identification
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:32 +10:00
Ben Skeggs a01ca78c8f drm/nouveau/nvif: simplify and tidy library interfaces
A variety of tweaks to the NVIF library interfaces, mostly ripping out
things that turned out to be not so useful.

- Removed refcounting from nvif_object, callers are expected to not be
  stupid instead.
- nvif_client is directly reachable from anything derived from nvif_object,
  removing the need for heuristics to locate it
- _new() versions of interfaces, that allocate memory for the object
  they construct, have been removed.  The vast majority of callers used
  the embedded _init() interfaces.
- No longer storing constructor arguments (and the data returned from
  nvkm) inside nvif_object, it's more or less unused and just wastes
  memory.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:32 +10:00
Ben Skeggs 9ad97ede4b drm/nouveau: use dev_* for logging
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:26 +10:00
Ben Skeggs a1020afe88 drm/nouveau: add support for gm20x fifo channels
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-04-14 17:00:56 +10:00