Commit Graph

20 Commits

Author SHA1 Message Date
Matthew Auld 43d46f0b78 drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/
It covers more than just ttm_bo_type_sg usage, like with say dma-buf,
since one other user is userptr in amdgpu, and in the future we might
have some more. Hence EXTERNAL is likely a more suitable name.

v2(Christian):
  - Rename these to TTM_TT_FLAGS_*
  - Fix up all the holes in the flag values

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210929132629.353541-1-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-09-29 16:17:56 +02:00
Huang Rui 282abb5a1f drm/ttm: fix the type mismatch error on sparc64
__fls() on sparc64 return "int", but here it is expected as "unsigned
long" (x86). It will cause the build errors because the warning becomes
fatal while it is using sparc configuration. As suggested by Linus, it
can use min_t instead of min to force the type as "unsigned int".

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210907100302.3684453-1-ray.huang@amd.com
2021-09-15 10:18:27 +02:00
Christian König 450b2622bc drm/ttm: optimize the pool shrinker a bit v2
Switch back to using a spinlock again by moving the IOMMU unmap outside
of the locked region.

This avoids contention especially while freeing pages.

v2: Add a comment explaining why we need sync_shrinkers().

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210820120528.81114-3-christian.koenig@amd.com
2021-08-27 09:55:39 +02:00
Dave Airlie 51c3b916a4 drm-misc-next for 5.13:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
   - %p4cc printk format modifier
   - atomic: introduce drm_crtc_commit_wait, rework atomic plane state
     helpers to take the drm_commit_state structure
   - dma-buf: heaps rework to return a struct dma_buf
   - simple-kms: Add plate state helpers
   - ttm: debugfs support, removal of sysfs
 
 Driver Changes:
   - Convert drivers to shadow plane helpers
   - arc: Move to drm/tiny
   - ast: cursor plane reworks
   - gma500: Remove TTM and medfield support
   - mxsfb: imx8mm support
   - panfrost: MMU IRQ handling rework
   - qxl: rework to better handle resources deallocation, locking
   - sun4i: Add alpha properties for UI and VI layers
   - vc4: RPi4 CEC support
   - vmwgfx: doc cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYD9fUAAKCRDj7w1vZxhR
 xcRLAQDdWKgUNeHnkKCUNh3ewPGabxvc6KQtPqAxcFv0I3ZmWgEAlfTS0pRLdyzQ
 ITRBL0T0S7cIyqnDULZkwfqB6Q8D0ws=
 =hPCS
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-03-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.13:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - %p4cc printk format modifier
  - atomic: introduce drm_crtc_commit_wait, rework atomic plane state
    helpers to take the drm_commit_state structure
  - dma-buf: heaps rework to return a struct dma_buf
  - simple-kms: Add plate state helpers
  - ttm: debugfs support, removal of sysfs

Driver Changes:
  - Convert drivers to shadow plane helpers
  - arc: Move to drm/tiny
  - ast: cursor plane reworks
  - gma500: Remove TTM and medfield support
  - mxsfb: imx8mm support
  - panfrost: MMU IRQ handling rework
  - qxl: rework to better handle resources deallocation, locking
  - sun4i: Add alpha properties for UI and VI layers
  - vc4: RPi4 CEC support
  - vmwgfx: doc cleanup

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210303100600.dgnkadonzuvfnu22@gilmour
2021-03-16 17:08:46 +10:00
Anthony DeRossi ca63d76fd2 drm/ttm: Fix TTM page pool accounting
Freed pages are not subtracted from the allocated_pages counter in
ttm_pool_type_fini(), causing a leak in the count on device removal.
The next shrinker invocation loops forever trying to free pages that are
no longer in the pool:

  rcu: INFO: rcu_sched self-detected stall on CPU
  rcu:  3-....: (9998 ticks this GP) idle=54e/1/0x4000000000000000 softirq=434857/434857 fqs=2237
    (t=10001 jiffies g=2194533 q=49211)
  NMI backtrace for cpu 3
  CPU: 3 PID: 1034 Comm: kswapd0 Tainted: P           O      5.11.0-com #1
  Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 1405 11/19/2019
  Call Trace:
   <IRQ>
   ...
   </IRQ>
   sysvec_apic_timer_interrupt+0x77/0x80
   asm_sysvec_apic_timer_interrupt+0x12/0x20
  RIP: 0010:mutex_unlock+0x16/0x20
  Code: e7 48 8b 70 10 e8 7a 53 77 ff eb aa e8 43 6c ff ff 0f 1f 00 65 48 8b 14 25 00 6d 01 00 31 c9 48 89 d0 f0 48 0f b1 0f 48 39 c2 <74> 05 e9 e3 fe ff ff c3 66 90 48 8b 47 20 48 85 c0 74 0f 8b 50 10
  RSP: 0018:ffffbdb840797be8 EFLAGS: 00000246
  RAX: ffff9ff445a41c00 RBX: ffffffffc02a9ef8 RCX: 0000000000000000
  RDX: ffff9ff445a41c00 RSI: ffffbdb840797c78 RDI: ffffffffc02a9ac0
  RBP: 0000000000000080 R08: 0000000000000000 R09: ffffbdb840797c80
  R10: 0000000000000000 R11: fffffffffffffff5 R12: 0000000000000000
  R13: 0000000000000000 R14: 0000000000000084 R15: ffffffffc02a9a60
   ttm_pool_shrink+0x7d/0x90 [ttm]
   ttm_pool_shrinker_scan+0x5/0x20 [ttm]
   do_shrink_slab+0x13a/0x1a0
...

debugfs shows the incorrect total:

  $ cat /sys/kernel/debug/dri/0/ttm_page_pool
            --- 0--- --- 1--- --- 2--- --- 3--- --- 4--- --- 5--- --- 6--- --- 7--- --- 8--- --- 9--- ---10---
  wc      :        0        0        0        0        0        0        0        0        0        0        0
  uc      :        0        0        0        0        0        0        0        0        0        0        0
  wc 32   :        0        0        0        0        0        0        0        0        0        0        0
  uc 32   :        0        0        0        0        0        0        0        0        0        0        0
  DMA uc  :        0        0        0        0        0        0        0        0        0        0        0
  DMA wc  :        0        0        0        0        0        0        0        0        0        0        0
  DMA     :        0        0        0        0        0        0        0        0        0        0        0

  total   :     3029 of  8244261

Using ttm_pool_type_take() to remove pages from the pool before freeing
them correctly accounts for the freed pages.

Fixes: d099fc8f54 ("drm/ttm: new TT backend allocation pool v3")
Signed-off-by: Anthony DeRossi <ajderossi@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210303011723.22512-1-ajderossi@gmail.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-03-11 11:11:33 +01:00
Christian König 811ee9dff5 drm/ttm: make sure pool pages are cleared
The old implementation wasn't consistend on this.

But it looks like we depend on this so better bring it back.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-and-tested-by: Mike Galbraith <efault@gmx.de>
Fixes: d099fc8f54 ("drm/ttm: new TT backend allocation pool v3")
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210210160549.1462-1-christian.koenig@amd.com
2021-02-11 09:35:19 +01:00
Christian König f07069da6b drm/ttm: move memory accounting into vmwgfx v4
This is just another feature which is only used by VMWGFX, so move
it into the driver instead.

I've tried to add the accounting sysfs file to the kobject of the drm
minor, but I'm not 100% sure if this works as expected.

v2: fix typo in KFD and avoid 64bit divide
v3: fix init order in VMWGFX
v4: use pdev sysfs reference instead of drm

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Zack Rusin <zackr@vmware.com> (v3)
Tested-by: Nirmoy Das <nirmoy.das@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210208133226.36955-2-christian.koenig@amd.com
2021-02-09 17:27:33 +01:00
Michel Dänzer 2b1b3e544f drm/ttm: Use __GFP_NOWARN for huge pages in ttm_pool_alloc_page
Without __GFP_NOWARN, attempts at allocating huge pages can trigger
dmesg splats like below (which are essentially noise, since TTM falls
back to normal pages if it can't get a huge one).

[ 9556.710241] clinfo: page allocation failure: order:9, mode:0x194dc2(GFP_HIGHUSER|__GFP_RETRY_MAYFAIL|__GFP_NORETRY|__GFP_ZERO|__GFP_NOMEMALLOC), nodemask=(null),cpuset=user.slice,mems_allowed=0
[ 9556.710259] CPU: 1 PID: 470821 Comm: clinfo Tainted: G            E     5.10.10+ #4
[ 9556.710264] Hardware name: Micro-Star International Co., Ltd. MS-7A34/B350 TOMAHAWK (MS-7A34), BIOS 1.OR 11/29/2019
[ 9556.710268] Call Trace:
[ 9556.710281]  dump_stack+0x6b/0x83
[ 9556.710288]  warn_alloc.cold+0x7b/0xdf
[ 9556.710297]  ? __alloc_pages_direct_compact+0x137/0x150
[ 9556.710303]  __alloc_pages_slowpath.constprop.0+0xc1b/0xc50
[ 9556.710312]  __alloc_pages_nodemask+0x2ec/0x320
[ 9556.710325]  ttm_pool_alloc+0x2e4/0x5e0 [ttm]
[ 9556.710332]  ? kvmalloc_node+0x46/0x80
[ 9556.710341]  ttm_tt_populate+0x37/0xe0 [ttm]
[ 9556.710350]  ttm_bo_handle_move_mem+0x142/0x180 [ttm]
[ 9556.710359]  ttm_bo_validate+0x11d/0x190 [ttm]
[ 9556.710391]  ? drm_vma_offset_add+0x2f/0x60 [drm]
[ 9556.710399]  ttm_bo_init_reserved+0x2a7/0x320 [ttm]
[ 9556.710529]  amdgpu_bo_do_create+0x1b8/0x500 [amdgpu]
[ 9556.710657]  ? amdgpu_bo_subtract_pin_size+0x60/0x60 [amdgpu]
[ 9556.710663]  ? get_page_from_freelist+0x11f9/0x1450
[ 9556.710789]  amdgpu_bo_create+0x40/0x270 [amdgpu]
[ 9556.710797]  ? _raw_spin_unlock+0x16/0x30
[ 9556.710927]  amdgpu_gem_create_ioctl+0x123/0x310 [amdgpu]
[ 9556.711062]  ? amdgpu_gem_force_release+0x150/0x150 [amdgpu]
[ 9556.711098]  drm_ioctl_kernel+0xaa/0xf0 [drm]
[ 9556.711133]  drm_ioctl+0x20f/0x3a0 [drm]
[ 9556.711267]  ? amdgpu_gem_force_release+0x150/0x150 [amdgpu]
[ 9556.711276]  ? preempt_count_sub+0x9b/0xd0
[ 9556.711404]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[ 9556.711411]  __x64_sys_ioctl+0x83/0xb0
[ 9556.711417]  do_syscall_64+0x33/0x80
[ 9556.711421]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: bf9eee249a ("drm/ttm: stop using GFP_TRANSHUGE_LIGHT")
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/416353/
2021-01-28 13:01:52 +01:00
Christian König f987c9e0f5 drm/ttm: optimize ttm pool shrinker a bit
Only initialize the DMA coherent pools if they are used.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/414957/
2021-01-20 12:57:13 +01:00
Christian König 568517686f drm/ttm: add debugfs entry to test pool shrinker v2
Useful for testing.

v2: add fs_reclaim_acquire()/_release()

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/414955/
2021-01-20 12:57:03 +01:00
Christian König ba051901d1 drm/ttm: add a debugfs file for the global page pools
Instead of printing this on the per device pool.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/414956/
2021-01-20 12:56:53 +01:00
Christian König bf9eee249a drm/ttm: stop using GFP_TRANSHUGE_LIGHT
The only flag we really need is __GFP_NOMEMALLOC, highmem depends on
dma32 and moveable/compound should never be set in the first place.

Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/413812/
Link: https://patchwork.freedesktop.org/patch/413964/
Fixes: d099fc8f54 ("drm/ttm: new TT backend allocation pool v3")
Reported-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-01-18 09:26:21 +01:00
Christian König bb52cb0dec drm/ttm: make the pool shrinker lock a mutex
set_pages_wb() might sleep and so we can't do this in an atomic context.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Fixes: d099fc8f54 ("drm/ttm: new TT backend allocation pool v3")
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/413409/
2021-01-12 14:02:08 +01:00
Jeremy Cline 843010a815 drm/ttm: Fix address passed to dma_mapping_error() in ttm_pool_map()
check_unmap() is producing a warning about a missing map error check.
The return value from dma_map_page() should be checked for an error, not
the caller-provided dma_addr.

Fixes: d099fc8f54 ("drm/ttm: new TT backend allocation pool v3")
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/413432/
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-01-11 21:18:27 +01:00
Christian König a73858ef4d drm/ttm: unexport ttm_pool_init/fini
Drivers are not supposed to use this directly any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/412156/
2021-01-07 14:25:43 +01:00
Arnd Bergmann 846f151d03 drm/ttm: fix unused function warning
ttm_pool_type_count() is not used when debugfs is disabled:

drivers/gpu/drm/ttm/ttm_pool.c:243:21: error: unused function 'ttm_pool_type_count' [-Werror,-Wunused-function]
static unsigned int ttm_pool_type_count(struct ttm_pool_type *pt)

Move the definition into the #ifdef block.

Fixes: d099fc8f54 ("drm/ttm: new TT backend allocation pool v3")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/405695/
Signed-off-by: Christian König <christian.koenig@amd.com>
2020-12-16 16:24:25 +01:00
Christian König 3e3e59ef0c drm/ttm: fix DMA32 handling in the global page pool
When we have mixed DMA32 and non DMA32 device in one system
it could otherwise happen that the DMA32 device gets pages
it can't work with.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/401317/
2020-11-19 13:09:29 +01:00
Christian König e3e043992c drm/ttm: fix missing NULL check in the new page pool
The pool parameter can be NULL if we free through the shrinker.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Martin Peres <martin.peres@mupuf.org>
Acked-by: Martin Peres <martin.peres@mupuf.org>
Reported-by: Andy Lavr <andy.lavr@gmail.com>
Tested-by: Andy Lavr <andy.lavr@gmail.com>
Link: https://patchwork.freedesktop.org/patch/399365/
2020-11-13 17:01:48 +01:00
Christian König 586052b0a6 drm/ttm: rework no_retry handling v2
During eviction we do want to trigger the OOM killer.

Only while doing new allocations we should try to avoid that and
return -ENOMEM to the application.

v2: rename the flag to gfp_retry_mayfail.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/398685/
2020-11-04 11:22:46 +01:00
Christian König d099fc8f54 drm/ttm: new TT backend allocation pool v3
This replaces the spaghetti code in the two existing page pools.

First of all depending on the allocation size it is between 3 (1GiB) and
5 (1MiB) times faster than the old implementation.

It makes better use of buddy pages to allow for larger physical contiguous
allocations which should result in better TLB utilization at least for
amdgpu.

Instead of a completely braindead approach of filling the pool with one
CPU while another one is trying to shrink it we only give back freed
pages.

This also results in much less locking contention and a trylock free MM
shrinker callback, so we can guarantee that pages are given back to the
system when needed.

Downside of this is that it takes longer for many small allocations until
the pool is filled up. We could address this, but I couldn't find an use
case where this actually matters. We also don't bother freeing large
chunks of pages any more since the CPU overhead in that path isn't really
that important.

The sysfs files are replaced with a single module parameter, allowing
users to override how many pages should be globally pooled in TTM. This
unfortunately breaks the UAPI slightly, but as far as we know nobody ever
depended on this.

Zeroing memory coming from the pool was handled inconsistently. The
alloc_pages() based pool was zeroing it, the dma_alloc_attr() based one
wasn't. For now the new implementation isn't zeroing pages from the pool
either and only sets the __GFP_ZERO flag when necessary.

The implementation has only 768 lines of code compared to the over 2600
of the old one, and also allows for saving quite a bunch of code in the
drivers since we don't need specialized handling there any more based on
kernel config.

Additional to all of that there was a neat bug with IOMMU, coherent DMA
mappings and huge pages which is now fixed in the new code as well.

v2: make ttm_pool_apply_caching static as reported by the kernel bot, add
    some more checks
v3: fix some more checkpatch.pl warnings

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com>
Tested-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/397080/?series=83051&rev=1
2020-10-29 15:52:51 +01:00