Currently drivers get called to move a buffer, but if they have to
move it temporarily through another space (SYSTEM->VRAM via TT)
then they can end up with a lot of ttm->driver->ttm call stacks,
if the temprorary space moves requires eviction.
Instead of letting the driver do all the placement/space for the
temporary, allow it to report back (-EMULTIHOP) and a placement (hop)
to the move code, which will then do the temporary move, and the
correct placement move afterwards.
This removes a lot of code from drivers, at the expense of
adding some midlayering. I've some further ideas on how to turn
it inside out, but I think this is a good solution to the call
stack problems.
v2: separate out the driver patches, add WARN for getting
MULTHOP in paths we shouldn't (Daniel)
v3: use memset (Christian)
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: hristian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201109005432.861936-2-airlied@gmail.com
It should be able to handle all cases now.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com>
Tested-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/397083/?series=83051&rev=1
It makes no difference to kmalloc if the structure
is 48 or 64 bytes in size.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/396950/
The move notify callback is only used in one place, this should
be removed in the future, but for now just rename it to the use
case which is to notify the driver that the GPU memory is to be
deleted.
Drivers can be cleaned up after this separately.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021044031.1752624-2-airlied@gmail.com
This moves the to system move into the drivers, and moves all
the unbinds in the move path under driver control
Note: radeon/nouveau already wait so don't duplicate it.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020010319.1692445-4-airlied@gmail.com
The typical set of driver updates across the subsystem:
- Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma, hns,
usnic, qib, qedr, cxgb4, hns, bnxt_re
- Various rtrs fixes and updates
- Bug fix for mlx4 CM emulation for virtualization scenarios where MRA
wasn't working right
- Use tracepoints instead of pr_debug in the CM code
- Scrub the locking in ucma and cma to close more syzkaller bugs
- Use tasklet_setup in the subsystem
- Revert the idea that 'destroy' operations are not allowed to fail at
the driver level. This proved unworkable from a HW perspective.
- Revise how the umem API works so drivers make fewer mistakes using it
- XRC support for qedr
- Convert uverbs objects RWQ and MW to new the allocation scheme
- Large queue entry sizes for hns
- Use hmm_range_fault() for mlx5 On Demand Paging
- uverbs APIs to inspect the GID table instead of sysfs
- Move some of the RDMA code for building large page SGLs into
lib/scatterlist
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl+J37MACgkQOG33FX4g
mxrKfRAAnIecwdE8df0yvVU5k0Eg6qVjMy9MMHq4va9m7g6GpUcNNI0nIlOASxH2
l+9vnUQS3ebgsPeECaDYzEr0hh/u53+xw2g4WV5ts/hE8KkQ6erruXb9kasCe8yi
5QWJ9K36T3c03Cd3EeH6JVtytAxuH42ombfo9BkFLPVyfG/R2tsAzvm5pVi73lxk
46wtU1Bqi4tsLhyCbifn1huNFGbHp08OIBPAIKPUKCA+iBRPaWS+Dpi+93h3g3Bp
oJwDhL9CBCGcHM+rKWLzek3Dy87FnQn7R1wmTpUFwkK+4AH3U/XazivhX035w1vL
YJyhakVU0kosHlX9hJTNKDHJGkt0YEV2mS8dxAuqilFBtdnrVszb5/MirvlzC310
/b5xCPSEusv9UVZV0G4zbySVNA9knZ4YaRiR3VDVMLKl/pJgTOwEiHIIx+vs3ejk
p8GRWa1SjXw5LfZEQcq39J689ljt6xjCTonyuBSv7vSQq5v8pjBxvHxiAe2FIa2a
ZyZeSCYoSh0SwJQukO2VO7aprhHP3TcCJ/987+X03LQ8tV2VWPktHqm62YCaDcOl
fgiQuQdPivRjDDkJgMfDWDGKfZeHoWLKl5XsJhWByt0lablVrsvc+8ylUl1UI7gI
16hWB/Qtlhfwg10VdApn+aOFpIS+s5P4XIp8ik57MZO+VeJzpmE=
=LKpl
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"A usual cycle for RDMA with a typical mix of driver and core subsystem
updates:
- Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma,
hns, usnic, qib, qedr, cxgb4, hns, bnxt_re
- Various rtrs fixes and updates
- Bug fix for mlx4 CM emulation for virtualization scenarios where
MRA wasn't working right
- Use tracepoints instead of pr_debug in the CM code
- Scrub the locking in ucma and cma to close more syzkaller bugs
- Use tasklet_setup in the subsystem
- Revert the idea that 'destroy' operations are not allowed to fail
at the driver level. This proved unworkable from a HW perspective.
- Revise how the umem API works so drivers make fewer mistakes using
it
- XRC support for qedr
- Convert uverbs objects RWQ and MW to new the allocation scheme
- Large queue entry sizes for hns
- Use hmm_range_fault() for mlx5 On Demand Paging
- uverbs APIs to inspect the GID table instead of sysfs
- Move some of the RDMA code for building large page SGLs into
lib/scatterlist"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (191 commits)
RDMA/ucma: Fix use after free in destroy id flow
RDMA/rxe: Handle skb_clone() failure in rxe_recv.c
RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI
RDMA: Explicitly pass in the dma_device to ib_register_device
lib/scatterlist: Do not limit max_segment to PAGE_ALIGNED values
IB/mlx4: Convert rej_tmout radix-tree to XArray
RDMA/rxe: Fix bug rejecting all multicast packets
RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()
RDMA/rxe: Remove duplicate entries in struct rxe_mr
IB/hfi,rdmavt,qib,opa_vnic: Update MAINTAINERS
IB/rdmavt: Fix sizeof mismatch
MAINTAINERS: CISCO VIC LOW LATENCY NIC DRIVER
RDMA/bnxt_re: Fix sizeof mismatch for allocation of pbl_tbl.
RDMA/bnxt_re: Use rdma_umem_for_each_dma_block()
RDMA/umem: Move to allocate SG table from pages
lib/scatterlist: Add support in dynamic allocation of SG table from pages
tools/testing/scatterlist: Show errors in human readable form
tools/testing/scatterlist: Rejuvenate bit-rotten test
RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces
RDMA/uverbs: Expose the new GID query API to user space
...
Changing the caching on the fly never really worked
flawlessly.
So stop this completely and just let drivers specific the
desired caching in the tt or bus object.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/394256/
And implement setting it up correctly in the drivers.
This allows getting rid of the placement flags for this.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/394254/
All drivers can determine the tt caching state at creation time,
no need to do this on the fly during every validation.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/394253/
Extend __sg_alloc_table_from_pages to support dynamic allocation of
SG table from pages. It should be used by drivers that can't supply
all the pages at one time.
This function returns the last populated SGE in the table. Users should
pass it as an argument to the function from the second call and forward.
As before, nents will be equal to the number of populated SGEs (chunks).
With this new extension, drivers can benefit the optimization of merging
contiguous pages without a need to allocate all pages in advance and
hold them in a large buffer.
E.g. with the Infiniband driver that allocates a single page for hold the
pages. For 1TB memory registration, the temporary buffer would consume only
4KB, instead of 2GB.
Link: https://lore.kernel.org/r/20201004154340.1080481-2-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Implement in the driver instead since it is the only user of that function.
v2: fix usage of ttm_bo_init_reserved
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/391614/?series=81973&rev=1
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- dev: More devm_drm convertions and removal of drm_dev_init
Driver Changes:
- i915: selftests improvements
- panfrost: support for Amlogic SoC
- vc4: one fix
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCX2jGxQAKCRDj7w1vZxhR
xR3DAQCiZOnaxVcY49iG4343Z1aHHaIEShbnB0bDdaWstn7kiQD/UXBXUoOSFoFQ
FkTsW31JsdXNnWP5e6/eJd2Lb6waVAA=
=VlsU
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2020-09-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.10:
UAPI Changes:
Cross-subsystem Changes:
- virtio: Merged a PR for patches that will affect drm/virtio
Core Changes:
- dev: More devm_drm convertions and removal of drm_dev_init
- atomic: Split out drm_atomic_helper_calc_timestamping_constants of
drm_atomic_helper_update_legacy_modeset_state
- ttm: More rework
Driver Changes:
- i915: selftests improvements
- panfrost: support for Amlogic SoC
- vc4: one fix
- tree-wide: conversions to devm_drm_dev_alloc,
- ast: simplifications of the atomic modesetting code
- panfrost: multiple fixes
- vc4: multiple fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921152956.2gxnsdgxmwhvjyut@gilmour.lan
This moves the generic tracking into the drivers and protects
against reentrancy in the drivers. It fixes up radeon and agp
to be able to query the bound status as that is required.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-2-airlied@gmail.com
Please pull a set of fixes for various DRM drivers that finally resolve
incorrect usage of the scatterlists (struct sg_table nents and orig_nents
entries), what causes issues when IOMMU is used.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200910080505.24456-1-m.szyprowski@samsung.com
This adds 2 getters and 4 setters, however unbound and populated
are currently the same thing, this will change, it also drops
a BUG_ON that seems not that useful.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-2-airlied@gmail.com
It's not supported to specify more than one of those flags.
So it never made sense to make this a flag in the first place.
Nuke the flags and specify directly which memory type to use.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/389826/?series=81551&rev=1
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().
struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).
It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.
To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
This is used by TTM to communicate the physical address
which should be used with ioremap(), ioremap_wc(). We don't
need to separate the base and offset in any way here.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/389457/
Backmerging drm-next into drm-misc-next for nouveau and panel updates.
Resolves a conflict between ttm and nouveau, where struct ttm_mem_res got
renamed to struct ttm_resource.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
This name better reflects what the object does. I didn't rename
all the pointers it seemed too messy.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-60-airlied@gmail.com
It is a very strange concept to call a function which just
calls back the caller for the functions parameters.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/382085/
Instead just initialize the memory type parameters
before calling ttm_bo_init_mm.
v2: keep extra system domain handling
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/382082/
Instead use a boolean field in the memory manager structure.
Also invert the meaning of the field since the use of a TT
structure is the special case here.
v2: cleanup zero init.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/382079/
Instead of repeating that in each driver.
v2: keep the caching limitation for VMWGFX for now.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/382078/
The driver doesn't expose any not-mapable memory resources.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/378241/
The original intention was to avoid CPU page table unmaps
when BOs move between the GTT and SYSTEM domain.
The problem is that this never correctly handled changes
in the caching attributes or backing pages.
Just drop this for now and simply unmap the CPU page
tables in all cases.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/378240/
Implementing those is completely unnecessary.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com>
Link: https://patchwork.freedesktop.org/patch/378236/
Calculate GPU offset within vmwgfx driver itself without depending on
bo->offset.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/372933/
Signed-off-by: Christian König <christian.koenig@amd.com>
Start using the helpers that align buffer object user-space addresses and
buffer object vram addresses to huge page boundaries.
This is to improve the chances of allowing huge page-table entries.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Another completely unused feature.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/348265/
When building sg tables, honor the device sg list segment size limitation.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Use struct sg_dma_page_iter in favour struct of sg_page_iter, which fairly
recently was declared useless for obtaining dma addresses.
With a struct sg_dma_page_iter we can't call sg_page_iter_page() so
when the page is needed, use the same page lookup mechanism as for the
non-sg dma modes instead of calling sg_dma_page_iter.
Note, the fixes tag doesn't really point to a commit introducing a
failure / regression, but rather to a commit that implemented a simple
workaround for this problem.
Cc: Jason Gunthorpe <jgg@mellanox.com>
Fixes: d901b2760d ("lib/scatterlist: Provide a DMA page iterator")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Commit 2db76d7c3c ("lib/scatterlist: sg_page_iter: support sg lists w/o
backing pages") introduced the sg_page_iter_dma_address() function without
providing a way to use it in the general case. If the sg_dma_len() is not
equal to the sg length callers cannot safely use the
for_each_sg_page/sg_page_iter_dma_address combination.
Resolve this API mistake by providing a DMA specific iterator,
for_each_sg_dma_page(), that uses the right length so
sg_page_iter_dma_address() works as expected with all sglists.
A new iterator type is introduced to provide compile-time safety against
wrongly mixing accessors and iterators.
Acked-by: Christoph Hellwig <hch@lst.de> (for scatterlist)
Acked-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> (ipu3-cio2)
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>