2019-06-03 13:44:50 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0-only */
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
/*
|
|
|
|
* Header file for dma buffer sharing framework.
|
|
|
|
*
|
|
|
|
* Copyright(C) 2011 Linaro Limited. All rights reserved.
|
|
|
|
* Author: Sumit Semwal <sumit.semwal@ti.com>
|
|
|
|
*
|
|
|
|
* Many thanks to linaro-mm-sig list, and specially
|
|
|
|
* Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
|
|
|
|
* Daniel Vetter <daniel@ffwll.ch> for their support in creation and
|
|
|
|
* refining of this idea.
|
|
|
|
*/
|
|
|
|
#ifndef __DMA_BUF_H__
|
|
|
|
#define __DMA_BUF_H__
|
|
|
|
|
2022-02-05 01:05:41 +08:00
|
|
|
#include <linux/iosys-map.h>
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
#include <linux/file.h>
|
|
|
|
#include <linux/err.h>
|
|
|
|
#include <linux/scatterlist.h>
|
|
|
|
#include <linux/list.h>
|
|
|
|
#include <linux/dma-mapping.h>
|
2012-03-17 00:04:41 +08:00
|
|
|
#include <linux/fs.h>
|
2016-10-25 20:00:45 +08:00
|
|
|
#include <linux/dma-fence.h>
|
2014-07-01 18:57:43 +08:00
|
|
|
#include <linux/wait.h>
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
|
2012-01-31 00:46:54 +08:00
|
|
|
struct device;
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
struct dma_buf;
|
|
|
|
struct dma_buf_attachment;
|
|
|
|
|
|
|
|
/**
|
|
|
|
* struct dma_buf_ops - operations possible on struct dma_buf
|
2012-05-23 17:57:40 +08:00
|
|
|
* @vmap: [optional] creates a virtual mapping for the buffer into kernel
|
|
|
|
* address space. Same restrictions as for vmap and friends apply.
|
|
|
|
* @vunmap: [optional] unmaps a vmap from the buffer
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
*/
|
|
|
|
struct dma_buf_ops {
|
2018-07-03 22:42:26 +08:00
|
|
|
/**
|
|
|
|
* @cache_sgt_mapping:
|
|
|
|
*
|
|
|
|
* If true the framework will cache the first mapping made for each
|
|
|
|
* attachment. This avoids creating mappings for attachments multiple
|
|
|
|
* times.
|
|
|
|
*/
|
|
|
|
bool cache_sgt_mapping;
|
|
|
|
|
2016-12-10 02:53:07 +08:00
|
|
|
/**
|
|
|
|
* @attach:
|
|
|
|
*
|
|
|
|
* This is called from dma_buf_attach() to make sure that a given
|
2018-05-28 17:47:52 +08:00
|
|
|
* &dma_buf_attachment.dev can access the provided &dma_buf. Exporters
|
|
|
|
* which support buffer objects in special locations like VRAM or
|
|
|
|
* device-specific carveout areas should check whether the buffer could
|
|
|
|
* be move to system memory (or directly accessed by the provided
|
|
|
|
* device), and otherwise need to fail the attach operation.
|
2016-12-10 02:53:07 +08:00
|
|
|
*
|
|
|
|
* The exporter should also in general check whether the current
|
2021-08-09 20:22:46 +08:00
|
|
|
* allocation fulfills the DMA constraints of the new device. If this
|
2016-12-10 02:53:07 +08:00
|
|
|
* is not the case, and the allocation cannot be moved, it should also
|
|
|
|
* fail the attach operation.
|
|
|
|
*
|
2016-12-30 04:48:25 +08:00
|
|
|
* Any exporter-private housekeeping data can be stored in the
|
|
|
|
* &dma_buf_attachment.priv pointer.
|
2016-12-10 02:53:07 +08:00
|
|
|
*
|
|
|
|
* This callback is optional.
|
|
|
|
*
|
|
|
|
* Returns:
|
|
|
|
*
|
|
|
|
* 0 on success, negative error code on failure. It might return -EBUSY
|
|
|
|
* to signal that backing storage is already allocated and incompatible
|
|
|
|
* with the requirements of requesting device.
|
|
|
|
*/
|
2018-05-28 17:47:52 +08:00
|
|
|
int (*attach)(struct dma_buf *, struct dma_buf_attachment *);
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
|
2016-12-10 02:53:07 +08:00
|
|
|
/**
|
|
|
|
* @detach:
|
|
|
|
*
|
|
|
|
* This is called by dma_buf_detach() to release a &dma_buf_attachment.
|
|
|
|
* Provided so that exporters can clean up any housekeeping for an
|
|
|
|
* &dma_buf_attachment.
|
|
|
|
*
|
|
|
|
* This callback is optional.
|
|
|
|
*/
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
|
|
|
|
|
2018-07-03 22:42:26 +08:00
|
|
|
/**
|
|
|
|
* @pin:
|
|
|
|
*
|
2020-12-11 23:58:41 +08:00
|
|
|
* This is called by dma_buf_pin() and lets the exporter know that the
|
2021-10-12 20:09:01 +08:00
|
|
|
* DMA-buf can't be moved any more. Ideally, the exporter should
|
|
|
|
* pin the buffer so that it is generally accessible by all
|
2020-12-11 23:58:43 +08:00
|
|
|
* devices.
|
2018-07-03 22:42:26 +08:00
|
|
|
*
|
2020-12-11 23:58:41 +08:00
|
|
|
* This is called with the &dmabuf.resv object locked and is mutual
|
2020-02-18 23:57:24 +08:00
|
|
|
* exclusive with @cache_sgt_mapping.
|
2018-07-03 22:42:26 +08:00
|
|
|
*
|
2020-12-11 23:58:43 +08:00
|
|
|
* This is called automatically for non-dynamic importers from
|
|
|
|
* dma_buf_attach().
|
2018-07-03 22:42:26 +08:00
|
|
|
*
|
2021-06-21 23:17:58 +08:00
|
|
|
* Note that similar to non-dynamic exporters in their @map_dma_buf
|
|
|
|
* callback the driver must guarantee that the memory is available for
|
|
|
|
* use and cleared of any old data by the time this function returns.
|
|
|
|
* Drivers which pipeline their buffer moves internally must wait for
|
|
|
|
* all moves and clears to complete.
|
|
|
|
*
|
2018-07-03 22:42:26 +08:00
|
|
|
* Returns:
|
|
|
|
*
|
|
|
|
* 0 on success, negative error code on failure.
|
|
|
|
*/
|
|
|
|
int (*pin)(struct dma_buf_attachment *attach);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @unpin:
|
|
|
|
*
|
2020-12-11 23:58:41 +08:00
|
|
|
* This is called by dma_buf_unpin() and lets the exporter know that the
|
2018-07-03 22:42:26 +08:00
|
|
|
* DMA-buf can be moved again.
|
|
|
|
*
|
2020-02-18 23:57:24 +08:00
|
|
|
* This is called with the dmabuf->resv object locked and is mutual
|
|
|
|
* exclusive with @cache_sgt_mapping.
|
2018-07-03 22:42:26 +08:00
|
|
|
*
|
|
|
|
* This callback is optional.
|
|
|
|
*/
|
|
|
|
void (*unpin)(struct dma_buf_attachment *attach);
|
|
|
|
|
2016-12-10 02:53:07 +08:00
|
|
|
/**
|
|
|
|
* @map_dma_buf:
|
|
|
|
*
|
|
|
|
* This is called by dma_buf_map_attachment() and is used to map a
|
|
|
|
* shared &dma_buf into device address space, and it is mandatory. It
|
2018-07-03 22:42:26 +08:00
|
|
|
* can only be called if @attach has been called successfully.
|
2016-12-10 02:53:07 +08:00
|
|
|
*
|
|
|
|
* This call may sleep, e.g. when the backing storage first needs to be
|
|
|
|
* allocated, or moved to a location suitable for all currently attached
|
|
|
|
* devices.
|
|
|
|
*
|
|
|
|
* Note that any specific buffer attributes required for this function
|
|
|
|
* should get added to device_dma_parameters accessible via
|
2016-12-30 04:48:25 +08:00
|
|
|
* &device.dma_params from the &dma_buf_attachment. The @attach callback
|
2016-12-10 02:53:07 +08:00
|
|
|
* should also check these constraints.
|
|
|
|
*
|
|
|
|
* If this is being called for the first time, the exporter can now
|
|
|
|
* choose to scan through the list of attachments for this buffer,
|
|
|
|
* collate the requirements of the attached devices, and choose an
|
|
|
|
* appropriate backing storage for the buffer.
|
|
|
|
*
|
|
|
|
* Based on enum dma_data_direction, it might be possible to have
|
|
|
|
* multiple users accessing at the same time (for reading, maybe), or
|
|
|
|
* any other kind of sharing that the exporter might wish to make
|
|
|
|
* available to buffer-users.
|
|
|
|
*
|
dma-buf: change DMA-buf locking convention v3
This patch is a stripped down version of the locking changes
necessary to support dynamic DMA-buf handling.
It adds a dynamic flag for both importers as well as exporters
so that drivers can choose if they want the reservation object
locked or unlocked during mapping of attachments.
For compatibility between drivers we cache the DMA-buf mapping
during attaching an importer as soon as exporter/importer
disagree on the dynamic handling.
Issues and solutions we considered:
- We can't change all existing drivers, and existing improters have
strong opinions about which locks they're holding while calling
dma_buf_attachment_map/unmap. Exporters also have strong opinions about
which locks they can acquire in their ->map/unmap callbacks, levaing no
room for change. The solution to avoid this was to move the
actual map/unmap out from this call, into the attach/detach callbacks,
and cache the mapping. This works because drivers don't call
attach/detach from deep within their code callchains (like deep in
memory management code called from cs/execbuf ioctl), but directly from
the fd2handle implementation.
- The caching has some troubles on some soc drivers, which set other modes
than DMA_BIDIRECTIONAL. We can't have 2 incompatible mappings, and we
can't re-create the mapping at _map time due to the above locking fun.
We very carefuly step around that by only caching at attach time if the
dynamic mode between importer/expoert mismatches.
- There's been quite some discussion on dma-buf mappings which need active
cache management, which would all break down when caching, plus we don't
have explicit flush operations on the attachment side. The solution to
this was to shrug and keep the current discrepancy between what the
dma-buf docs claim and what implementations do, with the hope that the
begin/end_cpu_access hooks are good enough and that all necessary
flushing to keep device mappings consistent will be done there.
v2: cleanup set_name merge, improve kerneldoc
v3: update commit message, kerneldoc and cleanup _debug_show()
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/336788/
2018-07-03 22:42:26 +08:00
|
|
|
* This is always called with the dmabuf->resv object locked when
|
|
|
|
* the dynamic_mapping flag is true.
|
|
|
|
*
|
2021-06-21 23:17:58 +08:00
|
|
|
* Note that for non-dynamic exporters the driver must guarantee that
|
|
|
|
* that the memory is available for use and cleared of any old data by
|
|
|
|
* the time this function returns. Drivers which pipeline their buffer
|
|
|
|
* moves internally must wait for all moves and clears to complete.
|
|
|
|
* Dynamic exporters do not need to follow this rule: For non-dynamic
|
|
|
|
* importers the buffer is already pinned through @pin, which has the
|
|
|
|
* same requirements. Dynamic importers otoh are required to obey the
|
|
|
|
* dma_resv fences.
|
|
|
|
*
|
2016-12-10 02:53:07 +08:00
|
|
|
* Returns:
|
|
|
|
*
|
2021-08-09 20:22:46 +08:00
|
|
|
* A &sg_table scatter list of the backing storage of the DMA buffer,
|
2016-12-10 02:53:07 +08:00
|
|
|
* already mapped into the device address space of the &device attached
|
2020-10-15 00:16:01 +08:00
|
|
|
* with the provided &dma_buf_attachment. The addresses and lengths in
|
|
|
|
* the scatter list are PAGE_SIZE aligned.
|
2016-12-10 02:53:07 +08:00
|
|
|
*
|
|
|
|
* On failure, returns a negative error value wrapped into a pointer.
|
|
|
|
* May also return -EINTR when a signal was received while being
|
|
|
|
* blocked.
|
2021-01-16 00:47:39 +08:00
|
|
|
*
|
|
|
|
* Note that exporters should not try to cache the scatter list, or
|
|
|
|
* return the same one for multiple calls. Caching is done either by the
|
|
|
|
* DMA-BUF code (for non-dynamic importers) or the importer. Ownership
|
|
|
|
* of the scatter list is transferred to the caller, and returned by
|
|
|
|
* @unmap_dma_buf.
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
*/
|
|
|
|
struct sg_table * (*map_dma_buf)(struct dma_buf_attachment *,
|
2016-12-10 02:53:07 +08:00
|
|
|
enum dma_data_direction);
|
|
|
|
/**
|
|
|
|
* @unmap_dma_buf:
|
|
|
|
*
|
|
|
|
* This is called by dma_buf_unmap_attachment() and should unmap and
|
|
|
|
* release the &sg_table allocated in @map_dma_buf, and it is mandatory.
|
2021-08-09 20:22:46 +08:00
|
|
|
* For static dma_buf handling this might also unpin the backing
|
2018-07-03 22:42:26 +08:00
|
|
|
* storage if this is the last mapping of the DMA buffer.
|
2016-12-10 02:53:07 +08:00
|
|
|
*/
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
void (*unmap_dma_buf)(struct dma_buf_attachment *,
|
2016-12-10 02:53:07 +08:00
|
|
|
struct sg_table *,
|
|
|
|
enum dma_data_direction);
|
|
|
|
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
/* TODO: Add try_map_dma_buf version, to return immed with -EBUSY
|
|
|
|
* if the call would block.
|
|
|
|
*/
|
|
|
|
|
2016-12-10 02:53:07 +08:00
|
|
|
/**
|
|
|
|
* @release:
|
|
|
|
*
|
|
|
|
* Called after the last dma_buf_put to release the &dma_buf, and
|
|
|
|
* mandatory.
|
|
|
|
*/
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
void (*release)(struct dma_buf *);
|
|
|
|
|
2016-12-10 02:53:08 +08:00
|
|
|
/**
|
|
|
|
* @begin_cpu_access:
|
|
|
|
*
|
|
|
|
* This is called from dma_buf_begin_cpu_access() and allows the
|
2020-12-11 23:58:40 +08:00
|
|
|
* exporter to ensure that the memory is actually coherent for cpu
|
|
|
|
* access. The exporter also needs to ensure that cpu access is coherent
|
|
|
|
* for the access direction. The direction can be used by the exporter
|
|
|
|
* to optimize the cache flushing, i.e. access with a different
|
2016-12-10 02:53:08 +08:00
|
|
|
* direction (read instead of write) might return stale or even bogus
|
|
|
|
* data (e.g. when the exporter needs to copy the data to temporary
|
|
|
|
* storage).
|
|
|
|
*
|
2020-12-11 23:58:40 +08:00
|
|
|
* Note that this is both called through the DMA_BUF_IOCTL_SYNC IOCTL
|
|
|
|
* command for userspace mappings established through @mmap, and also
|
|
|
|
* for kernel mappings established with @vmap.
|
2016-12-10 02:53:08 +08:00
|
|
|
*
|
2020-12-11 23:58:40 +08:00
|
|
|
* This callback is optional.
|
2016-12-10 02:53:08 +08:00
|
|
|
*
|
|
|
|
* Returns:
|
|
|
|
*
|
|
|
|
* 0 on success or a negative error code on failure. This can for
|
|
|
|
* example fail when the backing storage can't be allocated. Can also
|
|
|
|
* return -ERESTARTSYS or -EINTR when the call has been interrupted and
|
|
|
|
* needs to be restarted.
|
|
|
|
*/
|
2015-12-23 05:36:45 +08:00
|
|
|
int (*begin_cpu_access)(struct dma_buf *, enum dma_data_direction);
|
2016-12-10 02:53:08 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @end_cpu_access:
|
|
|
|
*
|
|
|
|
* This is called from dma_buf_end_cpu_access() when the importer is
|
|
|
|
* done accessing the CPU. The exporter can use this to flush caches and
|
2020-12-11 23:58:40 +08:00
|
|
|
* undo anything else done in @begin_cpu_access.
|
2016-12-10 02:53:08 +08:00
|
|
|
*
|
|
|
|
* This callback is optional.
|
|
|
|
*
|
|
|
|
* Returns:
|
|
|
|
*
|
|
|
|
* 0 on success or a negative error code on failure. Can return
|
|
|
|
* -ERESTARTSYS or -EINTR when the call has been interrupted and needs
|
|
|
|
* to be restarted.
|
|
|
|
*/
|
dma-buf, drm, ion: Propagate error code from dma_buf_start_cpu_access()
Drivers, especially i915.ko, can fail during the initial migration of a
dma-buf for CPU access. However, the error code from the driver was not
being propagated back to ioctl and so userspace was blissfully ignorant
of the failure. Rendering corruption ensues.
Whilst fixing the ioctl to return the error code from
dma_buf_start_cpu_access(), also do the same for
dma_buf_end_cpu_access(). For most drivers, dma_buf_end_cpu_access()
cannot fail. i915.ko however, as most drivers would, wants to avoid being
uninterruptible (as would be required to guarrantee no failure when
flushing the buffer to the device). As userspace already has to handle
errors from the SYNC_IOCTL, take advantage of this to be able to restart
the syscall across signals.
This fixes a coherency issue for i915.ko as well as reducing the
uninterruptible hold upon its BKL, the struct_mutex.
Fixes commit c11e391da2a8fe973c3c2398452000bed505851e
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Feb 11 20:04:51 2016 -0200
dma-buf: Add ioctls to allow userspace to flush
Testcase: igt/gem_concurrent_blit/*dmabuf*interruptible
Testcase: igt/prime_mmap_coherency/ioctl-errors
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tiago Vignatti <tiago.vignatti@intel.com>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Daniel Vetter <daniel.vetter@intel.com>
CC: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: intel-gfx@lists.freedesktop.org
Cc: devel@driverdev.osuosl.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1458331359-2634-1-git-send-email-chris@chris-wilson.co.uk
2016-03-19 04:02:39 +08:00
|
|
|
int (*end_cpu_access)(struct dma_buf *, enum dma_data_direction);
|
dma-buf: mmap support
Compared to Rob Clark's RFC I've ditched the prepare/finish hooks
and corresponding ioctls on the dma_buf file. The major reason for
that is that many people seem to be under the impression that this is
also for synchronization with outstanding asynchronous processsing.
I'm pretty massively opposed to this because:
- It boils down reinventing a new rather general-purpose userspace
synchronization interface. If we look at things like futexes, this
is hard to get right.
- Furthermore a lot of kernel code has to interact with this
synchronization primitive. This smells a look like the dri1 hw_lock,
a horror show I prefer not to reinvent.
- Even more fun is that multiple different subsystems would interact
here, so we have plenty of opportunities to create funny deadlock
scenarios.
I think synchronization is a wholesale different problem from data
sharing and should be tackled as an orthogonal problem.
Now we could demand that prepare/finish may only ensure cache
coherency (as Rob intended), but that runs up into the next problem:
We not only need mmap support to facilitate sw-only processing nodes
in a pipeline (without jumping through hoops by importing the dma_buf
into some sw-access only importer), which allows for a nicer
ION->dma-buf upgrade path for existing Android userspace. We also need
mmap support for existing importing subsystems to support existing
userspace libraries. And a loot of these subsystems are expected to
export coherent userspace mappings.
So prepare/finish can only ever be optional and the exporter /needs/
to support coherent mappings. Given that mmap access is always
somewhat fallback-y in nature I've decided to drop this optimization,
instead of just making it optional. If we demonstrate a clear need for
this, supported by benchmark results, we can always add it in again
later as an optional extension.
Other differences compared to Rob's RFC is the above mentioned support
for mapping a dma-buf through facilities provided by the importer.
Which results in mmap support no longer being optional.
Note that this dma-buf mmap patch does _not_ support every possible
insanity an existing subsystem could pull of with mmap: Because it
does not allow to intercept pagefaults and shoot down ptes importing
subsystems can't add some magic of their own at these points (e.g. to
automatically synchronize with outstanding rendering or set up some
special resources). I've done a cursory read through a few mmap
implementions of various subsytems and I'm hopeful that we can avoid
this (and the complexity it'd bring with it).
Additonally I've extended the documentation a bit to explain the hows
and whys of this mmap extension.
In case we ever want to add support for explicitly cache maneged
userspace mmap with a prepare/finish ioctl pair, we could specify that
userspace needs to mmap a different part of the dma_buf, e.g. the
range starting at dma_buf->size up to dma_buf->size*2. This works
because the size of a dma_buf is invariant over it's lifetime. The
exporter would obviously need to fall back to coherent mappings for
both ranges if a legacy clients maps the coherent range and the
architecture cannot suppor conflicting caching policies. Also, this
would obviously be optional and userspace needs to be able to fall
back to coherent mappings.
v2:
- Spelling fixes from Rob Clark.
- Compile fix for !DMA_BUF from Rob Clark.
- Extend commit message to explain how explicitly cache managed mmap
support could be added later.
- Extend the documentation with implementations notes for exporters
that need to manually fake coherency.
v3:
- dma_buf pointer initialization goof-up noticed by Rebecca Schultz
Zavin.
Cc: Rob Clark <rob.clark@linaro.org>
Cc: Rebecca Schultz Zavin <rebecca@android.com>
Acked-by: Rob Clark <rob.clark@linaro.org>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
2012-04-24 17:08:52 +08:00
|
|
|
|
2016-12-10 02:53:08 +08:00
|
|
|
/**
|
|
|
|
* @mmap:
|
|
|
|
*
|
|
|
|
* This callback is used by the dma_buf_mmap() function
|
|
|
|
*
|
|
|
|
* Note that the mapping needs to be incoherent, userspace is expected
|
2021-08-09 20:22:46 +08:00
|
|
|
* to bracket CPU access using the DMA_BUF_IOCTL_SYNC interface.
|
2016-12-10 02:53:08 +08:00
|
|
|
*
|
|
|
|
* Because dma-buf buffers have invariant size over their lifetime, the
|
|
|
|
* dma-buf core checks whether a vma is too large and rejects such
|
|
|
|
* mappings. The exporter hence does not need to duplicate this check.
|
|
|
|
* Drivers do not need to check this themselves.
|
|
|
|
*
|
|
|
|
* If an exporter needs to manually flush caches and hence needs to fake
|
|
|
|
* coherency for mmap support, it needs to be able to zap all the ptes
|
|
|
|
* pointing at the backing storage. Now linux mm needs a struct
|
|
|
|
* address_space associated with the struct file stored in vma->vm_file
|
|
|
|
* to do that with the function unmap_mapping_range. But the dma_buf
|
|
|
|
* framework only backs every dma_buf fd with the anon_file struct file,
|
|
|
|
* i.e. all dma_bufs share the same file.
|
|
|
|
*
|
|
|
|
* Hence exporters need to setup their own file (and address_space)
|
|
|
|
* association by setting vma->vm_file and adjusting vma->vm_pgoff in
|
|
|
|
* the dma_buf mmap callback. In the specific case of a gem driver the
|
|
|
|
* exporter could use the shmem file already provided by gem (and set
|
|
|
|
* vm_pgoff = 0). Exporters can then zap ptes by unmapping the
|
|
|
|
* corresponding range of the struct address_space associated with their
|
|
|
|
* own file.
|
|
|
|
*
|
|
|
|
* This callback is optional.
|
|
|
|
*
|
|
|
|
* Returns:
|
|
|
|
*
|
|
|
|
* 0 on success or a negative error code on failure.
|
|
|
|
*/
|
dma-buf: mmap support
Compared to Rob Clark's RFC I've ditched the prepare/finish hooks
and corresponding ioctls on the dma_buf file. The major reason for
that is that many people seem to be under the impression that this is
also for synchronization with outstanding asynchronous processsing.
I'm pretty massively opposed to this because:
- It boils down reinventing a new rather general-purpose userspace
synchronization interface. If we look at things like futexes, this
is hard to get right.
- Furthermore a lot of kernel code has to interact with this
synchronization primitive. This smells a look like the dri1 hw_lock,
a horror show I prefer not to reinvent.
- Even more fun is that multiple different subsystems would interact
here, so we have plenty of opportunities to create funny deadlock
scenarios.
I think synchronization is a wholesale different problem from data
sharing and should be tackled as an orthogonal problem.
Now we could demand that prepare/finish may only ensure cache
coherency (as Rob intended), but that runs up into the next problem:
We not only need mmap support to facilitate sw-only processing nodes
in a pipeline (without jumping through hoops by importing the dma_buf
into some sw-access only importer), which allows for a nicer
ION->dma-buf upgrade path for existing Android userspace. We also need
mmap support for existing importing subsystems to support existing
userspace libraries. And a loot of these subsystems are expected to
export coherent userspace mappings.
So prepare/finish can only ever be optional and the exporter /needs/
to support coherent mappings. Given that mmap access is always
somewhat fallback-y in nature I've decided to drop this optimization,
instead of just making it optional. If we demonstrate a clear need for
this, supported by benchmark results, we can always add it in again
later as an optional extension.
Other differences compared to Rob's RFC is the above mentioned support
for mapping a dma-buf through facilities provided by the importer.
Which results in mmap support no longer being optional.
Note that this dma-buf mmap patch does _not_ support every possible
insanity an existing subsystem could pull of with mmap: Because it
does not allow to intercept pagefaults and shoot down ptes importing
subsystems can't add some magic of their own at these points (e.g. to
automatically synchronize with outstanding rendering or set up some
special resources). I've done a cursory read through a few mmap
implementions of various subsytems and I'm hopeful that we can avoid
this (and the complexity it'd bring with it).
Additonally I've extended the documentation a bit to explain the hows
and whys of this mmap extension.
In case we ever want to add support for explicitly cache maneged
userspace mmap with a prepare/finish ioctl pair, we could specify that
userspace needs to mmap a different part of the dma_buf, e.g. the
range starting at dma_buf->size up to dma_buf->size*2. This works
because the size of a dma_buf is invariant over it's lifetime. The
exporter would obviously need to fall back to coherent mappings for
both ranges if a legacy clients maps the coherent range and the
architecture cannot suppor conflicting caching policies. Also, this
would obviously be optional and userspace needs to be able to fall
back to coherent mappings.
v2:
- Spelling fixes from Rob Clark.
- Compile fix for !DMA_BUF from Rob Clark.
- Extend commit message to explain how explicitly cache managed mmap
support could be added later.
- Extend the documentation with implementations notes for exporters
that need to manually fake coherency.
v3:
- dma_buf pointer initialization goof-up noticed by Rebecca Schultz
Zavin.
Cc: Rob Clark <rob.clark@linaro.org>
Cc: Rebecca Schultz Zavin <rebecca@android.com>
Acked-by: Rob Clark <rob.clark@linaro.org>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
2012-04-24 17:08:52 +08:00
|
|
|
int (*mmap)(struct dma_buf *, struct vm_area_struct *vma);
|
2012-05-20 15:03:56 +08:00
|
|
|
|
2022-02-05 01:05:41 +08:00
|
|
|
int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map);
|
|
|
|
void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map *map);
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
/**
|
|
|
|
* struct dma_buf - shared buffer object
|
2016-12-10 02:53:07 +08:00
|
|
|
*
|
|
|
|
* This represents a shared buffer, created by calling dma_buf_export(). The
|
|
|
|
* userspace representation is a normal file descriptor, which can be created by
|
|
|
|
* calling dma_buf_fd().
|
|
|
|
*
|
|
|
|
* Shared dma buffers are reference counted using dma_buf_put() and
|
|
|
|
* get_dma_buf().
|
|
|
|
*
|
2016-12-30 04:48:24 +08:00
|
|
|
* Device DMA access is handled by the separate &struct dma_buf_attachment.
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
*/
|
|
|
|
struct dma_buf {
|
2021-06-24 00:17:12 +08:00
|
|
|
/**
|
|
|
|
* @size:
|
|
|
|
*
|
|
|
|
* Size of the buffer; invariant over the lifetime of the buffer.
|
|
|
|
*/
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
size_t size;
|
2021-06-24 00:17:12 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @file:
|
|
|
|
*
|
|
|
|
* File pointer used for sharing buffers across, and for refcounting.
|
|
|
|
* See dma_buf_get() and dma_buf_put().
|
|
|
|
*/
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
struct file *file;
|
2021-06-24 00:17:12 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @attachments:
|
|
|
|
*
|
|
|
|
* List of dma_buf_attachment that denotes all devices attached,
|
|
|
|
* protected by &dma_resv lock @resv.
|
|
|
|
*/
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
struct list_head attachments;
|
2021-06-24 00:17:12 +08:00
|
|
|
|
|
|
|
/** @ops: dma_buf_ops associated with this buffer object. */
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
const struct dma_buf_ops *ops;
|
2021-06-24 00:17:12 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @vmapping_counter:
|
|
|
|
*
|
|
|
|
* Used internally to refcnt the vmaps returned by dma_buf_vmap().
|
|
|
|
* Protected by @lock.
|
|
|
|
*/
|
2012-12-20 21:14:23 +08:00
|
|
|
unsigned vmapping_counter;
|
2021-06-24 00:17:12 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @vmap_ptr:
|
|
|
|
* The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
|
|
|
|
*/
|
2022-02-05 01:05:41 +08:00
|
|
|
struct iosys_map vmap_ptr;
|
2021-06-24 00:17:12 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @exp_name:
|
|
|
|
*
|
|
|
|
* Name of the exporter; useful for debugging. See the
|
|
|
|
* DMA_BUF_SET_NAME IOCTL.
|
|
|
|
*/
|
2013-03-22 20:52:16 +08:00
|
|
|
const char *exp_name;
|
2021-06-24 00:17:12 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @name:
|
|
|
|
*
|
|
|
|
* Userspace-provided name; useful for accounting and debugging,
|
|
|
|
* protected by dma_resv_lock() on @resv and @name_lock for read access.
|
|
|
|
*/
|
2019-06-14 06:34:07 +08:00
|
|
|
const char *name;
|
2021-06-24 00:17:12 +08:00
|
|
|
|
|
|
|
/** @name_lock: Spinlock to protect name acces for read access. */
|
2020-08-31 12:16:55 +08:00
|
|
|
spinlock_t name_lock;
|
2021-06-24 00:17:12 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @owner:
|
|
|
|
*
|
|
|
|
* Pointer to exporter module; used for refcounting when exporter is a
|
|
|
|
* kernel module.
|
|
|
|
*/
|
2015-05-05 17:26:15 +08:00
|
|
|
struct module *owner;
|
2021-06-24 00:17:12 +08:00
|
|
|
|
|
|
|
/** @list_node: node for dma_buf accounting and debugging. */
|
2013-04-04 14:14:37 +08:00
|
|
|
struct list_head list_node;
|
2021-06-24 00:17:12 +08:00
|
|
|
|
|
|
|
/** @priv: exporter specific private data for this buffer object. */
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
void *priv;
|
2021-06-24 00:17:12 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @resv:
|
|
|
|
*
|
|
|
|
* Reservation object linked to this dma-buf.
|
dma-buf: Document dma-buf implicit fencing/resv fencing rules
Docs for struct dma_resv are fairly clear:
"A reservation object can have attached one exclusive fence (normally
associated with write operations) or N shared fences (read
operations)."
https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects
Furthermore a review across all of upstream.
First of render drivers and how they set implicit fences:
- nouveau follows this contract, see in validate_fini_no_ticket()
nouveau_bo_fence(nvbo, fence, !!b->write_domains);
and that last boolean controls whether the exclusive or shared fence
slot is used.
- radeon follows this contract by setting
p->relocs[i].tv.num_shared = !r->write_domain;
in radeon_cs_parser_relocs(), which ensures that the call to
ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
right thing.
- vmwgfx seems to follow this contract with the shotgun approach of
always setting ttm_val_buf->num_shared = 0, which means
ttm_eu_fence_buffer_objects() will only use the exclusive slot.
- etnaviv follows this contract, as can be trivially seen by looking
at submit_attach_object_fences()
- i915 is a bit a convoluted maze with multiple paths leading to
i915_vma_move_to_active(). Which sets the exclusive flag if
EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
softpin mode, or through the write_domain when using relocations. It
follows this contract.
- lima follows this contract, see lima_gem_submit() which sets the
exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
bo
- msm follows this contract, see msm_gpu_submit() which sets the
exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer
- panfrost follows this contract with the shotgun approach of just
always setting the exclusive fence, see
panfrost_attach_object_fences(). Benefits of a single engine I guess
- v3d follows this contract with the same shotgun approach in
v3d_attach_fences_and_unlock_reservation(), but it has at least an
XXX comment that maybe this should be improved
- v4c uses the same shotgun approach of always setting an exclusive
fence, see vc4_update_bo_seqnos()
- vgem also follows this contract, see vgem_fence_attach_ioctl() and
the VGEM_FENCE_WRITE. This is used in some igts to validate prime
sharing with i915.ko without the need of a 2nd gpu
- vritio follows this contract again with the shotgun approach of
always setting an exclusive fence, see virtio_gpu_array_add_fence()
This covers the setting of the exclusive fences when writing.
Synchronizing against the exclusive fence is a lot more tricky, and I
only spot checked a few:
- i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
implicit dependencies (which is used by vulkan)
- etnaviv does this. Implicit dependencies are collected in
submit_fence_sync(), again with an opt-out flag
ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
etnaviv_sched_dependency which is the
drm_sched_backend_ops->dependency callback.
- v4c seems to not do much here, maybe gets away with it by not having
a scheduler and only a single engine. Since all newer broadcom chips than
the OG vc4 use v3d for rendering, which follows this contract, the
impact of this issue is fairly small.
- v3d does this using the drm_gem_fence_array_add_implicit() helper,
which then it's drm_sched_backend_ops->dependency callback
v3d_job_dependency() picks up.
- panfrost is nice here and tracks the implicit fences in
panfrost_job->implicit_fences, which again the
drm_sched_backend_ops->dependency callback panfrost_job_dependency()
picks up. It is mildly questionable though since it only picks up
exclusive fences in panfrost_acquire_object_fences(), but not buggy
in practice because it also always sets the exclusive fence. It
should pick up both sets of fences, just in case there's ever going
to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
pcie port and a real gpu, which might actually happen eventually. A
bug, but easy to fix. Should probably use the
drm_gem_fence_array_add_implicit() helper.
- lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
the same schema as v3d.
- msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
but because it doesn't use the drm/scheduler it handles fences from
the wrong context with a synchronous dma_fence_wait. See
submit_fence_sync() leading to msm_gem_sync_object(). Investing into
a scheduler might be a good idea.
- all the remaining drivers are ttm based, where I hope they do
appropriately obey implicit fences already. I didn't do the full
audit there because a) not follow the contract would confuse ttm
quite well and b) reading non-standard scheduler and submit code
which isn't based on drm/scheduler is a pain.
Onwards to the display side.
- Any driver using the drm_gem_plane_helper_prepare_fb() helper will
correctly. Overwhelmingly most drivers get this right, except a few
totally dont. I'll follow up with a patch to make this the default
and avoid a bunch of bugs.
- I didn't audit the ttm drivers, but given that dma_resv started
there I hope they get this right.
In conclusion this IS the contract, both as documented and
overwhelmingly implemented, specically as implemented by all render
drivers except amdgpu.
Amdgpu tried to fix this already in
commit 049aca4363d8af87cab8d53de5401602db3b9999
Author: Christian König <christian.koenig@amd.com>
Date: Wed Sep 19 16:54:35 2018 +0200
drm/amdgpu: fix using shared fence for exported BOs v2
but this fix falls short on a number of areas:
- It's racy, by the time the buffer is shared it might be too late. To
make sure there's definitely never a problem we need to set the
fences correctly for any buffer that's potentially exportable.
- It's breaking uapi, dma-buf fds support poll() and differentitiate
between, which was introduced in
commit 9b495a5887994a6d74d5c261d012083a92b94738
Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Date: Tue Jul 1 12:57:43 2014 +0200
dma-buf: add poll support, v3
- Christian König wants to nack new uapi building further on this
dma_resv contract because it breaks amdgpu, quoting
"Yeah, and that is exactly the reason why I will NAK this uAPI change.
"This doesn't works for amdgpu at all for the reasons outlined above."
https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/
Rejecting new development because your own driver is broken and
violates established cross driver contracts and uapi is really not
how upstream works.
Now this patch will have a severe performance impact on anything that
runs on multiple engines. So we can't just merge it outright, but need
a bit a plan:
- amdgpu needs a proper uapi for handling implicit fencing. The funny
thing is that to do it correctly, implicit fencing must be treated
as a very strange IPC mechanism for transporting fences, where both
setting the fence and dependency intercepts must be handled
explicitly. Current best practices is a per-bo flag to indicate
writes, and a per-bo flag to to skip implicit fencing in the CS
ioctl as a new chunk.
- Since amdgpu has been shipping with broken behaviour we need an
opt-out flag from the butchered implicit fencing model to enable the
proper explicit implicit fencing model.
- for kernel memory fences due to bo moves at least the i915 idea is
to use ttm_bo->moving. amdgpu probably needs the same.
- since the current p2p dma-buf interface assumes the kernel memory
fence is in the exclusive dma_resv fence slot we need to add a new
fence slot for kernel fences, which must never be ignored. Since
currently only amdgpu supports this there's no real problem here
yet, until amdgpu gains a NO_IMPLICIT CS flag.
- New userspace needs to ship in enough desktop distros so that users
wont notice the perf impact. I think we can ignore LTS distros who
upgrade their kernels but not their mesa3d snapshot.
- Then when this is all in place we can merge this patch here.
What is not a solution to this problem here is trying to make the
dma_resv rules in the kernel more clever. The fundamental issue here
is that the amdgpu CS uapi is the least expressive one across all
drivers (only equalled by panfrost, which has an actual excuse) by not
allowing any userspace control over how implicit sync is conducted.
Until this is fixed it's completely pointless to make the kernel more
clever to improve amdgpu, because all we're doing is papering over
this uapi design issue. amdgpu needs to attain the status quo
established by other drivers first, once that's achieved we can tackle
the remaining issues in a consistent way across drivers.
v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
entirely missed.
This is great because it means the amdgpu specific piece for proper
implicit fence handling exists already, and that since a while. The
only thing that's now missing is
- fishing the implicit fences out of a shared object at the right time
- setting the exclusive implicit fence slot at the right time.
Jason has a patch series to fill that gap with a bunch of generic
ioctl on the dma-buf fd:
https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/
v3: Since Christian has fixed amdgpu now in
commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
Author: Christian König <christian.koenig@amd.com>
Date: Wed Jun 9 13:51:36 2021 +0200
drm/amdgpu: rework dma_resv handling v3
Use the audit covered in this commit message as the excuse to update
the dma-buf docs around dma_buf.resv usage across drivers.
Since dynamic importers have different rules also hammer these in
again while we're at it.
v4:
- Add the missing "through the device" in the dynamic section that I
overlooked.
- Fix a kerneldoc markup mistake, the link didn't connect
v5:
- A few s/should/must/ to make clear what must be done (if the driver
does implicit sync) and what's more a maybe (Daniel Stone)
- drop all the example api discussion, that needs to be expanded,
clarified and put into a new chapter in drm-uapi.rst (Daniel Stone)
Cc: Daniel Stone <daniel@fooishbar.org>
Acked-by: Daniel Stone <daniel@fooishbar.org>
Reviewed-by: Dave Airlie <airlied@redhat.com> (v4)
Reviewed-by: Christian König <christian.koenig@amd.com> (v3)
Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624125246.166721-1-daniel.vetter@ffwll.ch
2021-06-24 20:52:46 +08:00
|
|
|
*
|
|
|
|
* IMPLICIT SYNCHRONIZATION RULES:
|
|
|
|
*
|
|
|
|
* Drivers which support implicit synchronization of buffer access as
|
|
|
|
* e.g. exposed in `Implicit Fence Poll Support`_ must follow the
|
|
|
|
* below rules.
|
|
|
|
*
|
2021-11-09 18:08:18 +08:00
|
|
|
* - Drivers must add a read fence through dma_resv_add_fence() with the
|
|
|
|
* DMA_RESV_USAGE_READ flag for anything the userspace API considers a
|
|
|
|
* read access. This highly depends upon the API and window system.
|
dma-buf: Document dma-buf implicit fencing/resv fencing rules
Docs for struct dma_resv are fairly clear:
"A reservation object can have attached one exclusive fence (normally
associated with write operations) or N shared fences (read
operations)."
https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects
Furthermore a review across all of upstream.
First of render drivers and how they set implicit fences:
- nouveau follows this contract, see in validate_fini_no_ticket()
nouveau_bo_fence(nvbo, fence, !!b->write_domains);
and that last boolean controls whether the exclusive or shared fence
slot is used.
- radeon follows this contract by setting
p->relocs[i].tv.num_shared = !r->write_domain;
in radeon_cs_parser_relocs(), which ensures that the call to
ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
right thing.
- vmwgfx seems to follow this contract with the shotgun approach of
always setting ttm_val_buf->num_shared = 0, which means
ttm_eu_fence_buffer_objects() will only use the exclusive slot.
- etnaviv follows this contract, as can be trivially seen by looking
at submit_attach_object_fences()
- i915 is a bit a convoluted maze with multiple paths leading to
i915_vma_move_to_active(). Which sets the exclusive flag if
EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
softpin mode, or through the write_domain when using relocations. It
follows this contract.
- lima follows this contract, see lima_gem_submit() which sets the
exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
bo
- msm follows this contract, see msm_gpu_submit() which sets the
exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer
- panfrost follows this contract with the shotgun approach of just
always setting the exclusive fence, see
panfrost_attach_object_fences(). Benefits of a single engine I guess
- v3d follows this contract with the same shotgun approach in
v3d_attach_fences_and_unlock_reservation(), but it has at least an
XXX comment that maybe this should be improved
- v4c uses the same shotgun approach of always setting an exclusive
fence, see vc4_update_bo_seqnos()
- vgem also follows this contract, see vgem_fence_attach_ioctl() and
the VGEM_FENCE_WRITE. This is used in some igts to validate prime
sharing with i915.ko without the need of a 2nd gpu
- vritio follows this contract again with the shotgun approach of
always setting an exclusive fence, see virtio_gpu_array_add_fence()
This covers the setting of the exclusive fences when writing.
Synchronizing against the exclusive fence is a lot more tricky, and I
only spot checked a few:
- i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
implicit dependencies (which is used by vulkan)
- etnaviv does this. Implicit dependencies are collected in
submit_fence_sync(), again with an opt-out flag
ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
etnaviv_sched_dependency which is the
drm_sched_backend_ops->dependency callback.
- v4c seems to not do much here, maybe gets away with it by not having
a scheduler and only a single engine. Since all newer broadcom chips than
the OG vc4 use v3d for rendering, which follows this contract, the
impact of this issue is fairly small.
- v3d does this using the drm_gem_fence_array_add_implicit() helper,
which then it's drm_sched_backend_ops->dependency callback
v3d_job_dependency() picks up.
- panfrost is nice here and tracks the implicit fences in
panfrost_job->implicit_fences, which again the
drm_sched_backend_ops->dependency callback panfrost_job_dependency()
picks up. It is mildly questionable though since it only picks up
exclusive fences in panfrost_acquire_object_fences(), but not buggy
in practice because it also always sets the exclusive fence. It
should pick up both sets of fences, just in case there's ever going
to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
pcie port and a real gpu, which might actually happen eventually. A
bug, but easy to fix. Should probably use the
drm_gem_fence_array_add_implicit() helper.
- lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
the same schema as v3d.
- msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
but because it doesn't use the drm/scheduler it handles fences from
the wrong context with a synchronous dma_fence_wait. See
submit_fence_sync() leading to msm_gem_sync_object(). Investing into
a scheduler might be a good idea.
- all the remaining drivers are ttm based, where I hope they do
appropriately obey implicit fences already. I didn't do the full
audit there because a) not follow the contract would confuse ttm
quite well and b) reading non-standard scheduler and submit code
which isn't based on drm/scheduler is a pain.
Onwards to the display side.
- Any driver using the drm_gem_plane_helper_prepare_fb() helper will
correctly. Overwhelmingly most drivers get this right, except a few
totally dont. I'll follow up with a patch to make this the default
and avoid a bunch of bugs.
- I didn't audit the ttm drivers, but given that dma_resv started
there I hope they get this right.
In conclusion this IS the contract, both as documented and
overwhelmingly implemented, specically as implemented by all render
drivers except amdgpu.
Amdgpu tried to fix this already in
commit 049aca4363d8af87cab8d53de5401602db3b9999
Author: Christian König <christian.koenig@amd.com>
Date: Wed Sep 19 16:54:35 2018 +0200
drm/amdgpu: fix using shared fence for exported BOs v2
but this fix falls short on a number of areas:
- It's racy, by the time the buffer is shared it might be too late. To
make sure there's definitely never a problem we need to set the
fences correctly for any buffer that's potentially exportable.
- It's breaking uapi, dma-buf fds support poll() and differentitiate
between, which was introduced in
commit 9b495a5887994a6d74d5c261d012083a92b94738
Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Date: Tue Jul 1 12:57:43 2014 +0200
dma-buf: add poll support, v3
- Christian König wants to nack new uapi building further on this
dma_resv contract because it breaks amdgpu, quoting
"Yeah, and that is exactly the reason why I will NAK this uAPI change.
"This doesn't works for amdgpu at all for the reasons outlined above."
https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/
Rejecting new development because your own driver is broken and
violates established cross driver contracts and uapi is really not
how upstream works.
Now this patch will have a severe performance impact on anything that
runs on multiple engines. So we can't just merge it outright, but need
a bit a plan:
- amdgpu needs a proper uapi for handling implicit fencing. The funny
thing is that to do it correctly, implicit fencing must be treated
as a very strange IPC mechanism for transporting fences, where both
setting the fence and dependency intercepts must be handled
explicitly. Current best practices is a per-bo flag to indicate
writes, and a per-bo flag to to skip implicit fencing in the CS
ioctl as a new chunk.
- Since amdgpu has been shipping with broken behaviour we need an
opt-out flag from the butchered implicit fencing model to enable the
proper explicit implicit fencing model.
- for kernel memory fences due to bo moves at least the i915 idea is
to use ttm_bo->moving. amdgpu probably needs the same.
- since the current p2p dma-buf interface assumes the kernel memory
fence is in the exclusive dma_resv fence slot we need to add a new
fence slot for kernel fences, which must never be ignored. Since
currently only amdgpu supports this there's no real problem here
yet, until amdgpu gains a NO_IMPLICIT CS flag.
- New userspace needs to ship in enough desktop distros so that users
wont notice the perf impact. I think we can ignore LTS distros who
upgrade their kernels but not their mesa3d snapshot.
- Then when this is all in place we can merge this patch here.
What is not a solution to this problem here is trying to make the
dma_resv rules in the kernel more clever. The fundamental issue here
is that the amdgpu CS uapi is the least expressive one across all
drivers (only equalled by panfrost, which has an actual excuse) by not
allowing any userspace control over how implicit sync is conducted.
Until this is fixed it's completely pointless to make the kernel more
clever to improve amdgpu, because all we're doing is papering over
this uapi design issue. amdgpu needs to attain the status quo
established by other drivers first, once that's achieved we can tackle
the remaining issues in a consistent way across drivers.
v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
entirely missed.
This is great because it means the amdgpu specific piece for proper
implicit fence handling exists already, and that since a while. The
only thing that's now missing is
- fishing the implicit fences out of a shared object at the right time
- setting the exclusive implicit fence slot at the right time.
Jason has a patch series to fill that gap with a bunch of generic
ioctl on the dma-buf fd:
https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/
v3: Since Christian has fixed amdgpu now in
commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
Author: Christian König <christian.koenig@amd.com>
Date: Wed Jun 9 13:51:36 2021 +0200
drm/amdgpu: rework dma_resv handling v3
Use the audit covered in this commit message as the excuse to update
the dma-buf docs around dma_buf.resv usage across drivers.
Since dynamic importers have different rules also hammer these in
again while we're at it.
v4:
- Add the missing "through the device" in the dynamic section that I
overlooked.
- Fix a kerneldoc markup mistake, the link didn't connect
v5:
- A few s/should/must/ to make clear what must be done (if the driver
does implicit sync) and what's more a maybe (Daniel Stone)
- drop all the example api discussion, that needs to be expanded,
clarified and put into a new chapter in drm-uapi.rst (Daniel Stone)
Cc: Daniel Stone <daniel@fooishbar.org>
Acked-by: Daniel Stone <daniel@fooishbar.org>
Reviewed-by: Dave Airlie <airlied@redhat.com> (v4)
Reviewed-by: Christian König <christian.koenig@amd.com> (v3)
Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624125246.166721-1-daniel.vetter@ffwll.ch
2021-06-24 20:52:46 +08:00
|
|
|
*
|
2021-11-09 18:08:18 +08:00
|
|
|
* - Similarly drivers must add a write fence through
|
|
|
|
* dma_resv_add_fence() with the DMA_RESV_USAGE_WRITE flag for
|
|
|
|
* anything the userspace API considers write access.
|
dma-buf: Document dma-buf implicit fencing/resv fencing rules
Docs for struct dma_resv are fairly clear:
"A reservation object can have attached one exclusive fence (normally
associated with write operations) or N shared fences (read
operations)."
https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects
Furthermore a review across all of upstream.
First of render drivers and how they set implicit fences:
- nouveau follows this contract, see in validate_fini_no_ticket()
nouveau_bo_fence(nvbo, fence, !!b->write_domains);
and that last boolean controls whether the exclusive or shared fence
slot is used.
- radeon follows this contract by setting
p->relocs[i].tv.num_shared = !r->write_domain;
in radeon_cs_parser_relocs(), which ensures that the call to
ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
right thing.
- vmwgfx seems to follow this contract with the shotgun approach of
always setting ttm_val_buf->num_shared = 0, which means
ttm_eu_fence_buffer_objects() will only use the exclusive slot.
- etnaviv follows this contract, as can be trivially seen by looking
at submit_attach_object_fences()
- i915 is a bit a convoluted maze with multiple paths leading to
i915_vma_move_to_active(). Which sets the exclusive flag if
EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
softpin mode, or through the write_domain when using relocations. It
follows this contract.
- lima follows this contract, see lima_gem_submit() which sets the
exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
bo
- msm follows this contract, see msm_gpu_submit() which sets the
exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer
- panfrost follows this contract with the shotgun approach of just
always setting the exclusive fence, see
panfrost_attach_object_fences(). Benefits of a single engine I guess
- v3d follows this contract with the same shotgun approach in
v3d_attach_fences_and_unlock_reservation(), but it has at least an
XXX comment that maybe this should be improved
- v4c uses the same shotgun approach of always setting an exclusive
fence, see vc4_update_bo_seqnos()
- vgem also follows this contract, see vgem_fence_attach_ioctl() and
the VGEM_FENCE_WRITE. This is used in some igts to validate prime
sharing with i915.ko without the need of a 2nd gpu
- vritio follows this contract again with the shotgun approach of
always setting an exclusive fence, see virtio_gpu_array_add_fence()
This covers the setting of the exclusive fences when writing.
Synchronizing against the exclusive fence is a lot more tricky, and I
only spot checked a few:
- i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
implicit dependencies (which is used by vulkan)
- etnaviv does this. Implicit dependencies are collected in
submit_fence_sync(), again with an opt-out flag
ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
etnaviv_sched_dependency which is the
drm_sched_backend_ops->dependency callback.
- v4c seems to not do much here, maybe gets away with it by not having
a scheduler and only a single engine. Since all newer broadcom chips than
the OG vc4 use v3d for rendering, which follows this contract, the
impact of this issue is fairly small.
- v3d does this using the drm_gem_fence_array_add_implicit() helper,
which then it's drm_sched_backend_ops->dependency callback
v3d_job_dependency() picks up.
- panfrost is nice here and tracks the implicit fences in
panfrost_job->implicit_fences, which again the
drm_sched_backend_ops->dependency callback panfrost_job_dependency()
picks up. It is mildly questionable though since it only picks up
exclusive fences in panfrost_acquire_object_fences(), but not buggy
in practice because it also always sets the exclusive fence. It
should pick up both sets of fences, just in case there's ever going
to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
pcie port and a real gpu, which might actually happen eventually. A
bug, but easy to fix. Should probably use the
drm_gem_fence_array_add_implicit() helper.
- lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
the same schema as v3d.
- msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
but because it doesn't use the drm/scheduler it handles fences from
the wrong context with a synchronous dma_fence_wait. See
submit_fence_sync() leading to msm_gem_sync_object(). Investing into
a scheduler might be a good idea.
- all the remaining drivers are ttm based, where I hope they do
appropriately obey implicit fences already. I didn't do the full
audit there because a) not follow the contract would confuse ttm
quite well and b) reading non-standard scheduler and submit code
which isn't based on drm/scheduler is a pain.
Onwards to the display side.
- Any driver using the drm_gem_plane_helper_prepare_fb() helper will
correctly. Overwhelmingly most drivers get this right, except a few
totally dont. I'll follow up with a patch to make this the default
and avoid a bunch of bugs.
- I didn't audit the ttm drivers, but given that dma_resv started
there I hope they get this right.
In conclusion this IS the contract, both as documented and
overwhelmingly implemented, specically as implemented by all render
drivers except amdgpu.
Amdgpu tried to fix this already in
commit 049aca4363d8af87cab8d53de5401602db3b9999
Author: Christian König <christian.koenig@amd.com>
Date: Wed Sep 19 16:54:35 2018 +0200
drm/amdgpu: fix using shared fence for exported BOs v2
but this fix falls short on a number of areas:
- It's racy, by the time the buffer is shared it might be too late. To
make sure there's definitely never a problem we need to set the
fences correctly for any buffer that's potentially exportable.
- It's breaking uapi, dma-buf fds support poll() and differentitiate
between, which was introduced in
commit 9b495a5887994a6d74d5c261d012083a92b94738
Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Date: Tue Jul 1 12:57:43 2014 +0200
dma-buf: add poll support, v3
- Christian König wants to nack new uapi building further on this
dma_resv contract because it breaks amdgpu, quoting
"Yeah, and that is exactly the reason why I will NAK this uAPI change.
"This doesn't works for amdgpu at all for the reasons outlined above."
https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/
Rejecting new development because your own driver is broken and
violates established cross driver contracts and uapi is really not
how upstream works.
Now this patch will have a severe performance impact on anything that
runs on multiple engines. So we can't just merge it outright, but need
a bit a plan:
- amdgpu needs a proper uapi for handling implicit fencing. The funny
thing is that to do it correctly, implicit fencing must be treated
as a very strange IPC mechanism for transporting fences, where both
setting the fence and dependency intercepts must be handled
explicitly. Current best practices is a per-bo flag to indicate
writes, and a per-bo flag to to skip implicit fencing in the CS
ioctl as a new chunk.
- Since amdgpu has been shipping with broken behaviour we need an
opt-out flag from the butchered implicit fencing model to enable the
proper explicit implicit fencing model.
- for kernel memory fences due to bo moves at least the i915 idea is
to use ttm_bo->moving. amdgpu probably needs the same.
- since the current p2p dma-buf interface assumes the kernel memory
fence is in the exclusive dma_resv fence slot we need to add a new
fence slot for kernel fences, which must never be ignored. Since
currently only amdgpu supports this there's no real problem here
yet, until amdgpu gains a NO_IMPLICIT CS flag.
- New userspace needs to ship in enough desktop distros so that users
wont notice the perf impact. I think we can ignore LTS distros who
upgrade their kernels but not their mesa3d snapshot.
- Then when this is all in place we can merge this patch here.
What is not a solution to this problem here is trying to make the
dma_resv rules in the kernel more clever. The fundamental issue here
is that the amdgpu CS uapi is the least expressive one across all
drivers (only equalled by panfrost, which has an actual excuse) by not
allowing any userspace control over how implicit sync is conducted.
Until this is fixed it's completely pointless to make the kernel more
clever to improve amdgpu, because all we're doing is papering over
this uapi design issue. amdgpu needs to attain the status quo
established by other drivers first, once that's achieved we can tackle
the remaining issues in a consistent way across drivers.
v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
entirely missed.
This is great because it means the amdgpu specific piece for proper
implicit fence handling exists already, and that since a while. The
only thing that's now missing is
- fishing the implicit fences out of a shared object at the right time
- setting the exclusive implicit fence slot at the right time.
Jason has a patch series to fill that gap with a bunch of generic
ioctl on the dma-buf fd:
https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/
v3: Since Christian has fixed amdgpu now in
commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
Author: Christian König <christian.koenig@amd.com>
Date: Wed Jun 9 13:51:36 2021 +0200
drm/amdgpu: rework dma_resv handling v3
Use the audit covered in this commit message as the excuse to update
the dma-buf docs around dma_buf.resv usage across drivers.
Since dynamic importers have different rules also hammer these in
again while we're at it.
v4:
- Add the missing "through the device" in the dynamic section that I
overlooked.
- Fix a kerneldoc markup mistake, the link didn't connect
v5:
- A few s/should/must/ to make clear what must be done (if the driver
does implicit sync) and what's more a maybe (Daniel Stone)
- drop all the example api discussion, that needs to be expanded,
clarified and put into a new chapter in drm-uapi.rst (Daniel Stone)
Cc: Daniel Stone <daniel@fooishbar.org>
Acked-by: Daniel Stone <daniel@fooishbar.org>
Reviewed-by: Dave Airlie <airlied@redhat.com> (v4)
Reviewed-by: Christian König <christian.koenig@amd.com> (v3)
Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624125246.166721-1-daniel.vetter@ffwll.ch
2021-06-24 20:52:46 +08:00
|
|
|
*
|
2021-11-09 18:08:18 +08:00
|
|
|
* - Drivers may just always add a write fence, since that only
|
dma-buf: Document dma-buf implicit fencing/resv fencing rules
Docs for struct dma_resv are fairly clear:
"A reservation object can have attached one exclusive fence (normally
associated with write operations) or N shared fences (read
operations)."
https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects
Furthermore a review across all of upstream.
First of render drivers and how they set implicit fences:
- nouveau follows this contract, see in validate_fini_no_ticket()
nouveau_bo_fence(nvbo, fence, !!b->write_domains);
and that last boolean controls whether the exclusive or shared fence
slot is used.
- radeon follows this contract by setting
p->relocs[i].tv.num_shared = !r->write_domain;
in radeon_cs_parser_relocs(), which ensures that the call to
ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
right thing.
- vmwgfx seems to follow this contract with the shotgun approach of
always setting ttm_val_buf->num_shared = 0, which means
ttm_eu_fence_buffer_objects() will only use the exclusive slot.
- etnaviv follows this contract, as can be trivially seen by looking
at submit_attach_object_fences()
- i915 is a bit a convoluted maze with multiple paths leading to
i915_vma_move_to_active(). Which sets the exclusive flag if
EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
softpin mode, or through the write_domain when using relocations. It
follows this contract.
- lima follows this contract, see lima_gem_submit() which sets the
exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
bo
- msm follows this contract, see msm_gpu_submit() which sets the
exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer
- panfrost follows this contract with the shotgun approach of just
always setting the exclusive fence, see
panfrost_attach_object_fences(). Benefits of a single engine I guess
- v3d follows this contract with the same shotgun approach in
v3d_attach_fences_and_unlock_reservation(), but it has at least an
XXX comment that maybe this should be improved
- v4c uses the same shotgun approach of always setting an exclusive
fence, see vc4_update_bo_seqnos()
- vgem also follows this contract, see vgem_fence_attach_ioctl() and
the VGEM_FENCE_WRITE. This is used in some igts to validate prime
sharing with i915.ko without the need of a 2nd gpu
- vritio follows this contract again with the shotgun approach of
always setting an exclusive fence, see virtio_gpu_array_add_fence()
This covers the setting of the exclusive fences when writing.
Synchronizing against the exclusive fence is a lot more tricky, and I
only spot checked a few:
- i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
implicit dependencies (which is used by vulkan)
- etnaviv does this. Implicit dependencies are collected in
submit_fence_sync(), again with an opt-out flag
ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
etnaviv_sched_dependency which is the
drm_sched_backend_ops->dependency callback.
- v4c seems to not do much here, maybe gets away with it by not having
a scheduler and only a single engine. Since all newer broadcom chips than
the OG vc4 use v3d for rendering, which follows this contract, the
impact of this issue is fairly small.
- v3d does this using the drm_gem_fence_array_add_implicit() helper,
which then it's drm_sched_backend_ops->dependency callback
v3d_job_dependency() picks up.
- panfrost is nice here and tracks the implicit fences in
panfrost_job->implicit_fences, which again the
drm_sched_backend_ops->dependency callback panfrost_job_dependency()
picks up. It is mildly questionable though since it only picks up
exclusive fences in panfrost_acquire_object_fences(), but not buggy
in practice because it also always sets the exclusive fence. It
should pick up both sets of fences, just in case there's ever going
to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
pcie port and a real gpu, which might actually happen eventually. A
bug, but easy to fix. Should probably use the
drm_gem_fence_array_add_implicit() helper.
- lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
the same schema as v3d.
- msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
but because it doesn't use the drm/scheduler it handles fences from
the wrong context with a synchronous dma_fence_wait. See
submit_fence_sync() leading to msm_gem_sync_object(). Investing into
a scheduler might be a good idea.
- all the remaining drivers are ttm based, where I hope they do
appropriately obey implicit fences already. I didn't do the full
audit there because a) not follow the contract would confuse ttm
quite well and b) reading non-standard scheduler and submit code
which isn't based on drm/scheduler is a pain.
Onwards to the display side.
- Any driver using the drm_gem_plane_helper_prepare_fb() helper will
correctly. Overwhelmingly most drivers get this right, except a few
totally dont. I'll follow up with a patch to make this the default
and avoid a bunch of bugs.
- I didn't audit the ttm drivers, but given that dma_resv started
there I hope they get this right.
In conclusion this IS the contract, both as documented and
overwhelmingly implemented, specically as implemented by all render
drivers except amdgpu.
Amdgpu tried to fix this already in
commit 049aca4363d8af87cab8d53de5401602db3b9999
Author: Christian König <christian.koenig@amd.com>
Date: Wed Sep 19 16:54:35 2018 +0200
drm/amdgpu: fix using shared fence for exported BOs v2
but this fix falls short on a number of areas:
- It's racy, by the time the buffer is shared it might be too late. To
make sure there's definitely never a problem we need to set the
fences correctly for any buffer that's potentially exportable.
- It's breaking uapi, dma-buf fds support poll() and differentitiate
between, which was introduced in
commit 9b495a5887994a6d74d5c261d012083a92b94738
Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Date: Tue Jul 1 12:57:43 2014 +0200
dma-buf: add poll support, v3
- Christian König wants to nack new uapi building further on this
dma_resv contract because it breaks amdgpu, quoting
"Yeah, and that is exactly the reason why I will NAK this uAPI change.
"This doesn't works for amdgpu at all for the reasons outlined above."
https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/
Rejecting new development because your own driver is broken and
violates established cross driver contracts and uapi is really not
how upstream works.
Now this patch will have a severe performance impact on anything that
runs on multiple engines. So we can't just merge it outright, but need
a bit a plan:
- amdgpu needs a proper uapi for handling implicit fencing. The funny
thing is that to do it correctly, implicit fencing must be treated
as a very strange IPC mechanism for transporting fences, where both
setting the fence and dependency intercepts must be handled
explicitly. Current best practices is a per-bo flag to indicate
writes, and a per-bo flag to to skip implicit fencing in the CS
ioctl as a new chunk.
- Since amdgpu has been shipping with broken behaviour we need an
opt-out flag from the butchered implicit fencing model to enable the
proper explicit implicit fencing model.
- for kernel memory fences due to bo moves at least the i915 idea is
to use ttm_bo->moving. amdgpu probably needs the same.
- since the current p2p dma-buf interface assumes the kernel memory
fence is in the exclusive dma_resv fence slot we need to add a new
fence slot for kernel fences, which must never be ignored. Since
currently only amdgpu supports this there's no real problem here
yet, until amdgpu gains a NO_IMPLICIT CS flag.
- New userspace needs to ship in enough desktop distros so that users
wont notice the perf impact. I think we can ignore LTS distros who
upgrade their kernels but not their mesa3d snapshot.
- Then when this is all in place we can merge this patch here.
What is not a solution to this problem here is trying to make the
dma_resv rules in the kernel more clever. The fundamental issue here
is that the amdgpu CS uapi is the least expressive one across all
drivers (only equalled by panfrost, which has an actual excuse) by not
allowing any userspace control over how implicit sync is conducted.
Until this is fixed it's completely pointless to make the kernel more
clever to improve amdgpu, because all we're doing is papering over
this uapi design issue. amdgpu needs to attain the status quo
established by other drivers first, once that's achieved we can tackle
the remaining issues in a consistent way across drivers.
v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
entirely missed.
This is great because it means the amdgpu specific piece for proper
implicit fence handling exists already, and that since a while. The
only thing that's now missing is
- fishing the implicit fences out of a shared object at the right time
- setting the exclusive implicit fence slot at the right time.
Jason has a patch series to fill that gap with a bunch of generic
ioctl on the dma-buf fd:
https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/
v3: Since Christian has fixed amdgpu now in
commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
Author: Christian König <christian.koenig@amd.com>
Date: Wed Jun 9 13:51:36 2021 +0200
drm/amdgpu: rework dma_resv handling v3
Use the audit covered in this commit message as the excuse to update
the dma-buf docs around dma_buf.resv usage across drivers.
Since dynamic importers have different rules also hammer these in
again while we're at it.
v4:
- Add the missing "through the device" in the dynamic section that I
overlooked.
- Fix a kerneldoc markup mistake, the link didn't connect
v5:
- A few s/should/must/ to make clear what must be done (if the driver
does implicit sync) and what's more a maybe (Daniel Stone)
- drop all the example api discussion, that needs to be expanded,
clarified and put into a new chapter in drm-uapi.rst (Daniel Stone)
Cc: Daniel Stone <daniel@fooishbar.org>
Acked-by: Daniel Stone <daniel@fooishbar.org>
Reviewed-by: Dave Airlie <airlied@redhat.com> (v4)
Reviewed-by: Christian König <christian.koenig@amd.com> (v3)
Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624125246.166721-1-daniel.vetter@ffwll.ch
2021-06-24 20:52:46 +08:00
|
|
|
* causes unecessarily synchronization, but no correctness issues.
|
|
|
|
*
|
|
|
|
* - Some drivers only expose a synchronous userspace API with no
|
|
|
|
* pipelining across drivers. These do not set any fences for their
|
|
|
|
* access. An example here is v4l.
|
|
|
|
*
|
2021-11-09 18:08:18 +08:00
|
|
|
* - Driver should use dma_resv_usage_rw() when retrieving fences as
|
|
|
|
* dependency for implicit synchronization.
|
|
|
|
*
|
dma-buf: Document dma-buf implicit fencing/resv fencing rules
Docs for struct dma_resv are fairly clear:
"A reservation object can have attached one exclusive fence (normally
associated with write operations) or N shared fences (read
operations)."
https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects
Furthermore a review across all of upstream.
First of render drivers and how they set implicit fences:
- nouveau follows this contract, see in validate_fini_no_ticket()
nouveau_bo_fence(nvbo, fence, !!b->write_domains);
and that last boolean controls whether the exclusive or shared fence
slot is used.
- radeon follows this contract by setting
p->relocs[i].tv.num_shared = !r->write_domain;
in radeon_cs_parser_relocs(), which ensures that the call to
ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
right thing.
- vmwgfx seems to follow this contract with the shotgun approach of
always setting ttm_val_buf->num_shared = 0, which means
ttm_eu_fence_buffer_objects() will only use the exclusive slot.
- etnaviv follows this contract, as can be trivially seen by looking
at submit_attach_object_fences()
- i915 is a bit a convoluted maze with multiple paths leading to
i915_vma_move_to_active(). Which sets the exclusive flag if
EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
softpin mode, or through the write_domain when using relocations. It
follows this contract.
- lima follows this contract, see lima_gem_submit() which sets the
exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
bo
- msm follows this contract, see msm_gpu_submit() which sets the
exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer
- panfrost follows this contract with the shotgun approach of just
always setting the exclusive fence, see
panfrost_attach_object_fences(). Benefits of a single engine I guess
- v3d follows this contract with the same shotgun approach in
v3d_attach_fences_and_unlock_reservation(), but it has at least an
XXX comment that maybe this should be improved
- v4c uses the same shotgun approach of always setting an exclusive
fence, see vc4_update_bo_seqnos()
- vgem also follows this contract, see vgem_fence_attach_ioctl() and
the VGEM_FENCE_WRITE. This is used in some igts to validate prime
sharing with i915.ko without the need of a 2nd gpu
- vritio follows this contract again with the shotgun approach of
always setting an exclusive fence, see virtio_gpu_array_add_fence()
This covers the setting of the exclusive fences when writing.
Synchronizing against the exclusive fence is a lot more tricky, and I
only spot checked a few:
- i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
implicit dependencies (which is used by vulkan)
- etnaviv does this. Implicit dependencies are collected in
submit_fence_sync(), again with an opt-out flag
ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
etnaviv_sched_dependency which is the
drm_sched_backend_ops->dependency callback.
- v4c seems to not do much here, maybe gets away with it by not having
a scheduler and only a single engine. Since all newer broadcom chips than
the OG vc4 use v3d for rendering, which follows this contract, the
impact of this issue is fairly small.
- v3d does this using the drm_gem_fence_array_add_implicit() helper,
which then it's drm_sched_backend_ops->dependency callback
v3d_job_dependency() picks up.
- panfrost is nice here and tracks the implicit fences in
panfrost_job->implicit_fences, which again the
drm_sched_backend_ops->dependency callback panfrost_job_dependency()
picks up. It is mildly questionable though since it only picks up
exclusive fences in panfrost_acquire_object_fences(), but not buggy
in practice because it also always sets the exclusive fence. It
should pick up both sets of fences, just in case there's ever going
to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
pcie port and a real gpu, which might actually happen eventually. A
bug, but easy to fix. Should probably use the
drm_gem_fence_array_add_implicit() helper.
- lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
the same schema as v3d.
- msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
but because it doesn't use the drm/scheduler it handles fences from
the wrong context with a synchronous dma_fence_wait. See
submit_fence_sync() leading to msm_gem_sync_object(). Investing into
a scheduler might be a good idea.
- all the remaining drivers are ttm based, where I hope they do
appropriately obey implicit fences already. I didn't do the full
audit there because a) not follow the contract would confuse ttm
quite well and b) reading non-standard scheduler and submit code
which isn't based on drm/scheduler is a pain.
Onwards to the display side.
- Any driver using the drm_gem_plane_helper_prepare_fb() helper will
correctly. Overwhelmingly most drivers get this right, except a few
totally dont. I'll follow up with a patch to make this the default
and avoid a bunch of bugs.
- I didn't audit the ttm drivers, but given that dma_resv started
there I hope they get this right.
In conclusion this IS the contract, both as documented and
overwhelmingly implemented, specically as implemented by all render
drivers except amdgpu.
Amdgpu tried to fix this already in
commit 049aca4363d8af87cab8d53de5401602db3b9999
Author: Christian König <christian.koenig@amd.com>
Date: Wed Sep 19 16:54:35 2018 +0200
drm/amdgpu: fix using shared fence for exported BOs v2
but this fix falls short on a number of areas:
- It's racy, by the time the buffer is shared it might be too late. To
make sure there's definitely never a problem we need to set the
fences correctly for any buffer that's potentially exportable.
- It's breaking uapi, dma-buf fds support poll() and differentitiate
between, which was introduced in
commit 9b495a5887994a6d74d5c261d012083a92b94738
Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Date: Tue Jul 1 12:57:43 2014 +0200
dma-buf: add poll support, v3
- Christian König wants to nack new uapi building further on this
dma_resv contract because it breaks amdgpu, quoting
"Yeah, and that is exactly the reason why I will NAK this uAPI change.
"This doesn't works for amdgpu at all for the reasons outlined above."
https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/
Rejecting new development because your own driver is broken and
violates established cross driver contracts and uapi is really not
how upstream works.
Now this patch will have a severe performance impact on anything that
runs on multiple engines. So we can't just merge it outright, but need
a bit a plan:
- amdgpu needs a proper uapi for handling implicit fencing. The funny
thing is that to do it correctly, implicit fencing must be treated
as a very strange IPC mechanism for transporting fences, where both
setting the fence and dependency intercepts must be handled
explicitly. Current best practices is a per-bo flag to indicate
writes, and a per-bo flag to to skip implicit fencing in the CS
ioctl as a new chunk.
- Since amdgpu has been shipping with broken behaviour we need an
opt-out flag from the butchered implicit fencing model to enable the
proper explicit implicit fencing model.
- for kernel memory fences due to bo moves at least the i915 idea is
to use ttm_bo->moving. amdgpu probably needs the same.
- since the current p2p dma-buf interface assumes the kernel memory
fence is in the exclusive dma_resv fence slot we need to add a new
fence slot for kernel fences, which must never be ignored. Since
currently only amdgpu supports this there's no real problem here
yet, until amdgpu gains a NO_IMPLICIT CS flag.
- New userspace needs to ship in enough desktop distros so that users
wont notice the perf impact. I think we can ignore LTS distros who
upgrade their kernels but not their mesa3d snapshot.
- Then when this is all in place we can merge this patch here.
What is not a solution to this problem here is trying to make the
dma_resv rules in the kernel more clever. The fundamental issue here
is that the amdgpu CS uapi is the least expressive one across all
drivers (only equalled by panfrost, which has an actual excuse) by not
allowing any userspace control over how implicit sync is conducted.
Until this is fixed it's completely pointless to make the kernel more
clever to improve amdgpu, because all we're doing is papering over
this uapi design issue. amdgpu needs to attain the status quo
established by other drivers first, once that's achieved we can tackle
the remaining issues in a consistent way across drivers.
v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
entirely missed.
This is great because it means the amdgpu specific piece for proper
implicit fence handling exists already, and that since a while. The
only thing that's now missing is
- fishing the implicit fences out of a shared object at the right time
- setting the exclusive implicit fence slot at the right time.
Jason has a patch series to fill that gap with a bunch of generic
ioctl on the dma-buf fd:
https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/
v3: Since Christian has fixed amdgpu now in
commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
Author: Christian König <christian.koenig@amd.com>
Date: Wed Jun 9 13:51:36 2021 +0200
drm/amdgpu: rework dma_resv handling v3
Use the audit covered in this commit message as the excuse to update
the dma-buf docs around dma_buf.resv usage across drivers.
Since dynamic importers have different rules also hammer these in
again while we're at it.
v4:
- Add the missing "through the device" in the dynamic section that I
overlooked.
- Fix a kerneldoc markup mistake, the link didn't connect
v5:
- A few s/should/must/ to make clear what must be done (if the driver
does implicit sync) and what's more a maybe (Daniel Stone)
- drop all the example api discussion, that needs to be expanded,
clarified and put into a new chapter in drm-uapi.rst (Daniel Stone)
Cc: Daniel Stone <daniel@fooishbar.org>
Acked-by: Daniel Stone <daniel@fooishbar.org>
Reviewed-by: Dave Airlie <airlied@redhat.com> (v4)
Reviewed-by: Christian König <christian.koenig@amd.com> (v3)
Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624125246.166721-1-daniel.vetter@ffwll.ch
2021-06-24 20:52:46 +08:00
|
|
|
* DYNAMIC IMPORTER RULES:
|
|
|
|
*
|
|
|
|
* Dynamic importers, see dma_buf_attachment_is_dynamic(), have
|
|
|
|
* additional constraints on how they set up fences:
|
|
|
|
*
|
2021-11-09 18:08:18 +08:00
|
|
|
* - Dynamic importers must obey the write fences and wait for them to
|
dma-buf: Document dma-buf implicit fencing/resv fencing rules
Docs for struct dma_resv are fairly clear:
"A reservation object can have attached one exclusive fence (normally
associated with write operations) or N shared fences (read
operations)."
https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects
Furthermore a review across all of upstream.
First of render drivers and how they set implicit fences:
- nouveau follows this contract, see in validate_fini_no_ticket()
nouveau_bo_fence(nvbo, fence, !!b->write_domains);
and that last boolean controls whether the exclusive or shared fence
slot is used.
- radeon follows this contract by setting
p->relocs[i].tv.num_shared = !r->write_domain;
in radeon_cs_parser_relocs(), which ensures that the call to
ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
right thing.
- vmwgfx seems to follow this contract with the shotgun approach of
always setting ttm_val_buf->num_shared = 0, which means
ttm_eu_fence_buffer_objects() will only use the exclusive slot.
- etnaviv follows this contract, as can be trivially seen by looking
at submit_attach_object_fences()
- i915 is a bit a convoluted maze with multiple paths leading to
i915_vma_move_to_active(). Which sets the exclusive flag if
EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
softpin mode, or through the write_domain when using relocations. It
follows this contract.
- lima follows this contract, see lima_gem_submit() which sets the
exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
bo
- msm follows this contract, see msm_gpu_submit() which sets the
exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer
- panfrost follows this contract with the shotgun approach of just
always setting the exclusive fence, see
panfrost_attach_object_fences(). Benefits of a single engine I guess
- v3d follows this contract with the same shotgun approach in
v3d_attach_fences_and_unlock_reservation(), but it has at least an
XXX comment that maybe this should be improved
- v4c uses the same shotgun approach of always setting an exclusive
fence, see vc4_update_bo_seqnos()
- vgem also follows this contract, see vgem_fence_attach_ioctl() and
the VGEM_FENCE_WRITE. This is used in some igts to validate prime
sharing with i915.ko without the need of a 2nd gpu
- vritio follows this contract again with the shotgun approach of
always setting an exclusive fence, see virtio_gpu_array_add_fence()
This covers the setting of the exclusive fences when writing.
Synchronizing against the exclusive fence is a lot more tricky, and I
only spot checked a few:
- i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
implicit dependencies (which is used by vulkan)
- etnaviv does this. Implicit dependencies are collected in
submit_fence_sync(), again with an opt-out flag
ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
etnaviv_sched_dependency which is the
drm_sched_backend_ops->dependency callback.
- v4c seems to not do much here, maybe gets away with it by not having
a scheduler and only a single engine. Since all newer broadcom chips than
the OG vc4 use v3d for rendering, which follows this contract, the
impact of this issue is fairly small.
- v3d does this using the drm_gem_fence_array_add_implicit() helper,
which then it's drm_sched_backend_ops->dependency callback
v3d_job_dependency() picks up.
- panfrost is nice here and tracks the implicit fences in
panfrost_job->implicit_fences, which again the
drm_sched_backend_ops->dependency callback panfrost_job_dependency()
picks up. It is mildly questionable though since it only picks up
exclusive fences in panfrost_acquire_object_fences(), but not buggy
in practice because it also always sets the exclusive fence. It
should pick up both sets of fences, just in case there's ever going
to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
pcie port and a real gpu, which might actually happen eventually. A
bug, but easy to fix. Should probably use the
drm_gem_fence_array_add_implicit() helper.
- lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
the same schema as v3d.
- msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
but because it doesn't use the drm/scheduler it handles fences from
the wrong context with a synchronous dma_fence_wait. See
submit_fence_sync() leading to msm_gem_sync_object(). Investing into
a scheduler might be a good idea.
- all the remaining drivers are ttm based, where I hope they do
appropriately obey implicit fences already. I didn't do the full
audit there because a) not follow the contract would confuse ttm
quite well and b) reading non-standard scheduler and submit code
which isn't based on drm/scheduler is a pain.
Onwards to the display side.
- Any driver using the drm_gem_plane_helper_prepare_fb() helper will
correctly. Overwhelmingly most drivers get this right, except a few
totally dont. I'll follow up with a patch to make this the default
and avoid a bunch of bugs.
- I didn't audit the ttm drivers, but given that dma_resv started
there I hope they get this right.
In conclusion this IS the contract, both as documented and
overwhelmingly implemented, specically as implemented by all render
drivers except amdgpu.
Amdgpu tried to fix this already in
commit 049aca4363d8af87cab8d53de5401602db3b9999
Author: Christian König <christian.koenig@amd.com>
Date: Wed Sep 19 16:54:35 2018 +0200
drm/amdgpu: fix using shared fence for exported BOs v2
but this fix falls short on a number of areas:
- It's racy, by the time the buffer is shared it might be too late. To
make sure there's definitely never a problem we need to set the
fences correctly for any buffer that's potentially exportable.
- It's breaking uapi, dma-buf fds support poll() and differentitiate
between, which was introduced in
commit 9b495a5887994a6d74d5c261d012083a92b94738
Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Date: Tue Jul 1 12:57:43 2014 +0200
dma-buf: add poll support, v3
- Christian König wants to nack new uapi building further on this
dma_resv contract because it breaks amdgpu, quoting
"Yeah, and that is exactly the reason why I will NAK this uAPI change.
"This doesn't works for amdgpu at all for the reasons outlined above."
https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/
Rejecting new development because your own driver is broken and
violates established cross driver contracts and uapi is really not
how upstream works.
Now this patch will have a severe performance impact on anything that
runs on multiple engines. So we can't just merge it outright, but need
a bit a plan:
- amdgpu needs a proper uapi for handling implicit fencing. The funny
thing is that to do it correctly, implicit fencing must be treated
as a very strange IPC mechanism for transporting fences, where both
setting the fence and dependency intercepts must be handled
explicitly. Current best practices is a per-bo flag to indicate
writes, and a per-bo flag to to skip implicit fencing in the CS
ioctl as a new chunk.
- Since amdgpu has been shipping with broken behaviour we need an
opt-out flag from the butchered implicit fencing model to enable the
proper explicit implicit fencing model.
- for kernel memory fences due to bo moves at least the i915 idea is
to use ttm_bo->moving. amdgpu probably needs the same.
- since the current p2p dma-buf interface assumes the kernel memory
fence is in the exclusive dma_resv fence slot we need to add a new
fence slot for kernel fences, which must never be ignored. Since
currently only amdgpu supports this there's no real problem here
yet, until amdgpu gains a NO_IMPLICIT CS flag.
- New userspace needs to ship in enough desktop distros so that users
wont notice the perf impact. I think we can ignore LTS distros who
upgrade their kernels but not their mesa3d snapshot.
- Then when this is all in place we can merge this patch here.
What is not a solution to this problem here is trying to make the
dma_resv rules in the kernel more clever. The fundamental issue here
is that the amdgpu CS uapi is the least expressive one across all
drivers (only equalled by panfrost, which has an actual excuse) by not
allowing any userspace control over how implicit sync is conducted.
Until this is fixed it's completely pointless to make the kernel more
clever to improve amdgpu, because all we're doing is papering over
this uapi design issue. amdgpu needs to attain the status quo
established by other drivers first, once that's achieved we can tackle
the remaining issues in a consistent way across drivers.
v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
entirely missed.
This is great because it means the amdgpu specific piece for proper
implicit fence handling exists already, and that since a while. The
only thing that's now missing is
- fishing the implicit fences out of a shared object at the right time
- setting the exclusive implicit fence slot at the right time.
Jason has a patch series to fill that gap with a bunch of generic
ioctl on the dma-buf fd:
https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/
v3: Since Christian has fixed amdgpu now in
commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
Author: Christian König <christian.koenig@amd.com>
Date: Wed Jun 9 13:51:36 2021 +0200
drm/amdgpu: rework dma_resv handling v3
Use the audit covered in this commit message as the excuse to update
the dma-buf docs around dma_buf.resv usage across drivers.
Since dynamic importers have different rules also hammer these in
again while we're at it.
v4:
- Add the missing "through the device" in the dynamic section that I
overlooked.
- Fix a kerneldoc markup mistake, the link didn't connect
v5:
- A few s/should/must/ to make clear what must be done (if the driver
does implicit sync) and what's more a maybe (Daniel Stone)
- drop all the example api discussion, that needs to be expanded,
clarified and put into a new chapter in drm-uapi.rst (Daniel Stone)
Cc: Daniel Stone <daniel@fooishbar.org>
Acked-by: Daniel Stone <daniel@fooishbar.org>
Reviewed-by: Dave Airlie <airlied@redhat.com> (v4)
Reviewed-by: Christian König <christian.koenig@amd.com> (v3)
Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624125246.166721-1-daniel.vetter@ffwll.ch
2021-06-24 20:52:46 +08:00
|
|
|
* signal before allowing access to the buffer's underlying storage
|
|
|
|
* through the device.
|
|
|
|
*
|
|
|
|
* - Dynamic importers should set fences for any access that they can't
|
|
|
|
* disable immediately from their &dma_buf_attach_ops.move_notify
|
|
|
|
* callback.
|
2021-08-05 18:47:05 +08:00
|
|
|
*
|
|
|
|
* IMPORTANT:
|
|
|
|
*
|
2021-11-09 18:08:18 +08:00
|
|
|
* All drivers and memory management related functions must obey the
|
|
|
|
* struct dma_resv rules, specifically the rules for updating and
|
|
|
|
* obeying fences. See enum dma_resv_usage for further descriptions.
|
2021-06-24 00:17:12 +08:00
|
|
|
*/
|
2019-08-11 16:06:32 +08:00
|
|
|
struct dma_resv *resv;
|
2014-07-01 18:57:43 +08:00
|
|
|
|
2021-06-24 00:17:12 +08:00
|
|
|
/** @poll: for userspace poll support */
|
2014-07-01 18:57:43 +08:00
|
|
|
wait_queue_head_t poll;
|
|
|
|
|
2021-10-21 14:55:24 +08:00
|
|
|
/** @cb_in: for userspace poll support */
|
|
|
|
/** @cb_out: for userspace poll support */
|
2014-07-01 18:57:43 +08:00
|
|
|
struct dma_buf_poll_cb_t {
|
2016-10-25 20:00:45 +08:00
|
|
|
struct dma_fence_cb cb;
|
2014-07-01 18:57:43 +08:00
|
|
|
wait_queue_head_t *poll;
|
|
|
|
|
2017-07-04 11:53:17 +08:00
|
|
|
__poll_t active;
|
2021-06-15 19:12:33 +08:00
|
|
|
} cb_in, cb_out;
|
2021-06-04 05:47:51 +08:00
|
|
|
#ifdef CONFIG_DMABUF_SYSFS_STATS
|
2021-06-24 00:17:12 +08:00
|
|
|
/**
|
|
|
|
* @sysfs_entry:
|
|
|
|
*
|
|
|
|
* For exposing information about this buffer in sysfs. See also
|
|
|
|
* `DMA-BUF statistics`_ for the uapi this enables.
|
|
|
|
*/
|
2021-06-04 05:47:51 +08:00
|
|
|
struct dma_buf_sysfs_entry {
|
|
|
|
struct kobject kobj;
|
|
|
|
struct dma_buf *dmabuf;
|
|
|
|
} *sysfs_entry;
|
|
|
|
#endif
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
};
|
|
|
|
|
2018-07-03 22:42:26 +08:00
|
|
|
/**
|
|
|
|
* struct dma_buf_attach_ops - importer operations for an attachment
|
|
|
|
*
|
|
|
|
* Attachment operations implemented by the importer.
|
|
|
|
*/
|
|
|
|
struct dma_buf_attach_ops {
|
2018-03-23 00:09:42 +08:00
|
|
|
/**
|
|
|
|
* @allow_peer2peer:
|
|
|
|
*
|
|
|
|
* If this is set to true the importer must be able to handle peer
|
|
|
|
* resources without struct pages.
|
|
|
|
*/
|
|
|
|
bool allow_peer2peer;
|
|
|
|
|
2018-07-03 22:42:26 +08:00
|
|
|
/**
|
2020-04-08 12:20:34 +08:00
|
|
|
* @move_notify: [optional] notification that the DMA-buf is moving
|
2018-07-03 22:42:26 +08:00
|
|
|
*
|
|
|
|
* If this callback is provided the framework can avoid pinning the
|
|
|
|
* backing store while mappings exists.
|
|
|
|
*
|
|
|
|
* This callback is called with the lock of the reservation object
|
|
|
|
* associated with the dma_buf held and the mapping function must be
|
|
|
|
* called with this lock held as well. This makes sure that no mapping
|
|
|
|
* is created concurrently with an ongoing move operation.
|
|
|
|
*
|
|
|
|
* Mappings stay valid and are not directly affected by this callback.
|
|
|
|
* But the DMA-buf can now be in a different physical location, so all
|
|
|
|
* mappings should be destroyed and re-created as soon as possible.
|
|
|
|
*
|
|
|
|
* New mappings can be created after this callback returns, and will
|
|
|
|
* point to the new location of the DMA-buf.
|
|
|
|
*/
|
|
|
|
void (*move_notify)(struct dma_buf_attachment *attach);
|
|
|
|
};
|
|
|
|
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
/**
|
|
|
|
* struct dma_buf_attachment - holds device-buffer attachment data
|
|
|
|
* @dmabuf: buffer for this attachment.
|
|
|
|
* @dev: device attached to the buffer.
|
dma-buf: change DMA-buf locking convention v3
This patch is a stripped down version of the locking changes
necessary to support dynamic DMA-buf handling.
It adds a dynamic flag for both importers as well as exporters
so that drivers can choose if they want the reservation object
locked or unlocked during mapping of attachments.
For compatibility between drivers we cache the DMA-buf mapping
during attaching an importer as soon as exporter/importer
disagree on the dynamic handling.
Issues and solutions we considered:
- We can't change all existing drivers, and existing improters have
strong opinions about which locks they're holding while calling
dma_buf_attachment_map/unmap. Exporters also have strong opinions about
which locks they can acquire in their ->map/unmap callbacks, levaing no
room for change. The solution to avoid this was to move the
actual map/unmap out from this call, into the attach/detach callbacks,
and cache the mapping. This works because drivers don't call
attach/detach from deep within their code callchains (like deep in
memory management code called from cs/execbuf ioctl), but directly from
the fd2handle implementation.
- The caching has some troubles on some soc drivers, which set other modes
than DMA_BIDIRECTIONAL. We can't have 2 incompatible mappings, and we
can't re-create the mapping at _map time due to the above locking fun.
We very carefuly step around that by only caching at attach time if the
dynamic mode between importer/expoert mismatches.
- There's been quite some discussion on dma-buf mappings which need active
cache management, which would all break down when caching, plus we don't
have explicit flush operations on the attachment side. The solution to
this was to shrug and keep the current discrepancy between what the
dma-buf docs claim and what implementations do, with the hope that the
begin/end_cpu_access hooks are good enough and that all necessary
flushing to keep device mappings consistent will be done there.
v2: cleanup set_name merge, improve kerneldoc
v3: update commit message, kerneldoc and cleanup _debug_show()
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/336788/
2018-07-03 22:42:26 +08:00
|
|
|
* @node: list of dma_buf_attachment, protected by dma_resv lock of the dmabuf.
|
2018-07-03 22:42:26 +08:00
|
|
|
* @sgt: cached mapping.
|
|
|
|
* @dir: direction of cached mapping.
|
2018-03-23 00:09:42 +08:00
|
|
|
* @peer2peer: true if the importer can handle peer resources without pages.
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
* @priv: exporter specific attachment data.
|
2018-07-03 22:42:26 +08:00
|
|
|
* @importer_ops: importer operations for this attachment, if provided
|
|
|
|
* dma_buf_map/unmap_attachment() must be called with the dma_resv lock held.
|
|
|
|
* @importer_priv: importer specific attachment data.
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
*
|
|
|
|
* This structure holds the attachment information between the dma_buf buffer
|
|
|
|
* and its user device(s). The list contains one attachment struct per device
|
|
|
|
* attached to the buffer.
|
2016-12-10 02:53:07 +08:00
|
|
|
*
|
|
|
|
* An attachment is created by calling dma_buf_attach(), and released again by
|
|
|
|
* calling dma_buf_detach(). The DMA mapping itself needed to initiate a
|
|
|
|
* transfer is created by dma_buf_map_attachment() and freed again by calling
|
|
|
|
* dma_buf_unmap_attachment().
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
*/
|
|
|
|
struct dma_buf_attachment {
|
|
|
|
struct dma_buf *dmabuf;
|
|
|
|
struct device *dev;
|
|
|
|
struct list_head node;
|
2018-07-03 22:42:26 +08:00
|
|
|
struct sg_table *sgt;
|
|
|
|
enum dma_data_direction dir;
|
2018-03-23 00:09:42 +08:00
|
|
|
bool peer2peer;
|
2018-07-03 22:42:26 +08:00
|
|
|
const struct dma_buf_attach_ops *importer_ops;
|
|
|
|
void *importer_priv;
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
void *priv;
|
|
|
|
};
|
|
|
|
|
2015-01-23 15:23:43 +08:00
|
|
|
/**
|
|
|
|
* struct dma_buf_export_info - holds information needed to export a dma_buf
|
2015-05-05 17:26:15 +08:00
|
|
|
* @exp_name: name of the exporter - useful for debugging.
|
|
|
|
* @owner: pointer to exporter module - used for refcounting kernel module
|
2015-01-23 15:23:43 +08:00
|
|
|
* @ops: Attach allocator-defined dma buf ops to the new buffer
|
2020-11-11 05:41:17 +08:00
|
|
|
* @size: Size of the buffer - invariant over the lifetime of the buffer
|
2015-01-23 15:23:43 +08:00
|
|
|
* @flags: mode flags for the file
|
|
|
|
* @resv: reservation-object, NULL to allocate default one
|
|
|
|
* @priv: Attach private data of allocator to this buffer
|
|
|
|
*
|
|
|
|
* This structure holds the information required to export the buffer. Used
|
|
|
|
* with dma_buf_export() only.
|
|
|
|
*/
|
|
|
|
struct dma_buf_export_info {
|
|
|
|
const char *exp_name;
|
2015-05-05 17:26:15 +08:00
|
|
|
struct module *owner;
|
2015-01-23 15:23:43 +08:00
|
|
|
const struct dma_buf_ops *ops;
|
|
|
|
size_t size;
|
|
|
|
int flags;
|
2019-08-11 16:06:32 +08:00
|
|
|
struct dma_resv *resv;
|
2015-01-23 15:23:43 +08:00
|
|
|
void *priv;
|
|
|
|
};
|
|
|
|
|
|
|
|
/**
|
2016-12-10 02:53:07 +08:00
|
|
|
* DEFINE_DMA_BUF_EXPORT_INFO - helper macro for exporters
|
2016-04-01 04:26:50 +08:00
|
|
|
* @name: export-info name
|
2016-12-10 02:53:07 +08:00
|
|
|
*
|
2016-12-30 04:48:24 +08:00
|
|
|
* DEFINE_DMA_BUF_EXPORT_INFO macro defines the &struct dma_buf_export_info,
|
2016-12-10 02:53:07 +08:00
|
|
|
* zeroes it out and pre-populates exp_name in it.
|
2015-01-23 15:23:43 +08:00
|
|
|
*/
|
2016-04-01 04:26:50 +08:00
|
|
|
#define DEFINE_DMA_BUF_EXPORT_INFO(name) \
|
|
|
|
struct dma_buf_export_info name = { .exp_name = KBUILD_MODNAME, \
|
2015-05-05 17:26:15 +08:00
|
|
|
.owner = THIS_MODULE }
|
2015-01-23 15:23:43 +08:00
|
|
|
|
2012-03-17 00:04:41 +08:00
|
|
|
/**
|
|
|
|
* get_dma_buf - convenience wrapper for get_file.
|
|
|
|
* @dmabuf: [in] pointer to dma_buf
|
|
|
|
*
|
|
|
|
* Increments the reference count on the dma-buf, needed in case of drivers
|
|
|
|
* that either need to create additional references to the dmabuf on the
|
|
|
|
* kernel side. For example, an exporter that needs to keep a dmabuf ptr
|
|
|
|
* so that subsequent exports don't create a new dmabuf.
|
|
|
|
*/
|
|
|
|
static inline void get_dma_buf(struct dma_buf *dmabuf)
|
|
|
|
{
|
|
|
|
get_file(dmabuf->file);
|
|
|
|
}
|
|
|
|
|
dma-buf: change DMA-buf locking convention v3
This patch is a stripped down version of the locking changes
necessary to support dynamic DMA-buf handling.
It adds a dynamic flag for both importers as well as exporters
so that drivers can choose if they want the reservation object
locked or unlocked during mapping of attachments.
For compatibility between drivers we cache the DMA-buf mapping
during attaching an importer as soon as exporter/importer
disagree on the dynamic handling.
Issues and solutions we considered:
- We can't change all existing drivers, and existing improters have
strong opinions about which locks they're holding while calling
dma_buf_attachment_map/unmap. Exporters also have strong opinions about
which locks they can acquire in their ->map/unmap callbacks, levaing no
room for change. The solution to avoid this was to move the
actual map/unmap out from this call, into the attach/detach callbacks,
and cache the mapping. This works because drivers don't call
attach/detach from deep within their code callchains (like deep in
memory management code called from cs/execbuf ioctl), but directly from
the fd2handle implementation.
- The caching has some troubles on some soc drivers, which set other modes
than DMA_BIDIRECTIONAL. We can't have 2 incompatible mappings, and we
can't re-create the mapping at _map time due to the above locking fun.
We very carefuly step around that by only caching at attach time if the
dynamic mode between importer/expoert mismatches.
- There's been quite some discussion on dma-buf mappings which need active
cache management, which would all break down when caching, plus we don't
have explicit flush operations on the attachment side. The solution to
this was to shrug and keep the current discrepancy between what the
dma-buf docs claim and what implementations do, with the hope that the
begin/end_cpu_access hooks are good enough and that all necessary
flushing to keep device mappings consistent will be done there.
v2: cleanup set_name merge, improve kerneldoc
v3: update commit message, kerneldoc and cleanup _debug_show()
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/336788/
2018-07-03 22:42:26 +08:00
|
|
|
/**
|
|
|
|
* dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
|
|
|
|
* @dmabuf: the DMA-buf to check
|
|
|
|
*
|
|
|
|
* Returns true if a DMA-buf exporter wants to be called with the dma_resv
|
|
|
|
* locked for the map/unmap callbacks, false if it doesn't wants to be called
|
|
|
|
* with the lock held.
|
|
|
|
*/
|
|
|
|
static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
|
|
|
|
{
|
2020-02-18 23:57:24 +08:00
|
|
|
return !!dmabuf->ops->pin;
|
dma-buf: change DMA-buf locking convention v3
This patch is a stripped down version of the locking changes
necessary to support dynamic DMA-buf handling.
It adds a dynamic flag for both importers as well as exporters
so that drivers can choose if they want the reservation object
locked or unlocked during mapping of attachments.
For compatibility between drivers we cache the DMA-buf mapping
during attaching an importer as soon as exporter/importer
disagree on the dynamic handling.
Issues and solutions we considered:
- We can't change all existing drivers, and existing improters have
strong opinions about which locks they're holding while calling
dma_buf_attachment_map/unmap. Exporters also have strong opinions about
which locks they can acquire in their ->map/unmap callbacks, levaing no
room for change. The solution to avoid this was to move the
actual map/unmap out from this call, into the attach/detach callbacks,
and cache the mapping. This works because drivers don't call
attach/detach from deep within their code callchains (like deep in
memory management code called from cs/execbuf ioctl), but directly from
the fd2handle implementation.
- The caching has some troubles on some soc drivers, which set other modes
than DMA_BIDIRECTIONAL. We can't have 2 incompatible mappings, and we
can't re-create the mapping at _map time due to the above locking fun.
We very carefuly step around that by only caching at attach time if the
dynamic mode between importer/expoert mismatches.
- There's been quite some discussion on dma-buf mappings which need active
cache management, which would all break down when caching, plus we don't
have explicit flush operations on the attachment side. The solution to
this was to shrug and keep the current discrepancy between what the
dma-buf docs claim and what implementations do, with the hope that the
begin/end_cpu_access hooks are good enough and that all necessary
flushing to keep device mappings consistent will be done there.
v2: cleanup set_name merge, improve kerneldoc
v3: update commit message, kerneldoc and cleanup _debug_show()
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/336788/
2018-07-03 22:42:26 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* dma_buf_attachment_is_dynamic - check if a DMA-buf attachment uses dynamic
|
2021-08-09 20:22:46 +08:00
|
|
|
* mappings
|
dma-buf: change DMA-buf locking convention v3
This patch is a stripped down version of the locking changes
necessary to support dynamic DMA-buf handling.
It adds a dynamic flag for both importers as well as exporters
so that drivers can choose if they want the reservation object
locked or unlocked during mapping of attachments.
For compatibility between drivers we cache the DMA-buf mapping
during attaching an importer as soon as exporter/importer
disagree on the dynamic handling.
Issues and solutions we considered:
- We can't change all existing drivers, and existing improters have
strong opinions about which locks they're holding while calling
dma_buf_attachment_map/unmap. Exporters also have strong opinions about
which locks they can acquire in their ->map/unmap callbacks, levaing no
room for change. The solution to avoid this was to move the
actual map/unmap out from this call, into the attach/detach callbacks,
and cache the mapping. This works because drivers don't call
attach/detach from deep within their code callchains (like deep in
memory management code called from cs/execbuf ioctl), but directly from
the fd2handle implementation.
- The caching has some troubles on some soc drivers, which set other modes
than DMA_BIDIRECTIONAL. We can't have 2 incompatible mappings, and we
can't re-create the mapping at _map time due to the above locking fun.
We very carefuly step around that by only caching at attach time if the
dynamic mode between importer/expoert mismatches.
- There's been quite some discussion on dma-buf mappings which need active
cache management, which would all break down when caching, plus we don't
have explicit flush operations on the attachment side. The solution to
this was to shrug and keep the current discrepancy between what the
dma-buf docs claim and what implementations do, with the hope that the
begin/end_cpu_access hooks are good enough and that all necessary
flushing to keep device mappings consistent will be done there.
v2: cleanup set_name merge, improve kerneldoc
v3: update commit message, kerneldoc and cleanup _debug_show()
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/336788/
2018-07-03 22:42:26 +08:00
|
|
|
* @attach: the DMA-buf attachment to check
|
|
|
|
*
|
|
|
|
* Returns true if a DMA-buf importer wants to call the map/unmap functions with
|
|
|
|
* the dma_resv lock held.
|
|
|
|
*/
|
|
|
|
static inline bool
|
|
|
|
dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
|
|
|
|
{
|
2018-07-03 22:42:26 +08:00
|
|
|
return !!attach->importer_ops;
|
dma-buf: change DMA-buf locking convention v3
This patch is a stripped down version of the locking changes
necessary to support dynamic DMA-buf handling.
It adds a dynamic flag for both importers as well as exporters
so that drivers can choose if they want the reservation object
locked or unlocked during mapping of attachments.
For compatibility between drivers we cache the DMA-buf mapping
during attaching an importer as soon as exporter/importer
disagree on the dynamic handling.
Issues and solutions we considered:
- We can't change all existing drivers, and existing improters have
strong opinions about which locks they're holding while calling
dma_buf_attachment_map/unmap. Exporters also have strong opinions about
which locks they can acquire in their ->map/unmap callbacks, levaing no
room for change. The solution to avoid this was to move the
actual map/unmap out from this call, into the attach/detach callbacks,
and cache the mapping. This works because drivers don't call
attach/detach from deep within their code callchains (like deep in
memory management code called from cs/execbuf ioctl), but directly from
the fd2handle implementation.
- The caching has some troubles on some soc drivers, which set other modes
than DMA_BIDIRECTIONAL. We can't have 2 incompatible mappings, and we
can't re-create the mapping at _map time due to the above locking fun.
We very carefuly step around that by only caching at attach time if the
dynamic mode between importer/expoert mismatches.
- There's been quite some discussion on dma-buf mappings which need active
cache management, which would all break down when caching, plus we don't
have explicit flush operations on the attachment side. The solution to
this was to shrug and keep the current discrepancy between what the
dma-buf docs claim and what implementations do, with the hope that the
begin/end_cpu_access hooks are good enough and that all necessary
flushing to keep device mappings consistent will be done there.
v2: cleanup set_name merge, improve kerneldoc
v3: update commit message, kerneldoc and cleanup _debug_show()
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/336788/
2018-07-03 22:42:26 +08:00
|
|
|
}
|
|
|
|
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
|
dma-buf: change DMA-buf locking convention v3
This patch is a stripped down version of the locking changes
necessary to support dynamic DMA-buf handling.
It adds a dynamic flag for both importers as well as exporters
so that drivers can choose if they want the reservation object
locked or unlocked during mapping of attachments.
For compatibility between drivers we cache the DMA-buf mapping
during attaching an importer as soon as exporter/importer
disagree on the dynamic handling.
Issues and solutions we considered:
- We can't change all existing drivers, and existing improters have
strong opinions about which locks they're holding while calling
dma_buf_attachment_map/unmap. Exporters also have strong opinions about
which locks they can acquire in their ->map/unmap callbacks, levaing no
room for change. The solution to avoid this was to move the
actual map/unmap out from this call, into the attach/detach callbacks,
and cache the mapping. This works because drivers don't call
attach/detach from deep within their code callchains (like deep in
memory management code called from cs/execbuf ioctl), but directly from
the fd2handle implementation.
- The caching has some troubles on some soc drivers, which set other modes
than DMA_BIDIRECTIONAL. We can't have 2 incompatible mappings, and we
can't re-create the mapping at _map time due to the above locking fun.
We very carefuly step around that by only caching at attach time if the
dynamic mode between importer/expoert mismatches.
- There's been quite some discussion on dma-buf mappings which need active
cache management, which would all break down when caching, plus we don't
have explicit flush operations on the attachment side. The solution to
this was to shrug and keep the current discrepancy between what the
dma-buf docs claim and what implementations do, with the hope that the
begin/end_cpu_access hooks are good enough and that all necessary
flushing to keep device mappings consistent will be done there.
v2: cleanup set_name merge, improve kerneldoc
v3: update commit message, kerneldoc and cleanup _debug_show()
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/336788/
2018-07-03 22:42:26 +08:00
|
|
|
struct device *dev);
|
|
|
|
struct dma_buf_attachment *
|
|
|
|
dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
|
2018-07-03 22:42:26 +08:00
|
|
|
const struct dma_buf_attach_ops *importer_ops,
|
|
|
|
void *importer_priv);
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
void dma_buf_detach(struct dma_buf *dmabuf,
|
dma-buf: change DMA-buf locking convention v3
This patch is a stripped down version of the locking changes
necessary to support dynamic DMA-buf handling.
It adds a dynamic flag for both importers as well as exporters
so that drivers can choose if they want the reservation object
locked or unlocked during mapping of attachments.
For compatibility between drivers we cache the DMA-buf mapping
during attaching an importer as soon as exporter/importer
disagree on the dynamic handling.
Issues and solutions we considered:
- We can't change all existing drivers, and existing improters have
strong opinions about which locks they're holding while calling
dma_buf_attachment_map/unmap. Exporters also have strong opinions about
which locks they can acquire in their ->map/unmap callbacks, levaing no
room for change. The solution to avoid this was to move the
actual map/unmap out from this call, into the attach/detach callbacks,
and cache the mapping. This works because drivers don't call
attach/detach from deep within their code callchains (like deep in
memory management code called from cs/execbuf ioctl), but directly from
the fd2handle implementation.
- The caching has some troubles on some soc drivers, which set other modes
than DMA_BIDIRECTIONAL. We can't have 2 incompatible mappings, and we
can't re-create the mapping at _map time due to the above locking fun.
We very carefuly step around that by only caching at attach time if the
dynamic mode between importer/expoert mismatches.
- There's been quite some discussion on dma-buf mappings which need active
cache management, which would all break down when caching, plus we don't
have explicit flush operations on the attachment side. The solution to
this was to shrug and keep the current discrepancy between what the
dma-buf docs claim and what implementations do, with the hope that the
begin/end_cpu_access hooks are good enough and that all necessary
flushing to keep device mappings consistent will be done there.
v2: cleanup set_name merge, improve kerneldoc
v3: update commit message, kerneldoc and cleanup _debug_show()
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/336788/
2018-07-03 22:42:26 +08:00
|
|
|
struct dma_buf_attachment *attach);
|
2018-07-03 22:42:26 +08:00
|
|
|
int dma_buf_pin(struct dma_buf_attachment *attach);
|
|
|
|
void dma_buf_unpin(struct dma_buf_attachment *attach);
|
2013-03-22 20:52:16 +08:00
|
|
|
|
2015-01-23 15:23:43 +08:00
|
|
|
struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info);
|
2013-03-22 20:52:16 +08:00
|
|
|
|
2012-03-16 18:34:02 +08:00
|
|
|
int dma_buf_fd(struct dma_buf *dmabuf, int flags);
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
struct dma_buf *dma_buf_get(int fd);
|
|
|
|
void dma_buf_put(struct dma_buf *dmabuf);
|
|
|
|
|
|
|
|
struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
|
|
|
|
enum dma_data_direction);
|
2012-01-27 17:39:27 +08:00
|
|
|
void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *,
|
|
|
|
enum dma_data_direction);
|
dma-buf: change DMA-buf locking convention v3
This patch is a stripped down version of the locking changes
necessary to support dynamic DMA-buf handling.
It adds a dynamic flag for both importers as well as exporters
so that drivers can choose if they want the reservation object
locked or unlocked during mapping of attachments.
For compatibility between drivers we cache the DMA-buf mapping
during attaching an importer as soon as exporter/importer
disagree on the dynamic handling.
Issues and solutions we considered:
- We can't change all existing drivers, and existing improters have
strong opinions about which locks they're holding while calling
dma_buf_attachment_map/unmap. Exporters also have strong opinions about
which locks they can acquire in their ->map/unmap callbacks, levaing no
room for change. The solution to avoid this was to move the
actual map/unmap out from this call, into the attach/detach callbacks,
and cache the mapping. This works because drivers don't call
attach/detach from deep within their code callchains (like deep in
memory management code called from cs/execbuf ioctl), but directly from
the fd2handle implementation.
- The caching has some troubles on some soc drivers, which set other modes
than DMA_BIDIRECTIONAL. We can't have 2 incompatible mappings, and we
can't re-create the mapping at _map time due to the above locking fun.
We very carefuly step around that by only caching at attach time if the
dynamic mode between importer/expoert mismatches.
- There's been quite some discussion on dma-buf mappings which need active
cache management, which would all break down when caching, plus we don't
have explicit flush operations on the attachment side. The solution to
this was to shrug and keep the current discrepancy between what the
dma-buf docs claim and what implementations do, with the hope that the
begin/end_cpu_access hooks are good enough and that all necessary
flushing to keep device mappings consistent will be done there.
v2: cleanup set_name merge, improve kerneldoc
v3: update commit message, kerneldoc and cleanup _debug_show()
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/336788/
2018-07-03 22:42:26 +08:00
|
|
|
void dma_buf_move_notify(struct dma_buf *dma_buf);
|
2015-12-23 05:36:45 +08:00
|
|
|
int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
|
2012-03-20 07:02:37 +08:00
|
|
|
enum dma_data_direction dir);
|
dma-buf, drm, ion: Propagate error code from dma_buf_start_cpu_access()
Drivers, especially i915.ko, can fail during the initial migration of a
dma-buf for CPU access. However, the error code from the driver was not
being propagated back to ioctl and so userspace was blissfully ignorant
of the failure. Rendering corruption ensues.
Whilst fixing the ioctl to return the error code from
dma_buf_start_cpu_access(), also do the same for
dma_buf_end_cpu_access(). For most drivers, dma_buf_end_cpu_access()
cannot fail. i915.ko however, as most drivers would, wants to avoid being
uninterruptible (as would be required to guarrantee no failure when
flushing the buffer to the device). As userspace already has to handle
errors from the SYNC_IOCTL, take advantage of this to be able to restart
the syscall across signals.
This fixes a coherency issue for i915.ko as well as reducing the
uninterruptible hold upon its BKL, the struct_mutex.
Fixes commit c11e391da2a8fe973c3c2398452000bed505851e
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Feb 11 20:04:51 2016 -0200
dma-buf: Add ioctls to allow userspace to flush
Testcase: igt/gem_concurrent_blit/*dmabuf*interruptible
Testcase: igt/prime_mmap_coherency/ioctl-errors
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tiago Vignatti <tiago.vignatti@intel.com>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Daniel Vetter <daniel.vetter@intel.com>
CC: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: intel-gfx@lists.freedesktop.org
Cc: devel@driverdev.osuosl.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1458331359-2634-1-git-send-email-chris@chris-wilson.co.uk
2016-03-19 04:02:39 +08:00
|
|
|
int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
|
|
|
|
enum dma_data_direction dir);
|
2022-10-18 01:22:10 +08:00
|
|
|
struct sg_table *
|
|
|
|
dma_buf_map_attachment_unlocked(struct dma_buf_attachment *attach,
|
|
|
|
enum dma_data_direction direction);
|
|
|
|
void dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach,
|
|
|
|
struct sg_table *sg_table,
|
|
|
|
enum dma_data_direction direction);
|
dma-buf: mmap support
Compared to Rob Clark's RFC I've ditched the prepare/finish hooks
and corresponding ioctls on the dma_buf file. The major reason for
that is that many people seem to be under the impression that this is
also for synchronization with outstanding asynchronous processsing.
I'm pretty massively opposed to this because:
- It boils down reinventing a new rather general-purpose userspace
synchronization interface. If we look at things like futexes, this
is hard to get right.
- Furthermore a lot of kernel code has to interact with this
synchronization primitive. This smells a look like the dri1 hw_lock,
a horror show I prefer not to reinvent.
- Even more fun is that multiple different subsystems would interact
here, so we have plenty of opportunities to create funny deadlock
scenarios.
I think synchronization is a wholesale different problem from data
sharing and should be tackled as an orthogonal problem.
Now we could demand that prepare/finish may only ensure cache
coherency (as Rob intended), but that runs up into the next problem:
We not only need mmap support to facilitate sw-only processing nodes
in a pipeline (without jumping through hoops by importing the dma_buf
into some sw-access only importer), which allows for a nicer
ION->dma-buf upgrade path for existing Android userspace. We also need
mmap support for existing importing subsystems to support existing
userspace libraries. And a loot of these subsystems are expected to
export coherent userspace mappings.
So prepare/finish can only ever be optional and the exporter /needs/
to support coherent mappings. Given that mmap access is always
somewhat fallback-y in nature I've decided to drop this optimization,
instead of just making it optional. If we demonstrate a clear need for
this, supported by benchmark results, we can always add it in again
later as an optional extension.
Other differences compared to Rob's RFC is the above mentioned support
for mapping a dma-buf through facilities provided by the importer.
Which results in mmap support no longer being optional.
Note that this dma-buf mmap patch does _not_ support every possible
insanity an existing subsystem could pull of with mmap: Because it
does not allow to intercept pagefaults and shoot down ptes importing
subsystems can't add some magic of their own at these points (e.g. to
automatically synchronize with outstanding rendering or set up some
special resources). I've done a cursory read through a few mmap
implementions of various subsytems and I'm hopeful that we can avoid
this (and the complexity it'd bring with it).
Additonally I've extended the documentation a bit to explain the hows
and whys of this mmap extension.
In case we ever want to add support for explicitly cache maneged
userspace mmap with a prepare/finish ioctl pair, we could specify that
userspace needs to mmap a different part of the dma_buf, e.g. the
range starting at dma_buf->size up to dma_buf->size*2. This works
because the size of a dma_buf is invariant over it's lifetime. The
exporter would obviously need to fall back to coherent mappings for
both ranges if a legacy clients maps the coherent range and the
architecture cannot suppor conflicting caching policies. Also, this
would obviously be optional and userspace needs to be able to fall
back to coherent mappings.
v2:
- Spelling fixes from Rob Clark.
- Compile fix for !DMA_BUF from Rob Clark.
- Extend commit message to explain how explicitly cache managed mmap
support could be added later.
- Extend the documentation with implementations notes for exporters
that need to manually fake coherency.
v3:
- dma_buf pointer initialization goof-up noticed by Rebecca Schultz
Zavin.
Cc: Rob Clark <rob.clark@linaro.org>
Cc: Rebecca Schultz Zavin <rebecca@android.com>
Acked-by: Rob Clark <rob.clark@linaro.org>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
2012-04-24 17:08:52 +08:00
|
|
|
|
|
|
|
int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *,
|
|
|
|
unsigned long);
|
2022-02-05 01:05:41 +08:00
|
|
|
int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map);
|
|
|
|
void dma_buf_vunmap(struct dma_buf *dmabuf, struct iosys_map *map);
|
2022-10-18 01:22:09 +08:00
|
|
|
int dma_buf_vmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map);
|
|
|
|
void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map);
|
dma-buf: Introduce dma buffer sharing mechanism
This is the first step in defining a dma buffer sharing mechanism.
A new buffer object dma_buf is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- creation of a buffer object, its association with a file pointer, and
associated allocator-defined operations on that buffer. This operation is
called the 'export' operation.
- different devices to 'attach' themselves to this exported buffer object, to
facilitate backing storage negotiation, using dma_buf_attach() API.
- the exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist associated with this buffer
object using map_dma_buf and unmap_dma_buf operations.
Atleast one 'attach()' call is required to be made prior to calling the
map_dma_buf() operation.
Couple of building blocks in map_dma_buf() are added to ease introduction
of sync'ing across exporter and users, and late allocation by the exporter.
For this first version, this framework will work with certain conditions:
- *ONLY* exporter will be allowed to mmap to userspace (outside of this
framework - mmap is not a buffer object operation),
- currently, *ONLY* users that do not need CPU access to the buffer are
allowed.
More details are there in the documentation patch.
This is based on design suggestions from many people at the mini-summits[1],
most notably from Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and
Daniel Vetter <daniel@ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws@samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-and-Tested-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-26 17:23:15 +08:00
|
|
|
#endif /* __DMA_BUF_H__ */
|