Call reset using the wrapper function vdpa_reset() to make sure the
operation is serialized with cf_mutex.
This comes to protect from the following possible scenario:
vhost_vdpa_set_status() could call the reset op. Since the call is not
protected by cf_mutex, a netlink thread calling vdpa_dev_config_fill
could get passed the VIRTIO_CONFIG_S_FEATURES_OK check in
vdpa_dev_config_fill() and end up reporting wrong features.
Fixes: 5f6e85953d8f ("vdpa: Read device configuration only if FEATURES_OK")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220111183400.38418-3-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Avoid the wrapper holding cf_mutex since it is not protecting anything.
To avoid confusion and unnecessary overhead incurred by it, remove.
Fixes: f489f27bc0ab ("vdpa: Sync calls set/get config/status with cf_mutex")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220111183400.38418-2-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Add netlink support to configure the max virtqueue pairs for a device.
At least one pair is required. The maximum is dictated by the device.
Example:
$ vdpa dev add name vdpa-a mgmtdev auxiliary/mlx5_core.sf.1 max_vqp 4
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-6-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add wrappers to get/set status and protect these operations with
cf_mutex to serialize these operations with respect to get/set config
operations.
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-4-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Provide an interface to read the negotiated features. This is needed
when building the netlink message in vdpa_dev_net_config_fill().
Also fix the implementation of vdpa_dev_net_config_fill() to use the
negotiated features instead of the device features.
To make APIs clearer, make the following name changes to struct
vdpa_config_ops so they better describe their operations:
get_features -> get_device_features
set_features -> set_driver_features
Finally, add get_driver_features to return the negotiated features and
add implementation to all the upstream drivers.
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-2-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The return type of get_config_size is size_t so it makes
sense to change the type of the variable holding its result.
That said, this already got taken care of (differently, and arguably
not as well) by commit 3ed21c1451 ("vdpa: check that offsets are
within bounds").
The added 'c->off > size' test in that commit will be done as an
unsigned comparison on 32-bit (safe due to not being signed).
On a 64-bit platform, it will be done as a signed comparison, but in
that case the comparison will be done in 64-bit, and 'c->off' being an
u32 it will be valid thanks to the extended range (ie both values will
be positive in 64 bits).
So this was a real bug, but it was already addressed and marked for stable.
Signed-off-by: Laura Abbott <labbott@kernel.org>
Reported-by: Luo Likang <luolikang@nsfocus.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It has no sense to call get_status twice, since we already have a
variable for that.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Link: https://lore.kernel.org/r/20211104195833.2089796-1-eperezma@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
In this function "c->off" is a u32 and "size" is a long. On 64bit systems
if "c->off" is greater than "size" then "size - c->off" is a negative and
we always return -E2BIG. But on 32bit systems the subtraction is type
promoted to a high positive u32 value and basically any "c->len" is
accepted.
Fixes: 4c8cf31885 ("vhost: introduce vDPA-based backend")
Reported-by: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211208103337.GA4047@kili
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: stable@vger.kernel.org
Vdpa devices should be reset after unseting irqs of virtqueues, or we
will get errors when killing qemu process:
>> pi_update_irte: failed to update PI IRTE
>> irq bypass consumer (token 0000000065102a43) unregistration fails: -22
Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com>
Link: https://lore.kernel.org/r/a2cb60cf73be9da5c4e6399242117d8818f975ae.1636946171.git.wuzongyong@linux.alibaba.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
We can increment `total_len` directly and remove `len` since it
is no longer used for vhost_add_used().
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211122163525.294024-3-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
The "used length" reported by calling vhost_add_used() must be the
number of bytes written by the device (using "in" buffers).
In vhost_vsock_handle_tx_kick() the device only reads the guest
buffers (they are all "out" buffers), without writing anything,
so we must pass 0 as "used length" to comply virtio spec.
Fixes: 433fc58e6b ("VSOCK: Introduce vhost_vsock.ko")
Cc: stable@vger.kernel.org
Reported-by: Halil Pasic <pasic@linux.ibm.com>
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211122163525.294024-2-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Subsequent patches enable get and set configuration either
via management device or via vdpa device' config ops.
This requires synchronization between multiple callers to get and set
config callbacks. Features setting also influence the layout of the
configuration fields endianness.
To avoid exposing synchronization primitives to callers, introduce
helper for setting the configuration and use it.
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211026175519.87795-2-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes up some issues in rc5.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmFm1OcPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpp0AIAL5lvwd/Dzj0trDEOm93yTiKigkllJ9if8pF
G3V6IgyyeNujmEJLD2Fz5GPUVXg5G2+Yh2xgf8OP4syS/pZoCGkGcLIoo7lJRSvz
D3KpNYrZ2O1kFw2XWP3p7O/H8pxWAMPjykRaVoniCd+rIIpRzWdYBDDWConUGqC1
IlB5BS6cv3vRIoJ4Ac3YVOEUM4WOw/2fwzxejVjxQdNjgbWL0JBY1IBiJrfp7iEo
L3KRmN25JWCwE0x+Ehy6/uSalVzPgjESBYzrBFihnFXDS/LIIIXuRb9Q3HzEFFHT
UhccuExc8Nq8JDKWrZkYP2s950Gnv249bC9tqlHHTnoE+pxG3Ms=
=i68W
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
"Fixes up some issues in rc5"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost-vdpa: Fix the wrong input in config_cb
VDUSE: fix documentation underline warning
Revert "virtio-blk: Add validation for block size in config space"
vhost_vdpa: unset vq irq before freeing irq
virtio: write back F_VERSION_1 before validate
Fix the wrong input in for config_cb. In function vhost_vdpa_config_cb,
the input cb.private was used as struct vhost_vdpa, so the input was
wrong here, fix this issue
Fixes: 776f395004 ("vhost_vdpa: Support config interrupt in vdpa")
Signed-off-by: Cindy Lu <lulu@redhat.com>
Link: https://lore.kernel.org/r/20210929090933.20465-1-lulu@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently we unset vq irq after freeing irq and that will result in
error messages:
pi_update_irte: failed to update PI IRTE
irq bypass consumer (token 000000005a07a12b) unregistration fails: -22
This patch solves this.
Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com>
Link: https://lore.kernel.org/r/02637d38dcf4e4b836c5b3a65055fe92bf812b3b.1631687872.git.wuzongyong@linux.alibaba.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Fixes up some issues in rc1.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmFSRZgPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpSnYIAMNstllnJgyDR0GUGO850AKv0x2acncf66wc
J5vjWFWh5rtmdZSMhvA5mo3J8h/7s6Mn67fCwKt0Ii6fi1f6eIl4OBDYBfV8wXkN
+e8eQAUboi3HLqsiuFSNpJTHnD70xbU4inxiTjaBndXaxk20nkWJsd1Mvfmxh+mE
uMRnumAwrdL3c0n0Vrcq8j+zxLhlDXSFjICd6l+xRwPigsX/5gY+V5tPQg4lhnk+
VuC2Q0eJHmCEjrVi4Tx7dkoDu9U4Go5CnVF0MF9AzVf7JYBPJPks3r17unMe+ZAI
Nvwa/39no2APa9wZPQnGk5V9rOPtYFa6XXsCufN4BbtlqVX5AlQ=
=2oo6
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vdpa fixes from Michael Tsirkin:
"Fixes up some issues in rc1"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpa: potential uninitialized return in vhost_vdpa_va_map()
vdpa/mlx5: Avoid executing set_vq_ready() if device is reset
vdpa/mlx5: Clear ready indication for control VQ
vduse: Cleanup the old kernel states after reset failure
vduse: missing error code in vduse_init()
virtio: don't fail on !of_device_is_compatible
The concern here is that "ret" can be uninitialized if we hit the
"goto next" condition on every iteration through the loop.
Fixes: 41ba1b5f9d4b ("vdpa: Support transferring virtual addressing during DMA mapping")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20210907073253.GB18254@kili
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
vduse driver supporting blk
virtio-vsock support for end of record with SEQPACKET
vdpa: mac and mq support for ifcvf and mlx5
vdpa: management netlink for ifcvf
virtio-i2c, gpio dt bindings
misc fixes, cleanups
NB: when merging this with
b542e383d8 ("eventfd: Make signal recursion protection a task bit")
from Linus' tree, replace eventfd_signal_count with
eventfd_signal_allowed, and drop the export of eventfd_wake_count from
("eventfd: Export eventfd_wake_count to modules").
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmE1+awPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpt6EIAJy0qrc62lktNA0IiIVJSLbUbTMmFj8MzkGR
8UxZdhpjWqBPJPyaOuNeksAqTGm/UAPEYx3C2c95Jhej7anFpy7dbCtIXcPHLJME
DjcJg+EDrlNCj8m0FcsHpHWsFzPMERJpyEZNxgB5WazQbv+yWhGrg2FN5DCnF0Ro
ZFYeKSVty148pQ0nHl8X0JM2XMtqit+O+LvKN2HQZ+fubh7BCzMxzkHY0QLHIzUS
UeZqd3Qm8YcbqnlX38P5D6k+NPiTEgknmxaBLkPxg6H3XxDAmaIRFb8Ldd1rsgy1
zTLGDiSGpVDIpawRnuEAzqJThV3Y5/MVJ1WD+mDYQ96tmhfp+KY=
=DBH/
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
- vduse driver ("vDPA Device in Userspace") supporting emulated virtio
block devices
- virtio-vsock support for end of record with SEQPACKET
- vdpa: mac and mq support for ifcvf and mlx5
- vdpa: management netlink for ifcvf
- virtio-i2c, gpio dt bindings
- misc fixes and cleanups
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (39 commits)
Documentation: Add documentation for VDUSE
vduse: Introduce VDUSE - vDPA Device in Userspace
vduse: Implement an MMU-based software IOTLB
vdpa: Support transferring virtual addressing during DMA mapping
vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap()
vdpa: Add an opaque pointer for vdpa_config_ops.dma_map()
vhost-iotlb: Add an opaque pointer for vhost IOTLB
vhost-vdpa: Handle the failure of vdpa_reset()
vdpa: Add reset callback in vdpa_config_ops
vdpa: Fix some coding style issues
file: Export receive_fd() to modules
eventfd: Export eventfd_wake_count to modules
iova: Export alloc_iova_fast() and free_iova_fast()
virtio-blk: remove unneeded "likely" statements
virtio-balloon: Use virtio_find_vqs() helper
vdpa: Make use of PFN_PHYS/PFN_UP/PFN_DOWN helper macro
vsock_test: update message bounds test for MSG_EOR
af_vsock: rename variables in receive loop
virtio/vsock: support MSG_EOR bit processing
vhost/vsock: support MSG_EOR bit processing
...
If the sendmsg() call in vhost_tx_batch() fails, both the 'batched_xdp'
and 'done_idx' indexes are left unchanged. If such failure happens
when batched_xdp == VHOST_NET_BATCH, the next call to
vhost_net_build_xdp() will access and write memory outside the xdp
buffers area.
Since sendmsg() can only error with EBADFD, this change addresses the
issue explicitly freeing the XDP buffers batch on error.
Fixes: 0a0be13b8f ("vhost_net: batch submitting XDP buffers to underlayer sockets")
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces an attribute for vDPA device to indicate
whether virtual address can be used. If vDPA device driver set
it, vhost-vdpa bus driver will not pin user page and transfer
userspace virtual address instead of physical address during
DMA mapping. And corresponding vma->vm_file and offset will be
also passed as an opaque pointer.
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-11-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The upcoming patch is going to support VA mapping/unmapping.
So let's factor out the logic of PA mapping/unmapping firstly
to make the code more readable.
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-10-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add an opaque pointer for DMA mapping.
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-9-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add an opaque pointer for vhost IOTLB. And introduce
vhost_iotlb_add_range_ctx() to accept it.
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-8-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The vdpa_reset() may fail now. This adds check to its return
value and fail the vhost_vdpa_open().
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-7-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds a new callback to support device specific reset
behavior. The vdpa bus driver will call the reset function
instead of setting status to zero during resetting.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20210831103634.33-6-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it's a nice refactor to make use of
PFN_PHYS/PFN_UP/PFN_DOWN helper macro
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Link: https://lore.kernel.org/r/20210802013717.851-1-caihuoqing@baidu.com
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
'MSG_EOR' handling has similar logic as 'MSG_EOM' - if bit present
in packet's header, reset it to 0. Then restore it back if packet
processing wasn't completed. Instead of bool variable for each
flag, bit mask variable was added: it has logical OR of 'MSG_EOR'
and 'MSG_EOM' if needed, to restore flags, this variable is ORed
with flags field of packet.
Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
Link: https://lore.kernel.org/r/20210903123238.3273526-1-arseny.krasnov@kaspersky.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
This current implemented bit is used to mark end of messages
('EOM' - end of message), not records('EOR' - end of record).
Also rename 'record' to 'message' in implementation as it is
different things.
Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210903123109.3273053-1-arseny.krasnov@kaspersky.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
use SPDX-License-Identifier instead of a verbose license text
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Link: https://lore.kernel.org/r/20210821123320.734-1-caihuoqing@baidu.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Both SKB_FRAG_PAGE_ORDER are defined to the same value in
net/core/sock.c and drivers/vhost/net.c.
Move the SKB_FRAG_PAGE_ORDER definition to net/core/sock.h,
as both net/core/sock.c and drivers/vhost/net.c include it,
and it seems a reasonable file to put the macro.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As __vringh_iov() traverses a descriptor chain, it populates
each descriptor entry into either read or write vring iov
and increments that iov's ->used member. So, as we iterate
over a descriptor chain, at any point, (riov/wriov)->used
value gives the number of descriptor enteries available,
which are to be read or written by the device. As all read
iovs must precede the write iovs, wiov->used should be zero
when we are traversing a read descriptor. Current code checks
for wiov->i, to figure out whether any previous entry in the
current descriptor chain was a write descriptor. However,
iov->i is only incremented, when these vring iovs are consumed,
at a later point, and remain 0 in __vringh_iov(). So, correct
the check for read and write descriptor order, to use
wiov->used.
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Link: https://lore.kernel.org/r/1624591502-4827-1-git-send-email-neeraju@codeaurora.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This fixes the incorrect calculation for integer overflow
when the last address of iova range is 0xffffffff.
Fixes: ec33d031a1 ("vhost: detect 32 bit integer wrap around")
Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210728130756.97-2-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The "msg->iova + msg->size" addition can have an integer overflow
if the iotlb message is from a malicious user space application.
So let's fix it.
Fixes: 1b48dc03e5 ("vhost: vdpa: report iova range")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210728130756.97-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch extends the vdpa_vq_state to support packed virtqueue
state which is basically the device/driver ring wrap counters and the
avail and used index. This will be used for the virito-vdpa support
for the packed virtqueue and the future vhost/vhost-vdpa support for
the packed virtqueue.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210602021536.39525-2-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
We use 3 coding styles in this struct. Switch to just tabs.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210525174733.6212-5-michael.christie@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost_scsi_flush will flush everything, so we can clear the backends then
flush, then destroy. We don't need to flush before each vq destruction
because after the flush we will have made sure there can be no new cmds
started and there are no running cmds.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20210525174733.6212-4-michael.christie@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The vhost work flush function was flushing the entire work queue, so
there is no need for the double vhost_work_dev_flush calls in
vhost_scsi_flush.
And we do not need to call vhost_poll_flush for each poller because
that call also ends up flushing the same work queue thread the
vhost_work_dev_flush call flushed.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210525174733.6212-3-michael.christie@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost_work_flush doesn't do anything with the work arg. This patch drops
it and then renames vhost_work_flush to vhost_work_dev_flush to reflect
that the function flushes all the works in the dev and not just a
specific queue or work item.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210525174733.6212-2-michael.christie@oracle.com
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Function 'vhost_vring_ioctl' is declared twice, remove the repeated
declaration.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1621857884-19964-1-git-send-email-zhangshaokun@hisilicon.com
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Trivial change for the vhost_iotlb_del_range() documentation,
fixing the function name in the comment block.
Discovered with `make C=2 M=drivers/vhost`:
../drivers/vhost/iotlb.c:92: warning: expecting prototype for vring_iotlb_del_range(). Prototype was for vhost_iotlb_del_range() instead
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210504135444.158716-1-sgarzare@redhat.com
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces a function wrapper to call the sk_error_report
callback. That will prepare to add additional handling whenever
sk_error_report is called, for example to trace socket errors.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When received packet is copied to guests's rx queue, data buffers
of rx queue could be smaller that data buffer of input packet, so
data of input packet is copied to each rx buffer, thus each rx
buffer will be a packet with dynamically created header. Fields
of such header are initialized from header of input packet(except
length field which value is depends on number of bytes copied to
rx buffer). But in SEQPACKET case, we also need to take care of
record delimeter bit: if input packet has this bit set, we don't
copy it to header of packet in rx buffer, except case when such
rx buffer is last part of input packet. Otherwise, we will get
sequence of packets with delimeter bit set, thus braking record
bounds.
Also remove ignore of non-stream type of packets, handle SEQPACKET
feature bit.
Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of the xdp_{init,prepare}_buff() helpers instead of
an open-coded version.
Also, the field xdp->rxq was never set, so pass NULL to xdp_init_buff()
to clear it.
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A bunch of new drivers including vdpa support for block
and virtio-vdpa. Beginning of vq kick (aka doorbell) mapping support.
Misc fixes.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmCRBBEPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpiCIH/iNNTeyl4hZJ8IOTlqTagjZgUBYslpda66pU
XfGKmXWpCGHYSw0XgbfHDyTZTCmdyq/b4FrxPgYrrEsQqztLIaGHyapHPcXEAThb
+pHtcxqsQ8DGucJZpNU44M3kB13u07gauR540HyXzEqLXd5vEhG7dkClBjm67TWN
SbJoEP3eNJMUezYuGsmUAGoi/M9NyCx+RiLd7roIlTxhIDW17PFNY0sIgG/sX6/s
1MXng0l00EjawIu4OnWfjg6kZoa6se41Rpcwd7XluTZncYKnMTJGoxDwv0xoJl4I
pI5OS+Ea6ENuuygmYMEl294I5E0QeaMGFpEYyO9sm764K5bLjVw=
=x0Ot
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
"A bunch of new drivers including vdpa support for block and
virtio-vdpa.
Beginning of vq kick (aka doorbell) mapping support.
Misc fixes"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (40 commits)
virtio_pci_modern: correct sparse tags for notify
virtio_pci_modern: __force cast the notify mapping
vDPA/ifcvf: get_config_size should return dev specific config size
vDPA/ifcvf: enable Intel C5000X-PL virtio-block for vDPA
vDPA/ifcvf: deduce VIRTIO device ID when probe
vdpa_sim_blk: add support for vdpa management tool
vdpa_sim_blk: handle VIRTIO_BLK_T_GET_ID
vdpa_sim_blk: implement ramdisk behaviour
vdpa: add vdpa simulator for block device
vhost/vdpa: Remove the restriction that only supports virtio-net devices
vhost/vdpa: use get_config_size callback in vhost_vdpa_config_validate()
vdpa: add get_config_size callback in vdpa_config_ops
vdpa_sim: cleanup kiovs in vdpasim_free()
vringh: add vringh_kiov_length() helper
vringh: implement vringh_kiov_advance()
vringh: explain more about cleaning riov and wiov
vringh: reset kiov 'consumed' field in __vringh_iov()
vringh: add 'iotlb_lock' to synchronize iotlb accesses
vdpa_sim: use iova module to allocate IOVA addresses
vDPA/ifcvf: deduce VIRTIO device ID from pdev ids
...
Since the config checks are done by the vDPA drivers, we can remove the
virtio-net restriction and we should be able to support all kinds of
virtio devices.
<linux/virtio_net.h> is not needed anymore, but we need to include
<linux/slab.h> to avoid compilation failures.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210315163450.254396-11-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Let's use the new 'get_config_size()' callback available instead of
using the 'virtio_id' to get the size of the device config space.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210315163450.254396-10-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
In some cases, it may be useful to provide a way to skip a number
of bytes in a vringh_kiov.
Let's implement vringh_kiov_advance() for this purpose, reusing the
code from vringh_iov_xfer().
We replace that code calling the new vringh_kiov_advance().
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210315163450.254396-6-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>