OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Angus Chen	db5db1a00d	vdpa/ifcvf: fix the calculation of queuepair The q_pair_id to address a queue pair in the lm bar should be calculated by queue_id / 2 rather than queue_id / nr_vring. Fixes: `2ddae773c9` ("vDPA/ifcvf: detect and use the onboard number of queues directly") Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20220923091013.191-1-angus.chen@jaguarmicro.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-09-27 18:30:48 -04:00
Jakub Kicinski	9c5d03d362	genetlink: start to validate reserved header bytes We had historically not checked that genlmsghdr.reserved is 0 on input which prevents us from using those precious bytes in the future. One use case would be to extend the cmd field, which is currently just 8 bits wide and 256 is not a lot of commands for some core families. To make sure that new families do the right thing by default put the onus of opting out of validation on existing families. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel) Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-29 12:47:15 +01:00
Eli Cohen	93e530d2a1	vdpa/mlx5: Fix possible uninitialized return value Initialize err local variable to return -EAGAIN if the asid cannot be found thus avoiding returning uninitialized value. Fixes: `8fcd20c307` ("vdpa/mlx5: Support different address spaces for control and data") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220811134010.952291-1-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 10:00:36 -04:00
Stefano Garzarella	4a44a5eda2	vdpa_sim_blk: add support for discard and write-zeroes Expose VIRTIO_BLK_F_DISCARD and VIRTIO_BLK_F_WRITE_ZEROES features to the drivers and handle VIRTIO_BLK_T_DISCARD and VIRTIO_BLK_T_WRITE_ZEROES requests checking ranges and flags. The simulator behaves like a ramdisk, so for VIRTIO_BLK_F_DISCARD does nothing, while for VIRTIO_BLK_T_WRITE_ZEROES sets to 0 the specified region. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220811083632.77525-5-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:44:58 -04:00
Stefano Garzarella	518083d2f5	vdpa_sim_blk: add support for VIRTIO_BLK_T_FLUSH The simulator behaves like a ramdisk, so we don't have to do anything when a VIRTIO_BLK_T_FLUSH request is received, but it could be useful to test driver behavior. Let's expose the VIRTIO_BLK_F_FLUSH feature to inform the driver that we support the flush command. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220811083632.77525-4-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:44:58 -04:00
Stefano Garzarella	ac926e1b46	vdpa_sim_blk: make vdpasim_blk_check_range usable by other requests Next patches will add handling of other requests, where will be useful to reuse vdpasim_blk_check_range(). So let's make it more generic by adding the `max_sectors` parameter, since different requests allow different numbers of maximum sectors. Let's also print the messages directly in vdpasim_blk_check_range() to avoid duplicate prints. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220811083632.77525-3-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:44:57 -04:00
Stefano Garzarella	b91cf6e95b	vdpa_sim_blk: check if sector is 0 for commands other than read or write VIRTIO spec states: "The sector number indicates the offset (multiplied by 512) where the read or write is to occur. This field is unused and set to 0 for commands other than read or write." Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220811083632.77525-2-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:44:57 -04:00
Eugenio Pérez	0c89e2a3a9	vdpa_sim: Implement suspend vdpa op Implement suspend operation for vdpa_sim devices, so vhost-vdpa will offer that backend feature and userspace can effectively suspend the device. This is a must before get virtqueue indexes (base) for live migration, since the device could modify them after userland gets them. There are individual ways to perform that action for some devices (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no way to perform it for any vhost device (and, in particular, vhost-vdpa). Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220810171512.2343333-5-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:31:15 -04:00
Zhu Lingshan	79e0034cb3	vDPA: fix 'cast to restricted le16' warnings in vdpa.c This commit fixes spars warnings: cast to restricted __le16 in function vdpa_dev_net_config_fill() and vdpa_fill_stats_rec() Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Message-Id: <20220722115309.82746-7-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:26:08 -04:00
Zhu Lingshan	a34bed37fc	vDPA: !FEATURES_OK should not block querying device config space Users may want to query the config space of a vDPA device, to choose a appropriate one for a certain guest. This means the users need to read the config space before FEATURES_OK, and the existence of config space contents does not depend on FEATURES_OK. The spec says: The device MUST allow reading of any device-specific configuration field before FEATURES_OK is set by the driver. This includes fields which are conditional on feature bits, as long as those feature bits are offered by the device. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20220722115309.82746-5-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:26:08 -04:00
Zhu Lingshan	378b2e9568	vDPA/ifcvf: support userspace to query features and MQ of a management device Adapting to current netlink interfaces, this commit allows userspace to query feature bits and MQ capability of a management device. Currently both the vDPA device and the management device are the VF itself, thus this ifcvf should initialize the virtio capabilities in probe() before setting up the struct vdpa_mgmt_dev. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20220722115309.82746-3-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:26:08 -04:00
Zhu Lingshan	0d6e5e8d16	vDPA/ifcvf: get_config_size should return a value no greater than dev implementation Drivers must not access a BAR outside the capability length, and for a virtio device, ifcvf driver should not report any non-standard capability contents to the upper layers. Function ifcvf_get_config_size() is introduced here to return a safe value of the device config capability size. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20220722115309.82746-2-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:26:08 -04:00
Eli Cohen	8fcd20c307	vdpa/mlx5: Support different address spaces for control and data Partition virtqueues to two different address spaces: one for control virtqueue which is implemented in software, and one for data virtqueues. Based-on: <20220526124338.36247-1-eperezma@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220714113927.85729-3-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:26:08 -04:00
Eli Cohen	cae15c2ed8	vdpa/mlx5: Implement susupend virtqueue callback Implement the suspend callback allowing to suspend the virtqueues so they stop processing descriptors. This is required to allow to query a consistent state of the virtqueue while live migration is taking place. Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220714113927.85729-2-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:26:08 -04:00
Xie Yongji	ad146355bf	vduse: Support querying information of IOVA regions This introduces a new ioctl: VDUSE_IOTLB_GET_INFO to support querying some information of IOVA regions. Now it can be used to query whether the IOVA region supports userspace memory registration. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20220803045523.23851-6-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-08-11 04:26:08 -04:00
Xie Yongji	79a463be9e	vduse: Support registering userspace memory for IOVA regions Introduce two ioctls: VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support registering and de-registering userspace memory for IOVA regions. Now it only supports registering userspace memory for bounce buffer region in virtio-vdpa case. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220803045523.23851-5-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:26:08 -04:00
Xie Yongji	6c77ed2288	vduse: Support using userspace pages as bounce buffer Introduce two APIs: vduse_domain_add_user_bounce_pages() and vduse_domain_remove_user_bounce_pages() to support adding and removing userspace pages for bounce buffers. During adding and removing, the DMA data would be copied from the kernel bounce pages to the userspace bounce pages and back. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220803045523.23851-4-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:26:08 -04:00
Xie Yongji	82eb46f94a	vduse: Use memcpy_{to,from}_page() in do_bounce() kmap_atomic() is being deprecated in favor of kmap_local_page(). The use of kmap_atomic() in do_bounce() is all thread local therefore kmap_local_page() is a sufficient replacement. Convert to kmap_local_page() but, instead of open coding it, use the helpers memcpy_to_page() and memcpy_from_page(). Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Message-Id: <20220803045523.23851-3-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:26:08 -04:00
Xie Yongji	c32ee69312	vduse: Remove unnecessary spin lock protection Now we use domain->iotlb_lock to protect two different variables: domain->bounce_maps->bounce_page and domain->iotlb. But for domain->bounce_maps->bounce_page, we actually don't need any synchronization between vduse_domain_get_bounce_page() and vduse_domain_free_bounce_pages() since vduse_domain_get_bounce_page() will only be called in page fault handler and vduse_domain_free_bounce_pages() will be called during file release. So let's remove the unnecessary spin lock protection in vduse_domain_get_bounce_page(). Then the usage of domain->iotlb_lock could be more clear: the lock will be only used to protect the domain->iotlb. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220803045523.23851-2-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:26:08 -04:00
Colin Ian King	994cea5318	vDPA/ifcvf: remove duplicated assignment to pointer cfg The assignment to pointer cfg is duplicated, the second assignment is redundant and can be removed. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Message-Id: <20220704190456.593464-1-colin.i.king@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-08-11 04:26:07 -04:00
Zhang Jiaming	6e33fa094f	vdpa: ifcvf: Fix spelling mistake in comments There is a typo(does't) in comments. It maybe 'doesn't' instead of 'does't'. Signed-off-by: Zhang Jiaming <jiaming@nfschina.com> Message-Id: <20220704024104.15535-1-jiaming@nfschina.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-08-11 04:26:07 -04:00
Xu Qiang	71aa95a69c	vdpa/mlx5: Use eth_broadcast_addr() to assign broadcast address Using eth_broadcast_addr() to assign broadcast address instead of memset(). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xu Qiang <xuqiang36@huawei.com> Message-Id: <20220704021405.64545-1-xuqiang36@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-08-11 04:26:07 -04:00
Stefano Garzarella	67f8f10c0b	vdpa_sim: use max_iotlb_entries as a limit in vhost_iotlb_init Commit `bda324fd03` ("vdpasim: control virtqueue support") changed the allocation of iotlbs calling vhost_iotlb_init() for each address space, instead of vhost_iotlb_alloc(). With this change we forgot to use the limit we had introduced with the `max_iotlb_entries` module parameter. Fixes: `bda324fd03` ("vdpasim: control virtqueue support") Cc: gautam.dawar@xilinx.com Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220621151208.189959-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>	2022-08-11 04:26:07 -04:00
Stefano Garzarella	19cd4a5471	vdpa_sim_blk: set number of address spaces and virtqueue groups Commit `bda324fd03` ("vdpasim: control virtqueue support") added two new fields (nas, ngroups) to vdpasim_dev_attr, but we forgot to initialize them for vdpa_sim_blk. When creating a new vdpa_sim_blk device this causes the kernel to panic in this way: $ vdpa dev add mgmtdev vdpasim_blk name blk0 BUG: kernel NULL pointer dereference, address: 0000000000000030 ... RIP: 0010:vhost_iotlb_add_range_ctx+0x41/0x220 [vhost_iotlb] ... Call Trace: <TASK> vhost_iotlb_add_range+0x11/0x800 [vhost_iotlb] vdpasim_map_range+0x91/0xd0 [vdpa_sim] vdpasim_alloc_coherent+0x56/0x90 [vdpa_sim] ... This happens because vdpasim->iommu[0] is not initialized when dev_attr.nas is 0. Let's fix this issue by initializing both (nas, ngroups) to 1 for vdpa_sim_blk. Fixes: `bda324fd03` ("vdpasim: control virtqueue support") Cc: gautam.dawar@xilinx.com Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220621151323.190431-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>	2022-08-11 04:26:07 -04:00
Stefano Garzarella	8472019eec	vdpa_sim_blk: call vringh_complete_iotlb() also in the error path Call vringh_complete_iotlb() even when we encounter a serious error that prevents us from writing the status in the "in" header (e.g. the header length is incorrect, etc.). The guest is misbehaving, so maybe the ring is in a bad state, but let's avoid making things worse. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220630153221.83371-4-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:26:07 -04:00
Stefano Garzarella	9c4df0900f	vdpa_sim_blk: limit the number of request handled per batch Limit the number of requests (4 per queue as for vdpa_sim_net) handled in a batch to prevent the worker from using the CPU for too long. Suggested-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220630153221.83371-3-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-08-11 04:26:07 -04:00
Stefano Garzarella	42a84c09ef	vdpa_sim_blk: use dev_dbg() to print errors Use dev_dbg() instead of dev_err()/dev_warn() to avoid flooding the host with prints, when the guest driver is misbehaving. In this way, prints can be dynamically enabled when the vDPA block simulator is used to validate a driver. Suggested-by: Jason Wang <jasowang@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220630153221.83371-2-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:26:07 -04:00
Parav Pandit	0e0348ac3f	vduse: Tie vduse mgmtdev and its device vduse devices are not backed by any real devices such as PCI. Hence it doesn't have any parent device linked to it. Kernel driver model in [1] suggests to avoid an empty device release callback. Hence tie the mgmtdevice object's life cycle to an allocate dummy struct device instead of static one. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/core-api/kobject.rst?h=v5.18-rc7#n284 Signed-off-by: Parav Pandit <parav@nvidia.com> Message-Id: <20220613195223.473966-1-parav@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-06-24 02:49:48 -04:00
Eli Cohen	ace9252446	vdpa/mlx5: Initialize CVQ vringh only once Currently, CVQ vringh is initialized inside setup_virtqueues() which is called every time a memory update is done. This is undesirable since it resets all the context of the vring, including the available and used indices. Move the initialization to mlx5_vdpa_set_status() when VIRTIO_CONFIG_S_DRIVER_OK is set. Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220613075958.511064-2-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>	2022-06-24 02:49:47 -04:00
Eli Cohen	40f2f3e941	vdpa/mlx5: Update Control VQ callback information The control VQ specific information is stored in the dedicated struct mlx5_control_vq. When the callback is updated through mlx5_vdpa_set_vq_cb(), make sure to update the control VQ struct. Fixes: `5262912ef3` ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220613075958.511064-1-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com)	2022-06-24 02:49:47 -04:00
Xie Yongji	b27ee76c74	vduse: Fix NULL pointer dereference on sysfs access The control device has no drvdata. So we will get a NULL pointer dereference when accessing control device's msg_timeout attribute via sysfs: [ 132.841881][ T3644] BUG: kernel NULL pointer dereference, address: 00000000000000f8 [ 132.850619][ T3644] RIP: 0010:msg_timeout_show (drivers/vdpa/vdpa_user/vduse_dev.c:1271) [ 132.869447][ T3644] dev_attr_show (drivers/base/core.c:2094) [ 132.870215][ T3644] sysfs_kf_seq_show (fs/sysfs/file.c:59) [ 132.871164][ T3644] ? device_remove_bin_file (drivers/base/core.c:2088) [ 132.872082][ T3644] kernfs_seq_show (fs/kernfs/file.c:164) [ 132.872838][ T3644] seq_read_iter (fs/seq_file.c:230) [ 132.873578][ T3644] ? __vmalloc_area_node (mm/vmalloc.c:3041) [ 132.874532][ T3644] kernfs_fop_read_iter (fs/kernfs/file.c:238) [ 132.875513][ T3644] __kernel_read (fs/read_write.c:440 (discriminator 1)) [ 132.876319][ T3644] kernel_read (fs/read_write.c:459) [ 132.877129][ T3644] kernel_read_file (fs/kernel_read_file.c:94) [ 132.877978][ T3644] kernel_read_file_from_fd (include/linux/file.h:45 fs/kernel_read_file.c:186) [ 132.879019][ T3644] __do_sys_finit_module (kernel/module.c:4207) [ 132.879930][ T3644] __ia32_sys_finit_module (kernel/module.c:4189) [ 132.880930][ T3644] do_int80_syscall_32 (arch/x86/entry/common.c:112 arch/x86/entry/common.c:132) [ 132.881847][ T3644] entry_INT80_compat (arch/x86/entry/entry_64_compat.S:419) To fix it, don't create the unneeded attribute for control device anymore. Fixes: `c8a6153b6c` ("vduse: Introduce VDUSE - vDPA Device in Userspace") Reported-by: kernel test robot <oliver.sang@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20220426073656.229-1-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-06-08 08:56:03 -04:00
Dan Carpenter	f38b3c6a78	vdpa/mlx5: clean up indenting in handle_ctrl_vlan() These lines were supposed to be indented. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <Yp71IYMP+QfuCJ8t@kili> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eli Cohen <elic@nvidia.com> Acked-by: Si-Wei Liu <si-wei.liu@oracle.com>	2022-06-08 08:56:03 -04:00
Dan Carpenter	f766c409fc	vdpa/mlx5: fix error code for deleting vlan Return success if we were able to delete a vlan. The current code always returns failure. Fixes: `baf2ad3f6a` ("vdpa/mlx5: Add RX MAC VLAN filter support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <Yp709f1g9NcMBCHg@kili> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eli Cohen <elic@nvidia.com> Acked-by: Si-Wei Liu <si-wei.liu@oracle.com>	2022-06-08 08:56:03 -04:00
Xiang wangx	2f72b2262d	vdpa/mlx5: Fix syntax errors in comments Delete the redundant word 'is'. Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com> Message-Id: <20220604143858.16073-1-wangxiang@cdjrlc.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-06-08 08:56:03 -04:00
Linus Torvalds	500a434fc5	Driver core changes for 5.19-rc1 Here is the set of driver core changes for 5.19-rc1. Note, I'm not really happy with this pull request as-is, see below for details, but overall this is all good for everything but a small set of systems, which we have a fix for already. Lots of tiny driver core changes and cleanups happened this cycle, but the two major things were: - firmware_loader reorganization and additions including the ability to have XZ compressed firmware images and the ability for userspace to initiate the firmware load when it needs to, instead of being always initiated by the kernel. FPGA devices specifically want this ability to have their firmware changed over the lifetime of the system boot, and this allows them to work without having to come up with yet-another-custom-uapi interface for loading firmware for them. - physical location support added to sysfs so that devices that know this information, can tell userspace where they are located in a common way. Some ACPI devices already support this today, and more bus types should support this in the future. Smaller changes included: - driver_override api cleanups and fixes - error path cleanups and fixes - get_abi script fixes - deferred probe timeout changes. It's that last change that I'm the most worried about. It has been reported to cause boot problems for a number of systems, and I have a tested patch series that resolves this issue. But I didn't get it merged into my tree before 5.18-final came out, so it has not gotten any linux-next testing. I'll send the fixup patches (there are 2) as a follow-on series to this pull request if you want to take them directly, _OR_ I can just revert the probe timeout changes and they can wait for the next -rc1 merge cycle. Given that the fixes are tested, and pretty simple, I'm leaning toward that choice. Sorry this all came at the end of the merge window, I should have resolved this all 2 weeks ago, that's my fault as it was in the middle of some travel for me. All have been tested in linux-next for weeks, with no reported issues other than the above-mentioned boot time outs. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpnv/A8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+yk/fACgvmenbo5HipqyHnOmTQlT50xQ9EYAn2eTq6ai GkjLXBGNWOPBa5cU52qf =yEi/ -----END PGP SIGNATURE----- Merge tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of driver core changes for 5.19-rc1. Lots of tiny driver core changes and cleanups happened this cycle, but the two major things are: - firmware_loader reorganization and additions including the ability to have XZ compressed firmware images and the ability for userspace to initiate the firmware load when it needs to, instead of being always initiated by the kernel. FPGA devices specifically want this ability to have their firmware changed over the lifetime of the system boot, and this allows them to work without having to come up with yet-another-custom-uapi interface for loading firmware for them. - physical location support added to sysfs so that devices that know this information, can tell userspace where they are located in a common way. Some ACPI devices already support this today, and more bus types should support this in the future. Smaller changes include: - driver_override api cleanups and fixes - error path cleanups and fixes - get_abi script fixes - deferred probe timeout changes. It's that last change that I'm the most worried about. It has been reported to cause boot problems for a number of systems, and I have a tested patch series that resolves this issue. But I didn't get it merged into my tree before 5.18-final came out, so it has not gotten any linux-next testing. I'll send the fixup patches (there are 2) as a follow-on series to this pull request. All have been tested in linux-next for weeks, with no reported issues other than the above-mentioned boot time-outs" * tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) driver core: fix deadlock in __device_attach kernfs: Separate kernfs_pr_cont_buf and rename_lock. topology: Remove unused cpu_cluster_mask() driver core: Extend deferred probe timeout on driver registration MAINTAINERS: add Russ Weight as a firmware loader maintainer driver: base: fix UAF when driver_attach failed test_firmware: fix end of loop test in upload_read_show() driver core: location: Add "back" as a possible output for panel driver core: location: Free struct acpi_pld_info pld driver core: Add "" wildcard support to driver_async_probe cmdline param driver core: location: Check for allocations failure arch_topology: Trace the update thermal pressure kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file. export: fix string handling of namespace in EXPORT_SYMBOL_NS rpmsg: use local 'dev' variable rpmsg: Fix calling device_lock() on non-initialized device firmware_loader: describe 'module' parameter of firmware_upload_register() firmware_loader: Move definitions from sysfs_upload.h to sysfs.h firmware_loader: Fix configs for sysfs split selftests: firmware: Add firmware upload selftests ...	2022-06-03 11:48:47 -07:00
Jason Wang	bd8bb9aed5	vdpa: ifcvf: set pci driver data in probe We should set the pci driver data in probe instead of the vdpa device adding callback. Otherwise if no vDPA device is created we will lose the pointer to the management device. Fixes: `6b5df347c6` ("vDPA/ifcvf: implement management netlink framework for ifcvf") Tested-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220524055557.1938-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-06-01 02:16:38 -04:00
Eli Cohen	baf2ad3f6a	vdpa/mlx5: Add RX MAC VLAN filter support Support HW offloaded filtering of MAC/VLAN packets. To allow that, we add a handler to handle VLAN configurations coming through the control VQ. Two operations are supported. 1. Adding VLAN - in this case, an entry will be added to the RX flow table that will allow the combination of the MAC/VLAN to be forwarded to the TIR. 2. Removing VLAN - will remove the entry from the flow table, effectively blocking such packets from going through. Currently the control VQ does not propagate changes to the MAC of the VLAN device so we always use the MAC of the parent device. Examples: 1. Create vlan device: $ ip link add link ens1 name ens1.8 type vlan id 8 Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220411122942.225717-4-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-06-01 02:16:38 -04:00
Eli Cohen	7becdd13b6	vdpa/mlx5: Remove flow counter from steering The flow counter has been introduced in early versions of the driver to aid in debugging. It is no longer needed and can harm performance. Remove it. Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220411122942.225717-2-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-06-01 02:16:38 -04:00
Dan Carpenter	1f97b97850	vdpasim: Off by one in vdpasim_set_group_asid() The > comparison needs to be >= to prevent an out of bounds access of the vdpasim->iommu[] array. The vdpasim->iommu[] is allocated in vdpasim_create() and it has vdpasim->dev_attr.nas elements. Fixes: 87e5afeac247 ("vdpasim: control virtqueue support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <YotGQU1q224RKZR8@kili> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-05-31 12:45:10 -04:00
Eugenio Pérez	2424369738	vdpasim: allow to enable a vq repeatedly Code must be resilient to enable a queue many times. At the moment the queue is resetting so it's definitely not the expected behavior. v2: set vq->ready = 0 at disable. Fixes: `2c53d0f64c` ("vdpasim: vDPA device simulator") Cc: stable@vger.kernel.org Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220519145919.772896-1-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2022-05-31 12:45:09 -04:00
Zhu Lingshan	ac33f84ba5	vDPA/ifcvf: fix uninitialized config_vector warning Static checkers are not informed that config_vector is controlled by vf->msix_vector_status, which can only be MSIX_VECTOR_SHARED_VQ_AND_CONFIG, MSIX_VECTOR_SHARED_VQ_AND_CONFIG and MSIX_VECTOR_DEV_SHARED. This commit uses an "if...elseif...else" code block to tell the checkers that it is a complete set, and config_vector can be initialized anyway Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <20220424072806.1083189-1-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-05-31 12:45:09 -04:00
Cindy Lu	ffbda8e9df	vdpa/vp_vdpa : add vdpa tool support in vp_vdpa this patch is to add the support for vdpa tool in vp_vdpa here is the example steps modprobe vp_vdpa modprobe vhost_vdpa echo 0000:00:06.0>/sys/bus/pci/drivers/virtio-pci/unbind echo 1af4 1041 > /sys/bus/pci/drivers/vp-vdpa/new_id vdpa dev add name vdpa1 mgmtdev pci/0000:00:06.0 Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20220429091030.547434-1-lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-05-31 12:45:09 -04:00
Gautam Dawar	bda324fd03	vdpasim: control virtqueue support This patch introduces the control virtqueue support for vDPA simulator. This is a requirement for supporting advanced features like multiqueue. A requirement for control virtqueue is to isolate its memory access from the rx/tx virtqueues. This is because when using vDPA device for VM, the control virqueue is not directly assigned to VM. Userspace (Qemu) will present a shadow control virtqueue to control for recording the device states. The isolation is done via the virtqueue groups and ASID support in vDPA through vhost-vdpa. The simulator is extended to have: 1) three virtqueues: RXVQ, TXVQ and CVQ (control virtqueue) 2) two virtqueue groups: group 0 contains RXVQ and TXVQ; group 1 contains CVQ 3) two address spaces and the simulator simply implements the address spaces by mapping it 1:1 to IOTLB. For the VM use cases, userspace(Qemu) may set AS 0 to group 0 and AS 1 to group 1. So we have: 1) The IOTLB for virtqueue group 0 contains the mappings of guest, so RX and TX can be assigned to guest directly. 2) The IOTLB for virtqueue group 1 contains the mappings of CVQ which is the buffers that allocated and managed by VMM only. So CVQ of vhost-vdpa is visible to VMM only. And Guest can not access the CVQ of vhost-vdpa. For the other use cases, since AS 0 is associated to all virtqueue groups by default. All virtqueues share the same mapping by default. To demonstrate the function, VIRITO_NET_F_CTRL_MACADDR is implemented in the simulator for the driver to set mac address. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-20-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-31 12:45:08 -04:00
Gautam Dawar	cfe2268929	vdpa_sim: filter destination mac address This patch implements a simple unicast filter for vDPA simulator. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-19-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-31 12:45:08 -04:00
Gautam Dawar	ec103d983b	vdpa_sim: factor out buffer completion logic Wrap up common buffer completion logic in to vdpasim_net_complete Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-18-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-31 12:44:33 -04:00
Gautam Dawar	05b6976212	vdpa_sim: advertise VIRTIO_NET_F_MTU We've already reported maximum mtu via config space, so let's advertise the feature. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-17-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-31 12:44:32 -04:00
Gautam Dawar	db9adcbf42	vdpa: multiple address spaces support This patches introduces the multiple address spaces support for vDPA device. This idea is to identify a specific address space via an dedicated identifier - ASID. During vDPA device allocation, vDPA device driver needs to report the number of address spaces supported by the device then the DMA mapping ops of the vDPA device needs to be extended to support ASID. This helps to isolate the environments for the virtqueue that will not be assigned directly. E.g in the case of virtio-net, the control virtqueue will not be assigned directly to guest. As a start, simply claim 1 virtqueue groups and 1 address spaces for all vDPA devices. And vhost-vDPA will simply reject the device with more than 1 virtqueue groups or address spaces. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-7-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-31 12:44:27 -04:00
Gautam Dawar	d4821902e4	vdpa: introduce virtqueue groups This patch introduces virtqueue groups to vDPA device. The virtqueue group is the minimal set of virtqueues that must share an address space. And the address space identifier could only be attached to a specific virtqueue group. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-6-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-31 12:44:27 -04:00
Eli Cohen	759ae7f9bf	vdpa/mlx5: Use readers/writers semaphore instead of mutex Reading statistics could be done intensively and by several processes concurrently. Reader's lock is sufficient in this case. Change reslock from mutex to a rwsem. Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220518133804.1075129-7-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-31 12:44:23 -04:00
Eli Cohen	1892a3d425	vdpa/mlx5: Add support for reading descriptor statistics Implement the get_vq_stats calback of vdpa_config_ops to return the statistics for a virtqueue. The statistics are provided as vendor specific statistics where the driver provides a pair of attribute name and attribute value. Currently supported are received descriptors and completed descriptors. Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220518133804.1075129-6-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-31 12:44:22 -04:00
Eli Cohen	a6a51adc6e	net/vdpa: Use readers/writers semaphore instead of cf_mutex Replace cf_mutex with rw_semaphore to reflect the fact that some calls could be called concurrently but can suffice with read lock. Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220518133804.1075129-5-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-31 12:44:22 -04:00
Eli Cohen	0078ad905d	net/vdpa: Use readers/writers semaphore instead of vdpa_dev_mutex Use rw_semaphore instead of mutex to control access to vdpa devices. This can be especially beneficial in case processes poll on statistics information. Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220518133804.1075129-4-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-31 12:44:21 -04:00
Eli Cohen	13b00b1356	vdpa: Add support for querying vendor statistics Allows to read vendor statistics of a vdpa device. The specific statistics data are received from the upstream driver in the form of an (attribute name, attribute value) pairs. An example of statistics for mlx5_vdpa device are: received_desc - number of descriptors received by the virtqueue completed_desc - number of descriptors completed by the virtqueue A descriptor using indirect buffers is still counted as 1. In addition, N chained descriptors are counted correctly N times as one would expect. A new callback was added to vdpa_config_ops which provides the means for the vdpa driver to return statistics results. The interface allows for reading all the supported virtqueues, including the control virtqueue if it exists. Below are some examples taken from mlx5_vdpa which are introduced in the following patch: 1. Read statistics for the virtqueue at index 1 $ vdpa dev vstats show vdpa-a qidx 1 vdpa-a: queue_type tx queue_index 1 received_desc 3844836 completed_desc 3844836 2. Read statistics for the virtqueue at index 32 $ vdpa dev vstats show vdpa-a qidx 32 vdpa-a: queue_type control_vq queue_index 32 received_desc 62 completed_desc 62 3. Read statisitics for the virtqueue at index 0 with json output $ vdpa -j dev vstats show vdpa-a qidx 0 {"vstats":{"vdpa-a":{ "queue_type":"rx","queue_index":0,"name":"received_desc","value":417776,\ "name":"completed_desc","value":417548}}} 4. Read statistics for the virtqueue at index 0 with preety json output $ vdpa -jp dev vstats show vdpa-a qidx 0 { "vstats": { "vdpa-a": { "queue_type": "rx", "queue_index": 0, "name": "received_desc", "value": 417776, "name": "completed_desc", "value": 417548 } } } Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220518133804.1075129-3-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-31 12:44:20 -04:00
Eli Cohen	7a6691f1f8	vdpa: Fix error logic in vdpa_nl_cmd_dev_get_doit In vdpa_nl_cmd_dev_get_doit(), if the call to genlmsg_reply() fails we must not call nlmsg_free() since this is done inside genlmsg_reply(). Fix it. Fixes: `bc0d90ee02` ("vdpa: Enable user to query vdpa device info") Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220518133804.1075129-2-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-31 12:44:20 -04:00
Eli Cohen	acde392949	vdpa/mlx5: Use consistent RQT size The current code evaluates RQT size based on the configured number of virtqueues. This can raise an issue in the following scenario: Assume MQ was negotiated. 1. mlx5_vdpa_set_map() gets called. 2. handle_ctrl_mq() is called setting cur_num_vqs to some value, lower than the configured max VQs. 3. A second set_map gets called, but now a smaller number of VQs is used to evaluate the size of the RQT. 4. handle_ctrl_mq() is called with a value larger than what the RQT can hold. This will emit errors and the driver state is compromised. To fix this, we use a new field in struct mlx5_vdpa_net to hold the required number of entries in the RQT. This value is evaluated in mlx5_vdpa_set_driver_features() where we have the negotiated features all set up. In addition to that, we take into consideration the max capability of RQT entries early when the device is added so we don't need to take consider it when creating the RQT. Last, we remove the use of mlx5_vdpa_max_qps() which just returns the max_vas / 2 and make the code clearer. Fixes: `52893733f2` ("vdpa/mlx5: Add multiqueue support") Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-18 12:31:31 -04:00
Krzysztof Kozlowski	240bf4e665	vdpa: Use helper for safer setting of driver_override Use a helper to set driver_override to the reduce amount of duplicated code. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20220419113435.246203-9-krzysztof.kozlowski@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-22 17:13:54 +02:00
Linus Torvalds	3e732ebf73	virtio: fixes, cleanups A couple of mlx5 fixes related to cvq A couple of reverts dropping useless code (code that used it got reverted earlier) Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmJKyK4PHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpE40IAIvsq1WDsXfg/649lp0wMO5MPEzAvCfgQpub hCti2lKY4K9Ykqh+uTipDTxpRmg3wt/4QlQ6NYZVmn8jQ1JsNAIFCkc7ptG1xwQI 0RypofGsFQwbar2vKh8jEzYUsIbwLKeLNlib113fH9zyDRw7KjRppUgC2cjMoYv6 JhlOFZAApzFknXC/I8fmJyzVYpX+v2GB8RX2wWiNXHQZ9hkdKd1IvJqRiuRuSY3L haubMBFG8hrdW8qQ40ngvt+GnrxKUustmPM6x9DCJYaMkKRkXC6NxueUcIk6qWiw UJz5IOXURdJ3i1qxmkMGSnJiYbFpmwVGlqRekN3qNVpXNovfbg0= =l15S -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio fixes from Michael Tsirkin: "Fixes and cleanups: - A couple of mlx5 fixes related to cvq - A couple of reverts dropping useless code (code that used it got reverted earlier)" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa: mlx5: synchronize driver status with CVQ vdpa: mlx5: prevent cvq work from hogging CPU Revert "virtio_config: introduce a new .enable_cbs method" Revert "virtio: use virtio_device_ready() in virtio_device_restore()"	2022-04-05 10:40:52 -07:00
Linus Torvalds	f4f5d7cfb2	virtio: features, fixes vdpa generic device type support More virtio hardening for broken devices On the same theme, revert some virtio hotplug hardening patches - they were misusing some interrupt flags, will have to be reverted. RSS support in virtio-net max device MTU support in mlx5 vdpa akcipher support in virtio-crypto shared IRQ support in ifcvf vdpa a minor performance improvement in vhost Enable virtio mem for ARM64 beginnings of advance dma support Cleanups, fixes all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmJEEk8PHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpcpUH+wRIXrzveirsN4MYH0aAeF+SLYaA5pgtO4U7 da22HYtwlMrDRMxwjepKBOTSu89uP5LEK7IKWPj9VRZg+GLz/Cdfc6BZl/fND3qt 0yFpwG1ZLsBK1+WHbysWQneEbPjXqQdbh9eVkKVGcNkRuLJJwXbmF95dyQEJwzeh dPHssDcEC2tRgHAMrLyjLPKwMCRwcgtdPoB1ZC+lqTs3G6lktAfREEvqVfJOVe1b mQcgdAJ+aRM0J/w/PYTmxFOZPYAmQ6hmAQ8Hf7nkjfRWQ4EM91W0cKAoZPc/+7KN ZfFKVL28GEZLJqnx+3xijwCR2gwVHsRYZHaTjfGgQUWZPoB3Vrc= =ynRx -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: - vdpa generic device type support - more virtio hardening for broken devices (but on the same theme, revert some virtio hotplug hardening patches - they were misusing some interrupt flags and had to be reverted) - RSS support in virtio-net - max device MTU support in mlx5 vdpa - akcipher support in virtio-crypto - shared IRQ support in ifcvf vdpa - a minor performance improvement in vhost - enable virtio mem for ARM64 - beginnings of advance dma support - cleanups, fixes all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (33 commits) vdpa/mlx5: Avoid processing works if workqueue was destroyed vhost: handle error while adding split ranges to iotlb vdpa: support exposing the count of vqs to userspace vdpa: change the type of nvqs to u32 vdpa: support exposing the config size to userspace vdpa/mlx5: re-create forwarding rules after mac modified virtio: pci: check bar values read from virtio config space Revert "virtio_pci: harden MSI-X interrupts" Revert "virtio-pci: harden INTX interrupts" drivers/net/virtio_net: Added RSS hash report control. drivers/net/virtio_net: Added RSS hash report. drivers/net/virtio_net: Added basic RSS support. drivers/net/virtio_net: Fixed padded vheader to use v1 with hash. virtio: use virtio_device_ready() in virtio_device_restore() tools/virtio: compile with -pthread tools/virtio: fix after premapped buf support virtio_ring: remove flags check for unmap packed indirect desc virtio_ring: remove flags check for unmap split indirect desc virtio_ring: rename vring_unmap_state_packed() to vring_unmap_extra_packed() net/mlx5: Add support for configuring max device MTU ...	2022-03-31 13:57:15 -07:00
Jason Wang	1c80cf031e	vdpa: mlx5: synchronize driver status with CVQ Currently, CVQ doesn't have any synchronization with the driver status. Then CVQ emulation code run in the middle of: 1) device reset 2) device status changed 3) map updating The will lead several unexpected issue like trying to execute CVQ command after the driver has been teared down. Fixing this by using reslock to synchronize CVQ emulation code with the driver status changing: - protect the whole device reset, status changing and set_map() updating with reslock - protect the CVQ handler with the reslock and check VIRTIO_CONFIG_S_DRIVER_OK in the CVQ handler This will guarantee that: 1) CVQ handler won't work if VIRTIO_CONFIG_S_DRIVER_OK is not set 2) CVQ handler will see a consistent state of the driver instead of the partial one when it is running in the middle of the teardown_driver() or setup_driver(). Cc: `5262912ef3` ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20220329042109.4029-2-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eli Cohen <elic@nvidia.com>	2022-03-30 04:18:14 -04:00
Jason Wang	55ebf0d60e	vdpa: mlx5: prevent cvq work from hogging CPU A userspace triggerable infinite loop could happen in mlx5_cvq_kick_handler() if userspace keeps sending a huge amount of cvq requests. Fixing this by introducing a quota and re-queue the work if we're out of the budget (currently the implicit budget is one) . While at it, using a per device work struct to avoid on demand memory allocation for cvq. Fixes: `5262912ef3` ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20220329042109.4029-1-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eli Cohen <elic@nvidia.com>	2022-03-30 04:18:14 -04:00
Eli Cohen	ad6dc1daaf	vdpa/mlx5: Avoid processing works if workqueue was destroyed If mlx5_vdpa gets unloaded while a VM is running, the workqueue will be destroyed. However, vhost might still have reference to the kick function and might attempt to push new works. This could lead to null pointer dereference. To fix this, set mvdev->wq to NULL just before destroying and verify that the workqueue is not NULL in mlx5_vdpa_kick_vq before attempting to push a new work. Fixes: `5262912ef3` ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220321141303.9586-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-28 16:54:30 -04:00
Longpeng	81d46d6931	vdpa: change the type of nvqs to u32 Change vdpa_device.nvqs and vhost_vdpa.nvqs to use u32 Signed-off-by: Longpeng <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220315032553.455-3-longpeng2@huawei.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Longpeng <<a href="mailto:longpeng2@huawei.com" target="_blank">longpeng2@huawei.com</a>><br></blockquote><div><br></div><div>Acked-by: Jason Wang <<a href="mailto:jasowang@redhat.com">jasowang@redhat.com</a>></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2022-03-28 16:53:00 -04:00
Michael Qiu	f1781bedea	vdpa/mlx5: re-create forwarding rules after mac modified When MAC Address has been modified in guest, we only re-add the Mac to mpfs, it is not enough, because the guest network will not work correctly: the reply package from outside will go straight away to the host VF net interface. This patch recreate the flow rules, and make it work correctly. Signed-off-by: Michael Qiu <qiudayu@archeros.com> Link: https://lore.kernel.org/r/1648446492-17614-1-git-send-email-08005325@163.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eli Cohen <elic@nvidia.com>	2022-03-28 16:52:59 -04:00
Eli Cohen	1e00e821e4	net/mlx5: Add support for configuring max device MTU Allow an admin creating a vdpa device to specify the max MTU for the net device. For example, to create a device with max MTU of 1000, the following command can be used: $ vdpa dev add name vdpa-a mgmtdev auxiliary/mlx5_core.sf.1 mtu 1000 This configuration mechanism assumes that vdpa is the sole real user of the function. mlx5_core could theoretically change the mtu of the function using the ip command on the mlx5_core net device but this should not be done. Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220221121927.194728-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-03-28 16:52:58 -04:00
Zhu Lingshan	6f84622db3	vDPA/ifcvf: cacheline alignment for ifcvf_hw This commit introduces a new cacheline aligned layout for ifcvf_hw. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20220222115428.998334-6-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-28 16:52:57 -04:00
Zhu Lingshan	9b3e814834	vDPA/ifcvf: implement shared IRQ feature On some platforms/devices, there may not be enough MSI vectors allocated for the virtqueues and config changes. In such a case, the interrupt sources(virtqueues, config changes) must share an IRQ/vector, to avoid initialization failures, keep the device functional. This commit handles three cases: (1) number of the allocated vectors == the number of virtqueues + 1 (config changes), every virtqueue and the config interrupt has a separated vector/IRQ, the best and the most likely case. (2) number of the allocated vectors is less than the best case, but greater than 1. In this case, all virtqueues share a vector/IRQ, the config interrupt has a separated vector/IRQ (3) only one vector is allocated, in this case, the virtqueues and the config interrupt share a vector/IRQ. The worst and most unlikely case. Otherwise, it needs to fail. This commit introduces some helper functions: ifcvf_set_vq_vector() and ifcvf_set_config_vector() sets virtqueue vector and config vector in the device config space, so that the device can send interrupt DMA. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20220222115428.998334-5-lingshan.zhu@intel.com Signed-off-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20220315124130.1710030-1-trix@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-28 16:52:57 -04:00
Zhu Lingshan	ad5c5690de	vDPA/ifcvf: implement device MSIX vector allocator This commit implements a MSIX vector allocation helper for vqs and config interrupts. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20220222115428.998334-4-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-28 16:52:57 -04:00
Zhu Lingshan	8897d6d0fc	vDPA/ifcvf: make use of virtio pci modern IO helpers in ifcvf This commit discards ifcvf_ioreadX()/writeX(), use virtio pci modern IO helpers instead Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20220222115428.998334-2-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-28 16:52:57 -04:00
Linus Torvalds	34af78c4e6	IOMMU Updates for Linux v5.18 Including: - IOMMU Core changes: - Removal of aux domain related code as it is basically dead and will be replaced by iommu-fd framework - Split of iommu_ops to carry domain-specific call-backs separatly - Cleanup to remove useless ops->capable implementations - Improve 32-bit free space estimate in iova allocator - Intel VT-d updates: - Various cleanups of the driver - Support for ATS of SoC-integrated devices listed in ACPI/SATC table - ARM SMMU updates: - Fix SMMUv3 soft lockup during continuous stream of events - Fix error path for Qualcomm SMMU probe() - Rework SMMU IRQ setup to prepare the ground for PMU support - Minor cleanups and refactoring - AMD IOMMU driver: - Some minor cleanups and error-handling fixes - Rockchip IOMMU driver: - Use standard driver registration - MSM IOMMU driver: - Minor cleanup and change to standard driver registration - Mediatek IOMMU driver: - Fixes for IOTLB flushing logic -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmI8OUkACgkQK/BELZcB GuNz9xAAvlgEg3byMx1y6LY9IctVGyLsegweVGM4+m6XR7qvT5Llc1E2Yw4Gooe4 EAceOihDKW2T9VnMlz9g/cG7Modrx60chcB22KKfxDXPl6yF3R89EMd7DE43T6n/ KPrP9+EsBnI8QSXyYu9ZowioX4CYwWhWD0dKHKAwDvw0BWHHUJ4hTaoHqEoIqLdP vubeHziIok/g1sylSpJjTzV7r/Na8Q3TGcb/Mi5qC8uiyiyx40vtaduMGNW+/ToN EqOKszxPmHfHv/xf0IHo0eUZ2L/JAe0mAlZzOb09f5F2sXJrbC05jlmRaDmSjT+u iEc1r2By/0xo6iOuQC3wD6kTvwwO/ecpNYGhXYXdTbtLquYfL5PSXjRHEU9gf2BO i/llPDsnytPvm/hnmbi26ChNR6yrDPz5bkoCUl5mnB1jZcaZtIURN7cRlEPPZUWo 62VDNdqWDB6AvALc1/SwYdJX/i5eaBf+niS7/BJ/IkLp2oJxFzrGsU8SRJFHNYsa zdFIUUoTw647Ul6derSpGzHow169/RwVKYPiXMsaA8/viPNjpBOtfg56abn1WLW6 N4CtwNu6tt+sPfftFdFx2cDEMW2zpWg5doMddBfEx9FAk0HJ4WLZiTpaO2PxcLyd kCAsGHj+ViAZHINVKFV4nQN/V9yQtcIc4UPmSGJBtKCK3KUYujw= =bcqr -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - IOMMU Core changes: - Removal of aux domain related code as it is basically dead and will be replaced by iommu-fd framework - Split of iommu_ops to carry domain-specific call-backs separatly - Cleanup to remove useless ops->capable implementations - Improve 32-bit free space estimate in iova allocator - Intel VT-d updates: - Various cleanups of the driver - Support for ATS of SoC-integrated devices listed in ACPI/SATC table - ARM SMMU updates: - Fix SMMUv3 soft lockup during continuous stream of events - Fix error path for Qualcomm SMMU probe() - Rework SMMU IRQ setup to prepare the ground for PMU support - Minor cleanups and refactoring - AMD IOMMU driver: - Some minor cleanups and error-handling fixes - Rockchip IOMMU driver: - Use standard driver registration - MSM IOMMU driver: - Minor cleanup and change to standard driver registration - Mediatek IOMMU driver: - Fixes for IOTLB flushing logic * tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (47 commits) iommu/amd: Improve amd_iommu_v2_exit() iommu/amd: Remove unused struct fault.devid iommu/amd: Clean up function declarations iommu/amd: Call memunmap in error path iommu/arm-smmu: Account for PMU interrupts iommu/vt-d: Enable ATS for the devices in SATC table iommu/vt-d: Remove unused function intel_svm_capable() iommu/vt-d: Add missing "__init" for rmrr_sanity_check() iommu/vt-d: Move intel_iommu_ops to header file iommu/vt-d: Fix indentation of goto labels iommu/vt-d: Remove unnecessary prototypes iommu/vt-d: Remove unnecessary includes iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO iommu/vt-d: Remove domain and devinfo mempool iommu/vt-d: Remove iova_cache_get/put() iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info() iommu/vt-d: Remove intel_iommu::domains iommu/mediatek: Always tlb_flush_all when each PM resume iommu/mediatek: Add tlb_lock in tlb_flush_all iommu/mediatek: Remove the power status checking in tlb flush all ...	2022-03-24 19:48:57 -07:00
Zhang Min	eb057b44db	vdpa: fix use-after-free on vp_vdpa_remove When vp_vdpa driver is unbind, vp_vdpa is freed in vdpa_unregister_device and then vp_vdpa->mdev.pci_dev is dereferenced in vp_modern_remove, triggering use-after-free. Call Trace of unbinding driver free vp_vdpa : do_syscall_64 vfs_write kernfs_fop_write_iter device_release_driver_internal pci_device_remove vp_vdpa_remove vdpa_unregister_device kobject_release device_release kfree Call Trace of dereference vp_vdpa->mdev.pci_dev: vp_modern_remove pci_release_selected_regions pci_release_region pci_resource_len pci_resource_end (dev)->resource[(bar)].end Signed-off-by: Zhang Min <zhang.min9@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Link: https://lore.kernel.org/r/20220301091059.46869-1-wang.yi59@zte.com.cn Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: `64b9f64f80` ("vdpa: introduce virtio pci driver") Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2022-03-06 06:06:50 -05:00
Xie Yongji	b9d102dafe	vduse: Fix returning wrong type in vduse_domain_alloc_iova() This fixes the following smatch warnings: drivers/vdpa/vdpa_user/iova_domain.c:305 vduse_domain_alloc_iova() warn: should 'iova_pfn << shift' be a 64 bit type? Fixes: `8c773d53fb` ("vduse: Implement an MMU-based software IOTLB") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20220121083940.102-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-03-04 11:56:34 -05:00
Si-Wei Liu	ed0f849fc3	vdpa/mlx5: add validation for VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command When control vq receives a VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command request from the driver, presently there is no validation against the number of queue pairs to configure, or even if multiqueue had been negotiated or not is unverified. This may lead to kernel panic due to uninitialized resource for the queues were there any bogus request sent down by untrusted driver. Tie up the loose ends there. Fixes: `52893733f2` ("vdpa/mlx5: Add multiqueue support") Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Link: https://lore.kernel.org/r/1642206481-30721-4-git-send-email-si-wei.liu@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-03-04 11:56:34 -05:00
Si-Wei Liu	30c22f3816	vdpa/mlx5: should verify CTRL_VQ feature exists for MQ Per VIRTIO v1.1 specification, section 5.1.3.1 Feature bit requirements: "VIRTIO_NET_F_MQ Requires VIRTIO_NET_F_CTRL_VQ". There's assumption in the mlx5_vdpa multiqueue code that MQ must come together with CTRL_VQ. However, there's nowhere in the upper layer to guarantee this assumption would hold. Were there an untrusted driver sending down MQ without CTRL_VQ, it would compromise various spots for e.g. is_index_valid() and is_ctrl_vq_idx(). Although this doesn't end up with immediate panic or security loophole as of today's code, the chance for this to be taken advantage of due to future code change is not zero. Harden the crispy assumption by failing the set_driver_features() call when seeing (MQ && !CTRL_VQ). For that end, verify_min_features() is renamed to verify_driver_features() to reflect the fact that it now does more than just validate the minimum features. verify_driver_features() is now used to accommodate various checks against the driver features for set_driver_features(). Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Link: https://lore.kernel.org/r/1642206481-30721-3-git-send-email-si-wei.liu@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-03-04 11:56:33 -05:00
Si-Wei Liu	e0077cc13b	vdpa: factor out vdpa_set_features_unlocked for vdpa internal use No functional change introduced. vdpa bus driver such as virtio_vdpa or vhost_vdpa is not supposed to take care of the locking for core by its own. The locked API vdpa_set_features should suffice the bus driver's need. Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/1642206481-30721-2-git-send-email-si-wei.liu@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-03-04 11:56:33 -05:00
John Garry	32e92d9f6f	iommu/iova: Separate out rcache init Currently the rcache structures are allocated for all IOVA domains, even if they do not use "fast" alloc+free interface. This is wasteful of memory. In addition, fails in init_iova_rcaches() are not handled safely, which is less than ideal. Make "fast" users call a separate rcache init explicitly, which includes error checking. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/1643882360-241739-1-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-02-14 15:43:15 +01:00
Linus Torvalds	3bf6a9e36e	virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes partial support for < MAX_ORDER - 1 granularity for virtio-mem driver_override for vdpa sysfs ABI documentation for vdpa multiqueue config support for mlx5 vdpa Misc fixes, cleanups. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmHiDHkPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpVT4H/3Veixt3uYPOmuLU2tSx+8X+sFTtik81hyiE okz5fRJrxxA8SqS76FnmO10FS4hlPOGNk0Z5WVhr0yihwFvPLvpCM/xi2Lmrz9I7 pB0sXOIocEL1xApsxukR9K1Twpb2hfYsflbJYUVlRfhS5G0izKJNZp5I7OPrzd80 vVNNDWKW2iLDlfqsavumI4Kvm4nsFuCHG03jzMtcIa7YTXYV3DORD4ZGFFVUOIQN t5F74TznwHOeYgJeg7TzjFjfPWmXjLetvx10QX1A1uOvwppWW/QY6My0UafTXNXj VB3gOwJPf+gxXAXl/4bafq4NzM0xys6cpcPpjvhmU+erY4UuyAU= =Y1eO -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: "virtio,vdpa,qemu_fw_cfg: features, cleanups, and fixes. - partial support for < MAX_ORDER - 1 granularity for virtio-mem - driver_override for vdpa - sysfs ABI documentation for vdpa - multiqueue config support for mlx5 vdpa - and misc fixes, cleanups" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (42 commits) vdpa/mlx5: Fix tracking of current number of VQs vdpa/mlx5: Fix is_index_valid() to refer to features vdpa: Protect vdpa reset with cf_mutex vdpa: Avoid taking cf_mutex lock on get status vdpa/vdpa_sim_net: Report max device capabilities vdpa: Use BIT_ULL for bit operations vdpa/vdpa_sim: Configure max supported virtqueues vdpa/mlx5: Report max device capabilities vdpa: Support reporting max device capabilities vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps() vdpa: Add support for returning device configuration information vdpa/mlx5: Support configuring max data virtqueue vdpa/mlx5: Fix config_attr_mask assignment vdpa: Allow to configure max data virtqueues vdpa: Read device configuration only if FEATURES_OK vdpa: Sync calls set/get config/status with cf_mutex vdpa/mlx5: Distribute RX virtqueues in RQT object vdpa: Provide interface to read driver features vdpa: clean up get_config_size ret value handling virtio_ring: mark ring unused on error ...	2022-01-18 10:05:48 +02:00
Eli Cohen	b03fc43e73	vdpa/mlx5: Fix tracking of current number of VQs Modify the code such that ndev->cur_num_vqs better reflects the actual number of data virtqueues. The value can be accurately realized after features have been negotiated. This is to prevent possible failures when modifying the RQT object if the cur_num_vqs bears invalid value. No issue was actually encountered but this also makes the code more readable. Fixes: c5a5cd3d3217 ("vdpa/mlx5: Support configuring max data virtqueue") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220111183400.38418-5-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-01-14 18:50:54 -05:00
Eli Cohen	f8ae3a489b	vdpa/mlx5: Fix is_index_valid() to refer to features Make sure the decision whether an index received through a callback is valid or not consults the negotiated features. The motivation for this was due to a case encountered where I shut down the VM. After the reset operation was called features were already clear, I got get_vq_state() call which caused out array bounds access since is_index_valid() reported the index value. So this is more of not hit a bug since the call shouldn't have been made first place. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220111183400.38418-4-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-01-14 18:50:54 -05:00
Eli Cohen	f6d955d808	vdpa: Avoid taking cf_mutex lock on get status Avoid the wrapper holding cf_mutex since it is not protecting anything. To avoid confusion and unnecessary overhead incurred by it, remove. Fixes: f489f27bc0ab ("vdpa: Sync calls set/get config/status with cf_mutex") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220111183400.38418-2-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-01-14 18:50:54 -05:00
Eli Cohen	b2ce6197c9	vdpa/vdpa_sim_net: Report max device capabilities Configure max supported virtqueues features on the management device. This info can be retrieved using: $ vdpa mgmtdev show vdpasim_net: supported_classes net max_supported_vqs 2 dev_features MAC ANY_LAYOUT VERSION_1 ACCESS_PLATFORM Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-15-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-01-14 18:50:54 -05:00
Eli Cohen	47a1401ac9	vdpa: Use BIT_ULL for bit operations All masks in this file are 64 bits. Change BIT to BIT_ULL. Other occurences use (1 << val) which yields a 32 bit value. Change them to use BIT_ULL too. Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-14-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-01-14 18:50:54 -05:00
Eli Cohen	cbe777e98b	vdpa/vdpa_sim: Configure max supported virtqueues Configure max supported virtqueues on the management device. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-13-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-01-14 18:50:54 -05:00
Eli Cohen	79de65edf8	vdpa/mlx5: Report max device capabilities Configure max supported virtqueues and features on the management device. This info can be retrieved using: $ vdpa mgmtdev show auxiliary/mlx5_core.sf.1: supported_classes net max_supported_vqs 257 dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ MQ \ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-12-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>	2022-01-14 18:50:54 -05:00
Eli Cohen	cd2629f6df	vdpa: Support reporting max device capabilities Add max_supported_vqs and supported_features fields to struct vdpa_mgmt_dev. Upstream drivers need to feel these values according to the device capabilities. These values are reported back in a netlink message when showing management devices. Examples: $ auxiliary/mlx5_core.sf.1: supported_classes net max_supported_vqs 257 dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ MQ \ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM $ vdpa -j mgmtdev show {"mgmtdev":{"auxiliary/mlx5_core.sf.1":{"supported_classes":["net"], \ "max_supported_vqs":257,"dev_features":["CSUM","GUEST_CSUM","MTU", \ "HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ","CTRL_MAC_ADDR", \ "VERSION_1","ACCESS_PLATFORM"]}}} $ vdpa -jp mgmtdev show { "mgmtdev": { "auxiliary/mlx5_core.sf.1": { "supported_classes": [ "net" ], "max_supported_vqs": 257, "dev_features": ["CSUM","GUEST_CSUM","MTU","HOST_TSO4", \ "HOST_TSO6","STATUS","CTRL_VQ","MQ", \ "CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM"] } } } Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-11-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>	2022-01-14 18:50:54 -05:00
Eli Cohen	37e07e7058	vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps() Restore ndev->cur_num_vqs to the original value in case change_num_qps() fails. Fixes: `52893733f2` ("vdpa/mlx5: Add multiqueue support") Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-10-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-14 18:50:54 -05:00
Eli Cohen	612f330ec5	vdpa: Add support for returning device configuration information Add netlink attribute to store the negotiated features. This can be used by userspace to get the current state of the vdpa instance. Examples: $ vdpa dev config show vdpa-a vdpa-a: mac 00:00:00:00:88:88 link up link_announce false max_vq_pairs 16 mtu 1500 negotiated_features CSUM GUEST_CSUM MTU MAC HOST_TSO4 HOST_TSO6 STATUS \ CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM $ vdpa -j dev config show vdpa-a {"config":{"vdpa-a":{"mac":"00:00:00:00:88:88","link ":"up","link_announce":false, \ "max_vq_pairs":16,"mtu":1500,"negotiated_features":["CSUM","GUEST_CSUM","MTU","MAC", \ "HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ","CTRL_MAC_ADDR","VERSION_1", \ "ACCESS_PLATFORM"]}}} $ vdpa -jp dev config show vdpa-a { "config": { "vdpa-a": { "mac": "00:00:00:00:88:88", "link ": "up", "link_announce ": false, "max_vq_pairs": 16, "mtu": 1500, "negotiated_features": [ "CSUM","GUEST_CSUM","MTU","MAC","HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ", \ "CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM" ] } } } Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-9-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-01-14 18:50:53 -05:00
Eli Cohen	75560522ea	vdpa/mlx5: Support configuring max data virtqueue Check whether the max number of data virtqueue pairs was provided when a adding a new device and verify the new value does not exceed device capabilities. In addition, change the arrays holding virtqueue and callback contexts to be dynamically allocated. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-8-elic@nvidia.com Includes fixup: vdpa/mlx5: fix error handling in mlx5_vdpa_dev_add() Clang build fails with mlx5_vnet.c:2574:6: error: variable 'mvdev' is used uninitialized whenever 'if' condition is true if (!ndev->vqs \|\| !ndev->event_cbs) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ mlx5_vnet.c:2660:14: note: uninitialized use occurs here put_device(&mvdev->vdev.dev); ^~~~~ This because mvdev is set after trying to allocate ndev->vqs,event_cbs. So move the allocation to after mvdev is set but before the arrays are used in init_mvqs() Signed-off-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20220107211352.3940570-1-trix@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Includes fixup: vdpa/mlx5: fix endian-ness for max vqs sparse warnings: (new ones prefixed by >>) >> drivers/vdpa/mlx5/net/mlx5_vnet.c:1247:23: sparse: sparse: cast to restricted __le16 >> drivers/vdpa/mlx5/net/mlx5_vnet.c:1247:23: sparse: sparse: cast from restricted __virtio16 > 1247 num = le16_to_cpu(ndev->config.max_virtqueue_pairs); Address this using the appropriate wrapper. Cc: "Eli Cohen" <elic@nvidia.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eli Cohen <elic@nvidia.com>	2022-01-14 18:50:53 -05:00
Eli Cohen	e3137056e6	vdpa/mlx5: Fix config_attr_mask assignment Fix VDPA_ATTR_DEV_NET_CFG_MACADDR assignment to be explicit 64 bit assignment. No issue was seen since the value is well below 64 bit max value. Nevertheless it needs to be fixed. Fixes: `a007d94004` ("vdpa/mlx5: Support configuration of MAC") Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-7-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-14 18:50:53 -05:00
Eli Cohen	aba21aff77	vdpa: Allow to configure max data virtqueues Add netlink support to configure the max virtqueue pairs for a device. At least one pair is required. The maximum is dictated by the device. Example: $ vdpa dev add name vdpa-a mgmtdev auxiliary/mlx5_core.sf.1 max_vqp 4 Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-6-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-14 18:50:53 -05:00
Eli Cohen	30ef7a8ac8	vdpa: Read device configuration only if FEATURES_OK Avoid reading device configuration during feature negotiation. Read device status and verify that VIRTIO_CONFIG_S_FEATURES_OK is set. Protect the entire operation, including configuration read with cf_mutex to ensure integrity of the results. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-5-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-01-14 18:50:53 -05:00
Eli Cohen	73bc0dbb59	vdpa: Sync calls set/get config/status with cf_mutex Add wrappers to get/set status and protect these operations with cf_mutex to serialize these operations with respect to get/set config operations. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-4-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-14 18:50:53 -05:00
Eli Cohen	a7f46ba424	vdpa/mlx5: Distribute RX virtqueues in RQT object Distribute the available rx virtqueues amongst the available RQT entries. RQTs require to have a power of two entries. When creating or modifying the RQT, use the lowest number of power of two entries that is not less than the number of rx virtqueues. Distribute them in the available entries such that some virtqueus may be referenced twice. This allows to configure any number of virtqueue pairs when multiqueue is used. Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-3-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-14 18:50:53 -05:00
Eli Cohen	a64917bc2e	vdpa: Provide interface to read driver features Provide an interface to read the negotiated features. This is needed when building the netlink message in vdpa_dev_net_config_fill(). Also fix the implementation of vdpa_dev_net_config_fill() to use the negotiated features instead of the device features. To make APIs clearer, make the following name changes to struct vdpa_config_ops so they better describe their operations: get_features -> get_device_features set_features -> set_driver_features Finally, add get_driver_features to return the negotiated features and add implementation to all the upstream drivers. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-2-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-14 18:50:53 -05:00
Eli Cohen	97143b70aa	vdpa/mlx5: Fix wrong configuration of virtio_version_1_0 Remove overriding of virtio_version_1_0 which forced the virtqueue object to version 1. Fixes: `1a86b377aa` ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20211230142024.142979-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>	2022-01-14 18:50:53 -05:00
Christophe JAILLET	10aa250b2f	eni_vdpa: Simplify 'eni_vdpa_probe()' When 'pcim_enable_device()' is used, some resources become automagically managed. There is no need to call 'pci_free_irq_vectors()' when the driver is removed. The same will already be done by 'pcim_release()'. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/02045bdcbbb25f79bae4827f66029cfcddc90381.1636301587.git.christophe.jaillet@wanadoo.fr Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-01-14 18:50:52 -05:00
Eli Cohen	60af39c1f4	net/mlx5_vdpa: Offer VIRTIO_NET_F_MTU when setting MTU Make sure to offer VIRTIO_NET_F_MTU since we configure the MTU based on what was queried from the device. This allows the virtio driver to allocate large enough buffers based on the reported MTU. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20211124170949.51725-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>	2022-01-14 18:50:52 -05:00
Stefano Garzarella	539fec78ed	vdpa: add driver_override support `driver_override` allows to control which of the vDPA bus drivers binds to a vDPA device. If `driver_override` is not set, the previous behaviour is followed: devices use the first vDPA bus driver loaded (unless auto binding is disabled). Tested on Fedora 34 with driverctl(8): $ modprobe virtio-vdpa $ modprobe vhost-vdpa $ modprobe vdpa-sim-net $ vdpa dev add mgmtdev vdpasim_net name dev1 # dev1 is attached to the first vDPA bus driver loaded $ driverctl -b vdpa list-devices dev1 virtio_vdpa $ driverctl -b vdpa set-override dev1 vhost_vdpa $ driverctl -b vdpa list-devices dev1 vhost_vdpa [*] Note: driverctl(8) integrates with udev so the binding is preserved. Suggested-by: Jason Wang <jasowang@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20211126164753.181829-3-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-14 18:50:52 -05:00
Zhu Lingshan	0f420c383a	ifcvf/vDPA: fix misuse virtio-net device config size for blk dev This commit fixes a misuse of virtio-net device config size issue for virtio-block devices. A new member config_size in struct ifcvf_hw is introduced and would be initialized through vdpa_dev_add() to record correct device config size. To be more generic, rename ifcvf_hw.net_config to ifcvf_hw.dev_config, the helpers ifcvf_read/write_net_config() to ifcvf_read/write_dev_config() Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Reported-and-suggested-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Fixes: `6ad31d162a` ("vDPA/ifcvf: enable Intel C5000X-PL virtio-block for vDPA") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211201081255.60187-1-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-14 18:50:52 -05:00
Guanjun	b4d80c8dda	vduse: moving kvfree into caller This free action should be moved into caller 'vduse_ioctl' in concert with the allocation. No functional change. Signed-off-by: Guanjun <guanjun@linux.alibaba.com> Link: https://lore.kernel.org/r/1638780498-55571-1-git-send-email-guanjun@linux.alibaba.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-14 18:50:52 -05:00
Linus Torvalds	13eaa5bda0	IOMMU Updates for Linux v5.17 Including: - Identity domain support for virtio-iommu - Move flush queue code into iommu-dma - Some fixes for AMD IOMMU suspend/resume support when x2apic is used - Arm SMMU Updates from Will Deacon: - Revert evtq and priq back to their former sizes - Return early on short-descriptor page-table allocation failure - Fix page fault reporting for Adreno GPU on SMMUv2 - Make SMMUv3 MMU notifier ops 'const' - Numerous new compatible strings for Qualcomm SMMUv2 implementations - Various smaller fixes and cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmHfBvsACgkQK/BELZcB GuOURRAAqHBTUUgT/f2dJbVsWMOxSx0dhDnKuLRhAREMbmcn88tR8dxPRPYB1/6q yyFmkh13UzEr1gosEd34P5G8GS6fMD/G4yz8p9tK6Mqu7UeU+DoiDIi3s194WhYa iNPBlGgQ3XznTrbBnFV+n/LjvkxSguNfjj809Jiw7Ew3i50K4GFnzRA+IZVAT4+F zdNmwY12Y0v4SCfHKiVR1ZB9kcn2/Dx78dlBQ6ALFuejeZGa2OVr94byKnfz+AJh OssrgRcgUKclWbMk4Tljf4/FIdwkLMKD6gieBFb3sNKTUpzkG8elYmETO5d+hM+l 27CKOCz1OOqVRzKe3r5n3c0wf9UAmi0Q91zW+UVZp2i0GjxpfIeIS9/NwRy+IXHS U9zybU47q10WF6cVO0n6wWHPRbjPii2OZpjqhSTq57qsnniCPLwkrry9H2fP71zz NDAZv5qvHCvRF7QoZfkBvCCJ12ZhNnhZqTfZR2wGGITMIk6dokG4NCsU93rSVKvZ 4xQDPm45rECmunibdc9c1vrifKC7BIWCSU5DH3AEDBU/i9QfYpVPXfJlGdz3enIV /FA+kcvYrh21sokly/TqiZXGSaOFBqFEN13KJReXOgbENNq6kT/4lSNjK5Q1WSWp qDq12EQyv0RtTEcKVDpRVunI+/G5MquO8gbIrVmsRV0SU1Z0yoM= =Q8LV -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - Identity domain support for virtio-iommu - Move flush queue code into iommu-dma - Some fixes for AMD IOMMU suspend/resume support when x2apic is used - Arm SMMU Updates from Will Deacon: - Revert evtq and priq back to their former sizes - Return early on short-descriptor page-table allocation failure - Fix page fault reporting for Adreno GPU on SMMUv2 - Make SMMUv3 MMU notifier ops 'const' - Numerous new compatible strings for Qualcomm SMMUv2 implementations - Various smaller fixes and cleanups * tag 'iommu-updates-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (38 commits) iommu/iova: Temporarily include dma-mapping.h from iova.h iommu: Move flush queue data into iommu_dma_cookie iommu/iova: Move flush queue code to iommu-dma iommu/iova: Consolidate flush queue code iommu/vt-d: Use put_pages_list iommu/amd: Use put_pages_list iommu/amd: Simplify pagetable freeing iommu/iova: Squash flush_cb abstraction iommu/iova: Squash entry_dtor abstraction iommu/iova: Fix race between FQ timeout and teardown iommu/amd: Fix typo in glues … together in comment iommu/vt-d: Remove unused dma_to_mm_pfn function iommu/vt-d: Drop duplicate check in dma_pte_free_pagetable() iommu/vt-d: Use bitmap_zalloc() when applicable iommu/amd: Remove useless irq affinity notifier iommu/amd: X2apic mode: mask/unmask interrupts on suspend/resume iommu/amd: X2apic mode: setup the INTX registers on mask/unmask iommu/amd: X2apic mode: re-enable after resume iommu/amd: Restore GA log/tail pointer on host resume iommu/iova: Move fast alloc size roundup into alloc_iova_fast() ...	2022-01-12 16:15:51 -08:00
Linus Torvalds	6dc69d3d0d	driver core changes for 5.17-rc1 Here is the set of changes for the driver core for 5.17-rc1. Lots of little things here, including: - kobj_type cleanups - auxiliary_bus documentation updates - auxiliary_device conversions for some drivers (relevant subsystems all have provided acks for these) - kernfs lock contention reduction for some workloads - other tiny cleanups and changes. All of these have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYd7deA8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ym8ngCgw0ANwrRPE5b1dthEmfU2f8Knk5kAn0pHQv6R VRZJypgNfU/Pt0ykstZD =CO9J -----END PGP SIGNATURE----- Merge tag 'driver-core-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of changes for the driver core for 5.17-rc1. Lots of little things here, including: - kobj_type cleanups - auxiliary_bus documentation updates - auxiliary_device conversions for some drivers (relevant subsystems all have provided acks for these) - kernfs lock contention reduction for some workloads - other tiny cleanups and changes. All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (43 commits) kobject documentation: remove default_attrs information drivers/firmware: Add missing platform_device_put() in sysfb_create_simplefb debugfs: lockdown: Allow reading debugfs files that are not world readable driver core: Make bus notifiers in right order in really_probe() driver core: Move driver_sysfs_remove() after driver_sysfs_add() firmware: edd: remove empty default_attrs array firmware: dmi-sysfs: use default_groups in kobj_type qemu_fw_cfg: use default_groups in kobj_type firmware: memmap: use default_groups in kobj_type sh: sq: use default_groups in kobj_type headers/uninline: Uninline single-use function: kobject_has_children() devtmpfs: mount with noexec and nosuid driver core: Simplify async probe test code by using ktime_ms_delta() nilfs2: use default_groups in kobj_type kobject: remove kset from struct kset_uevent_ops callbacks driver core: make kobj_type constant. driver core: platform: document registration-failure requirement vdpa/mlx5: Use auxiliary_device driver data helpers net/mlx5e: Use auxiliary_device driver data helpers soundwire: intel: Use auxiliary_device driver data helpers ...	2022-01-12 11:11:34 -08:00
Joerg Roedel	66dc1b791c	Merge branches 'arm/smmu', 'virtio', 'x86/amd', 'x86/vt-d' and 'core' into next	2022-01-04 10:33:45 +01:00
David E. Box	45e3a27984	vdpa/mlx5: Use auxiliary_device driver data helpers Use auxiliary_get_drvdata and auxiliary_set_drvdata helpers. Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20211221235852.323752-5-david.e.box@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-22 13:59:01 +01:00
John Garry via iommu	972bf252f8	iommu/iova: Move fast alloc size roundup into alloc_iova_fast() It really is a property of the IOVA rcache code that we need to alloc a power-of-2 size, so relocate the functionality to resize into alloc_iova_fast(), rather than the callsites. Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/1638875846-23993-1-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-12-17 09:10:40 +01:00
Parav Pandit	bb47620be3	vdpa: Consider device id larger than 31 virtio device id value can be more than 31. Hence, use BIT_ULL in assignment. Fixes: `33b347503f` ("vdpa: Define vdpa mgmt device, ops and a netlink interface") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Parav Pandit <parav@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211130042949.88958-1-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-12-08 15:41:50 -05:00
Dan Carpenter	dc1db0060c	vduse: check that offset is within bounds in get_config() This condition checks "len" but it does not check "offset" and that could result in an out of bounds read if "offset > dev->config_size". The problem is that since both variables are unsigned the "dev->config_size - offset" subtraction would result in a very high unsigned value. I think these checks might not be necessary because "len" and "offset" are supposed to already have been validated using the vhost_vdpa_config_validate() function. But I do not know the code perfectly, and I like to be safe. Fixes: `c8a6153b6c` ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211208150956.GA29160@kili Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: stable@vger.kernel.org	2021-12-08 14:53:15 -05:00
Dan Carpenter	ff9f9c6e74	vduse: fix memory corruption in vduse_dev_ioctl() The "config.offset" comes from the user. There needs to a check to prevent it being out of bounds. The "config.offset" and "dev->config_size" variables are both type u32. So if the offset if out of bounds then the "dev->config_size - config.offset" subtraction results in a very high u32 value. The out of bounds offset can result in memory corruption. Fixes: `c8a6153b6c` ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211208103307.GA3778@kili Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: stable@vger.kernel.org	2021-12-08 14:53:15 -05:00
Longpeng	bb93ce4b15	vdpa_sim: avoid putting an uninitialized iova_domain The system will crash if we put an uninitialized iova_domain, this could happen when an error occurs before initializing the iova_domain in vdpasim_create(). BUG: kernel NULL pointer dereference, address: 0000000000000000 ... RIP: 0010:__cpuhp_state_remove_instance+0x96/0x1c0 ... Call Trace: <TASK> put_iova_domain+0x29/0x220 vdpasim_free+0xd1/0x120 [vdpa_sim] vdpa_release_dev+0x21/0x40 [vdpa] device_release+0x33/0x90 kobject_release+0x63/0x160 vdpasim_create+0x127/0x2a0 [vdpa_sim] vdpasim_net_dev_add+0x7d/0xfe [vdpa_sim_net] vdpa_nl_cmd_dev_add_set_doit+0xe1/0x1a0 [vdpa] genl_family_rcv_msg_doit+0x112/0x140 genl_rcv_msg+0xdf/0x1d0 ... So we must make sure the iova_domain is already initialized before put it. In addition, we may get the following warning in this case: WARNING: ... drivers/iommu/iova.c:344 iova_cache_put+0x58/0x70 So we must make sure the iova_cache_put() is invoked only if the iova_cache_get() is already invoked. Let's fix it together. Cc: stable@vger.kernel.org Fixes: `4080fc1067` ("vdpa_sim: use iova module to allocate IOVA addresses") Signed-off-by: Longpeng <longpeng2@huawei.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20211124015215.119-1-longpeng2@huawei.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-24 19:00:29 -05:00
Linus Torvalds	43e1b12927	vhost,virtio,vhost: fixes,features Hardening work by Jason vdpa driver for Alibaba ENI Performance tweaks for virtio blk virtio rng rework using an internal buffer mac/mtu programming for mlx5 vdpa Misc fixes, cleanups Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmF/su8PHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpo+sIAJjBTvbET+d0KuIMt8YBGsU+NWgHaJxW06hm GHZzinueZYWaS20MPMYMPfcHUgigWj0kk7F0gMljG5rtDFzR85JkOi7+e5V6RVJW 8SQc7JjA1Krde6EiPJdlv3mVkLz/5VbrGUgjAW9di/O04Xc/jAc9kmJ41wTYr0mL E5IsUi9QBmgHtKQ+2ofb9QEWuebJclGK+NVd3Kp08BtAQOrdn5UrIzY30/sjkZwE ul2ll6v/k7T3dCQbHD/wMgfFycPvoZVfxaVj+FWQnwdWkqZwzMRmpKPRes4KJfww Lfx1qox10mJzH7XJCs7xmOM8f7HgNNWi0gD5FxNlfkiz1AwHYOk= =DTXY -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: "vhost and virtio fixes and features: - Hardening work by Jason - vdpa driver for Alibaba ENI - Performance tweaks for virtio blk - virtio rng rework using an internal buffer - mac/mtu programming for mlx5 vdpa - Misc fixes, cleanups" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (45 commits) vdpa/mlx5: Forward only packets with allowed MAC address vdpa/mlx5: Support configuration of MAC vdpa/mlx5: Fix clearing of VIRTIO_NET_F_MAC feature bit vdpa_sim_net: Enable user to set mac address and mtu vdpa: Enable user to set mac and mtu of vdpa device vdpa: Use kernel coding style for structure comments vdpa: Introduce query of device config layout vdpa: Introduce and use vdpa device get, set config helpers virtio-scsi: don't let virtio core to validate used buffer length virtio-blk: don't let virtio core to validate used length virtio-net: don't let virtio core to validate used length virtio_ring: validate used buffer length virtio_blk: correct types for status handling virtio_blk: allow 0 as num_request_queues i2c: virtio: Add support for zero-length requests virtio-blk: fixup coccinelle warnings virtio_ring: fix typos in vring_desc_extra virtio-pci: harden INTX interrupts virtio_pci: harden MSI-X interrupts virtio_config: introduce a new .enable_cbs method ...	2021-11-03 15:00:39 -07:00
Eli Cohen	540061ac79	vdpa/mlx5: Forward only packets with allowed MAC address Add rules to forward packets to the net device's TIR only if the destination MAC is equal to the configured MAC. This is required to prevent the netdevice from receiving traffic not destined to its configured MAC. Signed-off-by: Eli Cohen <elic@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Link: https://lore.kernel.org/r/20211026175519.87795-9-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-11-01 05:26:49 -04:00
Eli Cohen	a007d94004	vdpa/mlx5: Support configuration of MAC Add code to accept MAC configuration through vdpa tool. The MAC is written into the config struct and later can be retrieved through get_config(). Examples: 1. Configure MAC while adding a device: $ vdpa dev add mgmtdev pci/0000:06:00.2 name vdpa0 mac 00:11:22:33:44:55 2. Show configured params: $ vdpa dev config show vdpa0: mac 00:11:22:33:44:55 link down link_announce false max_vq_pairs 8 mtu 1500 Signed-off-by: Eli Cohen <elic@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211026175519.87795-8-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 05:26:49 -04:00
Parav Pandit	ef76eb83a1	vdpa/mlx5: Fix clearing of VIRTIO_NET_F_MAC feature bit Cited patch in the fixes tag clears the features bit during reset. mlx5 vdpa device feature bits are static decided by device capabilities. These feature bits (including VIRTIO_NET_F_MAC) are initialized during device addition time. Clearing features bit in reset callback cleared the VIRTIO_NET_F_MAC. Due to this, MAC address provided by the device is not honored. Fix it by not clearing the static feature bits during reset. Fixes: `0686082dbf` ("vdpa: Add reset callback in vdpa_config_ops") Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20211026175519.87795-7-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-11-01 05:26:49 -04:00
Parav Pandit	1138b9818e	vdpa_sim_net: Enable user to set mac address and mtu Enable user to set the mac address and mtu so that each vdpa device can have its own user specified mac address and mtu. Now that user is enabled to set the mac address, remove the module parameter for same. And example of setting mac addr and mtu and view the configuration: $ vdpa mgmtdev show vdpasim_net: supported_classes net $ vdpa dev add name bar mgmtdev vdpasim_net mac 00:11:22:33:44:55 mtu 9000 $ vdpa dev config show bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000 Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20211026175519.87795-6-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 05:26:49 -04:00
Parav Pandit	d8ca2fa5be	vdpa: Enable user to set mac and mtu of vdpa device $ vdpa dev add name bar mgmtdev vdpasim_net mac 00:11:22:33:44:55 mtu 9000 $ vdpa dev config show bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000 $ vdpa dev config show -jp { "config": { "bar": { "mac": "00:11:22:33:44:55", "link ": "up", "link_announce ": false, "mtu": 9000, } } } Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211026175519.87795-5-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2021-11-01 05:26:49 -04:00
Parav Pandit	ad69dd0bf2	vdpa: Introduce query of device config layout Introduce a command to query a device config layout. An example query of network vdpa device: $ vdpa dev add name bar mgmtdev vdpasim_net $ vdpa dev config show bar: mac 00:35:09:19:48:05 link up link_announce false mtu 1500 $ vdpa dev config show -jp { "config": { "bar": { "mac": "00:35:09:19:48:05", "link ": "up", "link_announce ": false, "mtu": 1500, } } } Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211026175519.87795-3-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 05:26:49 -04:00
Parav Pandit	6dbb1f1687	vdpa: Introduce and use vdpa device get, set config helpers Subsequent patches enable get and set configuration either via management device or via vdpa device' config ops. This requires synchronization between multiple callers to get and set config callbacks. Features setting also influence the layout of the configuration fields endianness. To avoid exposing synchronization primitives to callers, introduce helper for setting the configuration and use it. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20211026175519.87795-2-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 05:26:49 -04:00
Eli Cohen	edf747affc	vdpa/mlx5: Propagate link status from device to vdpa driver Add code to register to hardware asynchronous events. Use this mechanism to track link status events coming from the device and update the config struct. After doing link status change, call the vdpa callback to notify of the link status change. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210909123635.30884-4-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-11-01 05:26:47 -04:00
Eli Cohen	218bdd20e5	vdpa/mlx5: Rename control VQ workqueue to vdpa wq A subesequent patch will use the same workqueue for executing other work not related to control VQ. Rename the workqueue and the work queue entry used to convey information to the workqueue. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210909123635.30884-3-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-11-01 05:26:47 -04:00
Eli Cohen	246fd1caf0	vdpa/mlx5: Remove mtu field from vdpa net device No need to save the mtu int the net device struct. We can save it in the config struct which cannot be modified. Moreover, move the initialization to. mlx5_vdpa_set_features() callback is not the right place to put it. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210909123635.30884-2-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 05:26:47 -04:00
Wu Zongyong	e85087beed	eni_vdpa: add vDPA driver for Alibaba ENI This patch adds a new vDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon virtio 0.9.5 specification. And this driver is only enabled on X86 host currently. Link: https://lore.kernel.org/r/6a9f32c00609af16bbb2ea32e633b3beb1cbf84b.1635493219.git.wuzongyong@linux.alibaba.com Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com> Link: https://lore.kernel.org/r/20211026083214.3375383-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> # fix Kconfig typo Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 05:23:41 -04:00
Wu Zongyong	e47be840e8	vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE This attribute advertises the min value of virtqueue size. The value is 1 by default. Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com> Link: https://lore.kernel.org/r/2bbc417355c4d22298050b1ba887cecfbde3e85d.1635493219.git.wuzongyong@linux.alibaba.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 04:30:35 -04:00
Wu Zongyong	c53e5d1b5e	vdpa: min vq num of vdpa device cannot be greater than max vq num Just failed to probe the vdpa device if the min virtqueue num returned by get_vq_num_min is greater than the max virtqueue num returned by get_vq_num_max. Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/21199b62cc10b2a9f2cf90eeb63ad080645d881f.1635493219.git.wuzongyong@linux.alibaba.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 04:30:35 -04:00
Wu Zongyong	5bbfea1eac	vp_vdpa: add vq irq offloading support This patch implements the get_vq_irq() callback for virtio pci devices to allow irq offloading. Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/bb091e5505db704dd620f8854a7aebc921d2a752.1635493219.git.wuzongyong@linux.alibaba.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 04:30:34 -04:00
Jakub Kicinski	7df621a3ee	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net include/net/sock.h `7b50ecfcc6` ("net: Rename ->stream_memory_read to ->sock_is_readable") `4c1e34c0db` ("vsock: Enable y2038 safe timeval for timeout") drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c `0daa55d033` ("octeontx2-af: cn10k: debugfs for dumping LMTST map table") `e77bcdd1f6` ("octeontx2-af: Display all enabled PF VF rsrc_alloc entries.") Adjacent code addition in both cases, keep both. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-28 10:43:58 -07:00
Xie Yongji	0943aacf5a	vduse: Fix race condition between resetting and irq injecting The interrupt might be triggered after a reset since there is no synchronization between resetting and irq injecting. And it might break something if the interrupt is delayed until a new round of device initialization. Fixes: `c8a6153b6c` ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210929083050.88-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-10-22 06:49:14 -04:00
Xie Yongji	1394103fd7	vduse: Disallow injecting interrupt before DRIVER_OK is set The interrupt callback should not be triggered before DRIVER_OK is set. Otherwise, it might break the virtio device driver. So let's add a check to avoid the unexpected behavior. Fixes: `c8a6153b6c` ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210923075722.98-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-10-22 06:49:14 -04:00
Aharon Landau	83fec3f12a	RDMA/mlx5: Replace struct mlx5_core_mkey by u32 key In mlx5_core and vdpa there is no use of mlx5_core_mkey members except for the key itself. As preparation for moving mlx5_core_mkey to mlx5_ib, the occurrences of struct mlx5_core_mkey in all modules except for mlx5_ib are replaced by a u32 key. Signed-off-by: Aharon Landau <aharonl@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2021-10-19 14:34:12 +03:00
Aharon Landau	c64674168b	RDMA/mlx5: Remove pd from struct mlx5_core_mkey There is no read of mkey->pd, only write. Remove it. Signed-off-by: Aharon Landau <aharonl@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2021-10-19 14:33:58 +03:00
Aharon Landau	062fd731e5	RDMA/mlx5: Remove size from struct mlx5_core_mkey mkey->size is already stored in ibmr->length, no need to store it here. Signed-off-by: Aharon Landau <aharonl@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2021-10-19 14:33:44 +03:00
Aharon Landau	cf6a8b1b24	RDMA/mlx5: Remove iova from struct mlx5_core_mkey iova is already stored in ibmr->iova, no need to store it here. Signed-off-by: Aharon Landau <aharonl@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2021-10-19 14:33:20 +03:00
Eli Cohen	759be8993b	vdpa/mlx5: Avoid executing set_vq_ready() if device is reset Avoid executing set_vq_ready() if the device has been reset. In such case, the features are cleared and cannot be used in conditional statements. Such reference happens is the function ctrl_vq_idx(). Fixes: `52893733f2` ("vdpa/mlx5: Add multiqueue support") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210909063738.46970-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-09-14 18:10:43 -04:00
Eli Cohen	ef12e4bf42	vdpa/mlx5: Clear ready indication for control VQ When clearing VQs ready indication for the data VQs, do the same for the control VQ. Fixes: `5262912ef3` ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210909063652.46880-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-09-14 18:10:43 -04:00
Xie Yongji	7bb5fb2073	vduse: Cleanup the old kernel states after reset failure We should cleanup the old kernel states e.g. interrupt callback no matter whether the userspace handle the reset correctly or not since virtio-vdpa can't handle the reset failure now. Otherwise, the old state might be used after reset which might break something, e.g. the old interrupt callback might be triggered by userspace after reset, which can break the virtio device driver. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210906142158.181-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-09-14 18:10:43 -04:00
Dan Carpenter	6243e3c78a	vduse: missing error code in vduse_init() This should return -ENOMEM if alloc_workqueue() fails. Currently it returns success. Fixes: b66219796563 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20210907073223.GA18254@kili Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-09-14 18:10:42 -04:00
Linus Torvalds	78e709522d	virtio,vdpa,vhost: features, fixes vduse driver supporting blk virtio-vsock support for end of record with SEQPACKET vdpa: mac and mq support for ifcvf and mlx5 vdpa: management netlink for ifcvf virtio-i2c, gpio dt bindings misc fixes, cleanups NB: when merging this with `b542e383d8` ("eventfd: Make signal recursion protection a task bit") from Linus' tree, replace eventfd_signal_count with eventfd_signal_allowed, and drop the export of eventfd_wake_count from ("eventfd: Export eventfd_wake_count to modules"). Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmE1+awPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpt6EIAJy0qrc62lktNA0IiIVJSLbUbTMmFj8MzkGR 8UxZdhpjWqBPJPyaOuNeksAqTGm/UAPEYx3C2c95Jhej7anFpy7dbCtIXcPHLJME DjcJg+EDrlNCj8m0FcsHpHWsFzPMERJpyEZNxgB5WazQbv+yWhGrg2FN5DCnF0Ro ZFYeKSVty148pQ0nHl8X0JM2XMtqit+O+LvKN2HQZ+fubh7BCzMxzkHY0QLHIzUS UeZqd3Qm8YcbqnlX38P5D6k+NPiTEgknmxaBLkPxg6H3XxDAmaIRFb8Ldd1rsgy1 zTLGDiSGpVDIpawRnuEAzqJThV3Y5/MVJ1WD+mDYQ96tmhfp+KY= =DBH/ -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: - vduse driver ("vDPA Device in Userspace") supporting emulated virtio block devices - virtio-vsock support for end of record with SEQPACKET - vdpa: mac and mq support for ifcvf and mlx5 - vdpa: management netlink for ifcvf - virtio-i2c, gpio dt bindings - misc fixes and cleanups * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (39 commits) Documentation: Add documentation for VDUSE vduse: Introduce VDUSE - vDPA Device in Userspace vduse: Implement an MMU-based software IOTLB vdpa: Support transferring virtual addressing during DMA mapping vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap() vdpa: Add an opaque pointer for vdpa_config_ops.dma_map() vhost-iotlb: Add an opaque pointer for vhost IOTLB vhost-vdpa: Handle the failure of vdpa_reset() vdpa: Add reset callback in vdpa_config_ops vdpa: Fix some coding style issues file: Export receive_fd() to modules eventfd: Export eventfd_wake_count to modules iova: Export alloc_iova_fast() and free_iova_fast() virtio-blk: remove unneeded "likely" statements virtio-balloon: Use virtio_find_vqs() helper vdpa: Make use of PFN_PHYS/PFN_UP/PFN_DOWN helper macro vsock_test: update message bounds test for MSG_EOR af_vsock: rename variables in receive loop virtio/vsock: support MSG_EOR bit processing vhost/vsock: support MSG_EOR bit processing ...	2021-09-11 14:48:42 -07:00
Xie Yongji	7bc7f61897	Documentation: Add documentation for VDUSE VDUSE (vDPA Device in Userspace) is a framework to support implementing software-emulated vDPA devices in userspace. This document is intended to clarify the VDUSE design and usage. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210831103634.33-14-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-06 07:20:58 -04:00
Xie Yongji	c8a6153b6c	vduse: Introduce VDUSE - vDPA Device in Userspace This VDUSE driver enables implementing software-emulated vDPA devices in userspace. The vDPA device is created by ioctl(VDUSE_CREATE_DEV) on /dev/vduse/control. Then a char device interface (/dev/vduse/$NAME) is exported to userspace for device emulation. In order to make the device emulation more secure, the device's control path is handled in kernel. A message mechnism is introduced to forward some dataplane related control messages to userspace. And in the data path, the DMA buffer will be mapped into userspace address space through different ways depending on the vDPA bus to which the vDPA device is attached. In virtio-vdpa case, the MMU-based software IOTLB is used to achieve that. And in vhost-vdpa case, the DMA buffer is reside in a userspace memory region which can be shared to the VDUSE userspace processs via transferring the shmfd. For more details on VDUSE design and usage, please see the follow-on Documentation commit. NB(mst): when merging this with `b542e383d8` ("eventfd: Make signal recursion protection a task bit") replace eventfd_signal_count with eventfd_signal_allowed, and drop the previous ("eventfd: Export eventfd_wake_count to modules"). Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210831103634.33-13-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-06 07:20:58 -04:00
Xie Yongji	8c773d53fb	vduse: Implement an MMU-based software IOTLB This implements an MMU-based software IOTLB to support mapping kernel dma buffer into userspace dynamically. The basic idea behind it is treating MMU (VA->PA) as IOMMU (IOVA->PA). The software IOTLB will set up MMU mapping instead of IOMMU mapping for the DMA transfer so that the userspace process is able to use its virtual address to access the dma buffer in kernel. To avoid security issue, a bounce-buffering mechanism is introduced to prevent userspace accessing the original buffer directly which may contain other kernel data. During the mapping, unmapping, the software IOTLB will copy the data from the original buffer to the bounce buffer and back, depending on the direction of the transfer. And the bounce-buffer addresses will be mapped into the user address space instead of the original one. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210831103634.33-12-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-06 07:20:58 -04:00
Xie Yongji	d8945ec411	vdpa: Support transferring virtual addressing during DMA mapping This patch introduces an attribute for vDPA device to indicate whether virtual address can be used. If vDPA device driver set it, vhost-vdpa bus driver will not pin user page and transfer userspace virtual address instead of physical address during DMA mapping. And corresponding vma->vm_file and offset will be also passed as an opaque pointer. Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210831103634.33-11-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-06 07:20:57 -04:00
Xie Yongji	c10fb9454a	vdpa: Add an opaque pointer for vdpa_config_ops.dma_map() Add an opaque pointer for DMA mapping. Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210831103634.33-9-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-06 07:20:57 -04:00
Xie Yongji	0686082dbf	vdpa: Add reset callback in vdpa_config_ops This adds a new callback to support device specific reset behavior. The vdpa bus driver will call the reset function instead of setting status to zero during resetting. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210831103634.33-6-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-06 07:20:57 -04:00
Xie Yongji	0d8c9e7d4b	vdpa_sim: Use iova_shift() for the size passed to alloc_iova() The size passed to alloc_iova() should be the size of page frames. Now we use byte granularity for the iova domain, so it's safe to pass the size in bytes to alloc_iova(). But it would be better to use iova_shift() for the size to avoid future bugs if we change granularity. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210809100923.38-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-09-05 16:23:08 -04:00
Eli Cohen	52893733f2	vdpa/mlx5: Add multiqueue support Multiqueue support requires additional virtio_net_q objects to be added or removed per the configured number of queue pairs. In addition the RQ tables needs to be modified to match the number of configured receive queues so the packets are dispatched to the right virtqueue according to the hash result. Note: qemu v6.0.0 is broken when the device requests more than two data queues; no net device will be created for the vdpa device. To avoid this, one should specify mq=off to qemu. In this case it will end up with a single queue. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210823052123.14909-7-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-05 16:23:08 -04:00
Eli Cohen	5262912ef3	vdpa/mlx5: Add support for control VQ and MAC setting Add support to handle control virtqueue configurations per virtio specification. The control virtqueue is implemented in software and no hardware offloading is involved. Control VQ configuration need task context, therefore all configurations are handled in a workqueue created for the purpose. Modifications are made to the memory registration code to allow for saving a copy of itolb to be used by the control VQ to access the vring. The max number of data virtqueus supported by the driver has been updated to 2 since multiqueue is not supported at this stage and we need to ensure consistency of VQ indices mapping to either data or control VQ. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210823052123.14909-6-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-05 16:23:08 -04:00
Eli Cohen	e4fc66508c	vdpa/mlx5: Ensure valid indices are provided Following patches add control virtuqeue and multiqueue support. We want to verify that the index value to callbacks referencing a virtqueue is valid. The logic defining valid indices is as follows: CVQ clear: 0 and 1. CVQ set, MQ clear: 0, 1 and 2 CVQ set, MQ set: 0..nvq where nvq is whatever provided to _vdpa_register_device() Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210823052123.14909-5-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-05 16:23:07 -04:00
Eli Cohen	db296d252d	vdpa/mlx5: Decouple virtqueue callback from struct mlx5_vdpa_virtqueue Instead, define an array of struct vdpa_callback on struct mlx5_vdpa_net and use it to store callbacks for any virtqueue provided. This is required due to the fact that callback configurations arrive before feature negotiation. With control VQ and multiqueue introduced next we want to save the information until after feature negotiation where we know the CVQ index. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210823052123.14909-4-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-05 16:23:07 -04:00
Eli Cohen	ae0428debf	vdpa/mlx5: function prototype modifications in preparation to control VQ Use struct mlx5_vdpa_dev as an argument to setup_driver() and a few others in preparation to control virtqueue support in a subsequent patch. The control virtqueue is part of struct mlx5_vdpa_dev so this is required. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210823052123.14909-3-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-05 16:23:07 -04:00
Eli Cohen	4e57a9f622	vdpa/mlx5: Remove redundant header file inclusion linux/if_vlan.h is not required. Remove it. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210823052123.14909-2-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-05 16:23:07 -04:00
Zhu Lingshan	90d1936681	vDPA/ifcvf: enable multiqueue and control vq This commit enbales multi-queue and control vq features for ifcvf Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20210818095714.3220-3-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-09-05 16:23:07 -04:00
Zhu Lingshan	2ddae773c9	vDPA/ifcvf: detect and use the onboard number of queues directly To enable this multi-queue feature for ifcvf, this commit intends to detect and use the onboard number of queues directly than IFCVF_MAX_QUEUE_PAIRS = 1 (removed) Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20210818095714.3220-2-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-09-05 16:23:07 -04:00

1 2 3 4 5 ...

401 Commits