Commit Graph

1104701 Commits

Author SHA1 Message Date
Benjamin Coddington c3ed222745 NFSv4: Fix free of uninitialized nfs4_label on referral lookup.
Send along the already-allocated fattr along with nfs4_fs_locations, and
drop the memcpy of fattr.  We end up growing two more allocations, but this
fixes up a crash as:

PID: 790    TASK: ffff88811b43c000  CPU: 0   COMMAND: "ls"
 #0 [ffffc90000857920] panic at ffffffff81b9bfde
 #1 [ffffc900008579c0] do_trap at ffffffff81023a9b
 #2 [ffffc90000857a10] do_error_trap at ffffffff81023b78
 #3 [ffffc90000857a58] exc_stack_segment at ffffffff81be1f45
 #4 [ffffc90000857a80] asm_exc_stack_segment at ffffffff81c009de
 #5 [ffffc90000857b08] nfs_lookup at ffffffffa0302322 [nfs]
 #6 [ffffc90000857b70] __lookup_slow at ffffffff813a4a5f
 #7 [ffffc90000857c60] walk_component at ffffffff813a86c4
 #8 [ffffc90000857cb8] path_lookupat at ffffffff813a9553
 #9 [ffffc90000857cf0] filename_lookup at ffffffff813ab86b

Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
Fixes: 9558a007db ("NFS: Remove the label from the nfs4_lookup_res struct")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-31 17:09:24 -04:00
Saeed Mahameed 1c5de097be net/mlx5: Fix mlx5_get_next_dev() peer device matching
In some use-cases, mlx5 instances will need to search for their peer
device (the other port on the same HCA). For that, mlx5 device matching
mechanism relied on auxiliary_find_device() to search, and used a bad matching
callback function.

This approach has two issues:

1) next_phys_dev() the matching function, assumed all devices are
   of the type mlx5_adev (mlx5 auxiliary device) which is wrong and
   could lead to crashes, this worked for a while, since only lately
   other drivers started registering auxiliary devices.

2) using the auxiliary class bus (auxiliary_find_device) to search for
   mlx5_core_dev devices, who are actually PCIe device instances, is wrong.
   This works since mlx5_core always has at least one mlx5_adev instance
   hanging around in the aux bus.

As suggested by others we can fix 1. by comparing device names prefixes
if they have the string "mlx5_core" in them, which is not a best practice !
but even with that fixed, still 2. needs fixing, we are trying to
match pcie device peers so we should look in the right bus (pci bus),
hence this fix.

The fix:
1) search the pci bus for mlx5 peer devices, instead of the aux bus
2) to validated devices are the same type "mlx5_core_dev" compare if
   they have the same driver, which is bulletproof.

   This wouldn't have worked with the aux bus since the various mlx5 aux
   device types don't share the same driver, even if they share the same device
   wrapper struct (mlx5_adev) "which helped to find the parent device"

Fixes: a925b5e309 ("net/mlx5: Register mlx5 devices to auxiliary virtual bus")
Reported-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reported-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maher Sanalla <msanalla@nvidia.com>
2022-05-31 13:40:55 -07:00
Maxim Mikityanskiy f6279f113a net/mlx5e: Update netdev features after changing XDP state
Some features (LRO, HW GRO) conflict with XDP. If there is an attempt to
enable such features while XDP is active, they will be set to `off
[requested on]`. In order to activate these features after XDP is turned
off, the driver needs to call netdev_update_features(). This commit adds
this missing call after XDP state changes.

Fixes: cf6e34c8c2 ("net/mlx5e: Properly block LRO when XDP is enabled")
Fixes: b0617e7b35 ("net/mlx5e: Properly block HW GRO when XDP is enabled")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-31 13:40:55 -07:00
Changcheng Liu 3fc2a9e89b net/mlx5: correct ECE offset in query qp output
ECE field should be after opt_param_mask in query qp output.

Fixes: 6b646a7e4a ("net/mlx5: Add ability to read and write ECE options")
Signed-off-by: Changcheng Liu <jerrliu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-31 13:40:54 -07:00
Maxim Mikityanskiy 2e642afb61 net/mlx5e: Disable softirq in mlx5e_activate_rq to avoid race condition
When the driver activates the channels, it assumes NAPI isn't running
yet. mlx5e_activate_rq posts a NOP WQE to ICOSQ to trigger a hardware
interrupt and start NAPI, which will run mlx5e_alloc_rx_mpwqe and post
UMR WQEs to ICOSQ to be able to receive packets with striding RQ.

Unfortunately, a race condition is possible if NAPI is triggered by
something else (for example, TX) at a bad timing, before
mlx5e_activate_rq finishes. In this case, mlx5e_alloc_rx_mpwqe may post
UMR WQEs to ICOSQ, and with the bad timing, the wqe_info of the first
UMR may be overwritten by the wqe_info of the NOP posted by
mlx5e_activate_rq.

The consequence is that icosq->db.wqe_info[0].num_wqebbs will be changed
from MLX5E_UMR_WQEBBS to 1, disrupting the integrity of the array-based
linked list in wqe_info[]. mlx5e_poll_ico_cq will hang in an infinite
loop after processing wqe_info[0], because after the corruption, the
next item to be processed will be wqe_info[1], which is filled with
zeros, and `sqcc += wi->num_wqebbs` will never move further.

This commit fixes this race condition by using async_icosq to post the
NOP and trigger the interrupt. async_icosq is always protected with a
spinlock, eliminating the race condition.

Fixes: bc77b240b3 ("net/mlx5e: Add fragmented memory support for RX multi packet WQE")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reported-by: Karsten Nielsen <karsten@foo-bar.dk>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-31 13:40:54 -07:00
Paul Blakey 1f2856cde6 net/mlx5: CT: Fix header-rewrite re-use for tupels
Tuple entries that don't have nat configured for them
which are added to the ct nat table will always create
a new modify header, as we don't check for possible
re-use on them. The same for tuples that have nat configured
for them but are added to ct table.

Fix the above by only avoiding wasteful re-use lookup
for actually natted entries in ct nat table.

Fixes: 7fac5c2ece ("net/mlx5: CT: Avoid reusing modify header context for natted entries")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-31 13:40:54 -07:00
Maor Dickman 66cb64e292 net/mlx5e: TC NIC mode, fix tc chains miss table
The cited commit changed promisc table to be created on demand with the
highest priority in the NIC table replacing the vlan table, this caused
tc NIC tables miss flow to skip the prmoisc table because it use vlan
table as miss table.

OVS offload in NIC mode use promisc by default so any unicast packet
which will be handled by tc NIC tables miss flow will skip the promisc
rule and will be dropped.

Fix this by adding new empty table in new tc level with low priority and
point the nic tc chain miss to it, the new table is managed so it will
point to vlan table if promisc is disabled and to promisc table if enabled.

Fixes: 1c46d7409f ("net/mlx5e: Optimize promiscuous mode")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-31 13:40:53 -07:00
Leon Romanovsky 80b2bd737d net/mlx5: Don't use already freed action pointer
The call to mlx5dr_action_destroy() releases "action" memory. That
pointer is set to miss_action later and generates the following smatch
error:

 drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c:53 set_miss_action()
 warn: 'action' was already freed.

Make sure that the pointer is always valid by setting NULL after destroy.

Fixes: 6a48faeeca ("net/mlx5: Add direct rule fs_cmd implementation")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-31 13:40:53 -07:00
Sarthak Kukreti 4caae58406 dm verity: set DM_TARGET_IMMUTABLE feature flag
The device-mapper framework provides a mechanism to mark targets as
immutable (and hence fail table reloads that try to change the target
type). Add the DM_TARGET_IMMUTABLE flag to the dm-verity target's
feature flags to prevent switching the verity target with a different
target type.

Fixes: a4ffc15219 ("dm: add verity target")
Cc: stable@vger.kernel.org
Signed-off-by: Sarthak Kukreti <sarthakkukreti@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-05-31 16:22:30 -04:00
Enzo Matsumiya 0d5106a80e cifs: remove repeated debug message on cifs_put_smb_ses()
Similar message is printed a few lines later in the same function

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-31 15:09:51 -05:00
Lorenzo Pieralisi 833e53a4ff MAINTAINERS: Update Lorenzo Pieralisi's email address
I will soon lose my @arm.com email address, so to prevent any possible
issue let's update all kernel references (inclusive of .mailmap) to my
@kernel.org alias ahead of time.

My @arm.com address is still working and will likely resume to work at some
point in the future; nonetheless, it is safer to switch to the @kernel.org
alias from now onwards so that continuity is guaranteed.

Link: https://lore.kernel.org/r/20220528151411.29810-1-lorenzo.pieralisi@arm.com
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
2022-05-31 15:06:19 -05:00
Bjorn Helgaas 12068bb346 PCI/PM: Fix bridge_d3_blacklist[] Elo i2 overwrite of Gigabyte X299
92597f97a4 ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold") omitted
braces around the new Elo i2 entry, so it overwrote the existing Gigabyte
X299 entry.  Add the appropriate braces.

Found by:

  $ make W=1 drivers/pci/pci.o
    CC      drivers/pci/pci.o
  drivers/pci/pci.c:2974:12: error: initialized field overwritten [-Werror=override-init]
   2974 |   .ident = "Elo i2",
        |            ^~~~~~~~

Link: https://lore.kernel.org/r/20220526221258.GA409855@bhelgaas
Fixes: 92597f97a4 ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org  # v5.15+
2022-05-31 15:06:19 -05:00
Bjorn Helgaas f4fd559de3 Revert "PCI: brcmstb: Split brcm_pcie_setup() into two funcs"
This reverts commit 830aa6f29f.

This is part of a revert of the following commits:

  11ed8b8624 ("PCI: brcmstb: Do not turn off WOL regulators on suspend")
  93e41f3fca ("PCI: brcmstb: Add control of subdevice voltage regulators")
  67211aadcb ("PCI: brcmstb: Add mechanism to turn on subdev regulators")
  830aa6f29f ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs")

Cyril reported that 830aa6f29f ("PCI: brcmstb: Split brcm_pcie_setup()
into two funcs"), which appeared in v5.17-rc1, broke booting on the
Raspberry Pi Compute Module 4.  Apparently 830aa6f29f panics with an
Asynchronous SError Interrupt, and after further commits here is a black
screen on HDMI and no output on the serial console.

This does not seem to affect the Raspberry Pi 4 B.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215925
Link: https://lore.kernel.org/r/20220511201856.808690-5-helgaas@kernel.org
Reported-by: Cyril Brulebois <kibi@debian.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-05-31 15:06:19 -05:00
Bjorn Helgaas 420be2f7eb Revert "PCI: brcmstb: Add mechanism to turn on subdev regulators"
This reverts commit 67211aadcb.

This is part of a revert of the following commits:

  11ed8b8624 ("PCI: brcmstb: Do not turn off WOL regulators on suspend")
  93e41f3fca ("PCI: brcmstb: Add control of subdevice voltage regulators")
  67211aadcb ("PCI: brcmstb: Add mechanism to turn on subdev regulators")
  830aa6f29f ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs")

Cyril reported that 830aa6f29f ("PCI: brcmstb: Split brcm_pcie_setup()
into two funcs"), which appeared in v5.17-rc1, broke booting on the
Raspberry Pi Compute Module 4.  Apparently 830aa6f29f panics with an
Asynchronous SError Interrupt, and after further commits here is a black
screen on HDMI and no output on the serial console.

This does not seem to affect the Raspberry Pi 4 B.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215925
Link: https://lore.kernel.org/r/20220511201856.808690-4-helgaas@kernel.org
Reported-by: Cyril Brulebois <kibi@debian.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-05-31 15:06:19 -05:00
Bjorn Helgaas 212942609d Revert "PCI: brcmstb: Add control of subdevice voltage regulators"
This reverts commit 93e41f3fca.

This is part of a revert of the following commits:

  11ed8b8624 ("PCI: brcmstb: Do not turn off WOL regulators on suspend")
  93e41f3fca ("PCI: brcmstb: Add control of subdevice voltage regulators")
  67211aadcb ("PCI: brcmstb: Add mechanism to turn on subdev regulators")
  830aa6f29f ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs")

Cyril reported that 830aa6f29f ("PCI: brcmstb: Split brcm_pcie_setup()
into two funcs"), which appeared in v5.17-rc1, broke booting on the
Raspberry Pi Compute Module 4.  Apparently 830aa6f29f panics with an
Asynchronous SError Interrupt, and after further commits here is a black
screen on HDMI and no output on the serial console.

This does not seem to affect the Raspberry Pi 4 B.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215925
Link: https://lore.kernel.org/r/20220511201856.808690-3-helgaas@kernel.org
Reported-by: Cyril Brulebois <kibi@debian.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-05-31 15:06:19 -05:00
Bjorn Helgaas 7894025c78 Revert "PCI: brcmstb: Do not turn off WOL regulators on suspend"
This reverts commit 11ed8b8624.

This is part of a revert of the following commits:

  11ed8b8624 ("PCI: brcmstb: Do not turn off WOL regulators on suspend")
  93e41f3fca ("PCI: brcmstb: Add control of subdevice voltage regulators")
  67211aadcb ("PCI: brcmstb: Add mechanism to turn on subdev regulators")
  830aa6f29f ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs")

Cyril reported that 830aa6f29f ("PCI: brcmstb: Split brcm_pcie_setup()
into two funcs"), which appeared in v5.17-rc1, broke booting on the
Raspberry Pi Compute Module 4.  Apparently 830aa6f29f panics with an
Asynchronous SError Interrupt, and after further commits here is a black
screen on HDMI and no output on the serial console.

This does not seem to affect the Raspberry Pi 4 B.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215925
Link: https://lore.kernel.org/r/20220511201856.808690-2-helgaas@kernel.org
Reported-by: Cyril Brulebois <kibi@debian.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-05-31 15:06:19 -05:00
Mike Snitzer 9571f829f3 dm table: fix dm_table_supports_poll to return false if no data devices
It was reported that the "generic/250" test in xfstests (which uses
the dm-error target) demonstrates a regression where the kernel
crashes in bioset_exit().

Since commit cfc97abcbe ("dm: conditionally enable
BIOSET_PERCPU_CACHE for dm_io bioset") the bioset_init() for the dm_io
bioset will setup the bioset's per-cpu alloc cache if all devices have
QUEUE_FLAG_POLL set.

But there was an bug where a target that doesn't have any data devices
(and that doesn't even set the .iterate_devices dm target callback)
will incorrectly return true from dm_table_supports_poll().

Fix this by updating dm_table_supports_poll() to follow dm-table.c's
well-worn pattern for testing that _all_ targets in a DM table do in
fact have underlying devices that set QUEUE_FLAG_POLL.

NOTE: An additional block fix is still needed so that
bio_alloc_cache_destroy() clears the bioset's ->cache member.
Otherwise, a DM device's table reload that transitions the DM device's
bioset from using a per-cpu alloc cache to _not_ using one will result
in bioset_exit() crashing in bio_alloc_cache_destroy() because dm's
dm_io bioset ("io_bs") was left with a stale ->cache member.

Fixes: cfc97abcbe ("dm: conditionally enable BIOSET_PERCPU_CACHE for dm_io bioset")
Reported-by: Matthew Wilcox <willy@infradead.org>
Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-05-31 14:44:17 -04:00
Linus Torvalds e1cbc3b96a IOMMU Updates for Linux v5.19
Including:
 
 	- Intel VT-d driver updates
 	  - Domain force snooping improvement.
 	  - Cleanups, no intentional functional changes.
 
 	- ARM SMMU driver updates
 	  - Add new Qualcomm device-tree compatible strings
 	  - Add new Nvidia device-tree compatible string for Tegra234
 	  - Fix UAF in SMMUv3 shared virtual addressing code
 	  - Force identity-mapped domains for users of ye olde SMMU
 	    legacy binding
 	  - Minor cleanups
 
 	- Patches to fix a BUG_ON in the vfio_iommu_group_notifier
 	  - Groundwork for upcoming iommufd framework
 	  - Introduction of DMA ownership so that an entire IOMMU group
 	    is either controlled by the kernel or by user-space
 
 	- MT8195 and MT8186 support in the Mediatek IOMMU driver
 
 	- Patches to make forcing of cache-coherent DMA more coherent
 	  between IOMMU drivers
 
 	- Fixes for thunderbolt device DMA protection
 
 	- Various smaller fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmKWCbUACgkQK/BELZcB
 GuPHmRAAuoH9iK/jrC3SgrqpBfH2iRN7ovIX8dFvgbQWX27lhXF4gvj2/nYdIvPK
 75j/LmdibuzV3Iez4kjbGKNG1AikwK3dKIH21a84f3ctnoamQyL6nMfCVBFaVD/D
 kvPpTHyjbGPNf6KZyWQdkJ5DXD1aoG1DKkBnslH5pTNPqGuNqbcnRTg0YxiJFLBv
 5w2B6jL06XRzunh+Sp1Dbj+po8ROjLRCEU+tdrndO8W/Dyp6+ZNNuxL9/3BM9zMj
 py0M4piFtGnhmJSdym1eeHm7r1YRjkZw+MN+e8NcrcSihmDutEWo7nRRxA5uVaa+
 3O2DNERqCvQUYxfNRUOKwzV8v51GYQHEPhvOe/MLgaEQDmDmlF2dHNGm93eCMdrv
 m1cT011oU7pa4qHomwLyTJxSsR7FzJ37igq/WeY++MBhl+frqfzEQPVxF+W7GLb8
 QvT/+woCPzLVpJbE7s0FUD4nbPd8c1dAz4+HO1DajxILIOTq1bnPIorSjgXODRjq
 yzsiP1rAg0L0PsL7pXn3cPMzNCE//xtOsRsAGmaVv6wBoMLyWVFCU/wjPEdjrSWA
 nXpAuCL84uxCEl/KLYMsg9UhjT6ko7CuKdsybIG9zNIiUau43uSqgTen0xCpYt0i
 m//O/X3tPyxmoLKRW+XVehGOrBZW+qrQny6hk/Zex+6UJQqVMTA=
 =W0hj
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Intel VT-d driver updates:
     - Domain force snooping improvement.
     - Cleanups, no intentional functional changes.

 - ARM SMMU driver updates:
     - Add new Qualcomm device-tree compatible strings
     - Add new Nvidia device-tree compatible string for Tegra234
     - Fix UAF in SMMUv3 shared virtual addressing code
     - Force identity-mapped domains for users of ye olde SMMU legacy
       binding
     - Minor cleanups

 - Fix a BUG_ON in the vfio_iommu_group_notifier:
     - Groundwork for upcoming iommufd framework
     - Introduction of DMA ownership so that an entire IOMMU group is
       either controlled by the kernel or by user-space

 - MT8195 and MT8186 support in the Mediatek IOMMU driver

 - Make forcing of cache-coherent DMA more coherent between IOMMU
   drivers

 - Fixes for thunderbolt device DMA protection

 - Various smaller fixes and cleanups

* tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
  iommu/amd: Increase timeout waiting for GA log enablement
  iommu/s390: Tolerate repeat attach_dev calls
  iommu/vt-d: Remove hard coding PGSNP bit in PASID entries
  iommu/vt-d: Remove domain_update_iommu_snooping()
  iommu/vt-d: Check domain force_snooping against attached devices
  iommu/vt-d: Block force-snoop domain attaching if no SC support
  iommu/vt-d: Size Page Request Queue to avoid overflow condition
  iommu/vt-d: Fold dmar_insert_one_dev_info() into its caller
  iommu/vt-d: Change return type of dmar_insert_one_dev_info()
  iommu/vt-d: Remove unneeded validity check on dev
  iommu/dma: Explicitly sort PCI DMA windows
  iommu/dma: Fix iova map result check bug
  iommu/mediatek: Fix NULL pointer dereference when printing dev_name
  iommu: iommu_group_claim_dma_owner() must always assign a domain
  iommu/arm-smmu: Force identity domains for legacy binding
  iommu/arm-smmu: Support Tegra234 SMMU
  dt-bindings: arm-smmu: Add compatible for Tegra234 SOC
  dt-bindings: arm-smmu: Document nvidia,memory-controller property
  iommu/arm-smmu-qcom: Add SC8280XP support
  dt-bindings: arm-smmu: Add compatible for Qualcomm SC8280XP
  ...
2022-05-31 09:56:54 -07:00
Linus Torvalds 3335d55502 Microblaze patches for 5.19-rc1
- Fix issues with freestanding
 - Wire memblock_dump_all()
 - Add support for memory reservation from DT
 -----BEGIN PGP SIGNATURE-----
 
 iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCYpW+twAKCRDKSWXLKUoM
 IVdaAJ9xpATIDEDM4S6jq5EpjHbsexA7IgCdG1ewTy0ynVvP1sbVUZzkUaxbe/0=
 =6/ET
 -----END PGP SIGNATURE-----

Merge tag 'microblaze-v5.19' of git://git.monstr.eu/linux-2.6-microblaze

Pull microblaze updates from Michal Simek:

 - Fix issues with freestanding

 - Wire memblock_dump_all()

 - Add support for memory reservation from DT

* tag 'microblaze-v5.19' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: fix typos in comments
  microblaze: Add support for reserved memory defined by DT
  microblaze: Wire memblock_dump_all()
  microblaze: Use simple memmove/memcpy implementation from lib/string.c
  microblaze: Do loop unrolling for optimized memset implementation
  microblaze: Use simple memset implementation from lib/string.c
2022-05-31 09:46:23 -07:00
Mike Christie b2ffa407ed vhost: rename vhost_work_dev_flush
This patch renames vhost_work_dev_flush to just vhost_dev_flush to
relfect that it flushes everything on the device and that drivers
don't know/care that polls are based on vhost_works. Drivers just
flush the entire device and polls, and works for vhost-scsi
management TMFs and IO net virtqueues, etc all are flushed.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220517180850.198915-9-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:10 -04:00
Mike Christie f3a1aad9a4 vhost-test: drop flush after vhost_dev_cleanup
The flush after vhost_dev_cleanup is not needed because:

1. It doesn't do anything. vhost_dev_cleanup will stop the worker thread
so the flush call will just return since the worker has not device.

2. It's not needed. The comment about jobs re-queueing themselves does
not look correct because handle_vq does not requeue work.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220517180850.198915-8-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:10 -04:00
Mike Christie c3d284cf78 vhost-scsi: drop flush after vhost_dev_cleanup
The flush after vhost_dev_cleanup is not needed because:

1. It doesn't do anything. vhost_dev_cleanup will stop the worker thread
so the flush call will just return since the worker has not device.

2. It's not needed for the re-queue case. vhost_scsi_evt_handle_kick grabs
the mutex and if the backend is NULL will return without queueing a work.
vhost_scsi_clear_endpoint will set the backend to NULL under the vq->mutex
then drops the mutex and does a flush. So we know when
vhost_scsi_clear_endpoint has dropped the mutex after clearing the backend
no evt related work will be able to requeue. The flush would then make sure
any queued evts are run and return.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220517180850.198915-7-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:10 -04:00
Andrey Ryabinin 15538ba5ff vhost_vsock: simplify vhost_vsock_flush()
vhost_vsock_flush() calls vhost_work_dev_flush(vsock->vqs[i].poll.dev)
before vhost_work_dev_flush(&vsock->dev). This seems pointless
as vsock->vqs[i].poll.dev is the same as &vsock->dev and several flushes
in a row doesn't do anything useful, one is just enough.

Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220517180850.198915-6-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:10 -04:00
Andrey Ryabinin c5514758dd vhost_test: remove vhost_test_flush_vq()
vhost_test_flush_vq() just a simple wrapper around vhost_work_dev_flush()
which seems have no value. It's just easier to call vhost_work_dev_flush()
directly. Besides there is no point in obtaining vhost_dev pointer
via 'n->vqs[index].poll.dev' while we can just use &n->dev.
It's the same pointers, see vhost_test_open()/vhost_dev_init().

Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220517180850.198915-5-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-05-31 12:45:10 -04:00
Andrey Ryabinin 2c029f3298 vhost_net: get rid of vhost_net_flush_vq() and extra flush calls
vhost_net_flush_vq() calls vhost_work_dev_flush() twice passing
vhost_dev pointer obtained via 'n->poll[index].dev' and
'n->vqs[index].vq.poll.dev'. This is actually the same pointer,
initialized in vhost_net_open()/vhost_dev_init()/vhost_poll_init()

Remove vhost_net_flush_vq() and call vhost_work_dev_flush() directly.
Do the flushes only once instead of several flush calls in a row
which seems rather useless.

Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com>
[drop vhost_dev forward declaration in vhost.h]
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220517180850.198915-4-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:10 -04:00
Mike Christie 6ca84326c2 vhost: flush dev once during vhost_dev_stop
When vhost_work_dev_flush returns all work queued at that time will have
completed. There is then no need to flush after every vhost_poll_stop
call, and we can move the flush call to after the loop that stops the
pollers.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220517180850.198915-3-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:10 -04:00
Andrey Ryabinin 6fcf224c37 vhost: get rid of vhost_poll_flush() wrapper
vhost_poll_flush() is a simple wrapper around vhost_work_dev_flush().
It gives wrong impression that we are doing some work over vhost_poll,
while in fact it flushes vhost_poll->dev.
It only complicate understanding of the code and leads to mistakes
like flushing the same vhost_dev several times in a row.

Just remove vhost_poll_flush() and call vhost_work_dev_flush() directly.

Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com>
[merge vhost_poll_flush removal from Stefano Garzarella]
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220517180850.198915-2-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:10 -04:00
Dan Carpenter f4a8686ec7 vhost-vdpa: return -EFAULT on copy_to_user() failure
The copy_to_user() function returns the number of bytes remaining to be
copied.  However, we need to return a negative error code, -EFAULT, to
the user.

Fixes: 87f4c217413a ("vhost-vdpa: introduce uAPI to get the number of virtqueue groups")
Fixes: e96ef636f154 ("vhost-vdpa: introduce uAPI to get the number of address spaces")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <YotG1vXKXXSayr63@kili>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-05-31 12:45:10 -04:00
Dan Carpenter 1f97b97850 vdpasim: Off by one in vdpasim_set_group_asid()
The > comparison needs to be >= to prevent an out of bounds access
of the vdpasim->iommu[] array.  The vdpasim->iommu[] is allocated in
vdpasim_create() and it has vdpasim->dev_attr.nas elements.

Fixes: 87e5afeac247 ("vdpasim: control virtqueue support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <YotGQU1q224RKZR8@kili>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-31 12:45:10 -04:00
keliu 4f58afd6eb virtio: Directly use ida_alloc()/free()
Use ida_alloc()/ida_free() instead of deprecated
ida_simple_get()/ida_simple_remove() .

Signed-off-by: keliu <liuke94@huawei.com>
Message-Id: <20220527073302.2474073-1-liuke94@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:10 -04:00
Jason Wang 619e9e14ba virtio: use WARN_ON() to warning illegal status value
We used to use BUG_ON() in virtio_device_ready() to detect illegal
status value, this seems sub-optimal since the value is under the
control of the device. Switch to use WARN_ON() instead.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Vineeth Vijayan <vneethv@linux.ibm.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220527060120.20964-10-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-05-31 12:45:10 -04:00
Jason Wang 8b4ec69d7e virtio: harden vring IRQ
This is a rework on the previous IRQ hardening that is done for
virtio-pci where several drawbacks were found and were reverted:

1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
   that is used by some device such as virtio-blk
2) done only for PCI transport

The vq->broken is re-used in this patch for implementing the IRQ
hardening. The vq->broken is set to true during both initialization
and reset. And the vq->broken is set to false in
virtio_device_ready(). Then vring_interrupt() can check and return
when vq->broken is true. And in this case, switch to return IRQ_NONE
to let the interrupt core aware of such invalid interrupt to prevent
IRQ storm.

The reason of using a per queue variable instead of a per device one
is that we may need it for per queue reset hardening in the future.

Note that the hardening is only done for vring interrupt since the
config interrupt hardening is already done in commit 22b7050a02
("virtio: defer config changed notifications"). But the method that is
used by config interrupt can't be reused by the vring interrupt
handler because it uses spinlock to do the synchronization which is
expensive.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Vineeth Vijayan <vneethv@linux.ibm.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220527060120.20964-9-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-05-31 12:45:10 -04:00
Jason Wang be83f04d25 virtio: allow to unbreak virtqueue
This patch allows the new introduced __virtio_break_device() to
unbreak the virtqueue.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Vineeth Vijayan <vneethv@linux.ibm.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220527060120.20964-8-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-05-31 12:45:10 -04:00
Jason Wang 3a232277c1 virtio-ccw: implement synchronize_cbs()
This patch tries to implement the synchronize_cbs() for ccw. For the
vring_interrupt() that is called via virtio_airq_handler(), the
synchronization is simply done via the airq_info's lock. For the
vring_interrupt() that is called via virtio_ccw_int_handler(), a per
device rwlock is introduced and used in the synchronization method.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Vineeth Vijayan <vneethv@linux.ibm.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220527060120.20964-7-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:09 -04:00
Jason Wang 9e9b289328 virtio-mmio: implement synchronize_cbs()
Simply synchronize the platform irq that is used by us.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Vineeth Vijayan <vneethv@linux.ibm.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220527060120.20964-6-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-05-31 12:45:09 -04:00
Jason Wang 48b3dd2438 virtio-pci: implement synchronize_cbs()
We can simply reuse vp_synchronize_vectors() for .synchronize_cbs().

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Vineeth Vijayan <vneethv@linux.ibm.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220527060120.20964-5-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-05-31 12:45:09 -04:00
Jason Wang 48b69959a8 virtio: introduce config op to synchronize vring callbacks
This patch introduces new virtio config op to vring
callbacks. Transport specific method is required to make sure the
write before this function is visible to the vring_interrupt() that is
called after the return of this function. For the transport that
doesn't provide synchronize_vqs(), use synchornize_rcu() which
synchronize with IRQ implicitly as a fallback.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Vineeth Vijayan <vneethv@linux.ibm.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220527060120.20964-4-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-05-31 12:45:09 -04:00
Jason Wang 0aa96837c3 virtio: use virtio_reset_device() when possible
This allows us to do common extension without duplicating code.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Vineeth Vijayan <vneethv@linux.ibm.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220527060120.20964-3-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2022-05-31 12:45:09 -04:00
Stefano Garzarella 2536b2ca15 virtio: use virtio_device_ready() in virtio_device_restore()
It will allow us to do extension on virtio_device_ready() without
duplicating code.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Vineeth Vijayan <vneethv@linux.ibm.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220527060120.20964-2-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-05-31 12:45:09 -04:00
Eugenio Pérez 2424369738 vdpasim: allow to enable a vq repeatedly
Code must be resilient to enable a queue many times.

At the moment the queue is resetting so it's definitely not the expected
behavior.

v2: set vq->ready = 0 at disable.

Fixes: 2c53d0f64c ("vdpasim: vDPA device simulator")
Cc: stable@vger.kernel.org
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20220519145919.772896-1-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-05-31 12:45:09 -04:00
Zhu Lingshan ac33f84ba5 vDPA/ifcvf: fix uninitialized config_vector warning
Static checkers are not informed that config_vector is controlled
by vf->msix_vector_status, which can only be
MSIX_VECTOR_SHARED_VQ_AND_CONFIG, MSIX_VECTOR_SHARED_VQ_AND_CONFIG
and MSIX_VECTOR_DEV_SHARED.

This commit uses an "if...elseif...else" code block to tell the
checkers that it is a complete set, and config_vector can be
initialized anyway

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <20220424072806.1083189-1-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-31 12:45:09 -04:00
Cindy Lu ffbda8e9df vdpa/vp_vdpa : add vdpa tool support in vp_vdpa
this patch is to add the support for vdpa tool in vp_vdpa
here is the example steps

modprobe vp_vdpa
modprobe vhost_vdpa
echo 0000:00:06.0>/sys/bus/pci/drivers/virtio-pci/unbind
echo 1af4 1041 > /sys/bus/pci/drivers/vp-vdpa/new_id

vdpa dev add name vdpa1 mgmtdev pci/0000:00:06.0

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20220429091030.547434-1-lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-31 12:45:09 -04:00
Solomon Tan 0619eda83d virtio: Replace long long int with long long
This patch addresses the checkpatch.pl warning that long long is
preferred over long long int.

Signed-off-by: Solomon Tan <solomonbstoner@protonmail.ch>
Message-Id: <YlzTUQa06sP94zxB@ArchDesktop>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:09 -04:00
Solomon Tan 3153234097 virtio: Replace unsigned with unsigned int
This patch addresses the checkpatch.pl warning where unsigned int is
preferred over unsigned.

Signed-off-by: Solomon Tan <solomonbstoner@protonmail.ch>
Message-Id: <YlzS49Wo8JMDhKOt@ArchDesktop>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:09 -04:00
lei he 4e0d352af0 virtio-crypto: enable retry for virtio-crypto-dev
Enable retry for virtio-crypto-dev, so that crypto-engine
can process cipher-requests parallelly.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: lei he <helei.sig11@bytedance.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20220506131627.180784-6-pizhenwei@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:09 -04:00
lei he a36bd0ad9f virtio-crypto: adjust dst_len at ops callback
For some akcipher operations(eg, decryption of pkcs1pad(rsa)),
the length of returned result maybe less than akcipher_req->dst_len,
we need to recalculate the actual dst_len through the virt-queue
protocol.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: lei he <helei.sig11@bytedance.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20220506131627.180784-5-pizhenwei@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:09 -04:00
zhenwei pi 977231e8d4 virtio-crypto: wait ctrl queue instead of busy polling
Originally, after submitting request into virtio crypto control
queue, the guest side polls the result from the virt queue. This
works like following:
    CPU0   CPU1               ...             CPUx  CPUy
     |      |                                  |     |
     \      \                                  /     /
      \--------spin_lock(&vcrypto->ctrl_lock)-------/
                           |
                 virtqueue add & kick
                           |
                  busy poll virtqueue
                           |
              spin_unlock(&vcrypto->ctrl_lock)
                          ...

There are two problems:
1, The queue depth is always 1, the performance of a virtio crypto
   device gets limited. Multi user processes share a single control
   queue, and hit spin lock race from control queue. Test on Intel
   Platinum 8260, a single worker gets ~35K/s create/close session
   operations, and 8 workers get ~40K/s operations with 800% CPU
   utilization.
2, The control request is supposed to get handled immediately, but
   in the current implementation of QEMU(v6.2), the vCPU thread kicks
   another thread to do this work, the latency also gets unstable.
   Tracking latency of virtio_crypto_alg_akcipher_close_session in 5s:
        usecs               : count     distribution
         0 -> 1          : 0        |                        |
         2 -> 3          : 7        |                        |
         4 -> 7          : 72       |                        |
         8 -> 15         : 186485   |************************|
        16 -> 31         : 687      |                        |
        32 -> 63         : 5        |                        |
        64 -> 127        : 3        |                        |
       128 -> 255        : 1        |                        |
       256 -> 511        : 0        |                        |
       512 -> 1023       : 0        |                        |
      1024 -> 2047       : 0        |                        |
      2048 -> 4095       : 0        |                        |
      4096 -> 8191       : 0        |                        |
      8192 -> 16383      : 2        |                        |
This means that a CPU may hold vcrypto->ctrl_lock as long as 8192~16383us.

To improve the performance of control queue, a request on control queue
waits completion instead of busy polling to reduce lock racing, and gets
completed by control queue callback.
    CPU0   CPU1               ...             CPUx  CPUy
     |      |                                  |     |
     \      \                                  /     /
      \--------spin_lock(&vcrypto->ctrl_lock)-------/
                           |
                 virtqueue add & kick
                           |
      ---------spin_unlock(&vcrypto->ctrl_lock)------
     /      /                                  \     \
     |      |                                  |     |
    wait   wait                               wait  wait

Test this patch, the guest side get ~200K/s operations with 300% CPU
utilization.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20220506131627.180784-4-pizhenwei@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:09 -04:00
zhenwei pi 0756ad15b1 virtio-crypto: use private buffer for control request
Originally, all of the control requests share a single buffer(
ctrl & input & ctrl_status fields in struct virtio_crypto), this
allows queue depth 1 only, the performance of control queue gets
limited by this design.

In this patch, each request allocates request buffer dynamically, and
free buffer after request, so the scope protected by ctrl_lock also
get optimized here.
It's possible to optimize control queue depth in the next step.

A necessary comment is already in code, still describe it again:
/*
 * Note: there are padding fields in request, clear them to zero before
 * sending to host to avoid to divulge any information.
 * Ex, virtio_crypto_ctrl_request::ctrl::u::destroy_session::padding[48]
 */
So use kzalloc to allocate buffer of struct virtio_crypto_ctrl_request.

Potentially dereferencing uninitialized variables:
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20220506131627.180784-3-pizhenwei@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:09 -04:00
zhenwei pi 6fd763d155 virtio-crypto: change code style
Use temporary variable to make code easy to read and maintain.
	/* Pad cipher's parameters */
        vcrypto->ctrl.u.sym_create_session.op_type =
                cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
        vcrypto->ctrl.u.sym_create_session.u.cipher.para.algo =
                vcrypto->ctrl.header.algo;
        vcrypto->ctrl.u.sym_create_session.u.cipher.para.keylen =
                cpu_to_le32(keylen);
        vcrypto->ctrl.u.sym_create_session.u.cipher.para.op =
                cpu_to_le32(op);
-->
	sym_create_session = &ctrl->u.sym_create_session;
	sym_create_session->op_type = cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
	sym_create_session->u.cipher.para.algo = ctrl->header.algo;
	sym_create_session->u.cipher.para.keylen = cpu_to_le32(keylen);
	sym_create_session->u.cipher.para.op = cpu_to_le32(op);

The new style shows more obviously:
- the variable we want to operate.
- an assignment statement in a single line.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20220506131627.180784-2-pizhenwei@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:08 -04:00
Murilo Opsfelder Araujo 7e415282b4 virtio-pci: Remove wrong address verification in vp_del_vqs()
GCC 12 enhanced -Waddress when comparing array address to null [0],
which warns:

    drivers/virtio/virtio_pci_common.c: In function ‘vp_del_vqs’:
    drivers/virtio/virtio_pci_common.c:257:29: warning: the comparison will always evaluate as ‘true’ for the pointer operand in ‘vp_dev->msix_affinity_masks + (sizetype)((long unsigned int)i * 256)’ must not be NULL [-Waddress]
      257 |                         if (vp_dev->msix_affinity_masks[i])
          |                             ^~~~~~

In fact, the verification is comparing the result of a pointer
arithmetic, the address "msix_affinity_masks + i", which will always
evaluate to true.

Under the hood, free_cpumask_var() calls kfree(), which is safe to pass
NULL, not requiring non-null verification.  So remove the verification
to make compiler happy (happy compiler, happy life).

[0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102103

Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Message-Id: <20220415023002.49805-1-muriloo@linux.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Christophe de Dinechin <dinechin@redhat.com>
2022-05-31 12:45:08 -04:00