Commit Graph

240 Commits

Author SHA1 Message Date
Haiyang Zhang 1cb9d3b618 hv_netvsc: Add support for XDP_REDIRECT
Handle XDP_REDIRECT action in netvsc driver.
Also, transparently pass ndo_xdp_xmit to VF when available.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/1649362894-20077-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-11 18:25:47 -07:00
Andrea Parri (Microsoft) 26894cd971 hv_netvsc: Print value of invalid ID in netvsc_send_{completion,tx_complete}()
That being useful for debugging purposes.

Notice that the packet descriptor is in "private" guest memory, so
that Hyper-V can not tamper with it.

While at it, remove two unnecessary u64-casts.

Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-07 21:04:11 -07:00
Saurabh Sengar 8cf5ab362d net: netvsc: remove break after return
In function netvsc_process_raw_pkt for VM_PKT_DATA_USING_XFER_PAGES
case there is already a 'return' statement which results 'break'
as dead code

Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/1646933534-29493-1-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-11 22:54:48 -08:00
Tianyu Lan b539324f6f Netvsc: Call hv_unmap_memory() in the netvsc_device_remove()
netvsc_device_remove() calls vunmap() inside which should not be
called in the interrupt context. Current code calls hv_unmap_memory()
in the free_netvsc_device() which is rcu callback and maybe called
in the interrupt context. This will trigger BUG_ON(in_interrupt())
in the vunmap(). Fix it via moving hv_unmap_memory() to netvsc_device_
remove().

Fixes: 846da38de0 ("net: netvsc: Add Isolation VM support for netvsc driver")
Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-09 11:54:05 +00:00
Linus Torvalds cb3f09f9af hyperv-next for 5.17
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmHhw7oTHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXrjSB/979LV4Dn1PMcFYsSdlFEMeHcjzJdw/
 kFnLPXMaPJyfg6QPuf83jxzw9uxw8fcePMdVq/FFBtmVV9fJMAv62B8jaGS1p58c
 WnAg+7zsTN+xEoJn+tskSSon8BNMWVrl41zP3K4Ged+5j8UEBk62GB8Orz1qkpwL
 fTh3/+xAvczJeD4zZb1dAm4WnmcQJ4vhg45p07jX6owvnwQAikMFl45aSW54I5o8
 vAxGzFgdsZ2NtExnRNKh3b3DozA8JUE89KckBSZnDtq4rH8Fyy6Wij56Hc6v6Cml
 SUohiNbHX7hsNwit/lxL8wuF97IiA0pQSABobEg3rxfTghTUep51LlaN
 =/m4A
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed-20220114' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:

 - More patches for Hyper-V isolation VM support (Tianyu Lan)

 - Bug fixes and clean-up patches from various people

* tag 'hyperv-next-signed-20220114' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  scsi: storvsc: Fix storvsc_queuecommand() memory leak
  x86/hyperv: Properly deal with empty cpumasks in hyperv_flush_tlb_multi()
  Drivers: hv: vmbus: Initialize request offers message for Isolation VM
  scsi: storvsc: Fix unsigned comparison to zero
  swiotlb: Add CONFIG_HAS_IOMEM check around swiotlb_mem_remap()
  x86/hyperv: Fix definition of hv_ghcb_pg variable
  Drivers: hv: Fix definition of hypercall input & output arg variables
  net: netvsc: Add Isolation VM support for netvsc driver
  scsi: storvsc: Add Isolation VM support for storvsc driver
  hyper-v: Enable swiotlb bounce buffer for Isolation VM
  x86/hyper-v: Add hyperv Isolation VM check in the cc_platform_has()
  swiotlb: Add swiotlb bounce buffer remap function for HV IVM
2022-01-16 15:53:00 +02:00
Tianyu Lan 846da38de0 net: netvsc: Add Isolation VM support for netvsc driver
In Isolation VM, all shared memory with host needs to mark visible
to host via hvcall. vmbus_establish_gpadl() has already done it for
netvsc rx/tx ring buffer. The page buffer used by vmbus_sendpacket_
pagebuffer() stills need to be handled. Use DMA API to map/umap
these memory during sending/receiving packet and Hyper-V swiotlb
bounce buffer dma address will be returned. The swiotlb bounce buffer
has been masked to be visible to host during boot up.

rx/tx ring buffer is allocated via vzalloc() and they need to be
mapped into unencrypted address space(above vTOM) before sharing
with host and accessing. Add hv_map/unmap_memory() to map/umap rx
/tx ring buffer.

Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20211213071407.314309-6-ltykernel@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-12-20 18:01:09 +00:00
Christophe JAILLET e9268a9439 hv_netvsc: Use bitmap_zalloc() when applicable
'send_section_map' is a bitmap. So use 'bitmap_zalloc()' to simplify code,
improve the semantic and avoid some open-coded arithmetic in allocator
arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

While at it, change an '== NULL' test into a '!'.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 14:32:54 +00:00
Tianyu Lan d4dccf353d Drivers: hv: vmbus: Mark vmbus ring buffer visible to host in Isolation VM
Mark vmbus ring buffer visible with set_memory_decrypted() when
establish gpadl handle.

Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Link: https://lore.kernel.org/r/20211025122116.264793-5-ltykernel@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-10-28 11:22:23 +00:00
Andrea Parri (Microsoft) bf5fd8cae3 scsi: storvsc: Use blk_mq_unique_tag() to generate requestIDs
Use blk_mq_unique_tag() to generate requestIDs for StorVSC, avoiding
all issues with allocating enough entries in the VMbus requestor.

Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20210510210841.370472-1-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-05-14 17:39:32 +00:00
Andres Beltran adae1e931a Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer
Pointers to ring-buffer packets sent by Hyper-V are used within the
guest VM. Hyper-V can send packets with erroneous values or modify
packet fields after they are processed by the guest. To defend
against these scenarios, return a copy of the incoming VMBus packet
after validating its length and offset fields in hv_pkt_iter_first().
In this way, the packet can no longer be modified by the host.

Signed-off-by: Andres Beltran <lkmlabelt@gmail.com>
Co-developed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210408161439.341988-1-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-05-14 17:37:46 +00:00
Haiyang Zhang d0922bf798 hv_netvsc: Add error handling while switching data path
Add error handling in case of failure to send switching data path message
to the host.

Reported-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29 16:35:59 -07:00
Shachar Raindel bd49fea758 hv_netvsc: Add a comment clarifying batching logic
The batching logic in netvsc_send is non-trivial, due to
a combination of the Linux API and the underlying hypervisor
interface. Add a comment explaining why the code is written this
way.

Signed-off-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-14 14:32:37 -07:00
Linus Torvalds 9c5b80b795 hyperv-next for 5.12
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmArly8THHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXkRfCADB0PA4xlfVF0Na/iZoBFdNFr3EMU4K
 NddGJYyk0o+gipUIj2xu7TksVw8c1/cWilXOUBe7oZRKw2/fC/0hpDwvLpPtD/wP
 +Tc2DcIgwquMvsSksyqpMOb0YjNNhWCx9A9xPWawpUdg20IfbK/ekRHlFI5MsEww
 7tFS+MHY4QbsPv0WggoK61PGnhGCBt/85Lv4I08ZGohA6uirwC4fNIKp83SgFNtf
 1hHbvpapAFEXwZiKFbzwpue20jWJg+tlTiEFpen3exjBICoagrLLaz3F0SZJvbxl
 2YY32zbBsQe4Izre5PVuOlMoNFRom9NSzEzdZT10g7HNtVrwKVNLcohS
 =MyO4
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed-20210216' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull Hyper-V updates from Wei Liu:

 - VMBus hardening patches from Andrea Parri and Andres Beltran.

 - Patches to make Linux boot as the root partition on Microsoft
   Hypervisor from Wei Liu.

 - One patch to add a new sysfs interface to support hibernation on
   Hyper-V from Dexuan Cui.

 - Two miscellaneous clean-up patches from Colin and Gustavo.

* tag 'hyperv-next-signed-20210216' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (31 commits)
  Revert "Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer"
  iommu/hyperv: setup an IO-APIC IRQ remapping domain for root partition
  x86/hyperv: implement an MSI domain for root partition
  asm-generic/hyperv: import data structures for mapping device interrupts
  asm-generic/hyperv: introduce hv_device_id and auxiliary structures
  asm-generic/hyperv: update hv_interrupt_entry
  asm-generic/hyperv: update hv_msi_entry
  x86/hyperv: implement and use hv_smp_prepare_cpus
  x86/hyperv: provide a bunch of helper functions
  ACPI / NUMA: add a stub function for node_to_pxm()
  x86/hyperv: handling hypercall page setup for root
  x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary
  x86/hyperv: allocate output arg pages if required
  clocksource/hyperv: use MSR-based access if running as root
  Drivers: hv: vmbus: skip VMBus initialization if Linux is root
  x86/hyperv: detect if Linux is the root partition
  asm-generic/hyperv: change HV_CPU_POWER_MANAGEMENT to HV_CPU_MANAGEMENT
  hv: hyperv.h: Replace one-element array with flexible-array in struct icmsg_negotiate
  hv_netvsc: Restrict configurations on isolated guests
  Drivers: hv: vmbus: Enforce 'VMBus version >= 5.2' on isolated guests
  ...
2021-02-21 13:24:39 -08:00
Wei Liu 3019270282 Revert "Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer"
This reverts commit a8c3209998.

It is reported that the said commit caused regression in netvsc.

Reported-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-02-15 10:49:11 +00:00
Andrea Parri (Microsoft) 96854bbda2 hv_netvsc: Restrict configurations on isolated guests
Restrict the NVSP protocol version(s) that will be negotiated with the
host to be NVSP_PROTOCOL_VERSION_61 or greater if the guest is running
isolated.  Moreover, do not advertise the SR-IOV capability and ignore
NVSP_MSG_4_TYPE_SEND_VF_ASSOCIATION messages in isolated guests, which
are not supposed to support SR-IOV.  This reduces the footprint of the
code that will be exercised by Confidential VMs and hence the exposure
to bugs and vulnerabilities.

Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210201144814.2701-5-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-02-11 08:47:05 +00:00
David S. Miller dc9d87581d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-02-10 13:30:12 -08:00
Andres Beltran a8c3209998 Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer
Pointers to ring-buffer packets sent by Hyper-V are used within the
guest VM. Hyper-V can send packets with erroneous values or modify
packet fields after they are processed by the guest. To defend
against these scenarios, return a copy of the incoming VMBus packet
after validating its length and offset fields in hv_pkt_iter_first().
In this way, the packet can no longer be modified by the host.

Signed-off-by: Andres Beltran <lkmlabelt@gmail.com>
Co-developed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: netdev@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201208045311.10244-1-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-02-05 09:55:42 +00:00
Andrea Parri (Microsoft) 0102eeedb7 hv_netvsc: Allocate the recv_buf buffers after NVSP_MSG1_TYPE_SEND_RECV_BUF
The recv_buf buffers are allocated in netvsc_device_add().  Later in
netvsc_init_buf() the response to NVSP_MSG1_TYPE_SEND_RECV_BUF allows
the host to set up a recv_section_size that could be bigger than the
(default) value used for that allocation.  The host-controlled value
could be used by a malicious host to bypass the check on the packet's
length in netvsc_receive() and hence to overflow the recv_buf buffer.

Move the allocation of the recv_buf buffers into netvsc_init_but().

Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Fixes: 0ba35fe91c ("hv_netvsc: Copy packets sent by Hyper-V out of the receive buffer")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-04 20:37:04 -08:00
Andrea Parri (Microsoft) 12bc8dfb83 hv_netvsc: Reset the RSC count if NVSP_STAT_FAIL in netvsc_receive()
Commit 4414418595 ("hv_netvsc: Add validation for untrusted Hyper-V
values") added validation to rndis_filter_receive_data() (and
rndis_filter_receive()) which introduced NVSP_STAT_FAIL-scenarios where
the count is not updated/reset.  Fix this omission, and prevent similar
scenarios from occurring in the future.

Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Fixes: 4414418595 ("hv_netvsc: Add validation for untrusted Hyper-V values")
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20210203113602.558916-1-parri.andrea@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-04 18:59:17 -08:00
Andrea Parri (Microsoft) 0ba35fe91c hv_netvsc: Copy packets sent by Hyper-V out of the receive buffer
Pointers to receive-buffer packets sent by Hyper-V are used within the
guest VM.  Hyper-V can send packets with erroneous values or modify
packet fields after they are processed by the guest.  To defend against
these scenarios, copy (sections of) the incoming packet after validating
their length and offset fields in netvsc_filter_receive().  In this way,
the packet can no longer be modified by the host.

Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20210126162907.21056-1-parri.andrea@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-29 16:44:07 -08:00
Andrea Parri (Microsoft) 505e3f00c3 hv_netvsc: Add (more) validation for untrusted Hyper-V values
For additional robustness in the face of Hyper-V errors or malicious
behavior, validate all values that originate from packets that Hyper-V
has sent to the guest.  Ensure that invalid values cannot cause indexing
off the end of an array, or subvert an existing validation via integer
overflow.  Ensure that outgoing packets do not have any leftover guest
memory that has not been zeroed out.

Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20210114202628.119541-1-parri.andrea@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-18 19:47:47 -08:00
Long Li 8b31f8c982 hv_netvsc: Wait for completion on request SWITCH_DATA_PATH
The completion indicates if NVSP_MSG4_TYPE_SWITCH_DATA_PATH has been
processed by the VSP. The traffic is steered to VF or synthetic after we
receive this completion.

Signed-off-by: Long Li <longli@microsoft.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-12 19:58:36 -08:00
Linus Torvalds 571b12dd1a hyperv-next for 5.11
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAl/XZvgTHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXrvTB/90iEm2NKQFNcrrVAIbo/tz4e214i7E
 aOhZlz/JZvLB05BB82FvlNTRzvgx2ilimdsHGA9PGsZLPQ2LPfPyp2/ivTq/h77U
 W0/ZZ+AmpNhZFm9D95t64RqwsieAIXloEo/oCH7JuRDhu9BMp9tAO1sq42SqtkN4
 e0Dkj1oQK7Ql+lA343/hrPP36jws/okrcvRuOJoCux97HWxE4GhJyjS3aZDPVCa4
 /0zWjte2UmDin94+Ql/BfZHN5Uo/pdZ+08iGkXNBibeny1qNwbUCAYRK51S8MQwO
 IvxGR+JGGaY9R/ahc7Fbv4UQWM8w3KAlOdA/Cc5eHNFgowNDErRrPTKQ
 =pgm9
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed-20201214' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull Hyper-V updates from Wei Liu:

 - harden VMBus (Andres Beltran)

 - clean up VMBus driver (Matheus Castello)

 - fix hv_balloon reporting (Vitaly Kuznetsov)

 - fix a potential OOB issue (Andrea Parri)

 - remove an obsolete TODO item (Stefan Eschenbacher)

* tag 'hyperv-next-signed-20201214' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  hv_balloon: do adjust_managed_page_count() when ballooning/un-ballooning
  hv_balloon: simplify math in alloc_balloon_pages()
  drivers/hv: remove obsolete TODO and fix misleading typo in comment
  drivers: hv: vmbus: Fix checkpatch SPLIT_STRING
  hv_netvsc: Validate number of allocated sub-channels
  drivers: hv: vmbus: Fix call msleep using < 20ms
  drivers: hv: vmbus: Fix checkpatch LINE_SPACING
  drivers: hv: vmbus: Replace symbolic permissions by octal permissions
  drivers: hv: Fix hyperv_record_panic_msg path on comment
  hv_netvsc: Use vmbus_requestor to generate transaction IDs for VMBus hardening
  scsi: storvsc: Use vmbus_requestor to generate transaction IDs for VMBus hardening
  Drivers: hv: vmbus: Add vmbus_requestor data structure for VMBus hardening
2020-12-16 11:49:46 -08:00
Björn Töpel b02e5a0ebb xsk: Propagate napi_id to XDP socket Rx path
Add napi_id to the xdp_rxq_info structure, and make sure the XDP
socket pick up the napi_id in the Rx path. The napi_id is used to find
the corresponding NAPI structure for socket busy polling.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/bpf/20201130185205.196029-7-bjorn.topel@gmail.com
2020-12-01 00:09:25 +01:00
Andres Beltran 4d18fcc95f hv_netvsc: Use vmbus_requestor to generate transaction IDs for VMBus hardening
Currently, pointers to guest memory are passed to Hyper-V as
transaction IDs in netvsc. In the face of errors or malicious
behavior in Hyper-V, netvsc should not expose or trust the transaction
IDs returned by Hyper-V to be valid guest memory addresses. Instead,
use small integers generated by vmbus_requestor as requests
(transaction) IDs.

Signed-off-by: Andres Beltran <lkmlabelt@gmail.com>
Co-developed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20201109100402.8946-4-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2020-11-17 10:54:18 +00:00
Linus Torvalds 4907a43da8 hyperv-next for 5.10
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAl+FqrsTHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXnN8B/4sRg7j9OTzVBlDiXF2vj6vbuplTIH6
 JR6S5f4PNjUg4gV6ghzSnsx1zqNhPSOr78zDqYto8vv+wqqj3thmld8+gAnSbKtt
 yoAa7mhbbN1ryJiwPlZzvX4ApzGZPC7byqEi3+zPIcag6TEl8eyYJOmvY3x1zv8x
 CsAb57oCC4erD0n4xlTyfuc8TLpO+EiU53PXbR9AovKQHe4m2/8LWyEbmrm5cRUR
 gx8RxoLkkrqK0unzcmanbm47QodiaOTUpycs3IvaBeWZQsqSgFZdI1RAdTZNg+U+
 GT8eMRXAwpgDpilPm/0n1O0PKGAsVh9Lbw8Btb/ggqnjTUlA4Z3Df23E
 =Wy5n
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull Hyper-V updates from Wei Liu:

 - a series from Boqun Feng to support page size larger than 4K

 - a few miscellaneous clean-ups

* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  hv: clocksource: Add notrace attribute to read_hv_sched_clock_*() functions
  x86/hyperv: Remove aliases with X64 in their name
  PCI: hv: Document missing hv_pci_protocol_negotiation() parameter
  scsi: storvsc: Support PAGE_SIZE larger than 4K
  Driver: hv: util: Use VMBUS_RING_SIZE() for ringbuffer sizes
  HID: hyperv: Use VMBUS_RING_SIZE() for ringbuffer sizes
  Input: hyperv-keyboard: Use VMBUS_RING_SIZE() for ringbuffer sizes
  hv_netvsc: Use HV_HYP_PAGE_SIZE for Hyper-V communication
  hv: hyperv.h: Introduce some hvpfn helper functions
  Drivers: hv: vmbus: Move virt_to_hvpfn() to hyperv header
  Drivers: hv: Use HV_HYP_PAGE in hv_synic_enable_regs()
  Drivers: hv: vmbus: Introduce types of GPADL
  Drivers: hv: vmbus: Move __vmbus_open()
  Drivers: hv: vmbus: Always use HV_HYP_PAGE_SIZE for gpadl
  drivers: hv: remove cast from hyperv_die_event
2020-10-14 10:32:10 -07:00
Boqun Feng 11d8620e08 hv_netvsc: Use HV_HYP_PAGE_SIZE for Hyper-V communication
When communicating with Hyper-V, HV_HYP_PAGE_SIZE should be used since
that's the page size used by Hyper-V and Hyper-V expects all
page-related data using the unit of HY_HYP_PAGE_SIZE, for example, the
"pfn" in hv_page_buffer is actually the HV_HYP_PAGE (i.e. the Hyper-V
page) number.

In order to support guest whose page size is not 4k, we need to make
hv_netvsc always use HV_HYP_PAGE_SIZE for Hyper-V communication.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200916034817.30282-8-boqun.feng@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2020-09-28 08:55:13 +00:00
Andres Beltran 4414418595 hv_netvsc: Add validation for untrusted Hyper-V values
For additional robustness in the face of Hyper-V errors or malicious
behavior, validate all values that originate from packets that Hyper-V
has sent to the guest in the host-to-guest ring buffer. Ensure that
invalid values cannot cause indexing off the end of an array, or
subvert an existing validation via integer overflow. Ensure that
outgoing packets do not have any leftover guest memory that has not
been zeroed out.

Signed-off-by: Andres Beltran <lkmlabelt@gmail.com>
Co-developed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-17 16:21:26 -07:00
Andrea Parri (Microsoft) ac50476717 hv_netvsc: Disable NAPI before closing the VMBus channel
vmbus_chan_sched() might call the netvsc driver callback function that
ends up scheduling NAPI work.  This "work" can access the channel ring
buffer, so we must ensure that any such work is completed and that the
ring buffer is no longer being accessed before freeing the ring buffer
data structure in the channel closure path.  To this end, disable NAPI
before calling vmbus_close() in netvsc_device_remove().

Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: <netdev@vger.kernel.org>
Link: https://lore.kernel.org/r/20200406001514.19876-5-parri.andrea@gmail.com
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2020-04-23 13:17:11 +00:00
Haiyang Zhang f87238d30c hv_netvsc: Remove unnecessary round_up for recv_completion_cnt
The vzalloc_node(), already rounds the total size to whole pages, and
sizeof(u64) is smaller than sizeof(struct recv_comp_data). So
round_up of recv_completion_cnt is not necessary, and may cause extra
memory allocation.

To save memory, remove this unnecessary round_up for recv_completion_cnt.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 19:43:42 -07:00
Haiyang Zhang f6f13c125e hv_netvsc: Fix unwanted wakeup in netvsc_attach()
When netvsc_attach() is called by operations like changing MTU, etc.,
an extra wakeup may happen while netvsc_attach() calling
rndis_filter_device_add() which sends rndis messages when queue is
stopped in netvsc_detach(). The completion message will wake up queue 0.

We can reproduce the issue by changing MTU etc., then the wake_queue
counter from "ethtool -S" will increase beyond stop_queue counter:
     stop_queue: 0
     wake_queue: 1
The issue causes queue wake up, and counter increment, no other ill
effects in current code. So we didn't see any network problem for now.

To fix this, initialize tx_disable to true, and set it to false when
the NIC is ready to be attached or registered.

Fixes: 7b2ee50c0c ("hv_netvsc: common detach logic")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-23 16:32:37 -08:00
Haiyang Zhang 351e158139 hv_netvsc: Add XDP support
This patch adds support of XDP in native mode for hv_netvsc driver, and
transparently sets the XDP program on the associated VF NIC as well.

Setting / unsetting XDP program on synthetic NIC (netvsc) propagates to
VF NIC automatically. Setting / unsetting XDP program on VF NIC directly
is not recommended, also not propagated to synthetic NIC, and may be
overwritten by setting of synthetic NIC.

The Azure/Hyper-V synthetic NIC receive buffer doesn't provide headroom
for XDP. We thought about re-use the RNDIS header space, but it's too
small. So we decided to copy the packets to a page buffer for XDP. And,
most of our VMs on Azure have Accelerated  Network (SRIOV) enabled, so
most of the packets run on VF NIC. The synthetic NIC is considered as a
fallback data-path. So the data copy on netvsc won't impact performance
significantly.

XDP program cannot run with LRO (RSC) enabled, so you need to disable LRO
before running XDP:
        ethtool -K eth0 lro off

XDP actions not yet supported:
        XDP_REDIRECT

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:43:19 +01:00
Haiyang Zhang 171c1fd98d hv_netvsc: Fix send_table offset in case of a host bug
If negotiated NVSP version <= NVSP_PROTOCOL_VERSION_6, the offset may
be wrong (too small) due to a host bug. This can cause missing the
end of the send indirection table, and add multiple zero entries from
leading zeros before the data region. This bug adds extra burden on
channel 0.

So fix the offset by computing it from the data structure sizes. This
will ensure netvsc driver runs normally on unfixed hosts, and future
fixed hosts.

Fixes: 5b54dac856 ("hyperv: Add support for virtual Receive Side Scaling (vRSS)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 19:32:23 -08:00
Haiyang Zhang 71f21959dd hv_netvsc: Fix offset usage in netvsc_send_table()
To reach the data region, the existing code adds offset in struct
nvsp_5_send_indirect_table on the beginning of this struct. But the
offset should be based on the beginning of its container,
struct nvsp_message. This bug causes the first table entry missing,
and adds an extra zero from the zero pad after the data region.
This can put extra burden on the channel 0.

So, correct the offset usage. Also add a boundary check to ensure
not reading beyond data region.

Fixes: 5b54dac856 ("hyperv: Add support for virtual Receive Side Scaling (vRSS)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 19:32:23 -08:00
Thomas Gleixner 9952f6918d treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms and conditions of the gnu general public license
  version 2 as published by the free software foundation this program
  is distributed in the hope it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not see http www gnu org
  licenses

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 228 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190528171438.107155473@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:29:52 -07:00
David S. Miller a9e41a5296 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor conflict with the DSA legacy code removal.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-07 17:22:09 -07:00
Haiyang Zhang 93aa4792c3 hv_netvsc: fix race that may miss tx queue wakeup
When the ring buffer is almost full due to RX completion messages, a
TX packet may reach the "low watermark" and cause the queue stopped.
If the TX completion arrives earlier than queue stopping, the wakeup
may be missed.

This patch moves the check for the last pending packet to cover both
EAGAIN and success cases, so the queue will be reliably waked up when
necessary.

Reported-and-tested-by: Stephan Klein <stephan.klein@wegfinder.at>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-03 23:50:25 -04:00
David S. Miller f83f715195 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor comment merge conflict in mlx5.

Staging driver has a fixup due to the skb->xmit_more changes
in 'net-next', but was removed in 'net'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-05 14:14:19 -07:00
Florian Westphal 6b16f9ee89 net: move skb->xmit_more hint to softnet data
There are two reasons for this.

First, the xmit_more flag conceptually doesn't fit into the skb, as
xmit_more is not a property related to the skb.
Its only a hint to the driver that the stack is about to transmit another
packet immediately.

Second, it was only done this way to not have to pass another argument
to ndo_start_xmit().

We can place xmit_more in the softnet data, next to the device recursion.
The recursion counter is already written to on each transmit. The "more"
indicator is placed right next to it.

Drivers can use the netdev_xmit_more() helper instead of skb->xmit_more
to check the "more packets coming" hint.

skb->xmit_more is retained (but always 0) to not cause build breakage.

This change takes care of the simple s/skb->xmit_more/netdev_xmit_more()/
conversions.  Remaining drivers are converted in the next patches.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-01 18:35:02 -07:00
Haiyang Zhang 1b704c4a1b hv_netvsc: Fix unwanted wakeup after tx_disable
After queue stopped, the wakeup mechanism may wake it up again
when ring buffer usage is lower than a threshold. This may cause
send path panic on NULL pointer when we stopped all tx queues in
netvsc_detach and start removing the netvsc device.

This patch fix it by adding a tx_disable flag to prevent unwanted
queue wakeup.

Fixes: 7b2ee50c0c ("hv_netvsc: common detach logic")
Reported-by: Mohammed Gamal <mgamal@redhat.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-29 13:34:01 -07:00
Adrian Vladu 52d3b49491 hv_netvsc: fix typos in code comments
Fix all typos from hyperv netvsc code comments.

Signed-off-by: Adrian Vladu <avladu@cloudbasesolutions.com>

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Sasha Levin <sashal@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Alessandro Pilotti" <apilotti@cloudbasesolutions.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-23 13:21:34 -05:00
Haiyang Zhang 17d9125689 hv_netvsc: Fix hash key value reset after other ops
Changing mtu, channels, or buffer sizes ops call to netvsc_attach(),
rndis_set_subchannel(), which always reset the hash key to default
value. That will override hash key changed previously. This patch
fixes the problem by save the hash key, then restore it when we re-
add the netvsc device.

Fixes: ff4a441990 ("netvsc: allow get/set of RSS indirection table")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
[sl: fix up subject line]
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-23 13:21:28 -05:00
Haiyang Zhang c8e4eff467 hv_netvsc: Add support for LRO/RSC in the vSwitch
LRO/RSC in the vSwitch is a feature available in Windows Server 2019
hosts and later. It reduces the per packet processing overhead by
coalescing multiple TCP segments when possible. This patch adds netvsc
driver support for this feature.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-22 17:23:15 -07:00
Stephen Hemminger 00d7ddba11 hv_netvsc: pair VF based on serial number
Matching network device based on MAC address is problematic
since a non VF network device can be creted with a duplicate MAC
address causing confusion and problems.  The VMBus API does provide
a serial number that is a better matching method.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-17 07:59:41 -07:00
Haiyang Zhang 6b81b193b8 hv_netvsc: Fix napi reschedule while receive completion is busy
If out ring is full temporarily and receive completion cannot go out,
we may still need to reschedule napi if certain conditions are met.
Otherwise the napi poll might be stopped forever, and cause network
disconnect.

Fixes: 7426b1a518 ("netvsc: optimize receive completions")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:23:30 -07:00
Stephen Hemminger 3ffe64f1a6 hv_netvsc: split sub-channel setup into async and sync
When doing device hotplug the sub channel must be async to avoid
deadlock issues because device is discovered in softirq context.

When doing changes to MTU and number of channels, the setup
must be synchronous to avoid races such as when MTU and device
settings are done in a single ip command.

Reported-by: Thomas Walker <Thomas.Walker@twosigma.com>
Fixes: 8195b1396e ("hv_netvsc: fix deadlock on hotplug")
Fixes: 732e49850c ("netvsc: fix race on sub channel creation")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-30 21:28:36 +09:00
Linus Torvalds 5f85942c2e SCSI misc on 20180610
This is mostly updates to the usual drivers: ufs, qedf, mpt3sas, lpfc,
 xfcp, hisi_sas, cxlflash, qla2xxx.  In the absence of Nic, we're also
 taking target updates which are mostly minor except for the tcmu
 refactor. The only real core change to worry about is the removal of
 high page bouncing (in sas, storvsc and iscsi).  This has been well
 tested and no problems have shown up so far.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWx1pbCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishUucAP42pccS
 ziKyiOizuxv9fZ4Q+nXd1A9zhI5tqqpkHjcQegEA40qiZSi3EKGKR8W0UpX7Ntmo
 tqrZJGojx9lnrAM2RbQ=
 =NMXg
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly updates to the usual drivers: ufs, qedf, mpt3sas, lpfc,
  xfcp, hisi_sas, cxlflash, qla2xxx.

  In the absence of Nic, we're also taking target updates which are
  mostly minor except for the tcmu refactor.

  The only real core change to worry about is the removal of high page
  bouncing (in sas, storvsc and iscsi). This has been well tested and no
  problems have shown up so far"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (268 commits)
  scsi: lpfc: update driver version to 12.0.0.4
  scsi: lpfc: Fix port initialization failure.
  scsi: lpfc: Fix 16gb hbas failing cq create.
  scsi: lpfc: Fix crash in blk_mq layer when executing modprobe -r lpfc
  scsi: lpfc: correct oversubscription of nvme io requests for an adapter
  scsi: lpfc: Fix MDS diagnostics failure (Rx < Tx)
  scsi: hisi_sas: Mark PHY as in reset for nexus reset
  scsi: hisi_sas: Fix return value when get_free_slot() failed
  scsi: hisi_sas: Terminate STP reject quickly for v2 hw
  scsi: hisi_sas: Add v2 hw force PHY function for internal ATA command
  scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot
  scsi: hisi_sas: Try wait commands before before controller reset
  scsi: hisi_sas: Init disks after controller reset
  scsi: hisi_sas: Create a scsi_host_template per HW module
  scsi: hisi_sas: Reset disks when discovered
  scsi: hisi_sas: Add LED feature for v3 hw
  scsi: hisi_sas: Change common allocation mode of device id
  scsi: hisi_sas: change slot index allocation mode
  scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate()
  scsi: hisi_sas: fix a typo in hisi_sas_task_prep()
  ...
2018-06-10 13:01:12 -07:00
Stephen Hemminger c347b9273c hv_netvsc: simplify receive side calling arguments
The calls up from the napi poll reading the receive ring had many
places where an argument was being recreated. I.e the caller already
had the value and wasn't passing it, then the callee would use
known relationship to determine the same value. Simpler and faster
to just pass arguments needed.

Also, add const in a couple places where message is being only read.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27 14:46:26 -04:00
Haiyang Zhang 0dcec221dd hv_netvsc: Add NetVSP v6 and v6.1 into version negotiation
This patch adds the NetVSP v6 and 6.1 message structures, and includes
these versions into NetVSC/NetVSP version negotiation process.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18 21:20:44 -04:00
Long Li 6b1f8376dc scsi: netvsc: Use the vmbus function to calculate ring buffer percentage
In Vmbus, we have defined a function to calculate available ring buffer
percentage to write.

Use that function and remove netvsc's private version.

[mkp: typo]

Signed-off-by: Long Li <longli@microsoft.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:52 -04:00