OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Haiyang Zhang	715e2ec532	hv_netvsc: Clean up an unused parameter in rndis_filter_set_rss_param() This patch removes the parameter, num_queue in rndis_filter_set_rss_param(), which is no longer in use. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 20:39:12 -07:00
Haiyang Zhang	4823eb2f3a	hv_netvsc: Add ethtool handler to set and get UDP hash levels The patch add the functions to switch UDP hash level between L3 and L4 by ethtool command. UDP over IPv4 and v6 can be set differently. The default hash level is L4. We currently only allow switching TX hash level from within the guests. On Azure, fragmented UDP packets have high loss rate with L4 hashing. Using L3 hashing is recommended in this case. For example, for UDP over IPv4 on eth0: To include UDP port numbers in hasing: ethtool -N eth0 rx-flow-hash udp4 sdfn To exclude UDP port numbers in hasing: ethtool -N eth0 rx-flow-hash udp4 sd To show UDP hash level: ethtool -n eth0 rx-flow-hash udp4 Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-22 14:08:12 -07:00
stephen hemminger	cad5c19770	netvsc: keep track of some non-fatal overload conditions Add ethtool statistics for case where send chimmeny buffer is exhausted and driver has to fall back to doing scatter/gather send. Also, add statistic for case where ring buffer is full and receive completions are delayed. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-11 14:00:07 -07:00
stephen hemminger	8b5327975a	netvsc: allow controlling send/recv buffer size Control the size of the buffer areas via ethtool ring settings. They aren't really traditional hardware rings, but host API breaks receive and send buffer into chunks. The final size of the chunks are controlled by the host. The default value of send and receive buffer area for host DMA is much larger than it needs to be. Experimentation shows that 4M receive and 1M send is sufficient. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-11 14:00:06 -07:00
stephen hemminger	6123c66854	netvsc: delay setup of VF device When VF device is discovered, delay bring it automatically up in order to allow userspace to some simple changes (like renaming). Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-11 13:59:42 -07:00
David S. Miller	3118e6e19d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The UDP offload conflict is dealt with by simply taking what is in net-next where we have removed all of the UFO handling code entirely. The TCP conflict was a case of local variables in a function being removed from both net and net-next. In netvsc we had an assignment right next to where a missing set of u64 stats sync object inits were added. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-09 16:28:45 -07:00
stephen hemminger	7b83f52047	netvsc: make sure and unregister datapath Go back to switching datapath directly in the notifier callback. Otherwise datapath might not get switched on unregister. No need for calling the NOTIFY_PEERS notifier since that is only for a gratitious ARP/ND packet; but that is not required with Hyper-V because both VF and synthetic NIC have the same MAC address. Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Fixes: `0c195567a8` ("netvsc: transparent VF management") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-08 18:09:52 -07:00
stephen hemminger	732e49850c	netvsc: fix race on sub channel creation The existing sub channel code did not wait for all the sub-channels to completely initialize. This could lead to race causing crash in napi_netif_del() from bad list. The existing code would send an init message, then wait only for the initial response that the init message was received. It thought it was waiting for sub channels but really the init response did the wakeup. The new code keeps track of the number of open channels and waits until that many are open. Other issues here were: * host might return less sub-channels than was requested. * the new init status is not valid until after init was completed. Fixes: `b3e6b82a00` ("hv_netvsc: Wait for sub-channels to be processed during probe") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-06 21:23:21 -07:00
stephen hemminger	0c195567a8	netvsc: transparent VF management This patch implements transparent fail over from synthetic NIC to SR-IOV virtual function NIC in Hyper-V environment. It is a better alternative to using bonding as is done now. Instead, the receive and transmit fail over is done internally inside the driver. Using bonding driver has lots of issues because it depends on the script being run early enough in the boot process and with sufficient information to make the association. This patch moves all that functionality into the kernel. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 16:55:33 -07:00
stephen hemminger	7426b1a518	netvsc: optimize receive completions Optimize how receive completion ring are managed. * Allocate only as many slots as needed for all buffers from host * Allocate before setting up sub channel for better error detection * Don't need to keep copy of initial receive section message * Precompute the watermark for when receive flushing is needed * Replace division with conditional test * Replace atomic per-device variable with per-channel check. * Handle corner case where receive completion send fails if ring buffer to host is full. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-29 15:25:43 -07:00
stephen hemminger	02b6de01af	netvsc: remove unnecessary indirection of page_buffer The internal API was passing struct hv_page_buffer ** when only simple struct hv_page_buffer * was necessary for passing an array. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-29 15:25:43 -07:00
stephen hemminger	867047c451	netvsc: fix warnings reported by lockdep This includes a bunch of fixups for issues reported by lockdep. * ethtool routines can assume RTNL * send is done with RCU lock (and BH disable) * avoid refetching internal device struct (netvsc) instead pass it as a parameter. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-29 15:25:43 -07:00
stephen hemminger	658677f17c	netvsc: remove no longer used max_num_rss queues This value has been calculated in rndis_device_attach since 4.11. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-24 17:39:20 -07:00
stephen hemminger	3962981f48	netvsc: add rtnl annotations in rndis The rndis functions are used when changing device state. Therefore the references from network device to internal state are protected by RTNL mutex. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-19 22:20:05 -07:00
stephen hemminger	35fbbccfb4	netvsc: save pointer to parent netvsc_device in channel table Keep back pointer in the per-channel data structure to avoid any possible RCU related issues when napi poll is called but netvsc_device is in RCU limbo. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-19 22:20:05 -07:00
stephen hemminger	2a926f7912	netvsc: need rcu_derefence when accessing internal device info The netvsc_device structure should be accessed by rcu_dereference in the send path. Change arguments to netvsc_send() to make this easier to do correctly. Remove no longer needed hv_device_to_netvsc_device. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-19 22:20:05 -07:00
stephen hemminger	9749fed5d4	netvsc: use ERR_PTR to avoid dereference issues The rndis_filter_device_add function is called both in probe context and RTNL context,and creates the netvsc_device inner structure. It is easier to get the RTNL lock annotation correct if it returns the object directly, rather than implicitly by updating network device private data. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-19 22:20:05 -07:00
stephen hemminger	ea383bf146	netvsc: change logic for change mtu and set_queues Use device detach/attach to ensure that no packets are handed to device during state changes. Call rndis_filter_open/close directly as part of later VF related changes. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-19 22:20:05 -07:00
Haiyang Zhang	53fa1a6f33	hv_netvsc: Fix the carrier state error when data path is off When the VF NIC is opened, the synthetic NIC's carrier state is set to off. This tells the host to transitions data path to the VF device. But if startup script or user manipulates the admin state of the netvsc device directly for example: # ifconfig eth0 down # ifconfig eth0 up Then the carrier state of the synthetic NIC would be on, even though the data path was still over the VF NIC. This patch sets the carrier state of synthetic NIC with consideration of the related VF state. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-22 13:30:37 -04:00
Haiyang Zhang	dedb459e13	hv_netvsc: Remove unnecessary var link_state from struct netvsc_device_info We simply use rndis_device->link_state in the netdev_dbg. The variable, link_state from struct netvsc_device_info, is not used anywhere else. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-22 13:30:37 -04:00
David S. Miller	0ddead90b2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The conflicts were two cases of overlapping changes in batman-adv and the qed driver. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-15 11:59:32 -04:00
stephen hemminger	2d05b56097	netvsc: use typed pointer for internal state The element netvsc_device:extension is always a pointer to RNDIS information. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-09 12:15:02 -04:00
stephen hemminger	4f19c0d807	netvsc: move filter setting to rndis_device The work queue and handling of network filter parameters should be in rndis_device. This gets rid of warning from RCU checks, eliminates a race and cleans up code. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-08 11:45:48 -04:00
David S. Miller	b1513c3531	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 22:39:08 -04:00
stephen hemminger	fdfb70d275	netvsc: fix calculation of available send sections My change (introduced in 4.11) to use find_first_clear_bit incorrectly assumed that the size argument was words, not bits. The effect was only a small limited number of the available send sections were being actually used. This can cause performance loss with some workloads. Since map_words is now used only during initialization, it can be on stack instead of in per-device data. Fixes: `b58a185801` ("netvsc: simplify get next send section") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-25 11:56:59 -04:00
Haiyang Zhang	8db91f6a9b	hv_netvsc: Fix the queue index computation in forwarding case If the outgoing skb has a RX queue mapping available, we use the queue number directly, other than put it through Send Indirection Table. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:05:19 -04:00
stephen hemminger	43c7bd1ffc	netvsc: use refcount_t for keeping track of sub channels Rather than a lock and variable, use a refcount_t to keep track of the number of sub channels. Don't need to wait for subchannels on device removal since wait was already done in device_add. Also fix the error handling; don't wait forever in case of an error on request to create sub channels. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 19:38:56 -07:00
stephen hemminger	a0be450e19	netvsc: uses RCU instead of removal flag It is cleaner to use RCU protected pointer (nvdev_ctx->nvdev) to indicate device is in removed state, rather than having a separate boolean flag. By using the pointer the context can be checked by static checkers and dynamic lockdep. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 19:38:56 -07:00
stephen hemminger	545a8e79bd	netvsc: use RCU to protect inner device structure The netvsc driver has an internal structure (netvsc_device) which is created when device is opened and released when device is closed. And also opened/released when MTU or number of channels change. Since this is referenced in the receive and transmit path, it is safer to use RCU to protect/prevent use after free problems. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 19:38:56 -07:00
stephen hemminger	f4f1c23d6e	netvsc: fix NAPI performance regression When using NAPI, the single stream performance declined signifcantly because the poll routine was updating host after every burst of packets. This excess signalling caused host throttling. This fix restores the old behavior. Host is only signalled after the ring has been emptied. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 19:38:55 -07:00
stephen hemminger	76f5ed881c	netvsc: remove unused #define Not used anywhere. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 21:39:51 -07:00
David S. Miller	101c431492	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/broadcom/genet/bcmgenet.c net/core/sock.c Conflicts were overlapping changes in bcmgenet and the lockdep handling of sockets. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-15 11:59:10 -07:00
stephen hemminger	7ce1012466	netvsc: handle select_queue when device is being removed Move the send indirection table from the inner device (netvsc) to the network device context. It is possible that netvsc_device is not present (remove in progress). This solves potential use after free issues when packet is being created during MTU change, shutdown, or queue count changes. Fixes: `d8e18ee0fa` ("netvsc: enhance transmit select_queue") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-12 23:13:41 -07:00
stephen hemminger	15a863bf74	netvsc: implement NAPI Use NAPI (softirq), to handle receive packets and send completions. Previously this was handled by tasklet. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-06 17:13:13 -08:00
Simon Xiao	6c80f3fc23	netvsc: report per-channel stats in ethtool statistics Report packets and bytes transferred through a vmbus channel via ethtool. This supersedes need for per-cpu statistics. Example: $ ethtool -S eth0 NIC statistics: ... tx_queue_0_packets: 3523179 tx_queue_0_bytes: 505370920 rx_queue_0_packets: 41430490 rx_queue_0_bytes: 62714661254 tx_queue_1_packets: 0 tx_queue_1_bytes: 0 rx_queue_1_packets: 0 rx_queue_1_bytes: 0 ... Reviewed-by: Long Li <longli@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Simon Xiao <sixiao@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 16:29:01 -05:00
stephen hemminger	793e395555	netvsc: account for packets/bytes transmitted after completion Most drivers do not increment transmit statistics until after the transmit is completed. This will also be necessary for BQL support. Slight additional complexity because the netvsc driver aggregates multiple packets into one transmit. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 16:29:01 -05:00
stephen hemminger	46b4f7f5d1	netvsc: eliminate per-device outstanding send counter Since now keep track of per-queue outstanding sends, we can avoid one atomic update by removing no longer needed per-device atomic. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 16:29:01 -05:00
stephen hemminger	2289f0aa70	netvsc: simplify rndis_filter_remove All caller's already have pointer to netvsc_device so pass it. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 16:29:00 -05:00
stephen hemminger	2c7f83ca71	netvsc: don't pass void * to internal device_add All the caller's/callee's know that the format of the device_add parameter is a netvsc_device_info struct. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 16:29:00 -05:00
stephen hemminger	dc54a08cd3	netvsc: optimize receive path Do manual optimizations of receive path: - remove checks for impossible conditions (but keep checks for bad data from host) - pass argument down, rather than having callee recompute what is already known - remove indirection about receive buffer datalength - remove dependence on VLAN_TAG_PRESENCE - use _hot/_cold and likely/unlikely Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 16:29:00 -05:00
stephen hemminger	b8b835a89b	netvsc: group all per-channel state together Put all the per-channel state together in one data struct. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 16:28:59 -05:00
stephen hemminger	ff4a441990	netvsc: allow get/set of RSS indirection table Allow setting receive indirection table. Also uses the system standard for initialization. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 16:28:59 -05:00
stephen hemminger	2b01888d1b	netvsc: allow more flexible setting of number of channels This allows for number of channels to be managed in a manner similar to existing hardware drivers. It also removes the restriction of maximum 8 channels and allows as many as the host will allow. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 16:28:58 -05:00
stephen hemminger	962f3fee83	netvsc: add ethtool ops to get/set RSS key For some cases it is useful to be able to change RSS key value. For example, replacing RSS key with a symmetric hash. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 16:28:58 -05:00
stephen hemminger	23312a3be9	netvsc: negotiate checksum and segmentation parameters Redo how Hyper-V network driver negotiates offload features. Query the host to determine offload settings, and use the result. Also: * disable IPv4 header checksum offload (not used by Linux) * enable TSO only if host supports * enable UDP checksum offload if supported * don't advertise support for checksumming of non-IP protocols * adjust GSO maximum segment size * enable HIGHDMA Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 16:28:57 -05:00
stephen hemminger	0b307ebd68	netvsc: remove no longer needed receive staging buffers The ring buffer mapping now handles the wraparound case inside get_next_pkt_raw. Therefore it is not necessary to have an additional special receive staging buffer. See commit 1562edaed8c164ca5199 ("Drivers: hv: ring_buffer: count on wrap around mappings") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 16:28:57 -05:00
Jarod Wilson	d0c2c9973e	net: use core MTU range checking in virt drivers hyperv_net: - set min/max_mtu, per Haiyang, after rndis_filter_device_add virtio_net: - set min/max_mtu - remove virtnet_change_mtu vmxnet3: - set min/max_mtu xen-netback: - min_mtu = 0, max_mtu = 65517 xen-netfront: - min_mtu = 0, max_mtu = 65535 unisys/visor: - clean up defines a little to not clash with network core or add redundat definitions CC: netdev@vger.kernel.org CC: virtualization@lists.linux-foundation.org CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: "Michael S. Tsirkin" <mst@redhat.com> CC: Shrikrishna Khare <skhare@vmware.com> CC: "VMware, Inc." <pv-drivers@vmware.com> CC: Wei Liu <wei.liu2@citrix.com> CC: Paul Durrant <paul.durrant@citrix.com> CC: David Kershner <david.kershner@unisys.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-20 14:51:09 -04:00
Stephen Hemminger	c6a77ff82f	hv_netvsc: fix comments Typo's and spelling errors. Also remove old comment from staging era. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-24 09:36:12 -04:00
Stephen Hemminger	f7ad75b753	hv_netvsc: count multicast packets received Useful for debugging issues with multicast and SR-IOV to keep track of number of received multicast packets. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:39:49 -04:00
Stephen Hemminger	9cbcc42806	hv_netvsc: remove VF in flight counters Since VF reference is now protected by RCU, no longer need the VF usage counter and can use device flags to see whether to inject or not. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:39:49 -04:00

1 2 3

137 Commits