linux-sg2042

Commit Graph

Author	SHA1	Message	Date
David S. Miller	e1ea2f9856	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Several conflicts here. NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to nfp_fl_output() needed some adjustments because the code block is in an else block now. Parallel additions to net/pkt_cls.h and net/sch_generic.h A bug fix in __tcp_retransmit_skb() conflicted with some of the rbtree changes in net-next. The tc action RCU callback fixes in 'net' had some overlap with some of the recent tcf_block reworking. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-30 21:09:24 +09:00
Mahesh Bandewar	fe89aa6b25	ipvlan: implement VEPA mode This is very similar to the Macvlan VEPA mode, however, there is some difference. IPvlan uses the mac-address of the lower device, so the VEPA mode has implications of ICMP-redirects for packets destined for its immediate neighbors sharing same master since the packets will have same source and dest mac. The external switch/router will send redirect msg. Having said that, this will be useful tool in terms of debugging since IPvlan will not switch packets within its slaves and rely completely on the external entity as intended in 802.1Qbg. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 18:39:57 +09:00
Mahesh Bandewar	a190d04db9	ipvlan: introduce 'private' attribute for all existing modes. IPvlan has always operated in bridge mode. However there are scenarios where each slave should be able to talk through the master device but not necessarily across each other. Think of an environment where each of a namespace is a private and independant customer. In this scenario the machine which is hosting these namespaces neither want to tell who their neighbor is nor the individual namespaces care to talk to neighbor on short-circuited network path. This patch implements the mode that is very similar to the 'private' mode in macvlan where individual slaves can send and receive traffic through the master device, just that they can not talk among slave devices. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 18:39:57 +09:00
Wei Yongjun	2660d226d9	net: aquantia: Make local functions static Fixes the following sparse warnings: drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c:224:5: warning: symbol 'aq_ethtool_get_coalesce' was not declared. Should it be static? drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c:245:5: warning: symbol 'aq_ethtool_set_coalesce' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 17:53:00 +09:00
Florian Fainelli	5c1a6eaf0d	net: dsa: b53: Export b53_configure_vlan() bcm_sf2 and b53 replicate the same operations: clear all VLANs and set their ports to the default VLAN tag (1 for these devices) so export the b53 function doing just that. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 12:13:30 +09:00
Felix Manlunas	641da8ed3d	liquidio: get rid of false alarm "Unknown cmd 27" in dmesg Creating a macvtap interface with the liquidio VF driver as lower device causes this alarming message to show up in dmesg: liquidio_link_ctrl_cmd_completion Unknown cmd 27 That's actually a false alarm because cmd 27 is the value of the macro OCTNET_CMD_SET_UC_LIST which is known. It's a control command sent from host to NIC firmware to set the unicast MAC address list of the macvtap lower device. Make the false alarm go away by adding a case for OCTNET_CMD_SET_UC_LIST in liquidio_link_ctrl_cmd_completion(). Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 12:12:33 +09:00
Haiyang Zhang	a6fb6aa3cf	hv_netvsc: Set tx_table to equal weight after subchannels open In some cases, like internal vSwitch, the host doesn't provide send indirection table updates. This patch sets the table to be equal weight after subchannels are all open. Otherwise, all workload will be on one TX channel. As tested, this patch has largely increased the throughput over internal vSwitch. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 12:09:23 +09:00
Matteo Croce	90e229ef61	ppp: allow usage in namespaces Check for CAP_NET_ADMIN with ns_capable() instead of capable() to allow usage of ppp in user namespace other than the init one. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 11:55:32 +09:00
David S. Miller	87e3de1e4e	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2017-10-27 This patchset is a proposal of how the Traffic Control subsystem can be used to offload the configuration of the Credit Based Shaper (defined in the IEEE 802.1Q-2014 Section 8.6.8.2) into supported network devices. As part of this work, we've assessed previous public discussions related to TSN enabling: patches from Henrik Austad (Cisco), the presentation from Eric Mann at Linux Plumbers 2012, patches from Gangfeng Huang (National Instruments) and the current state of the OpenAVNU project (https://github.com/AVnu/OpenAvnu/). Overview ======== Time-sensitive Networking (TSN) is a set of standards that aim to address resources availability for providing bandwidth reservation and bounded latency on Ethernet based LANs. The proposal described here aims to cover mainly what is needed to enable the following standards: 802.1Qat and 802.1Qav. The initial target of this work is the Intel i210 NIC, but other controllers' datasheet were also taken into account, like the Renesas RZ/A1H RZ/A1M group and the Synopsis DesignWare Ethernet QoS controller. Proposal ======== Feature-wise, what is covered here is the configuration interfaces for HW implementations of the Credit-Based shaper (CBS, 802.1Qav). CBS is a per-queue shaper. Given that this feature is related to traffic shaping, and that the traffic control subsystem already provides a queueing discipline that offloads config into the device driver (i.e. mqprio), designing a new qdisc for the specific purpose of offloading the config for the CBS shaper seemed like a good fit. For steering traffic into the correct queues, we use the socket option SO_PRIORITY and then a mechanism to map priority to traffic classes / Tx queues. The qdisc mqprio is currently used in our tests. As for the CBS config interface, this patchset is proposing a new qdisc called 'cbs'. Its 'tc' cmd line is: $ tc qdisc add dev IFACE parent ID cbs locredit N hicredit M sendslope S \ idleslope I Note that the parameters for this qdisc are the ones defined by the 802.1Q-2014 spec, so no hardware specific functionality is exposed here. Per-stream shaping, as defined by IEEE 802.1Q-2014 Section 34.6.1, is not yet covered by this proposal. v2: Merged patch 6 of the original series into patch 4 based on feedback from David Miller. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 11:20:28 +09:00
Arjun Vynipadath	c69fe40780	cxgb3: Check and handle the dma mapping errors This patch adds checks at approprate places whether dma_map() call has succeeded or not. Original Work by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 11:10:17 +09:00
Francois Romieu	509708310c	r8169: Add support for interrupt coalesce tuning (ethtool -C) Kirr: In particular with ethtool -C <ifname> rx-usecs 0 rx-frames 0 now it is possible to disable RX delays when NIC usage requires low-latency. See this thread for context: https://www.spinics.net/lists/netdev/msg217665.html My specific case is that: We have many computers with gigabit Realtek NICs. For 2 such computers connected to a gigabit store-and-forward switch the minimum round-trip time for small pings (`ping -i 0 -w 3 -s 56 -q peer`) is ~ 30μs. However it turned out that when Ethernet frame length transitions 127 -> 128 bytes (`ping -i 0 -w 3 -s {81 -> 82} -q peer`) the lowest RTT transitions step-wise to ~ 270μs. As David Light said this is RX interrupt mitigation done by NIC which creates the latency. For workloads when low-latency is required with e.g. Intel, BCM etc NIC drivers one just uses `ethtool -C rx-usecs ...` to reduce the time NIC delays before interrupting CPU, but it turned out `ethtool -C` is not supported by r8169 driver. Like Stéphane ANCELOT I've traced the problem down to IntrMitigate being hardcoded to != 0 for our chips (we have 8168 based NICs): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/realtek/r8169.c#n5460 static void rtl_hw_start_8169(struct net_device dev) { ... / * Undocumented corner. Supposedly: * (TxTimer << 12) \| (TxPackets << 8) \| (RxTimer << 4) \| RxPackets / RTL_W16(IntrMitigate, 0x0000); https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/realtek/r8169.c#n6346 static void rtl_hw_start_8168(struct net_device dev) { ... RTL_W16(IntrMitigate, 0x5151); and then I've also found https://www.spinics.net/lists/netdev/msg217665.html and original Francois' patch: https://www.spinics.net/lists/netdev/msg217984.html https://www.spinics.net/lists/netdev/msg218207.html So could we please finally get support for tuning r8169 interrupt coalescing in tree? (so that next poor soul who hits the problem does not need to go all the way to dig into driver sources and internet wildly and finally patch locally -RTL_W16(IntrMitigate, 0x5151); +RTL_W16(IntrMitigate, 0x5100); guessing whether it is right or not and also having to care to deploy the patch everywhere it needs to be used, etc...). To do so I've took original Francois's patch from 2012 and reworked it a bit: - updated to latest net-next.git; - adjusted scaling setup based on feedback from Hayes to pick up scaling vector depending not only on link speed but also on CPlusCmd[0:1] and to adjust CPlusCmd[0:1] correspondingly when setting timings; - improved a bit (I think so) error handling. I've tested the patch on "RTL8168d/8111d" (XID 083000c0) and with it and `ethtool -C rx-usecs 0 rx-frames 0` on both ends it improves: - minimum RTT latency: ~270μs -> ~30μs (small packet), ~330μs -> ~110μs (full 1.5K ethernet frame) - average RTT latency: ~480μs -> ~50μs (small packet), ~560μs -> ~125μs (full 1.5K ethernet frame) ( before: root@neo1:# ping -i 0 -w 3 -s 82 -q neo2 PING neo2.kirr.nexedi.com (192.168.102.21) 82(110) bytes of data. --- neo2.kirr.nexedi.com ping statistics --- 5906 packets transmitted, 5905 received, 0% packet loss, time 2999ms rtt min/avg/max/mdev = 0.274/0.485/0.607/0.026 ms, ipg/ewma 0.508/0.489 ms root@neo1:# ping -i 0 -w 3 -s 1472 -q neo2 PING neo2.kirr.nexedi.com (192.168.102.21) 1472(1500) bytes of data. --- neo2.kirr.nexedi.com ping statistics --- 5073 packets transmitted, 5073 received, 0% packet loss, time 2999ms rtt min/avg/max/mdev = 0.330/0.566/0.710/0.028 ms, ipg/ewma 0.591/0.544 ms after: root@neo1# ping -i 0 -w 3 -s 82 -q neo2 PING neo2.kirr.nexedi.com (192.168.102.21) 82(110) bytes of data. --- neo2.kirr.nexedi.com ping statistics --- 45815 packets transmitted, 45815 received, 0% packet loss, time 3000ms rtt min/avg/max/mdev = 0.036/0.051/0.368/0.010 ms, ipg/ewma 0.065/0.053 ms root@neo1:# ping -i 0 -w 3 -s 1472 -q neo2 PING neo2.kirr.nexedi.com (192.168.102.21) 1472(1500) bytes of data. --- neo2.kirr.nexedi.com ping statistics --- 21250 packets transmitted, 21250 received, 0% packet loss, time 3000ms rtt min/avg/max/mdev = 0.112/0.125/0.390/0.007 ms, ipg/ewma 0.141/0.125 ms the small -> 1.5K latency growth is understandable as it takes ~15μs to transmit 1.5K on 1Gbps on the wire and with 2 hosts and 1 switch and ICMP ECHO + ECHO reply the packet has to travel 4 ethernet segments which is already 60μs; probably something a bit else is also there as e.g. on Linux, even with `cpupower frequency-set -g performance`, on some computers I've noticed the kernel can be spending more time in software-only mode when incoming packets go in less frequently. E.g. this program can demonstrate the effect for ICMP ECHO processing: https://lab.nexedi.com/kirr/bcc/blob/43cfc13b/tools/pinglat.py (later this was found to be partly due to C-states exit latencies) ) We have this patch running in our testing setup for 1 months already without any issues observed. It remains to be clarified whether RX and TX timers use the same base. For now I've set them equally, but Francois's original patch version suggests it could be not the same. I've got no feedback at all to my original posting of this patch and questions https://www.spinics.net/lists/netdev/msg457173.html neither from Francois, nor from any people from Realtek during one month. So I suggest we simply apply it to net-next.git now. Cc: Francois Romieu <romieu@fr.zoreil.com> Cc: Hayes Wang <hayeswang@realtek.com> Cc: Realtek linux nic maintainers <nic_swsd@realtek.com> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Stéphane ANCELOT <sancelot@free.fr> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Kirill Smelkov <kirr@nexedi.com> Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 11:07:58 +09:00
Girish Moodalbail	dea6e19f4e	tap: reference to KVA of an unloaded module causes kernel panic The commit `9a393b5d59` ("tap: tap as an independent module") created a separate tap module that implements tap functionality and exports interfaces that will be used by macvtap and ipvtap modules to create create respective tap devices. However, that patch introduced a regression wherein the modules macvtap and ipvtap can be removed (through modprobe -r) while there are applications using the respective /dev/tapX devices. These applications cause kernel to hold reference to /dev/tapX through 'struct cdev macvtap_cdev' and 'struct cdev ipvtap_dev' defined in macvtap and ipvtap modules respectively. So, when the application is later closed the kernel panics because we are referencing KVA that is present in the unloaded modules. ----------8<------- Example ----------8<---------- $ sudo ip li add name mv0 link enp7s0 type macvtap $ sudo ip li show mv0 \|grep mv0\| awk -e '{print $1 $2}' 14:mv0@enp7s0: $ cat /dev/tap14 & $ lsmod \|egrep -i 'tap\|vlan' macvtap 16384 0 macvlan 24576 1 macvtap tap 24576 3 macvtap $ sudo modprobe -r macvtap $ fg cat /dev/tap14 ^C <...system panics...> BUG: unable to handle kernel paging request at ffffffffa038c500 IP: cdev_put+0xf/0x30 ----------8<-----------------8<---------- The fix is to set cdev.owner to the module that creates the tap device (either macvtap or ipvtap). With this set, the operations (in fs/char_dev.c) on char device holds and releases the module through cdev_get() and cdev_put() and will not allow the module to unload prematurely. Fixes: `9a393b5d59` (tap: tap as an independent module) Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:17:21 +09:00
Kees Cook	267146d447	drivers/net: smsc: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: "David S. Miller" <davem@davemloft.net> Cc: "yuval.shaia@oracle.com" <yuval.shaia@oracle.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Philippe Reynes <tremyfr@gmail.com> Cc: Allen Pais <allen.lkml@gmail.com> Cc: Tobias Klauser <tklauser@distanz.ch> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:09:50 +09:00
Kees Cook	8089c6f477	drivers/net: packetengines: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: "David S. Miller" <davem@davemloft.net> Cc: Allen Pais <allen.lkml@gmail.com> Cc: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn> Cc: Philippe Reynes <tremyfr@gmail.com> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:09:49 +09:00
Kees Cook	15735c9d8a	drivers/net: natsemi: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: "David S. Miller" <davem@davemloft.net> Cc: Allen Pais <allen.lkml@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Philippe Reynes <tremyfr@gmail.com> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:09:49 +09:00
Kees Cook	0365b047de	drivers/net: mellanox: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Matan Barak <matanb@mellanox.com> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: netdev@vger.kernel.org Cc: linux-rdma@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:09:49 +09:00
Kees Cook	34309b36e4	drivers/net: korina: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: "David S. Miller" <davem@davemloft.net> Cc: Roman Yeryomin <leroi.lists@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:09:49 +09:00
Kees Cook	8b3718dc2c	drivers/net: fealnx: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: "David S. Miller" <davem@davemloft.net> Cc: "yuval.shaia@oracle.com" <yuval.shaia@oracle.com> Cc: Allen Pais <allen.lkml@gmail.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Philippe Reynes <tremyfr@gmail.com> Cc: Johannes Berg <johannes.berg@intel.com> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:09:49 +09:00
Kees Cook	9cb618c295	drivers/net: dlink: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Denis Kirjanov <kda@linux-powerpc.org> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:09:49 +09:00
Kees Cook	0e23daeb64	drivers/net: chelsio/cxgb*: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Santosh Raspatur <santosh@chelsio.com> Cc: Ganesh Goudar <ganeshgr@chelsio.com> Cc: Casey Leedom <leedom@chelsio.com> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:09:49 +09:00
Kees Cook	70a42ac1c2	drivers/net: appletalk/cops: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Allen Pais <allen.lkml@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:09:49 +09:00
Kees Cook	c6c52ba151	drivers/net: amd: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Allen Pais <allen.lkml@gmail.com> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:09:49 +09:00
Kees Cook	c63144e4dd	drivers/net: 8390: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:09:49 +09:00
Jason Wang	63b9ab65bd	tuntap: properly align skb->head before building skb An unaligned alloc_frag->offset caused by previous allocation will result an unaligned skb->head. This will lead unaligned skb_shared_info and then unaligned dataref which requires to be aligned for accessing on some architecture. Fix this by aligning alloc_frag->offset before the frag refilling. Fixes: `0bbd7dad34` ("tun: make tun_build_skb() thread safe") Cc: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Cc: Wei Wei <dotweiba@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Reported-by: Wei Wei <dotweiba@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:05:28 +09:00
Bhadram Varka	a830405ee4	stmmac: copy unicast mac address to MAC registers Currently stmmac driver not copying the valid ethernet MAC address to MAC registers. This patch takes care of updating the MAC register with MAC address. Signed-off-by: Bhadram Varka <vbhadram@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 19:04:29 +09:00
Pablo Cascón	a267eaebfc	nfp: inform the VF driver needs to be restarted after changing the MAC Add message to inform the VF MAC was changed and the need to restart the VF driver for the changes to be effective. Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 18:59:48 +09:00
Felix Manlunas	aa28667cfb	liquidio: fix kernel panic in VF driver Doing ifconfig down on VF driver in the middle of receiving line rate traffic causes a kernel panic: LiquidIO_VF 0000:02:00.3: should not come here should not get rx when poll mode = 0 for vf BUG: unable to handle kernel NULL pointer dereference at (null) . . . Call Trace: <IRQ> ? tasklet_action+0x102/0x120 __do_softirq+0x91/0x292 irq_exit+0xb6/0xc0 do_IRQ+0x4f/0xd0 common_interrupt+0x93/0x93 </IRQ> RIP: 0010:cpuidle_enter_state+0x142/0x2f0 RSP: 0018:ffffffffa6403e20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff59 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 000000000000001f RDX: 0000000000000000 RSI: 000000002ab7519f RDI: 0000000000000000 RBP: ffffffffa6403e58 R08: 0000000000000084 R09: 0000000000000018 R10: ffffffffa6403df0 R11: 00000000000003c7 R12: 0000000000000003 R13: ffffd27ebd806800 R14: ffffffffa64d40d8 R15: 0000007be072823f cpuidle_enter+0x17/0x20 call_cpuidle+0x23/0x40 do_idle+0x18c/0x1f0 cpu_startup_entry+0x64/0x70 rest_init+0xa5/0xb0 start_kernel+0x45e/0x46b x86_64_start_reservations+0x24/0x26 x86_64_start_kernel+0x6f/0x72 secondary_startup_64+0xa5/0xa5 Code: Bad RIP value. RIP: (null) RSP: ffff9246ed003f28 CR2: 0000000000000000 ---[ end trace 92731e80f31b7d7d ]--- Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x24000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception in interrupt Reason is: in the function assigned to net_device_ops->ndo_stop, the steps for bringing down the interface are done in the wrong order. The step that notifies the NIC firmware to stop forwarding packets to host is done too late. Fix it by moving that step to the beginning. Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 18:52:46 +09:00
Michael Chan	952c5719aa	bnxt_en: Fix randconfig build errors. Fix undefined symbols when CONFIG_VLAN_8021Q or CONFIG_INET is not set. Fixes: `8c95f773b4` ("bnxt_en: add support for Flower based vxlan encap/decap offload") Reported-by: Jakub Kicinski <kubakici@wp.pl> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 18:24:15 +09:00
Andre Guedes	05f9d3e1ae	igb: Add support for CBS offload This patch adds support for Credit-Based Shaper (CBS) qdisc offload from Traffic Control system. This support enable us to leverage the Forwarding and Queuing for Time-Sensitive Streams (FQTSS) features from Intel i210 Ethernet Controller. FQTSS is the former 802.1Qav standard which was merged into 802.1Q in 2014. It enables traffic prioritization and bandwidth reservation via the Credit-Based Shaper which is implemented in hardware by i210 controller. The patch introduces the igb_setup_tc() function which implements the support for CBS qdisc hardware offload in the IGB driver. CBS offload is the only traffic control offload supported by the driver at the moment. FQTSS transmission mode from i210 controller is automatically enabled by the IGB driver when the CBS is enabled for the first hardware queue. Likewise, FQTSS mode is automatically disabled when CBS is disabled for the last hardware queue. Changing FQTSS mode requires NIC reset. FQTSS feature is supported by i210 controller only. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Tested-by: Henrik Austad <henrik@austad.us> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-27 09:49:36 -07:00
Intiyaz Basha	c859e21a35	liquidio: xmit_more support Defer ringing the Tx doorbell if skb->xmit_more is set unless the Tx queue is full or stopped. To keep latency low, use a deferral limit of 8 packets. We chose 8 because Octeon can fetch at most 8 packets in a single PCI read, and our tests show that 8 results in low latency. Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:35:23 +09:00
John Allen	2a1bf51111	ibmvnic: Fix failover error path for non-fatal resets For all non-fatal reset conditions, the hypervisor will send a failover when we attempt to initialize the crq and the vnic client is expected to handle that failover instead of the existing non-fatal reset. To handle this, we need to return from init with a return code that indicates that we have hit this case. Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:23:58 +09:00
John Allen	c26eba03e4	ibmvnic: Update reset infrastructure to support tunable parameters Update ibmvnic reset infrastructure to include a new reset option that will allow changing of tunable parameters. There currently is no way to request different capabilities from the vnic server on the fly so this patch achieves this by resetting the driver and attempting to log in with the requested changes. If the reset operation fails, the old values of the tunable parameters are stored in the "fallback" struct and we attempt to login with the fallback values. Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:23:58 +09:00
Subash Abhinov Kasiviswanathan	ca32fb034c	net: qualcomm: rmnet: Add support for GRO Add gro_cells so that rmnet devices can call gro_cells_receive instead of netif_receive_skb. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:10:23 +09:00
Subash Abhinov Kasiviswanathan	192c4b5d48	net: qualcomm: rmnet: Add support for 64 bit stats Implement 64 bit per cpu stats. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:10:23 +09:00
Subash Abhinov Kasiviswanathan	85355d775f	net: qualcomm: rmnet: Always assign rmnet dev in deaggregation path The rmnet device needs to assigned for all packets in the deaggregation path based on the mux id, so the check is not needed. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:10:23 +09:00
Subash Abhinov Kasiviswanathan	2ffbbf0f91	net: qualcomm: rmnet: Fix the return value of rmnet_rx_handler() Since packet is always consumed by rmnet_rx_handler(), we always return RX_HANDLER_CONSUMED. There is no need to pass on this value through multiple functions. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:10:23 +09:00
David S. Miller	8ab190fbe9	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2017-10-26 This series contains fixes to e1000, igb, ixgbe and i40e. Vincenzo Maffione fixes a potential race condition which would result in the interface being up but transmits are disabled in the hardware. Colin Ian King fixes a possible NULL pointer dereference in e1000, which was found by Coverity. Jean-Philippe Brucker fixes a possible kernel panic when a driver cannot map a transmit buffer, which is caused by an erroneous test. Alex provides a fix for ixgbe, which is a partial revert of the commit `ffed21bcee` ("ixgbe: Don't bother clearing buffer memory for descriptor rings") because the previous commit messed up the exception handling path by adding the count back in when we did not need to. Also fixed a typo, where the transmit ITR setting was being used to determine if we were using adaptive receive interrupt moderation or not. Lastly, fixed a memory leak by including programming descriptors in the cleaned count. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:05:34 +09:00
Sathya Perla	cd66358e52	bnxt_en: alloc tc_info{} struct only when tc flower is enabled TC flower is not enabled on VFs and when there's no FW support. Alloc the tc_info{} struct at init time only when TC flower is being enabled. Signed-off-by: Sathya Perla <sathya.perla@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:45 +09:00
Sathya Perla	5a84acbebb	bnxt_en: query cfa flow stats periodically to compute 'lastused' attribute This patch implements periodic querying of cfa flow stats in batches to compute the 'lastused' attribute of TC flow stats. Signed-off-by: Sathya Perla <sathya.perla@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:45 +09:00
Sathya Perla	f484f6782e	bnxt_en: add hwrm FW cmds for cfa_encap_record and decap_filter Add routines for issuing the hwrm_cfa_encap_record_alloc/free and hwrm_cfa_decap_filter_alloc/free FW cmds needed for supporting vxlan encap/decap offload. Signed-off-by: Sathya Perla <sathya.perla@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:45 +09:00
Sathya Perla	8c95f773b4	bnxt_en: add support for Flower based vxlan encap/decap offload This patch adds IPv4 vxlan encap/decap action support to TC-flower offload. For vxlan encap, the driver maintains a tunnel encap hash-table. When a new flow with a tunnel encap action arrives, this table is looked up; if an encap entry exists, it uses the already programmed encap_record_handle as the tunnel_handle in the hwrm_cfa_flow_alloc cmd. Else, a new encap node is added and the L2 header fields are queried via a route lookup. hwrm_cfa_encap_record_alloc cmd is used to create a new encap record and the encap_record_handle is used as the tunnel_handle while adding the flow. For vxlan decap, the driver maintains a tunnel decap hash-table. When a new flow with a tunnel decap action arrives, this table is looked up; if a decap entry exists, it uses the already programmed decap_filter_handle as the tunnel_handle in the hwrm_cfa_flow_alloc cmd. Else, a new decap node is added and a decap_filter_handle is alloc'd via the hwrm_cfa_decap_filter_alloc cmd. This handle is used as the tunnel_handle while adding the flow. The code to issue the HWRM FW cmds is introduced in a follow-up patch. Signed-off-by: Sathya Perla <sathya.perla@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:45 +09:00
Michael Chan	f8503969d2	bnxt_en: Refactor and simplify coalescing code. The mapping of the ethtool coalescing parameters to hardware parameters is now done in bnxt_hwrm_set_coal_params(). The same function can handle both RX and TX settings. The code is now more clear. Some adjustments have been made to get better hardware settings. The coal_frames setting is now accurately set in hardware. The max_timer is set to coal_ticks value. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:45 +09:00
Michael Chan	18775aa8a9	bnxt_en: Reorganize the coalescing parameters. The current IRQ coalescing logic is a little messy. The ethtool parameters are mapped to hardware parameters in a way that is difficult to understand. The first step is to better organize the parameters by adding the new structure bnxt_coal. The structure is used by both the RX and TX sets of coalescing parameters. Adjust the default coal_ticks to 14 us and 28 us for RX and TX. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:45 +09:00
Vasundhara Volam	49f7972fd1	bnxt_en: Add ethtool reset method This is a firmware internal reset after driver is unloaded. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:45 +09:00
Michael Chan	7eb9bb3a0c	bnxt_en: Check maximum supported MTU from firmware. Some NICs have a firmware enforced maximum MTU setting by management firmware. Set up netdev->max_mtu accordingly. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:44 +09:00
Michael Chan	c1a7bdff17	bnxt_en: Optimize .ndo_set_mac_address() for VFs. No need to call bnxt_approve_mac() which will send a message to the PF if the MAC address hasn't changed. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:44 +09:00
Michael Chan	431aa1eb20	bnxt_en: Get firmware package version one time. The current code retrieves the firmware package version from firmware everytime ethtool -i is run. There is no reason to do that as the firmware will not change while the driver is loaded. Get the version once at init time. Also, display the full 4-part firmware version string and remove the less useful interface spec version. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:44 +09:00
Michael Chan	e0ad8fc598	bnxt_en: Check for zero length value in bnxt_get_nvram_item(). Return -EINVAL if the length is zero and not proceed to do essentially nothing. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:44 +09:00
Rob Miller	618784e3ee	bnxt_en: adding PCI ID for SMARTNIC VF support Signed-off-by: Rob Miller <rmiller@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:44 +09:00
Ray Jui	8ed693b7bb	bnxt_en: Add PCIe device ID for bcm58804 Add new PCIe device ID and chip number for bcm58804 Signed-off-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-28 00:02:44 +09:00

1 2 3 4 5 ...

71772 Commits