OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
sudarsana.kalluru@cavium.com	f870a3c672	qed*: Fix possible overflow for status block id field. Value for status block id could be more than 256 in 100G mode, need to update its data type from u8 to u16. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-04 12:31:02 -04:00
stephen hemminger	2be0f26445	netvsc: make sure napi enabled before vmbus_open This fixes a race where vmbus callback for new packet arriving could occur before NAPI is initialized. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-04 11:08:36 -04:00
Pavel Belous	5900eca1ac	aquantia: Fix driver name reported by ethtool V2: using "aquantia" subsystem tag. The command "ethtool -i ethX" should display driver name (driver: atlantic) instead vendor name (driver: aquantia). Signed-off-by: Pavel Belous <pavel.belous@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-04 11:07:20 -04:00
Zhu Yanjun	5d826b7b98	forcedeth: remove unnecessary carrier status check Since netif_carrier_on() will do nothing if device's carrier is already on, so it's unnecessary to do carrier status check. It's the same for netif_carrier_off(). Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-04 10:57:41 -04:00
Nathan Fontenot	7c3e7de3f3	ibmvnic: Move queue restarting in ibmvnic_tx_complete Restart of the subqueue should occur outside of the loop processing any tx buffers instead of doing this in the middle of the loop. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 11:33:06 -04:00
Thomas Falcon	94ca305fd8	ibmvnic: Record SKB RX queue during poll Map each RX SKB to the RX queue associated with the driver's RX SCRQ. This should improve the RX CPU load balancing issues seen by the performance team. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 11:33:05 -04:00
Nathan Fontenot	ca05e31674	ibmvnic: Continue skb processing after skb completion error There is not a need to stop processing skbs if we encounter a skb that has a receive completion error. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 11:33:05 -04:00
Nathan Fontenot	161b8a8138	ibmvnic: Check for driver reset first in ibmvnic_xmit Move the check for the driver resetting to the first thing in ibmvnic_xmit(). Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 11:33:05 -04:00
Nathan Fontenot	46293b940f	ibmvnic: Wait for any pending scrqs entries at driver close When closing the ibmvnic driver we need to wait for any pending sub crq entries to ensure they are handled. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 11:33:05 -04:00
Nathan Fontenot	b41b83e9a7	ibmvnic: Clean up tx pools when closing When closing the ibmvnic driver, most notably during the reset path, the tx pools need to be cleaned to ensure there are no hanging skbs that need to be free'ed. The need for this was found during debugging a loss of network traffic after handling a driver reset. The underlying cause was some skbs in the tx pool that were never free'ed. As a result the upper network layers never tried a re-send since it believed the driver still had the skb. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 11:33:04 -04:00
Nathan Fontenot	e0ebe942f4	ibmvnic: Whitespace correction in release_rx_pools Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 11:33:04 -04:00
Nathan Fontenot	c7bac00b40	ibmvnic: Delete napi's when releasing driver resources The napi structs allocated at drivier initializatio need to be free'ed when releasing the drivers resources. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 11:33:04 -04:00
Nathan Fontenot	ed651a1087	ibmvnic: Updated reset handling The ibmvnic driver has multiple handlers for resetting the driver depending on the reason the reset is needed (failover, lpm, fatal erors,...). All of the reset handlers do essentially the same thing, this patch moves this work to a common reset handler. By doing this we also allow the driver to better handle situations where we can get a reset while handling a reset. The updated reset handling works by adding a reset work item to the list of resets and then scheduling work to perform the reset. This step is necessary because we can receive a reset in interrupt context and we want to handle the reset out of interrupt context. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 11:33:04 -04:00
Nathan Fontenot	90c8014c2b	ibmvnic: Replace is_closed with state field Replace the is_closed flag in the ibmvnic adapter strcut with a more comprehensive state field that tracks the current state of the driver. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 11:33:04 -04:00
Nathan Fontenot	bfc32f2973	ibmvnic: Move resource initialization to its own routine Move all of the calls to initialize resources for the driver to a separate routine. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 11:33:03 -04:00
Daniele Palmas	4c54dc0277	net: usb: qmi_wwan: add Telit ME910 support This patch adds support for Telit ME910 PID 0x1100. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 09:51:25 -04:00
YueHaibing	37a7fdf289	tg3: don't clear stats while tg3_close Now tg3 NIC's stats will be cleared after ifdown/ifup. bond_get_stats traverse its salves to get statistics,cumulative the increment.If a tg3 NIC is added to bonding as a slave,ifdown/ifup will cause bonding's stats become tremendous value (ex.1638.3 PiB) because of negative increment. Fixes: `92feeabf3f` ("tg3: Save stats across chip resets") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 09:51:25 -04:00
Daniel Borkmann	4d463c4dbc	xdp: use common helper for netlink extended ack reporting Small follow-up to `d74a32acd5` ("xdp: use netlink extended ACK reporting") in order to let drivers all use the same NL_SET_ERR_MSG_MOD() helper macro for reporting. This also ensures that we consistently add the driver's prefix for dumping the report in user space to indicate that the error message is driver specific and not coming from core code. Furthermore, NL_SET_ERR_MSG_MOD() now reuses NL_SET_ERR_MSG() and thus makes all macros check the pointer as suggested. References: https://www.spinics.net/lists/netdev/msg433267.html Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 09:51:24 -04:00
David Cai	f6fec61eb5	smsc911x: Adding support for Micochip LAN9250 Ethernet controller Adding support for Microchip LAN9250 Ethernet controller. Signed-off-by: David Cai <david.cai@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-03 09:41:52 -04:00
Linus Torvalds	89c9fea3c8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: tty: fix comment for __tty_alloc_driver() init/main: properly align the multi-line comment init/main: Fix double "the" in comment Fix dead URLs to ftp.kernel.org drivers: Clean up duplicated email address treewide: Fix typo in xml/driver-api/basics.xml tools/testing/selftests/powerpc: remove redundant CFLAGS in Makefile: "-Wall -O2 -Wall" -> "-O2 -Wall" selftests/timers: Spelling s/privledges/privileges/ HID: picoLCD: Spelling s/REPORT_WRTIE_MEMORY/REPORT_WRITE_MEMORY/ net: phy: dp83848: Fix Typo UBI: Fix typos Documentation: ftrace.txt: Correct nice value of 120 priority net: fec: Fix typo in error msg and comment treewide: Fix typos in printk	2017-05-02 19:09:35 -07:00
Linus Torvalds	8d65b08deb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Millar: "Here are some highlights from the 2065 networking commits that happened this development cycle: 1) XDP support for IXGBE (John Fastabend) and thunderx (Sunil Kowuri) 2) Add a generic XDP driver, so that anyone can test XDP even if they lack a networking device whose driver has explicit XDP support (me). 3) Sparc64 now has an eBPF JIT too (me) 4) Add a BPF program testing framework via BPF_PROG_TEST_RUN (Alexei Starovoitov) 5) Make netfitler network namespace teardown less expensive (Florian Westphal) 6) Add symmetric hashing support to nft_hash (Laura Garcia Liebana) 7) Implement NAPI and GRO in netvsc driver (Stephen Hemminger) 8) Support TC flower offload statistics in mlxsw (Arkadi Sharshevsky) 9) Multiqueue support in stmmac driver (Joao Pinto) 10) Remove TCP timewait recycling, it never really could possibly work well in the real world and timestamp randomization really zaps any hint of usability this feature had (Soheil Hassas Yeganeh) 11) Support level3 vs level4 ECMP route hashing in ipv4 (Nikolay Aleksandrov) 12) Add socket busy poll support to epoll (Sridhar Samudrala) 13) Netlink extended ACK support (Johannes Berg, Pablo Neira Ayuso, and several others) 14) IPSEC hw offload infrastructure (Steffen Klassert)" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2065 commits) tipc: refactor function tipc_sk_recv_stream() tipc: refactor function tipc_sk_recvmsg() net: thunderx: Optimize page recycling for XDP net: thunderx: Support for XDP header adjustment net: thunderx: Add support for XDP_TX net: thunderx: Add support for XDP_DROP net: thunderx: Add basic XDP support net: thunderx: Cleanup receive buffer allocation net: thunderx: Optimize CQE_TX handling net: thunderx: Optimize RBDR descriptor handling net: thunderx: Support for page recycling ipx: call ipxitf_put() in ioctl error path net: sched: add helpers to handle extended actions qed*: Fix issues in the ptp filter config implementation. qede: Fix concurrency issue in PTP Tx path processing. stmmac: Add support for SIMATIC IOT2000 platform net: hns: fix ethtool_get_strings overflow in hns driver tcp: fix wraparound issue in tcp_lp bpf, arm64: fix jit branch offset related to ldimm64 bpf, arm64: implement jiting of BPF_XADD ...	2017-05-02 16:40:27 -07:00
Linus Torvalds	5a0387a8a8	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "Here is the crypto update for 4.12: API: - Add batch registration for acomp/scomp - Change acomp testing to non-unique compressed result - Extend algorithm name limit to 128 bytes - Require setkey before accept(2) in algif_aead Algorithms: - Add support for deflate rfc1950 (zlib) Drivers: - Add accelerated crct10dif for powerpc - Add crc32 in stm32 - Add sha384/sha512 in ccp - Add 3des/gcm(aes) for v5 devices in ccp - Add Queue Interface (QI) backend support in caam - Add new Exynos RNG driver - Add ThunderX ZIP driver - Add driver for hardware random generator on MT7623 SoC" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (101 commits) crypto: stm32 - Fix OF module alias information crypto: algif_aead - Require setkey before accept(2) crypto: scomp - add support for deflate rfc1950 (zlib) crypto: scomp - allow registration of multiple scomps crypto: ccp - Change ISR handler method for a v5 CCP crypto: ccp - Change ISR handler method for a v3 CCP crypto: crypto4xx - rename ce_ring_contol to ce_ring_control crypto: testmgr - Allow ecb(cipher_null) in FIPS mode Revert "crypto: arm64/sha - Add constant operand modifier to ASM_EXPORT" crypto: ccp - Disable interrupts early on unload crypto: ccp - Use only the relevant interrupt bits hwrng: mtk - Add driver for hardware random generator on MT7623 SoC dt-bindings: hwrng: Add Mediatek hardware random generator bindings crypto: crct10dif-vpmsum - Fix missing preempt_disable() crypto: testmgr - replace compression known answer test crypto: acomp - allow registration of multiple acomps hwrng: n2 - Use devm_kcalloc() in n2rng_probe() crypto: chcr - Fix error handling related to 'chcr_alloc_shash' padata: get_next is never NULL crypto: exynos - Add new Exynos RNG driver ...	2017-05-02 15:53:46 -07:00
Michael S. Tsirkin	9b2bbdb227	virtio: wrap find_vqs We are going to add more parameters to find_vqs, let's wrap the call so we don't need to tweak all drivers every time. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-05-02 23:41:42 +03:00
Sunil Goutham	773225388d	net: thunderx: Optimize page recycling for XDP Driver follows a method of taking one extra reference on the page for recycling which is fine in usual packet path where each 64KB page is segmented into multiple receive buffers. But in XDP mode since there is just one receive buffer per page taking extra page reference itself becomes big bottleneck consuming ~50% of CPU cycles due to atomic operations. This patch adds a internal ref count in pgcache for each page and additional page references are taken in a batch instead of just one at a time. Internal i.e 'pgcache->ref_count' and page's i.e 'page->_refcount' counters are compared to check page's recyclability. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:22 -04:00
Sunil Goutham	e3d06ff9ec	net: thunderx: Support for XDP header adjustment When in XDP mode reserve XDP_PACKET_HEADROOM bytes at the start of receive buffer for XDP program to modify headers and adjust packet start. Additional code changes done to handle such packets. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:22 -04:00
Sunil Goutham	16f2bccda7	net: thunderx: Add support for XDP_TX Adds support for XDP_TX i.e transmits packet out of the XDP TX queue mapped to the corresponding Rx queue on which packet is received. Since SQ for XDP TX will be used only on a single cpu i.e SQ description creation and freeing, using atomic free count is not necessary and will become a bottleneck. Hence added a separate 'xdp_free_cnt' used for SQs designated for XDP to track descriptor free count. Changes also include - A new entry 'xdp_page' is added to save transmitted packet's page pointer for later cleanup. - XDP Tx SQ's doorbell is ringed once per NAPI instance. - Retrieving designated SQ for packets being sent out by stack via 'nicvf_xmit'. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:22 -04:00
Sunil Goutham	c56d91ce38	net: thunderx: Add support for XDP_DROP Adds support for XDP_DROP. Also since in XDP mode there is just a single buffer per page, made changes to recycle DMA mapping info as well along with pages. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:21 -04:00
Sunil Goutham	05c773f52b	net: thunderx: Add basic XDP support Adds basic XDP support i.e attaching a BPF program to an interface. Also takes care of allocating separate Tx queues for XDP path and for network stack packet transmission. This patch doesn't support handling of any of the XDP actions, all are treated as XDP_PASS i.e packets will be handed over to the network stack. Changes also involve allocating one receive buffer per page in XDP mode and multiple in normal mode i.e when no BPF program is attached. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:21 -04:00
Sunil Goutham	927987f39f	net: thunderx: Cleanup receive buffer allocation Get rid of unnecessary double pointer references and type casting in receive buffer allocation code. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:21 -04:00
Sunil Goutham	0dada88b8c	net: thunderx: Optimize CQE_TX handling Optimized CQE handling with below changes - Feeing descriptors back to SQ in bulk i.e once per NAPI instance instead for every CQE_TX, this will reduce number of atomic updates to 'sq->free_cnt'. - Checking errors in CQE_TX and CQE_RX before calling appropriate fn()s to update error stats i.e reduce branching. Also removed debug messages in packet handling path which otherwise causes issues if DEBUG is enabled. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:21 -04:00
Sunil Goutham	5e848e4c5d	net: thunderx: Optimize RBDR descriptor handling Receive buffer's physical address or iova will anyway not go beyond 49bits, since it is the max supported HW address. As per perf, updating bitfields i.e buf_addr:42 in RBDR descriptor entry consumes lots of cpu cycles, hence changed it to a 64bit field with alignment requirements taken care of. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:20 -04:00
Sunil Goutham	5836b44297	net: thunderx: Support for page recycling Adds support for page recycling for allocating receive buffers to reduce cost of refilling RBDR ring. Also got rid of using compound pages when pagesize is 4K, only order-0 pages now. Only page is recycled, DMA mappings still needs to be done for every receive buffer allocated due to following constraints - Cannot have just one receive buffer per 64KB page. - There is just one buffer ring shared across 8 Rx queues, so buffers of same page can go to any Rx queue. - HW gives buffer address where packet has been DMA'ed and not the index into buffer ring. This makes it not possible to resue DMA mapping info. So unfortunately have to go through costly mapping route for every buffer. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:20 -04:00
sudarsana.kalluru@cavium.com	8d3f87d8cd	qed*: Fix issues in the ptp filter config implementation. PTP hardware filter configuration performed by the driver for a given user requested config is not correct for some of the PTP modes. Following changes are needed for PTP config-filter implementation. 1. NIG_REG_TX_PTP_EN register - Bits 0/1/2 respectively enables TimeSync/"V1 frame format support"/"V2 frame format support" on the TX side. Set the associated bits based on the user request. 2. ptp4l application fails to operate in Peer Delay mode. Following changes are needed to fix this, a. Driver should enable (set to 0) DA #1-related bits for IPv4, IPv6 and MAC destination addresses in these registers: NIG_REG_TX_LLH_PTP_RULE_MASK NIG_REG_LLH_PTP_RULE_MASK b. NIG_REG_LLH_PTP_PARAM_MASK/NIG_REG_TX_LLH_PTP_PARAM_MASK should be set to 0x0 in all modes. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:33:01 -04:00
sudarsana.kalluru@cavium.com	461eec1201	qede: Fix concurrency issue in PTP Tx path processing. PTP Tx timestamping data structures are not protected against the concurrent access in the Tx paths. Protecting the same using atomic bit locks. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:33:01 -04:00
Jan Kiszka	212c7fd614	stmmac: Add support for SIMATIC IOT2000 platform The IOT2000 is industrial controller platform, derived from the Intel Galileo Gen2 board. The variant IOT2020 comes with one LAN port, the IOT2040 has two of them. They can be told apart based on the board asset tag in the DMI table. Based on patch by Sascha Weisenberger. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Sascha Weisenberger <sascha.weisenberger@siemens.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:31:27 -04:00
Timmy Li	412b65d15a	net: hns: fix ethtool_get_strings overflow in hns driver hns_get_sset_count() returns HNS_NET_STATS_CNT and the data space allocated is not enough for ethtool_get_strings(), which will cause random memory corruption. When SLAB and DEBUG_SLAB are both enabled, memory corruptions like the the following can be observed without this patch: [ 43.115200] Slab corruption (Not tainted): Acpi-ParseExt start=ffff801fb0b69030, len=80 [ 43.115206] Redzone: 0x9f911029d006462/0x5f78745f31657070. [ 43.115208] Last user: [<5f7272655f746b70>](0x5f7272655f746b70) [ 43.115214] 010: 70 70 65 31 5f 74 78 5f 70 6b 74 00 6b 6b 6b 6b ppe1_tx_pkt.kkkk [ 43.115217] 030: 70 70 65 31 5f 74 78 5f 70 6b 74 5f 6f 6b 00 6b ppe1_tx_pkt_ok.k [ 43.115218] Next obj: start=ffff801fb0b69098, len=80 [ 43.115220] Redzone: 0x706d655f6f666966/0x9f911029d74e35b. [ 43.115229] Last user: [<ffff0000084b11b0>](acpi_os_release_object+0x28/0x38) [ 43.115231] 000: 74 79 00 6b 6b 6b 6b 6b 70 70 65 31 5f 74 78 5f ty.kkkkkppe1_tx_ [ 43.115232] 010: 70 6b 74 5f 65 72 72 5f 63 73 75 6d 5f 66 61 69 pkt_err_csum_fai Signed-off-by: Timmy Li <lixiaoping3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:08:21 -04:00
Vivien Didelot	931d182239	net: dsa: mv88e6xxx: add VTU support for 88E6390 The 6390 family of chips use only 2 of the 3 VTU Data registers to pack the MemberTag and PortState VLAN data. This means that they must be written or read before or after each VTU/STU operations. Implement this variant to add support for VTU with such chips. These chips have a 13th bit for the VID thus set their max_vid to 8191. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:13 -04:00
Vivien Didelot	1ac758648b	net: dsa: mv88e6xxx: support the VTU Page bit Newer chips such as the 88E6390 have a VTU Page bit in the VTU VID register to specify a 13th bit for the VID. This can be used to support 8K VLANs. When dumping the whole VTU, all VID bits must be set to one, including this VTU Page bit. Add support for VID greater than 4095. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:13 -04:00
Vivien Didelot	567aa59a8b	net: dsa: mv88e6xxx: simplify VTU entry getter Make the code which fetches or initializes a new VTU entry more concise. This allows us the get rid of the old underscore prefix naming. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:13 -04:00
Vivien Didelot	bf7d71c045	net: dsa: mv88e6xxx: make VTU helpers static Now that we have chip operations for VTU accesses, mark all helpers from global1_vtu.c as static. Only the various implementations of the GetNext, LoadPurge and Flush operations need to be exposed. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:12 -04:00
Vivien Didelot	0ad5daf6ba	net: dsa: mv88e6xxx: add VTU Load/Purge operation Add a new vtu_loadpurge operation to the chip info structure to differ the various implementations of the VTU accesses. Now that the STU handling is abstracted behind VTU operations, kill the obsolete MV88E6XXX_FLAG_STU flag. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:12 -04:00
Vivien Didelot	f1394b78a6	net: dsa: mv88e6xxx: add VTU GetNext operation Add a new vtu_getnext operation to the chip info structure to differ the various implementations of the VTU accesses. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:12 -04:00
Vivien Didelot	021e64ff76	net: dsa: mv88e6xxx: load STU entry with VTU entry Now that the code writes both VTU and STU data when loading a VTU entry, load the corresponding STU entry at the same time. This allows us to get rid of the STU management in the _mv88e6xxx_vtu_new helper and thus remove the separate implementations of STU Load/Purge and STU GetNext, as well as the unused family checks. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:11 -04:00
Vivien Didelot	ef6fcea37f	net: dsa: mv88e6xxx: get STU entry on VTU GetNext Now that the code reads both VTU and STU data on VTU GetNext operation, fetch the STU entry data of a VTU entry at the same time. The STU data bits are masked with the VTU data bits and they are now all read at the same time a VTU GetNext operation is issued. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:11 -04:00
Vivien Didelot	66a8e1f933	net: dsa: mv88e6xxx: move STU GetNext operation Extract the generic portion of code to issue an STU GetNext operation, which will be used in other implementations. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:11 -04:00
Vivien Didelot	c499a64f34	net: dsa: mv88e6xxx: move VTU Data accessors The code to access the VTU Data registers currently only supports the 88E6185 family and alike: 2-bit membership adjacent to 2-bit port state. Even though the 88E6352 family introduced an indirect table to program the VLAN Spanning Tree states, the usage of the VTU Data registers remains the same regardless the VTU or STU operation. Now that the mv88e6xxx_vtu_entry structure contains both port membership and states data, factorize the code to access them in global1_vtu.c. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:11 -04:00
Vivien Didelot	f169e5ee5f	net: dsa: mv88e6xxx: move generic VTU GetNext Even though every switch model has a different way to access the VTU Data bits, the base implementation of the VTU GetNext operation remains the same: wait, write the first VID to iterate from, start the operation, and read the next VID. Move this generic implementation into global1_vtu.c and abstract the handling of the start VID (similarly to the ATU GetNext implementation), before introducing a new chip operation for specific chips. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:10 -04:00
Vivien Didelot	3afb4bde6f	net: dsa: mv88e6xxx: move VTU VID accessors Add helpers to access the VTU VID register in the global1_vtu.c file. At the same time, move mv88e6xxx_g1_vtu_vid_write at the beginning of _mv88e6xxx_vtu_loadpurge, which adds no functional changes but makes future patches simpler. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:10 -04:00
Vivien Didelot	d2ca1ea18d	net: dsa: mv88e6xxx: move VTU SID accessors Add helpers to access the VTU SID register in the global1_vtu.c file. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:10 -04:00
Vivien Didelot	8ee51f6b4f	net: dsa: mv88e6xxx: move VTU FID accessors Add helpers to access the VTU FID register in the global1_vtu.c file. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:10 -04:00
Vivien Didelot	b486d7c95c	net: dsa: mv88e6xxx: move VTU flush Move the VTU flush operation to global1_vtu.c and call it from a mv88e6xxx_vtu_setup helper, similarly to the ATU and PVT setup. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:09 -04:00
Vivien Didelot	332aa5ccc8	net: dsa: mv88e6xxx: move VTU Operation accessors Move the helper functions to access the Global 1 VTU Operation register to a new global1_vtu.c file, and get rid of the old underscore prefix naming convention. This file will be extended will all VTU/STU related code. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:09 -04:00
Vivien Didelot	bd00e053ae	net: dsa: mv88e6xxx: split VTU entry data member VLAN aware Marvell chips can program 802.1Q VLAN membership as well as 802.1s per VLAN Spanning Tree state using the same 3 VTU Data registers. Some chips such as 88E6185 use different Data registers offsets for ports state and membership, and program them in a single operation. Other chips such as 88E6352 use the same register layout but program them in distinct operations (an indirect table is used for 802.1s.) Newer chips such as 88E6390 use the same offsets for both state and membership in distinct operations, thus require multiple data accesses. To correctly abstract this, split the "data" structure member of mv88e6xxx_vtu_entry in two "state" and "member" members, before adding VTU support for newer chips. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:09 -04:00
Vivien Didelot	3cf3c8469f	net: dsa: mv88e6xxx: add max VID to info Some chips don't have a VLAN Table Unit, most of them do have a 4K table, some others as the 88E6390 family has a 13th bit for the VID. Add a new max_vid member to the info structure, used to check the presence of a VTU as well as the value used to iterate from in VTU GetNext operations. This makes the MV88E6XXX_FLAG_VTU obsolete, thus remove it. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 15:03:09 -04:00
Ido Schimmel	b1e455260c	mlxsw: spectrum_router: Simplify VRF enslavement When a netdev is enslaved to a VRF master, its router interface (RIF) needs to be destroyed (if exists) and a new one created using the corresponding virtual router (VR). >From the driver's perspective, the above is equivalent to an inetaddr event sent for this netdev. Therefore, when a port netdev (or its uppers) are enslaved to a VRF master, call the same function that would've been called had a NETDEV_UP was sent for this netdev in the inetaddr notification chain. This patch also fixes a bug when a LAG netdev with an existing RIF is enslaved to a VRF. Before this patch, each LAG port would drop the reference on the RIF, but would re-join the same one (in the wrong VR) soon after. With this patch, the corresponding RIF is first destroyed and a new one is created using the correct VR. Fixes: `7179eb5acd` ("mlxsw: spectrum_router: Add support for VRFs") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 11:47:58 -04:00
David S. Miller	cedf90c0cc	mlx5-updates-2017-04-30 Or says: ================ mlx5 neigh update This series (whose code name is 'neigh update') from Hadar, enhances the mlx5 TC IP tunnel offloads to deal with changes to tunnel destination neighbours used in offloaded flows which involved encapsulation. In order to keep track on the validity state of such neighbours, we register a netevent notifier callback and act on NEIGH_UPDATE events: if a neighbour becomes valid, offload the related flows to HW (the other way around when neigh becomes invalid) and similarly when a neigh mac addresses changes. Since this traffic is offloaded from the host OS, the neighbour for the IP tunnel destination can mistakenly become STALE and deleted by the kernel since its 'used' value wasn't changed. To address that, we proactively update the neighbour 'used' value every DELAY_PROBE_TIME seconds, using time stamps generated by the existing driver code for HW flow counters. We use the DELAY_PROBE_TIME_UPDATE event to adjust the frequency of the updates. Prior to the core of the series, there's a patch from Saeed that introduces an extendable vport representor implementation scheme. It provides a separation between the eswitch to the netdev related aspects of the representors. We would like to thank Ido Schimmel and Ilya Lesokhin for their coaching && advice through the long design and review cycles while we struggled to understand and (hopefully correctly) implement the locking around the different driver flows(..) . - Or. ================= Misc Updates: From Tariq: Some small performance and trivial code optimization for mlx5 netdev driver - Optimize poll ICOSQ completion queue - Use prefetchw when a write is to follow - Use u8 as ownership type in mlx5e_get_cqe() From Eran: - Disable LRO by default on specific setups From Eli: - Small cleanup for E-Switch to avoid redundant allocation Thanks, Saeed. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJZBeGlAAoJEEg/ir3gV/o+AuMH/2OfS+diXrTMU90tQGfucurg cPuv2qSGzSzcyFOjMUKL/hgj+y8HcUNZVSqts/LPhUj5CbR8On2jo/3XZb405SMX 1eAffpD5JxDGm4xQCADwkDMRVOuWNbXlTceRMbD+8vDtboTDEZwbaXX20En6MIEW qcQPX0wv7/u8VpxGlhch2RRvl7+zBhhGCLNpwzkE5K+RggIPguE+F2hLpm+SH9PD AOHSuwGWnE40In8dZIPqsEnUsmsC9LmUQ17ip8S8dEtGDHMvwDelsKGZcRGMuwoz wSxpVTWn5JaU0EwN0t+rYtERFH18SqboG3OFwqb27qtRns6ALXIx3Dj1bb9sVBo= =3qOg -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2017-04-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2017-04-30 Or says: ================ mlx5 neigh update This series (whose code name is 'neigh update') from Hadar, enhances the mlx5 TC IP tunnel offloads to deal with changes to tunnel destination neighbours used in offloaded flows which involved encapsulation. In order to keep track on the validity state of such neighbours, we register a netevent notifier callback and act on NEIGH_UPDATE events: if a neighbour becomes valid, offload the related flows to HW (the other way around when neigh becomes invalid) and similarly when a neigh mac addresses changes. Since this traffic is offloaded from the host OS, the neighbour for the IP tunnel destination can mistakenly become STALE and deleted by the kernel since its 'used' value wasn't changed. To address that, we proactively update the neighbour 'used' value every DELAY_PROBE_TIME seconds, using time stamps generated by the existing driver code for HW flow counters. We use the DELAY_PROBE_TIME_UPDATE event to adjust the frequency of the updates. Prior to the core of the series, there's a patch from Saeed that introduces an extendable vport representor implementation scheme. It provides a separation between the eswitch to the netdev related aspects of the representors. We would like to thank Ido Schimmel and Ilya Lesokhin for their coaching && advice through the long design and review cycles while we struggled to understand and (hopefully correctly) implement the locking around the different driver flows(..) . - Or. ================= Misc Updates: From Tariq: Some small performance and trivial code optimization for mlx5 netdev driver - Optimize poll ICOSQ completion queue - Use prefetchw when a write is to follow - Use u8 as ownership type in mlx5e_get_cqe() From Eran: - Disable LRO by default on specific setups From Eli: - Small cleanup for E-Switch to avoid redundant allocation Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 11:47:10 -04:00
Mintz, Yuval	07ff2ed03b	qed: Prevent warning without CONFIG_RFS_ACCEL After removing the PTP related initialization from slowpath start, the remaining PTT entry is required only in case CONFIG_RFS_ACCEL is set. Otherwise, it leads to a warning due to it being unused. Fixes: `d179bd1699` ("qed: Acquire/release ptt_ptp lock when enabling/disabling PTP") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 11:42:51 -04:00
Ram Amrani	20b1bd96e9	qed: output the DPM status and WID count Output to the RDMA driver whether DPM mode is enabled or disabled in the HW and if so what is the number of WIDs it supports Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 11:42:15 -04:00
Ram Amrani	107392b75f	qed: align DPI configuration to HW requirements When calculating doorbell BAR partitioning round up the number of CPUs to the nearest power of 2 so the size of the DPI (per user section) configured in the hardware will be stored properly and not truncated. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 11:42:15 -04:00
Ram Amrani	e015d58b44	qed: verify RoCE resource bitmaps are released Add mechanism to verify RoCE resources are released prior to freeing the bitmaps. If this is not the case, print what resources were not released. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 11:42:14 -04:00
Ram Amrani	105361943d	qed: add error handling flow to TID deregistratin posting failure If the posting of the ramrod for the purpose of TID deregistration fails, abort the deregistration operation without using the FW's return code. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 11:42:14 -04:00
Ram Amrani	ba0154e964	qed: remove unused SQ error state The internal RoCE SQE QP state isn't being used. Instead we mark the QP as in regular error state. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 11:42:13 -04:00
Ram Amrani	793ea8a9c7	qed: configure the RoCE max message size Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 11:42:13 -04:00
Karim Eshapa	2faf265753	benet: Use time_before_eq for time comparison Use time_before_eq for time comparison more safe and dealing with timer wrapping to be future-proof. Signed-off-by: Karim Eshapa <karim.eshapa@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 11:12:46 -04:00
Jakub Kicinski	9861ce039c	virtio_net: make use of extended ack message reporting Try to carry error messages to the user via the netlink extended ack message attribute. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 10:35:48 -04:00
Jakub Kicinski	d957c0f711	nfp: make use of extended ack message reporting Try to carry error messages to the user via the netlink extended ack message attribute. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 10:35:47 -04:00
Abhishek Shah	733336262b	net: phy: Allow BCM5481x PHYs to setup internal TX/RX clock delay This patch allows users to enable/disable internal TX and/or RX clock delay for BCM5481x series PHYs so as to satisfy RGMII timing specifications. On a particular platform, whether TX and/or RX clock delay is required depends on how PHY connected to the MAC IP. This requirement can be specified through "phy-mode" property in the platform device tree. Signed-off-by: Abhishek Shah <abhishek.shah@broadcom.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:56:19 -04:00
Colin Ian King	d832565032	net: sunhme: fix spelling mistakes: "ParityErro" -> "ParityError" trivial fix to spelling mistakes in printk message. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:55:47 -04:00
Scott Wood	9b70de6d02	bnx2x: Align RX buffers The bnx2x driver is not providing proper alignment on the receive buffers it passes to build_skb(), causing skb_shared_info to be misaligned. skb_shared_info contains an atomic, and while PPC normally supports unaligned accesses, it does not support unaligned atomics. Aligning the size of rx buffers will ensure that page_frag_alloc() returns aligned addresses. This can be reproduced on PPC by setting the network MTU to 1450 (or other non-multiple-of-4) and then generating sufficient inbound network traffic (one or two large "wget"s usually does it), producing the following oops: Unable to handle kernel paging request for unaligned access at address 0xc00000ffc43af656 Faulting instruction address: 0xc00000000080ef8c Oops: Kernel access of bad area, sig: 7 [#1] SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: vmx_crypto powernv_rng rng_core powernv_op_panel leds_powernv led_class nfsd ip_tables x_tables autofs4 xfs lpfc bnx2x mdio libcrc32c crc_t10dif crct10dif_generic crct10dif_common CPU: 104 PID: 0 Comm: swapper/104 Not tainted 4.11.0-rc8-00088-g4c761da #2 task: c00000ffd4892400 task.stack: c00000ffd4920000 NIP: c00000000080ef8c LR: c00000000080eee8 CTR: c0000000001f8320 REGS: c00000ffffc33710 TRAP: 0600 Not tainted (4.11.0-rc8-00088-g4c761da) MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24082042 XER: 00000000 CFAR: c00000000080eea0 DAR: c00000ffc43af656 DSISR: 00000000 SOFTE: 1 GPR00: c000000000907f64 c00000ffffc33990 c000000000dd3b00 c00000ffcaf22100 GPR04: c00000ffcaf22e00 0000000000000000 0000000000000000 0000000000000000 GPR08: 0000000000b80008 c00000ffc43af636 c00000ffc43af656 0000000000000000 GPR12: c0000000001f6f00 c00000000fe1a000 000000000000049f 000000000000c51f GPR16: 00000000ffffef33 0000000000000000 0000000000008a43 0000000000000001 GPR20: c00000ffc58a90c0 0000000000000000 000000000000dd86 0000000000000000 GPR24: c000007fd0ed10c0 00000000ffffffff 0000000000000158 000000000000014a GPR28: c00000ffc43af010 c00000ffc9144000 c00000ffcaf22e00 c00000ffcaf22100 NIP [c00000000080ef8c] __skb_clone+0xdc/0x140 LR [c00000000080eee8] __skb_clone+0x38/0x140 Call Trace: [c00000ffffc33990] [c00000000080fb74] skb_clone+0x74/0x110 (unreliable) [c00000ffffc339c0] [c000000000907f64] packet_rcv+0x144/0x510 [c00000ffffc33a40] [c000000000827b64] __netif_receive_skb_core+0x5b4/0xd80 [c00000ffffc33b00] [c00000000082b2bc] netif_receive_skb_internal+0x2c/0xc0 [c00000ffffc33b40] [c00000000082c49c] napi_gro_receive+0x11c/0x260 [c00000ffffc33b80] [d000000066483d68] bnx2x_poll+0xcf8/0x17b0 [bnx2x] [c00000ffffc33d00] [c00000000082babc] net_rx_action+0x31c/0x480 [c00000ffffc33e10] [c0000000000d5a44] __do_softirq+0x164/0x3d0 [c00000ffffc33f00] [c0000000000d60a8] irq_exit+0x108/0x120 [c00000ffffc33f20] [c000000000015b98] __do_irq+0x98/0x200 [c00000ffffc33f90] [c000000000027f14] call_do_irq+0x14/0x24 [c00000ffd4923a90] [c000000000015d94] do_IRQ+0x94/0x110 [c00000ffd4923ae0] [c000000000008d90] hardware_interrupt_common+0x150/0x160 Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:50:19 -04:00
Dan Carpenter	77041e89ce	liquidio: silence a locking static checker warning Presumably we never hit this return, but static checkers complain that we need to unlock so we may as well fix that. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:41:07 -04:00
Dan Carpenter	66117a9d9a	qed: Unlock on error in qed_vf_pf_acquire() My static checker complains that we're holding a mutex on this error path. Let's goto exit instead of returning directly. Fixes: `b0bccb69eb` ("qed: Change locking scheme for VF channel") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:40:45 -04:00
lipeng	804ffe5c61	net: hns: support deferred probe when no mdio In the hip06 and hip07 SoCs, phy connect to mdio bus.The mdio module is probed with module_init, and, as such, is not guaranteed to probe before the HNS driver. So we need to support deferred probe. We check for probe deferral in the mac init, so we not init DSAF when there is no mdio, and free all resource, to later learn that we need to defer the probe. Signed-off-by: lipeng <lipeng321@huawei.com> Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com> Reviewed-by: Matthias Brugger <mbrugger@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:39:24 -04:00
lipeng	2fdd6bafe3	net: hns: support deferred probe when can not obtain irq In the hip06 and hip07 SoCs, the interrupt lines from the DSAF controllers are connected to mbigen hw module. The mbigen module is probed with module_init, and, as such, is not guaranteed to probe before the HNS driver. So we need to support deferred probe. Signed-off-by: lipeng <lipeng321@huawei.com> Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com> Reviewed-by: Matthias Brugger <mbrugger@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:39:24 -04:00
Jakub Kicinski	dbf637ff39	nfp: provide 256 bytes of XDP headroom in all configurations For legacy reasons NFP FW may be compiled to DMA packets to a constant offset into the buffer and use the space before it for metadata. This ensures that packets data always start at a certain offset regardless of the amount of preceding metadata. If rx offset is set to 0 there may still be up to 64 bytes of metadata but metadata will start at the beginning of the buffer, instead of: data_start_offset = rx_offset - meta_len Even though we make the buffers larger to accommodate up to 64 bytes of metadata, if there is only N bytes of metadata, we will end up with N bytes of headroom and 64 - N bytes of tailroom. Therefore we can't rely on that space for XDP headroom. Make sure we always allocate full 256 bytes. This, unfortunately, means we can't fit the headroom on an u8 any more. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:37:00 -04:00
Jakub Kicinski	85cb207ee3	nfp: don't completely refuse to work with old flashes Right now the required Service Process ABI version is still tied to max ID of known commands. For new NSP commands we are adding we are checking if NSP version is recent enough on command-by-command basis. The driver doesn't have to force the device to have the very latest flash, anything newer than 0.8 should do. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:37:00 -04:00
Jakub Kicinski	d38df0d364	nfp: avoid reading TX queue indexes from the device Reading TX queue indexes from the device memory on each interrupt is expensive. It's doubly expensive with XDP running since we have two TX rings to check there. If the software indexes indicate that the TX queue is completely empty, however, we don't need to look at the device completion index at all. The queuing CPU is doing a wmb() before kicking the device TX so we should be safe to assume on the CPU handling the completions will never see old value of the software copy of the index. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:37:00 -04:00
Jakub Kicinski	92e68195eb	nfp: do simple XDP TX buffer recycling On the RX path we follow the "drop if allocation of replacement buffer fails" rule. With XDP we extended that to the TX action, so if XDP prog returned TX but allocation of replacement RX buffer failed, we will drop the packet. To improve our XDP TX performance extend the idea of rings being always full to XDP TX rings. Pre-fill the XDP TX rings with RX buffers, and when XDP prog returns TX action swap the RX buffer with the next buffer from the TX ring. XDP TX complete will no longer free the buffers but let them sit on the TX ring and wait for swap with RX buffer, instead. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:37:00 -04:00
Jakub Kicinski	d78005a50f	nfp: drop rx_ring param from buffer allocation We will soon allocate RX buffers for caching on XDP TX rings. The rx_ring parameter passed to nfp_net_rx_alloc_one() is not actually used, remove it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:37:00 -04:00
Jakub Kicinski	46c505188b	nfp: replace -ENOTSUPP with -EOPNOTSUPP As Or points out in commit `423b3aecf2` ("net/mlx4: Change ENOTSUPP to EOPNOTSUPP"), ENOTSUPP is NFS specific error. Replace it with EOPNOTSUPP. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:37:00 -04:00
Willem de Bruijn	1d11e732e7	virtio-net: use netif_tx_napi_add for tx napi Avoid hashing the tx napi struct into napi_hash[], which is used for busy polling receive queues. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:33:04 -04:00
Girish Moodalbail	5e0740c445	geneve: fix incorrect setting of UDP checksum flag Creating a geneve link with 'udpcsum' set results in a creation of link for which UDP checksum will NOT be computed on outbound packets, as can be seen below. 11: gen0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link/ether c2:85:27:b6:b4:15 brd ff:ff:ff:ff:ff:ff promiscuity 0 geneve id 200 remote 192.168.13.1 dstport 6081 noudpcsum Similarly, creating a link with 'noudpcsum' set results in a creation of link for which UDP checksum will be computed on outbound packets. Fixes: `9b4437a5b8` ("geneve: Unify LWT and netdev handling.") Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:31:16 -04:00
Jiri Benc	baf4d78607	vxlan: do not output confusing error message The message "Cannot bind port X, err=Y" creates only confusion. In metadata based mode, failure of IPv6 socket creation is okay if IPv6 is disabled and no error message should be printed. But when IPv6 tunnel was requested, such failure is fatal. The vxlan_socket_create does not know when the error is harmless and when it's not. Instead of passing such information down to vxlan_socket_create, remove the message completely. It's not useful. We propagate the error code up to the user space and the port number comes from the user space. There's nothing in the message that the process creating vxlan interface does not know. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:30:13 -04:00
Jiri Benc	d074bf9600	vxlan: correctly handle ipv6.disable module parameter When IPv6 is compiled but disabled at runtime, __vxlan_sock_add returns -EAFNOSUPPORT. For metadata based tunnels, this causes failure of the whole operation of bringing up the tunnel. Ignore failure of IPv6 socket creation for metadata based tunnels caused by IPv6 not being available. Fixes: `b1be00a6c3` ("vxlan: support both IPv4 and IPv6 sockets in a single vxlan device") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:30:13 -04:00
Andy Shevchenko	90a1bb9816	bnx2x: Get rid of useless temporary variable Replace pattern int status; ... status = func(...); return status; by return func(...); No functional change intented. Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:28:00 -04:00
Andy Shevchenko	b77f016726	bnx2x: Reuse bnx2x_null_format_ver() Reuse bnx2x_null_format_ver() in functions where it's appropriated instead of open coded variant. Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:28:00 -04:00
Andy Shevchenko	55b218c12c	bnx2x: Replace custom scnprintf() Use scnprintf() when printing version instead of custom open coded variants. Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:28:00 -04:00
Alexandre Belloni	ae3696c167	net: macb: fix phy interrupt parsing Since `83a77e9ec4`, the phydev irq is explicitly set to PHY_POLL when there is no pdata. It doesn't work on DT enabled platforms because the phydev irq is already set by libphy before. Fixes: `83a77e9ec4` ("net: macb: Added PCI wrapper for Platform Driver.") Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:21:49 -04:00
David S. Miller	8dd5b698c2	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-04-30 This series contains updates to i40e and i40evf only. Jake provides majority of the changes in this series, starting with the renaming of a flag to avoid confusion. Then renamed a variable to a more meaningful name to clarify what is actually being done and to reduce confusion. Amortizes the wait time when initializing or disabling lots of VFs by using i40e_reset_all_vfs() and i40e_vsi_stop_rings_no_wait(). Cleaned up a unnecessary delay since pci_disable_sriov() already has its own delay, so need to add a additional delay when removing VFs. Avoid using the same name flags for both vsi->state and pf->state, to make code review easier and assist future work to use the correct state field when checking bits. Use DECLARE_BITMAP() to ensure that we always allocate enough space for flags. Replace hw_disabled_flags with the new _AUTO_DISABLED flags, which are more readable because we are not setting an *_ENABLED flag to disable the feature. Alex corrects a oversight where we were not reprogramming the ports after a reset, which was causing us to lose all of the receive tunnel offloads. Arnd Bergmann moves the declaration of a local variable to avoid a warning seen on architectures with larger pages about an unused variable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 11:33:37 -04:00
Eli Cohen	0a0ab1d2cc	net/mlx5: E-Switch, Avoid redundant memory allocation struct esw_mc_addr is a small struct that can be part of struct mlx5_eswitch. Define it as a field and not as a pointer and save the kzalloc call and then error flow handling. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-30 16:03:21 +03:00
Eran Ben Elisha	0f6e4cf674	net/mlx5e: Disable HW LRO when PCI is slower than link on striding RQ We will activate the HW LRO only on servers with PCI BW > MAX LINK BW, or when PCI BW > 16Gbps. On other cases we do not want LRO by default as LRO sessions might get timeout and add redundant software overhead. Tested: ethtool -k <ifs-name> \| grep large-receive-offload On systems with and without the limitations. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-30 16:03:20 +03:00
Tariq Toukan	b1b03bded1	net/mlx5e: Use u8 as ownership type in mlx5e_get_cqe() CQE ownership indication is as small as a single bit. Use u8 to speedup the comparison. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-30 16:03:19 +03:00
Tariq Toukan	ad78af9b88	net/mlx5e: Use prefetchw when a write is to follow "prefetchw()" prefetches the cacheline for write. Use it for skb->data, as soon we'll be copying the packet header there. Performance: Single-stream packet-rate tested with pktgen. Packets are dropped in tc level to zoom into driver data-path. Larger gain is expected for smaller packets, as less time is spent on handling SKB fragments, making the path shorter and the improvement more significant. --------------------------------------------- packet size \| before \| after \| gain \| 64B \| 4,113,306 \| 4,778,720 \| 16% \| 1024B \| 3,633,819 \| 3,950,593 \| 8.7% \| Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-30 16:03:17 +03:00
Tariq Toukan	1f5b1e47ee	net/mlx5e: Optimize poll ICOSQ completion queue UMR operations are more frequent and important. Check them first, and add a compiler branch predictor hint. According to current design, ICOSQ CQ can contain at most one pending CQE per napi. Poll function is optimized accordingly. Performance: Single-stream packet-rate tested with pktgen. Packets are dropped in tc level to zoom into driver data-path. Larger gain is expected for larger packet sizes, as BW is higher and UMR posts are more frequent. --------------------------------------------- packet size \| before \| after \| gain \| 64B \| 4,092,370 \| 4,113,306 \| 0.5% \| 1024B \| 3,421,435 \| 3,633,819 \| 6.2% \| Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-30 16:03:16 +03:00
Hadar Hen Zion	a2fa1fe5ad	net/mlx5e: Act on delay probe time updates The user can change delay_first_probe_time parameter through sysctl. Listen to NETEVENT_DELAY_PROBE_TIME_UPDATE notifications and update the intervals for updating the neighbours 'used' value periodic task and for flow HW counters query periodic task. Both of the intervals will be update only in case the new delay prob time value is lower the current interval. Since the driver saves only one min interval value and not per device, the users will be able to set lower interval value for updating neighbour 'used' value periodic task but they won't be able to schedule a higher interval for this periodic task. The used interval for scheduling neighbour 'used' value periodic task is the minimal delay prob time parameter ever seen by the driver. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-30 16:03:15 +03:00
Hadar Hen Zion	f6dfb4c3f2	net/mlx5e: Update neighbour 'used' state using HW flow rules counters When IP tunnel encapsulation rules are offloaded, the kernel can't see the traffic of the offloaded flow. The neighbour for the IP tunnel destination of the offloaded flow can mistakenly become STALE and deleted by the kernel since its 'used' value wasn't changed. To make sure that a neighbour which is used by the HW won't become STALE, we proactively update the neighbour 'used' value every DELAY_PROBE_TIME period, when packets were matched and counted by the HW for one of the tunnel encap flows related to this neighbour. The periodic task that updates the used neighbours is scheduled when a tunnel encap rule is successfully offloaded into HW and keeps re-scheduling itself as long as the representor's neighbours list isn't empty. Add, remove, lookup and status change operations done over the representor's neighbours list or the neighbour hash entry encaps list are all serialized by RTNL lock. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-30 16:03:14 +03:00
Hadar Hen Zion	232c001398	net/mlx5e: Add support to neighbour update flow In order to offload TC encap rules, the driver does a lookup for the IP tunnel neighbour according to the output device and the destination IP given by the user. To keep tracking after the validity state of such neighbours, we keep the neighbours information (pair of device pointer and destination IP) in a hash table maintained at the relevant egress representor and register to get NETEVENT_NEIGH_UPDATE events. When getting neighbour update netevent, we search for a match among the cached neighbours entries used for encapsulation. In case the neighbour isn't valid, we can't offload the flow into the HW. We cache the flow (requested matching and actions) in the driver and offload the rule later, when the neighbour is resolved and becomes valid. When a flow is only cached in the driver and not offloaded into HW yet, we use EAGAIN return value to mark it internally, the TC ndo still returns success. Listen to kernel neighbour update netevents to trace relevant neighbours validity state: 1. If a neighbour becomes valid, offload the related rules to HW. 2. If the neighbour becomes invalid, remove the related rules from HW. 3. If the neighbour mac address was changed, update the encap header. Remove all the offloaded rules using the old encap header from the HW and insert new rules to HW with updated encap header. Access to the neighbors hash table is protected by RTNL lock of its caller or by the table's spinlock. Details of the locking/synchronization among the different actions applied on the neighbour table: Add/remove operations - protected by RTNL lock of its caller (all TC commands are protected by RTNL lock). Add and remove operations are initiated only when the user inserts/removes a TC rule into/from the driver. Lookup/remove operations - since the lookup operation is done from netevent notifier block, RTNL lock can't be used (atomic context). Use the table's spin lock to protect lookups from TC user removal operation. bh is used since netevent can be called from a softirq context. Lookup/add operations - The hash table access functions are taking care of the protection between lookup and add operations. When adding/removing encap headers and rules to/from the HW, RTNL lock is used. It can happen when: 1. The user inserts/removes a TC rule into/from the driver (TC commands are protected by RTNL lock of it's caller). 2. The driver gets neighbour notification event, which reports about neighbour validity status change. Before adding/removing encap headers and rules to/from the HW, RTNL lock is taken. A neighbour hash table entry should be freed when its encap list is empty. Since The neighbour update netevent notification schedules a neighbour update work that uses the neighbour hash entry, it can't be freed unconditionally when the encap list becomes empty during TC delete rule flow. Use reference count to protect from freeing neighbour hash table entry while it's still in use. When the user asks to unregister a netdvice used by one of the neigbours, neighbour removal notification is received. Then we take a reference on the neighbour and don't free it until the relevant encap entries (and flows) are marked as invalid (not offloaded) and removed from HW. As long as the encap entry is still valid (checked under RTNL lock) we can safely access the neighbour device saved on mlx5e_neigh struct. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-30 16:03:13 +03:00
Hadar Hen Zion	37b498ff23	net/mlx5e: Add neighbour hash table to the representors Add hash table to the representors which is to be used by the next patch to save neighbours information in the driver. In order to offload IP tunnel encapsulation rules, the driver must find the tunnel dst neighbour according to the output device and the destination address given by the user. The next patch will cache the neighbors information in the driver to allow support in neigh update flow for tunnel encap rules. The neighbour entries are also saved in a list so we easily iterate over them when querying statistics in order to provide 'used' feedback to the kernel neighbour NUD core. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-30 16:03:12 +03:00
Hadar Hen Zion	033354d501	net/mlx5e: Read neigh parameters with proper locking The nud_state and hardware address fields are protected by the neighbour lock, we should acquire it before accessing those parameters. Use this lock to avoid inconsistency between the neighbour validity state and it's hardware address. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-30 16:03:10 +03:00
Hadar Hen Zion	0b67a38fe6	net/mlx5e: Use flag to properly monitor a flow rule offloading state Instead of relaying on the 'flow->rule' pointer value which can be valid or invalid (in case the FW returns an error while trying to offload the rule), monitor the rule state using a flag. In downstream patch which adds support to IP tunneling neigh update flow, a TC rule could be cached in the driver and not offloaded into the HW. In this case, the flow handle pointer stays NULL. Check the offloaded flag to properly deal with rules which are currently not offloaded when querying rule statistics. This patch doesn't add any new functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-30 16:03:09 +03:00
Hadar Hen Zion	1a8552bd81	net/mlx5e: Remove output device parameter from create encap header helpers definition Passing output device parameter to the helper functions that deal with creation of encapsulation headers is redundant. Output device parameter can be defined inside those helpers, no need to pass it. Refactor the code by removing the parameter from the function signature. This patch doesn't change any functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-30 16:03:08 +03:00

1 2 3 4 5 ...

67792 Commits