linux-sg2042

Commit Graph

Author	SHA1	Message	Date
David S. Miller	337f1b07db	Merge branch 'tcp-xmit-timer-rearming' Neal Cardwell says: ==================== tcp: fix xmit timer rearming to avoid stalls This patch series is a bug fix for a TCP loss recovery performance bug reported independently in recent netdev threads: (i) July 26, 2017: netdev thread "TCP fast retransmit issues" (ii) July 26, 2017: netdev thread: "[PATCH V2 net-next] TLP: Don't reschedule PTO when there's one outstanding TLP retransmission" Many thanks to Klavs Klavsen and Mao Wenan for the detailed reports, traces, and packetdrill test cases, which enabled us to root-cause this issue and verify the fix. - v1 -> v2: - In patch 2/3, changed an unclear comment in the pre-existing code in tcp_schedule_loss_probe() to be more clear (thanks to Eric Dumazet for suggesting we improve this). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:38:31 -07:00
Neal Cardwell	df92c8394e	tcp: fix xmit timer to only be reset if data ACKed/SACKed Fix a TCP loss recovery performance bug raised recently on the netdev list, in two threads: (i) July 26, 2017: netdev thread "TCP fast retransmit issues" (ii) July 26, 2017: netdev thread: "[PATCH V2 net-next] TLP: Don't reschedule PTO when there's one outstanding TLP retransmission" The basic problem is that incoming TCP packets that did not indicate forward progress could cause the xmit timer (TLP or RTO) to be rearmed and pushed back in time. In certain corner cases this could result in the following problems noted in these threads: - Repeated ACKs coming in with bogus SACKs corrupted by middleboxes could cause TCP to repeatedly schedule TLPs forever. We kept sending TLPs after every ~200ms, which elicited bogus SACKs, which caused more TLPs, ad infinitum; we never fired an RTO to fill in the holes. - Incoming data segments could, in some cases, cause us to reschedule our RTO or TLP timer further out in time, for no good reason. This could cause repeated inbound data to result in stalls in outbound data, in the presence of packet loss. This commit fixes these bugs by changing the TLP and RTO ACK processing to: (a) Only reschedule the xmit timer once per ACK. (b) Only reschedule the xmit timer if tcp_clean_rtx_queue() deems the ACK indicates sufficient forward progress (a packet was cumulatively ACKed, or we got a SACK for a packet that was sent before the most recent retransmit of the write queue head). This brings us back into closer compliance with the RFCs, since, as the comment for tcp_rearm_rto() notes, we should only restart the RTO timer after forward progress on the connection. Previously we were restarting the xmit timer even in these cases where there was no forward progress. As a side benefit, this commit simplifies and speeds up the TCP timer arming logic. We had been calling inet_csk_reset_xmit_timer() three times on normal ACKs that cumulatively acknowledged some data: 1) Once near the top of tcp_ack() to switch from TLP timer to RTO: if (icsk->icsk_pending == ICSK_TIME_LOSS_PROBE) tcp_rearm_rto(sk); 2) Once in tcp_clean_rtx_queue(), to update the RTO: if (flag & FLAG_ACKED) { tcp_rearm_rto(sk); 3) Once in tcp_ack() after tcp_fastretrans_alert() to switch from RTO to TLP: if (icsk->icsk_pending == ICSK_TIME_RETRANS) tcp_schedule_loss_probe(sk); This commit, by only rescheduling the xmit timer once per ACK, simplifies the code and reduces CPU overhead. This commit was tested in an A/B test with Google web server traffic. SNMP stats and request latency metrics were within noise levels, substantiating that for normal web traffic patterns this is a rare issue. This commit was also tested with packetdrill tests to verify that it fixes the timer behavior in the corner cases discussed in the netdev threads mentioned above. This patch is a bug fix patch intended to be queued for -stable relases. Fixes: `6ba8a3b19e` ("tcp: Tail loss probe (TLP)") Reported-by: Klavs Klavsen <kl@vsen.dk> Reported-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:38:31 -07:00
Neal Cardwell	a2815817ff	tcp: enable xmit timer fix by having TLP use time when RTO should fire Have tcp_schedule_loss_probe() base the TLP scheduling decision based on when the RTO should fire. This is to enable the upcoming xmit timer fix in this series, where tcp_schedule_loss_probe() cannot assume that the last timer installed was an RTO timer (because we are no longer doing the "rearm RTO, rearm RTO, rearm TLP" dance on every ACK). So tcp_schedule_loss_probe() must independently figure out when an RTO would want to fire. In the new TLP implementation following in this series, we cannot assume that icsk_timeout was set based on an RTO; after processing a cumulative ACK the icsk_timeout we see can be from a previous TLP or RTO. So we need to independently recalculate the RTO time (instead of reading it out of icsk_timeout). Removing this dependency on the nature of icsk_timeout makes things a little easier to reason about anyway. Note that the old and new code should be equivalent, since they are both saying: "if the RTO is in the future, but at an earlier time than the normal TLP time, then set the TLP timer to fire when the RTO would have fired". Fixes: `6ba8a3b19e` ("tcp: Tail loss probe (TLP)") Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:38:30 -07:00
Neal Cardwell	e1a10ef7fa	tcp: introduce tcp_rto_delta_us() helper for xmit timer fix Pure refactor. This helper will be required in the xmit timer fix later in the patch series. (Because the TLP logic will want to make this calculation.) Fixes: `6ba8a3b19e` ("tcp: Tail loss probe (TLP)") Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:38:30 -07:00
Xin Long	b91d532928	ipv6: set rt6i_protocol properly in the route when it is installed After commit `c2ed1880fd` ("net: ipv6: check route protocol when deleting routes"), ipv6 route checks rt protocol when trying to remove a rt entry. It introduced a side effect causing 'ip -6 route flush cache' not to work well. When flushing caches with iproute, all route caches get dumped from kernel then removed one by one by sending DELROUTE requests to kernel for each cache. The thing is iproute sends the request with the cache whose proto is set with RTPROT_REDIRECT by rt6_fill_node() when kernel dumps it. But in kernel the rt_cache protocol is still 0, which causes the cache not to be matched and removed. So the real reason is rt6i_protocol in the route is not set when it is allocated. As David Ahern's suggestion, this patch is to set rt6i_protocol properly in the route when it is installed and remove the codes setting rtm_protocol according to rt6i_flags in rt6_fill_node. This is also an improvement to keep rt6i_protocol consistent with rtm_protocol. Fixes: `c2ed1880fd` ("net: ipv6: check route protocol when deleting routes") Reported-by: Jianlin Shi <jishi@redhat.com> Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:10:18 -07:00
Eric Dumazet	2dda640040	net: fix keepalive code vs TCP_FASTOPEN_CONNECT syzkaller was able to trigger a divide by 0 in TCP stack [1] Issue here is that keepalive timer needs to be updated to not attempt to send a probe if the connection setup was deferred using TCP_FASTOPEN_CONNECT socket option added in linux-4.11 [1] divide error: 0000 [#1] SMP CPU: 18 PID: 0 Comm: swapper/18 Not tainted task: ffff986f62f4b040 ti: ffff986f62fa2000 task.ti: ffff986f62fa2000 RIP: 0010:[<ffffffff8409cc0d>] [<ffffffff8409cc0d>] __tcp_select_window+0x8d/0x160 Call Trace: <IRQ> [<ffffffff8409d951>] tcp_transmit_skb+0x11/0x20 [<ffffffff8409da21>] tcp_xmit_probe_skb+0xc1/0xe0 [<ffffffff840a0ee8>] tcp_write_wakeup+0x68/0x160 [<ffffffff840a151b>] tcp_keepalive_timer+0x17b/0x230 [<ffffffff83b3f799>] call_timer_fn+0x39/0xf0 [<ffffffff83b40797>] run_timer_softirq+0x1d7/0x280 [<ffffffff83a04ddb>] __do_softirq+0xcb/0x257 [<ffffffff83ae03ac>] irq_exit+0x9c/0xb0 [<ffffffff83a04c1a>] smp_apic_timer_interrupt+0x6a/0x80 [<ffffffff83a03eaf>] apic_timer_interrupt+0x7f/0x90 <EOI> [<ffffffff83fed2ea>] ? cpuidle_enter_state+0x13a/0x3b0 [<ffffffff83fed2cd>] ? cpuidle_enter_state+0x11d/0x3b0 Tested: Following packetdrill no longer crashes the kernel `echo 0 >/proc/sys/net/ipv4/tcp_timestamps` // Cache warmup: send a Fast Open cookie request 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 fcntl(3, F_SETFL, O_RDWR\|O_NONBLOCK) = 0 +0 setsockopt(3, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0 +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation is now in progress) +0 > S 0:0(0) <mss 1460,nop,nop,sackOK,nop,wscale 8,FO,nop,nop> +.01 < S. 123:123(0) ack 1 win 14600 <mss 1460,nop,nop,sackOK,nop,wscale 6,FO abcd1234,nop,nop> +0 > . 1:1(0) ack 1 +0 close(3) = 0 +0 > F. 1:1(0) ack 1 +0 < F. 1:1(0) ack 2 win 92 +0 > . 2:2(0) ack 2 +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4 +0 fcntl(4, F_SETFL, O_RDWR\|O_NONBLOCK) = 0 +0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0 +0 setsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0 +.01 connect(4, ..., ...) = 0 +0 setsockopt(4, SOL_TCP, TCP_KEEPIDLE, [5], 4) = 0 +10 close(4) = 0 `echo 1 >/proc/sys/net/ipv4/tcp_timestamps` Fixes: `19f6d3f3c8` ("net/tcp-fastopen: Add new API support") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Wei Wang <weiwan@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 09:34:51 -07:00
David S. Miller	4d2bbb0ef8	Here is a batman-adv bugfix: - fix TT sync flag inconsistency problems, which can lead to excess packets, by Linus Luessing -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlmB3ikWHHN3QHNpbW9u d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoUCUEADIvPfBxzon12xBV4JbBpTMnIwl 09heUwE1FxCj2UoTV2QJ+O9ZQb1yc7ARJksymP30T96q9bNlDiNFVquXcuQLA+ba aoSowejLfTHLvOlP3Wrik5PVLBdKqeBpg7zYNtAuNI09DSryey1ipaC4Xemq9ZmC 8Hjam9RAQh5FV3RCeviTzpusXt31/xjkTx3vrf1WGx2uY/MSKNwwfPcRAyfG+Ye5 9kxpR8sBwRxJ+6+MEWRBjqi2Lk/BuD4urViAwGj88uDkKKByXT/pfULD3foTgx0d smZWWMun9jQl4XDk321oOp5HOqg29IFcF/mwuWQIvBDO8AJBRndYHw07lx0cB5tD I8PzTG5+kE3QXBeTBEUAuszaavDI9DKVdnr6h/8VbeEXIr73WHA0VixnTIv3zgI0 AiYHIrvUstDB7Cl5457blKbljgzXMN1ssBtD/UMNFhrEUKet9qIZskwlLyhPt2s3 qti70ZoURPz9ve57pNgZDZUR5lvvUhuSNUD4ZfGt5BC+YErX59jAswrRXHU5Zx/i V5pSsRQ2BfxZUZrnmKONg3RW+hmsJJYOSzNXiO5/0zeVuorju7C+0L2goONX2bpr fEzICfDJi+IXmYke0AAjiJq7KwobU703Nd3rT8eovjVlSSJj+mdyYSVYQF9dD2uE DdtwpJh37PwbHOeQqg== =a4HS -----END PGP SIGNATURE----- Merge tag 'batadv-net-for-davem-20170802' of git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== Here is a batman-adv bugfix: - fix TT sync flag inconsistency problems, which can lead to excess packets, by Linus Luessing ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 09:23:11 -07:00
Yuchung Cheng	ed254971ed	tcp: avoid setting cwnd to invalid ssthresh after cwnd reduction states If the sender switches the congestion control during ECN-triggered cwnd-reduction state (CA_CWR), upon exiting recovery cwnd is set to the ssthresh value calculated by the previous congestion control. If the previous congestion control is BBR that always keep ssthresh to TCP_INIFINITE_SSTHRESH, cwnd ends up being infinite. The safe step is to avoid assigning invalid ssthresh value when recovery ends. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:51:07 -07:00
Thomas Falcon	4d96f12a07	ibmvnic: Initialize SCRQ's during login renegotiation SCRQ resources are freed during renegotiation, but they are not re-allocated afterwards due to some changes in the initialization process. Fix that by re-allocating the memory after renegotation. SCRQ's can also be freed if a server capabilities request fails. If this were encountered during a device reset for example, SCRQ's may not be re-allocated. This operation is not necessary anymore so remove it. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:47:45 -07:00
Hector Martin	bed9ff1659	usb: qmi_wwan: add D-Link DWM-222 device ID Signed-off-by: Hector Martin <marcan@marcan.st> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:46:47 -07:00
David S. Miller	db803a46ae	Merge branch 'mlx4-misc-fixes' Tariq Toukan says: ==================== mlx4 misc fixes This patchset contains misc bug fixes from the team to the mlx4 Core and Eth drivers. Patch 1 by Inbar fixes a wrong ethtool indication for Wake-on-LAN. The other 3 patches by Jack add a missing capability description, and fixes the off-by-1 misalignment for the following capabilities descriptions. Series generated against net commit: `cc75f8514d` samples/bpf: fix bpf tunnel cleanup ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:44:10 -07:00
Jack Morgenstein	bff0c6840c	net/mlx4_core: Fixes missing capability bit in flags2 capability dump The cited commit introduced the following new enum value in file include/linux/mlx4/device.h: QUERY_DEV_CAP_DIAG_RPRT_PER_PORT However, it failed to introduce a corresponding entry in function dump_dev_cap_flags2() for outputting a line in the message log when this capability bit is set. The change here fixes that omission. Fixes: `c7c122ed67` ("net/mlx4: Add diagnostic counters capability bit") Reported-by: Mukesh Kacker <mukesh.kacker@oracle.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:44:09 -07:00
Jack Morgenstein	f9fb9d0b85	net/mlx4_core: Fix namespace misalignment in QinQ VST support commit The cited commit introduced the following new enum value in file include/linux/mlx4/device.h: MLX4_DEV_CAP_FLAG2_SVLAN_BY_QP However the value of MLX4_DEV_CAP_FLAG2_SVLAN_BY_QP needs to stay consistent with the value used in another namespace in function dump_dev_cap_flags2(), which is manually kept in sync. The change here restores that consistency. Fixes: `7c3d21c815` ("net/mlx4_core: Preparation for VF vlan protocol 802.1ad") Reported-by: Mukesh Kacker <mukesh.kacker@oracle.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:44:09 -07:00
Jack Morgenstein	5886259c12	net/mlx4_core: Fix sl_to_vl_change bit offset in flags2 dump The index value in function dump_dev_cap_flags2() for outputting "sl to vl mapping table change event support" needs to be consistent with the value of the enumerated constant MLX4_DEV_CAP_FLAG2_SL_TO_VL_CHANGE_EVENT defined in file include/linux/mlx4_device.h The change here restores that consistency. Fixes: `fd10ed8e6f` ("IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets") Reported-by: Mukesh Kacker <mukesh.kacker@oracle.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:44:09 -07:00
Inbar Karmy	c994f778bb	net/mlx4_en: Fix wrong indication of Wake-on-LAN (WoL) support Currently when WoL is supported but disabled, ethtool reports: "Supports Wake-on: d". Fix the indication of Wol support, so that the indication remains "g" all the time if the NIC supports WoL. Tested: As accepted, when NIC supports WoL- ethtool reports: Supports Wake-on: g Wake-on: d when NIC doesn't support WoL- ethtool reports: Supports Wake-on: d Wake-on: d Fixes: `14c07b1358` ("mlx4: Wake on LAN support") Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:44:09 -07:00
David S. Miller	9075bd206c	Merge branch 'lan78xx-Fixes' Nisar Sayed says: ==================== lan78xx: Fixes to lan78xx driver This series of patches are for lan78xx driver. These patches fixes potential issues associated with lan78xx driver ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:39:58 -07:00
Nisar Sayed	0573f94b1a	lan78xx: Fix to handle hard_header_len update Fix to handle hard_header_len update When ifconfig up/down sequence is initiated hard_header_len get updated incrementally for each ifconfig up /down sequence, this leads invalid hard_header_len, moving to lan78xx_bind to have one time update of hard_header_len addresses the issue. Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:39:58 -07:00
Nisar Sayed	fb52c3b597	lan78xx: USB fast connect/disconnect crash fix USB fast connect/disconnect crash fix When USB plugged/unplugged at fast rate, lan78xx_mdio_init() in lan78xx_bind() failing case is not handled. Whenever lan78xx_mdio_init() failed, dev->mdiobus will be freed, however since lan78xx_bind() not consider as error and try to proceed for further initialization in lan78xx_probe() which leads system hung/crash. Also when register_netdev() failed, netdev is freed without calling lan78xx_unbind(). Hence halting the failed cases right manner fixes the system crash/hung issue. Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:39:58 -07:00
David S. Miller	12c5d0c048	Merge branch 'net-Fix-64-bit-statistics-seqcount-init' Florian Fainelli says: ==================== drivers: net: Fix 64-bit statistics seqcount init This patch series fixes a bunch of drivers to have their 64-bit statistics seqcount cookie be initialized correctly. Most of these drivers (except b44, gtp) are probably used on 64-bit only hosts and so the lockdep splat might have never been seen. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 20:06:07 -07:00
Florian Fainelli	87173cd6cf	ipvlan: Fix 64-bit statistics seqcount initialization On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that by using the proper helper function: netdev_alloc_pcpu_stats(). Fixes: `2ad7bf3638` ("ipvlan: Initial check-in of the IPVLAN driver.") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 20:06:07 -07:00
Florian Fainelli	4a0dee1ffe	netvsc: Initialize 64-bit stats seqcount On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. In commit `6c80f3fc23` ("netvsc: report per-channel stats in ethtool statistics") netdev_alloc_pcpu_stats() was removed in favor of open-coding the 64-bits statistics, except that u64_stats_init() was missed. Fixes: `6c80f3fc23` ("netvsc: report per-channel stats in ethtool statistics") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 20:06:07 -07:00
Florian Fainelli	790cb2ebb3	gtp: Initialize 64-bit per-cpu stats correctly On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that by using netdev_alloc_pcpu_stats() instead of an open coded allocation. Fixes: `459aa660eb` ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 20:06:07 -07:00
Florian Fainelli	7ceb781a1a	nfp: Initialize RX and TX ring 64-bit stats seqcounts On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. Fixes: `4c3523623d` ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 20:06:07 -07:00
Florian Fainelli	7c3a4626eb	ixgbe: Initialize 64-bit stats seqcounts On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. Fixes: `4197aa7bb8` ("ixgbevf: provide 64 bit statistics") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 20:06:07 -07:00
Florian Fainelli	7d6d067790	i40e: Initialize 64-bit statistics TX ring seqcount On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. Fixes: `980e9b1186` ("i40e: Add support for 64 bit netstats") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 20:06:06 -07:00
Florian Fainelli	e43c9f23ef	b44: Initialize 64-bit stats seqcount On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. Fixes: `eeda858552` ("b44: add 64 bit stats") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 20:06:06 -07:00
K. Den	1bff8a0c1f	gue: fix remcsum when GRO on and CHECKSUM_PARTIAL boundary is outer UDP In the case that GRO is turned on and the original received packet is CHECKSUM_PARTIAL, if the outer UDP header is exactly at the last csum-unnecessary point, which for instance could occur if the packet comes from another Linux guest on the same Linux host, we have to do either remcsum_adjust or set up CHECKSUM_PARTIAL again with its csum_start properly reset considering RCO. However, since `b7fe10e5eb` ("gro: Fix remcsum offload to deal with frags in GRO") that barrier in such case could be skipped if GRO turned on, hence we pass over it and the inner L4 validation mistakenly reckons it as a bad csum. This patch makes remcsum_offload being reset at the same time of GRO remcsum cleanup, so as to make it work in such case as before. Fixes: `b7fe10e5eb` ("gro: Fix remcsum offload to deal with frags in GRO") Signed-off-by: Koichiro Den <den@klaipeden.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 16:09:14 -07:00
K. Den	be73b3043b	vxlan: fix remcsum when GRO on and CHECKSUM_PARTIAL boundary is outer UDP In the case that GRO is turned on and the original received packet is CHECKSUM_PARTIAL, if the outer UDP header is exactly at the last csum-unnecessary point, which for instance could occur if the packet comes from another Linux guest on the same Linux host, we have to do either remcsum_adjust or set up CHECKSUM_PARTIAL again with its csum_start properly reset considering RCO. However, since b7fe10e5ebac("gro: Fix remcsum offload to deal with frags in GRO") that barrier in such case could be skipped if GRO turned on, hence we pass over it and the inner L4 validation mistakenly reckons it as a bad csum. This patch makes remcsum_offload being reset at the same time of GRO remcsum cleanup, so as to make it work in such case as before. Fixes: `b7fe10e5eb` ("gro: Fix remcsum offload to deal with frags in GRO") Signed-off-by: Koichiro Den <den@klaipeden.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 16:09:14 -07:00
yujuan.qi	40413955ee	Cipso: cipso_v4_optptr enter infinite loop in for(),if((optlen > 0) && (optptr[1] == 0)), enter infinite loop. Test: receive a packet which the ip length > 20 and the first byte of ip option is 0, produce this issue Signed-off-by: yujuan.qi <yujuan.qi@mediatek.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 15:31:23 -07:00
David S. Miller	fdaa419bb2	Merge branch 'ethernet-ti-cpts-fix-tx-timestamping-timeout' Grygorii Strashko says: ==================== net: ethernet: ti: cpts: fix tx timestamping timeout With the low Ethernet connection speed cpdma notification about packet processing can be received before CPTS TX timestamp event, which is set when packet actually left CPSW while cpdma notification is sent when packet pushed in CPSW fifo. As result, when connection is slow and CPU is fast enough TX timestamping is not working properly. Issue was discovered using timestamping tool on am57x boards with Ethernet link speed forced to 100M and on am335x-evm with Ethernet link speed forced to 10M. Patch3 - This series fixes it by introducing TX SKB queue to store PTP SKBs for which Ethernet Transmit Event hasn't been received yet and then re-check this queue with new Ethernet Transmit Events by scheduling CPTS overflow work more often until TX SKB queue is not empty. Patch 1,2 - As CPTS overflow work is time critical task it important to ensure that its scheduling is not delayed. Unfortunately, There could be significant delay in CPTS work schedule under high system load and on -RT which could cause CPTS misbehavior due to internal counter overflow and there is no way to tune CPTS overflow work execution policy and priority manually. The kthread_worker can be used instead of workqueues, as it creates separate named kthread for each worker and its its execution policy and priority can be configured using chrt tool. Instead of modifying CPTS driver itself it was proposed to it was proposed to add PTP auxiliary worker to the PHC subsystem [1], so other drivers can benefit from this feature also. [1] https://www.spinics.net/lists/netdev/msg445392.html changes in v4: - fixed memleak in ptp_clock_register() - undocumented change in cpts_find_ts() moved to separate patch (minor fix) changes in v3: - patch 1: added proper error handling in ptp_clock_register. minor comments applied. changes in v2: - added PTP auxiliary worker to the PHC subsystem Links v3: https://www.spinics.net/lists/netdev/msg446058.html v2: https://www.spinics.net/lists/netdev/msg445859.html v1: https://www.spinics.net/lists/netdev/msg445387.html ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 15:22:56 -07:00
Grygorii Strashko	a93439cce2	net: ethernet: ti: cpts: fix fifo read in cpts_find_ts Now the call chain cpts_find_ts() \|- cpts_fifo_read(cpts, CPTS_EV_PUSH) will stop reading CPTS FIFO if PUSH event is found. But this is not expected and CPTS FIFI should be completely drained here. This is most probably copy-paste error and it has no negative impact as CPTS_EV_PUSH should not be present in FIFO without TS_PUSH request and cpts_systim_read() and cpts_find_ts() synchronized by spin_lock. Correct above by calling cpts_fifo_read() with -1 parameter, so it will read all CPTS event from FIFO. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 15:22:55 -07:00
Grygorii Strashko	0d5f54fec0	net: ethernet: ti: cpts: fix tx timestamping timeout With the low speed Ethernet connection CPDMA notification about packet processing can be received before CPTS TX timestamp event, which is set when packet actually left CPSW while cpdma notification is sent when packet pushed in CPSW fifo. As result, when connection is slow and CPU is fast enough TX timestamping is not working properly. Fix it, by introducing TX SKB queue to store PTP SKBs for which Ethernet Transmit Event hasn't been received yet and then re-check this queue with new Ethernet Transmit Events by scheduling CPTS overflow work more often (every 1 jiffies) until TX SKB queue is not empty. Side effect of this change is: - User space tools require to take into account possible delay in TX timestamp processing (for example ptp4l works with tx_timestamp_timeout=400 under net traffic and tx_timestamp_timeout=25 in idle). Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 15:22:55 -07:00
Grygorii Strashko	999f129289	net: ethernet: ti: cpts: convert to use ptp auxiliary worker There could be significant delay in CPTS work schedule under high system load and on -RT which could cause CPTS misbehavior due to internal counter overflow. Usage of own kthread_worker allows to avoid such kind of issues and makes it possible to tune priority of CPTS kthread_worker thread on -RT (thread name "cpts"). Hence, the CPTS driver is converted to use PTP auxiliary worker as PHC subsystem implements such functionality in a generic way now. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 15:22:55 -07:00
Grygorii Strashko	d9535cb7b7	ptp: introduce ptp auxiliary worker Many PTP drivers required to perform some asynchronous or periodic work, like periodically handling PHC counter overflow or handle delayed timestamp for RX/TX network packets. In most of the cases, such work is implemented using workqueues. Unfortunately, Kernel workqueues might introduce significant delay in work scheduling under high system load and on -RT, which could cause misbehavior of PTP drivers due to internal counter overflow, for example, and there is no way to tune its execution policy and priority manuallly. Hence, The kthread_worker can be used insted of workqueues, as it create separte named kthread for each worker and its its execution policy and priority can be configured using chrt tool. This prblem was reported for two drivers TI CPSW CPTS and dp83640, so instead of modifying each of these driver it was proposed to add PTP auxiliary worker to the PHC subsystem. The patch adds PTP auxiliary worker in PHC subsystem using kthread_worker and kthread_delayed_work and introduces two new PHC subsystem APIs: - long (do_aux_work)(struct ptp_clock_info ptp) callback in ptp_clock_info structure, which driver should assign if it require to perform asynchronous or periodic work. Driver should return the delay of the PTP next auxiliary work scheduling time (>=0) or negative value in case further scheduling is not required. - int ptp_schedule_worker(struct ptp_clock *ptp, unsigned long delay) which allows schedule PTP auxiliary work. The name of kthread_worker thread corresponds PTP PHC device name "ptp%d". Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 15:22:55 -07:00
Linus Torvalds	bc78d646e7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Handle notifier registry failures properly in tun/tap driver, from Tonghao Zhang. 2) Fix bpf verifier handling of subtraction bounds and add a testcase for this, from Edward Cree. 3) Increase reset timeout in ftgmac100 driver, from Ben Herrenschmidt. 4) Fix use after free in prd_retire_rx_blk_timer_exired() in AF_PACKET, from Cong Wang. 5) Fix SElinux regression due to recent UDP optimizations, from Paolo Abeni. 6) We accidently increment IPSTATS_MIB_FRAGFAILS in the ipv6 code paths, fix from Stefano Brivio. 7) Fix some mem leaks in dccp, from Xin Long. 8) Adjust MDIO_BUS kconfig deps to avoid build errors, from Arnd Bergmann. 9) Mac address length check and buffer size fixes from Cong Wang. 10) Don't leak sockets in ipv6 udp early demux, from Paolo Abeni. 11) Fix return value when copy_from_user() fails in bpf_prog_get_info_by_fd(), from Daniel Borkmann. 12) Handle PHY_HALTED properly in phy library state machine, from Florian Fainelli. 13) Fix OOPS in fib_sync_down_dev(), from Ido Schimmel. 14) Fix truesize calculation in virtio_net which led to performance regressions, from Michael S Tsirkin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits) samples/bpf: fix bpf tunnel cleanup udp6: fix jumbogram reception ppp: Fix a scheduling-while-atomic bug in del_chan Revert "net: bcmgenet: Remove init parameter from bcmgenet_mii_config" virtio_net: fix truesize for mergeable buffers mv643xx_eth: fix of_irq_to_resource() error check MAINTAINERS: Add more files to the PHY LIBRARY section ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev() net: phy: Correctly process PHY_HALTED in phy_stop_machine() sunhme: fix up GREG_STAT and GREG_IMASK register offsets bpf: fix bpf_prog_get_info_by_fd to dump correct xlated_prog_len tcp: avoid bogus gcc-7 array-bounds warning net: tc35815: fix spelling mistake: "Intterrupt" -> "Interrupt" bpf: don't indicate success when copy_from_user fails udp6: fix socket leak on early demux net: thunderx: Fix BGX transmit stall due to underflow Revert "vhost: cache used event for better performance" team: use a larger struct for mac address net: check dev->addr_len for dev_set_mac_address() phy: bcm-ns-usb3: fix MDIO_BUS dependency ...	2017-07-31 22:36:42 -07:00
William Tu	cc75f8514d	samples/bpf: fix bpf tunnel cleanup test_tunnel_bpf.sh fails to remove the vxlan11 tunnel device, causing the next geneve tunnelling test case fails. In addition, the geneve reserved bit in tcbpf2_kern.c should be zero, according to the RFC. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 22:02:47 -07:00
Paolo Abeni	cb891fa6a1	udp6: fix jumbogram reception Since commit `67a51780ae` ("ipv6: udp: leverage scratch area helpers") udp6_recvmsg() read the skb len from the scratch area, to avoid a cache miss. But the UDP6 rx path support RFC 2675 UDPv6 jumbograms, and their length exceeds the 16 bits available in the scratch area. As a side effect the length returned by recvmsg() is: <ingress datagram len> % (1<<16) This commit addresses the issue allocating one more bit in the IP6CB flags field and setting it for incoming jumbograms. Such field is still in the first cacheline, so at recvmsg() time we can check it and fallback to access skb->len if required, without a measurable overhead. Fixes: `67a51780ae` ("ipv6: udp: leverage scratch area helpers") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 22:01:21 -07:00
Gao Feng	ddab82821f	ppp: Fix a scheduling-while-atomic bug in del_chan The PPTP set the pptp_sock_destruct as the sock's sk_destruct, it would trigger this bug when __sk_free is invoked in atomic context, because of the call path pptp_sock_destruct->del_chan->synchronize_rcu. Now move the synchronize_rcu to pptp_release from del_chan. This is the only one case which would free the sock and need the synchronize_rcu. The following is the panic I met with kernel 3.3.8, but this issue should exist in current kernel too according to the codes. BUG: scheduling while atomic __schedule_bug+0x5e/0x64 __schedule+0x55/0x580 ? ppp_unregister_channel+0x1cd5/0x1de0 [ppp_generic] ? dev_hard_start_xmit+0x423/0x530 ? sch_direct_xmit+0x73/0x170 __cond_resched+0x16/0x30 _cond_resched+0x22/0x30 wait_for_common+0x18/0x110 ? call_rcu_bh+0x10/0x10 wait_for_completion+0x12/0x20 wait_rcu_gp+0x34/0x40 ? wait_rcu_gp+0x40/0x40 synchronize_sched+0x1e/0x20 0xf8417298 0xf8417484 ? sock_queue_rcv_skb+0x109/0x130 __sk_free+0x16/0x110 ? udp_queue_rcv_skb+0x1f2/0x290 sk_free+0x16/0x20 __udp4_lib_rcv+0x3b8/0x650 Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 21:59:01 -07:00
Florian Fainelli	00d51094b8	Revert "net: bcmgenet: Remove init parameter from bcmgenet_mii_config" This reverts commit `28b45910cc` ("net: bcmgenet: Remove init parameter from bcmgenet_mii_config") because in the process of moving from dev_info() to dev_info_once() we essentially lost the helpful printed messages once the second instance of the driver is loaded. dev_info_once() does not actually print the message once per device instance, but once period. Fixes: `28b45910cc` ("net: bcmgenet: Remove init parameter from bcmgenet_mii_config") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Doug Berger <opendmb@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 18:03:36 -07:00
Michael S. Tsirkin	1daa8790d0	virtio_net: fix truesize for mergeable buffers Seth Forshee noticed a performance degradation with some workloads. This turns out to be due to packet drops. Euan Kemp noticed that this is because we drop all packets where length exceeds the truesize, but for some packets we add in extra memory without updating the truesize. This in turn was kept around unchanged from `ab7db91705` ("virtio-net: auto-tune mergeable rx buffer size for improved performance"). That commit had an internal reason not to account for the extra space: not enough bits to do it. No longer true so let's account for the allocated length exactly. Many thanks to Seth Forshee for the report and bisecting and Euan Kemp for debugging the issue. Fixes: `680557cf79` ("virtio_net: rework mergeable buffer handling") Reported-by: Euan Kemp <euan.kemp@coreos.com> Tested-by: Euan Kemp <euan.kemp@coreos.com> Reported-by: Seth Forshee <seth.forshee@canonical.com> Tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 18:02:46 -07:00
Sergei Shtylyov	cfbcb61f6c	mv643xx_eth: fix of_irq_to_resource() error check of_irq_to_resource() has recently been fixed to return negative error #'s along with 0 in case of failure, however the Marvell MV643xx Ethernet driver still only regards 0 as invalid IRQ -- fix it up. Fixes: `7a4228bbff` ("of: irq: use of_irq_get() in of_irq_to_resource()") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 17:56:47 -07:00
Florian Fainelli	13332db54f	MAINTAINERS: Add more files to the PHY LIBRARY section Include missing files that are provided by, used, or directly maintained within the PHY LIBRARY, this include uapi header, header files used by Device Tree code etc. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 17:52:47 -07:00
Ido Schimmel	71ed7ee35a	ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev() Michał reported a NULL pointer deref during fib_sync_down_dev() when unregistering a netdevice. The problem is that we don't check for 'in_dev' being NULL, which can happen in very specific cases. Usually routes are flushed upon NETDEV_DOWN sent in either the netdev or the inetaddr notification chains. However, if an interface isn't configured with any IP address, then it's possible for host routes to be flushed following NETDEV_UNREGISTER, after NULLing dev->ip_ptr in inetdev_destroy(). To reproduce: $ ip link add type dummy $ ip route add local 1.1.1.0/24 dev dummy0 $ ip link del dev dummy0 Fix this by checking for the presence of 'in_dev' before referencing it. Fixes: `982acb9756` ("ipv4: fib: Notify about nexthop status changes") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Tested-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 17:51:11 -07:00
Florian Fainelli	7ad813f208	net: phy: Correctly process PHY_HALTED in phy_stop_machine() Marc reported that he was not getting the PHY library adjust_link() callback function to run when calling phy_stop() + phy_disconnect() which does not indeed happen because we set the state machine to PHY_HALTED but we don't get to run it to process this state past that point. Fix this with a synchronous call to phy_state_machine() in order to have the state machine actually act on PHY_HALTED, set the PHY device's link down, turn the network device's carrier off and finally call the adjust_link() function. Reported-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Fixes: `a390d1f379` ("phylib: convert state_queue work to delayed_work") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 17:27:10 -07:00
Mark Cave-Ayland	96a734bae0	sunhme: fix up GREG_STAT and GREG_IMASK register offsets Update the values to match those from the STP2002QFP documentation. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 16:23:05 -07:00
Linus Torvalds	2e7ca2064c	Merge branch 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Several cgroup bug fixes. - cgroup core was calling a migration callback on empty migrations, which could make cpuset crash. - There was a very subtle bug where the controller interface files aren't created directly when cgroup2 is mounted. Because later operations create them, this bug didn't get noticed earlier. - Failed writes to cgroup.subtree_control were incorrectly returning zero" * 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: fix error return value from cgroup_subtree_control() cgroup: create dfl_root files on subsys registration cgroup: don't call migration methods if there are no tasks to migrate	2017-07-31 14:03:05 -07:00
Linus Torvalds	ff2620f778	Merge branch 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fixes from Tejun Heo: "Two notable fixes. - While adding NUMA affinity support to unbound workqueues, the assumption that an unbound workqueue with max_active == 1 is ordered was broken. The plan was to use explicit alloc_ordered_workqueue() for those cases. Unfortunately, I forgot to update the documentation properly and we grew a handful of use cases which depend on that assumption. While we want to convert them to alloc_ordered_workqueue(), we don't really lose anything by enforcing ordered execution on unbound max_active == 1 workqueues and it doesn't make sense to risk subtle bugs. Restore the assumption. - Workqueue assumes that CPU <-> NUMA node mapping remains static. This is a general assumption - we don't have any synchronization mechanism around CPU <-> node mapping. Unfortunately, powerpc may change the mapping dynamically leading to crashes. Michael added a workaround so that we at least don't crash while powerpc hotplug code gets updated" * 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Work around edge cases for calc of pool's cpumask workqueue: implicit ordered attribute should be overridable workqueue: restore WQ_UNBOUND/max_active==1 to be ordered	2017-07-31 13:37:28 -07:00
Linus Torvalds	3dcc4c7d42	Merge branch 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Dan found a really old bug where libata hotplug code wasn't sanitizing index value from userland and may end up indexing with a negative number. It is scary but fortunately can only be triggered by root. Other than that, minor fixes" * 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: fix a couple of doc build warnings libata: array underflow in ata_find_dev() ata: sata_rcar: add gen[23] fallback compatibility strings libata: remove unused rc in ata_eh_handle_port_resume libata: Cleanup ata_read_log_page() ata: fix gemini Kconfig dependencies	2017-07-31 13:33:21 -07:00
Jonathan Corbet	2f60e1ab2f	libata: fix a couple of doc build warnings The kerneldoc comments for a couple of functions in drivers/ata/libata-eh.c had fallen behind the current implementation, resulting in these doc build warnings: ./drivers/ata/libata-eh.c:1449: warning: No description found for parameter 'link' ./drivers/ata/libata-eh.c:1449: warning: Excess function parameter 'ap' description in 'ata_eh_done' ./drivers/ata/libata-eh.c:1590: warning: No description found for parameter 'qc' ./drivers/ata/libata-eh.c:1590: warning: Excess function parameter 'dev' description in 'ata_eh_request_sense' Update the comments and make the warnings go away. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-07-31 08:03:06 -07:00
Linus Lüssing	54e22f265e	batman-adv: fix TT sync flag inconsistencies This patch fixes an issue in the translation table code potentially leading to a TT Request + Response storm. The issue may occur for nodes involving BLA and an inconsistent configuration of the batman-adv AP isolation feature. However, since the new multicast optimizations, a single, malformed packet may lead to a mesh-wide, persistent Denial-of-Service, too. The issue occurs because nodes are currently OR-ing the TT sync flags of all originators announcing a specific MAC address via the translation table. When an intermediate node now receives a TT Request and wants to answer this on behalf of the destination node, then this intermediate node now responds with an altered flag field and broken CRC. The next OGM of the real destination will lead to a CRC mismatch and triggering a TT Request and Response again. Furthermore, the OR-ing is currently never undone as long as at least one originator announcing the according MAC address remains, leading to the potential persistency of this issue. This patch fixes this issue by storing the flags used in the CRC calculation on a a per TT orig entry basis to be able to respond with the correct, original flags in an intermediate TT Response for one thing. And to be able to correctly unset sync flags once all nodes announcing a sync flag vanish for another. Fixes: `e9c00136a4` ("batman-adv: fix tt_global_entries flags update") Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Acked-by: Antonio Quartulli <a@unstable.cc> [sw: typo in commit message] Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2017-07-31 11:17:38 +02:00

1 2 3 4 5 ...

692259 Commits All Branches Search

692259 Commits

All Branches