OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Jakub Kicinski	f33b0e196e	ethtool: fix kdoc attr name Add missing 't' in attrtype. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 14:21:51 -07:00
Jonathon Reinhart	2671fa4dc0	netfilter: conntrack: Make global sysctls readonly in non-init netns These sysctls point to global variables: - NF_SYSCTL_CT_MAX (&nf_conntrack_max) - NF_SYSCTL_CT_EXPECT_MAX (&nf_ct_expect_max) - NF_SYSCTL_CT_BUCKETS (&nf_conntrack_htable_size_user) Because their data pointers are not updated to point to per-netns structures, they must be marked read-only in a non-init_net ns. Otherwise, changes in any net namespace are reflected in (leaked into) all other net namespaces. This problem has existed since the introduction of net namespaces. The current logic marks them read-only only if the net namespace is owned by an unprivileged user (other than init_user_ns). Commit `d0febd81ae` ("netfilter: conntrack: re-visit sysctls in unprivileged namespaces") "exposes all sysctls even if the namespace is unpriviliged." Since we need to mark them readonly in any case, we can forego the unprivileged user check altogether. Fixes: `d0febd81ae` ("netfilter: conntrack: re-visit sysctls in unprivileged namespaces") Signed-off-by: Jonathon Reinhart <Jonathon.Reinhart@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:27:11 -07:00
Jonathon Reinhart	31c4d2f160	net: Ensure net namespace isolation of sysctls This adds an ensure_safe_net_sysctl() check during register_net_sysctl() to validate that sysctl table entries for a non-init_net netns are sufficiently isolated. To be netns-safe, an entry must adhere to at least (and usually exactly) one of these rules: 1. It is marked read-only inside the netns. 2. Its data pointer does not point to kernel/module global data. An entry which fails both of these checks is indicative of a bug, whereby a child netns can affect global net sysctl values. If such an entry is found, this code will issue a warning to the kernel log, and force the entry to be read-only to prevent a leak. To test, simply create a new netns: $ sudo ip netns add dummy As it sits now, this patch will WARN for two sysctls which will be addressed in a subsequent patch: - /proc/sys/net/netfilter/nf_conntrack_max - /proc/sys/net/netfilter/nf_conntrack_expect_max Signed-off-by: Jonathon Reinhart <Jonathon.Reinhart@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:27:11 -07:00
Andrea Mayer	0d77036057	net: seg6: trivial fix of a spelling mistake in comment There is a comment spelling mistake "interfarence" -> "interference" in function parse_nla_action(). Fix it. Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-12 13:17:09 -07:00
Cong Wang	aadb2bb83f	sock_map: Fix a potential use-after-free in sock_map_close() The last refcnt of the psock can be gone right after sock_map_remove_links(), so sk_psock_stop() could trigger a UAF. The reason why I placed sk_psock_stop() there is to avoid RCU read critical section, and more importantly, some callee of sock_map_remove_links() is supposed to be called with RCU read lock, we can not simply get rid of RCU read lock here. Therefore, the only choice we have is to grab an additional refcnt with sk_psock_get() and put it back after sk_psock_stop(). Fixes: `799aa7f98d` ("skmsg: Avoid lock_sock() in sk_psock_backlog()") Reported-by: syzbot+7b6548ae483d6f4c64ae@syzkaller.appspotmail.com Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210408030556.45134-1-xiyou.wangcong@gmail.com	2021-04-12 17:35:26 +02:00
Cong Wang	51e0158a54	skmsg: Pass psock pointer to ->psock_update_sk_prot() Using sk_psock() to retrieve psock pointer from sock requires RCU read lock, but we already get psock pointer before calling ->psock_update_sk_prot() in both cases, so we can just pass it without bothering sk_psock(). Fixes: `8a59f9d1e3` ("sock: Introduce sk->sk_prot->psock_update_sk_prot()") Reported-by: syzbot+320a3bc8d80f478c37e4@syzkaller.appspotmail.com Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: syzbot+320a3bc8d80f478c37e4@syzkaller.appspotmail.com Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210407032111.33398-1-xiyou.wangcong@gmail.com	2021-04-12 17:34:27 +02:00
Andrew Lunn	c97a31f66e	ethtool: wire in generic SFP module access If the device has a sfp bus attached, call its sfp_get_module_eeprom_by_page() function, otherwise use the ethtool op for the device. This follows how the IOCTL works. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00
Vladyslav Tarasiuk	96d971e307	ethtool: Add fallback to get_module_eeprom from netlink command In case netlink get_module_eeprom_by_page() callback is not implemented by the driver, try to call old get_module_info() and get_module_eeprom() pair. Recalculate parameters to get_module_eeprom() offset and len using page number and their sizes. Return error if this can't be done. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00
Andrew Lunn	95dfc7effd	net: ethtool: Export helpers for getting EEPROM info There are two ways to retrieve information from SFP EEPROMs. Many devices make use of the common code, and assign the sfp_bus pointer in the netdev to point to the bus holding the SFP device. Some MAC drivers directly implement ops in there ethool structure. Export within net/ethtool the two helpers used to call these methods, so that they can also be used in the new netlink code. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00
Vladyslav Tarasiuk	c781ff12a2	ethtool: Allow network drivers to dump arbitrary EEPROM data Define get_module_eeprom_by_page() ethtool callback and implement netlink infrastructure. get_module_eeprom_by_page() allows network drivers to dump a part of module's EEPROM specified by page and bank numbers along with offset and length. It is effectively a netlink replacement for get_module_info() and get_module_eeprom() pair, which is needed due to emergence of complex non-linear EEPROM layouts. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-11 16:34:56 -07:00
Florian Westphal	d163a925eb	netfilter: arp_tables: add pre_exit hook for table unregister Same problem that also existed in iptables/ip(6)tables, when arptable_filter is removed there is no longer a wait period before the table/ruleset is free'd. Unregister the hook in pre_exit, then remove the table in the exit function. This used to work correctly because the old nf_hook_unregister API did unconditional synchronize_net. The per-net hook unregister function uses call_rcu instead. Fixes: `b9e69e1273` ("netfilter: xtables: don't hook tables by default") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-10 21:18:24 +02:00
Florian Westphal	7ee3c61dcd	netfilter: bridge: add pre_exit hooks for ebtable unregistration Just like ip/ip6/arptables, the hooks have to be removed, then synchronize_rcu() has to be called to make sure no more packets are being processed before the ruleset data is released. Place the hook unregistration in the pre_exit hook, then call the new ebtables pre_exit function from there. Years ago, when first netns support got added for netfilter+ebtables, this used an older (now removed) netfilter hook unregister API, that did a unconditional synchronize_rcu(). Now that all is done with call_rcu, ebtable_{filter,nat,broute} pernet exit handlers may free the ebtable ruleset while packets are still in flight. This can only happens on module removal, not during netns exit. The new function expects the table name, not the table struct. This is because upcoming patch set (targeting -next) will remove all net->xt.{nat,filter,broute}_table instances, this makes it necessary to avoid external references to those member variables. The existing APIs will be converted, so follow the upcoming scheme of passing name + hook type instead. Fixes: `aee12a0a37` ("ebtables: remove nf_hook_register usage") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-10 21:16:54 +02:00
Eric Dumazet	b895bdf5d6	netfilter: nft_limit: avoid possible divide error in nft_limit_init div_u64() divides u64 by u32. nft_limit_init() wants to divide u64 by u64, use the appropriate math function (div64_u64) divide error: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 8390 Comm: syz-executor188 Not tainted 5.12.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:div_u64_rem include/linux/math64.h:28 [inline] RIP: 0010:div_u64 include/linux/math64.h:127 [inline] RIP: 0010:nft_limit_init+0x2a2/0x5e0 net/netfilter/nft_limit.c:85 Code: ef 4c 01 eb 41 0f 92 c7 48 89 de e8 38 a5 22 fa 4d 85 ff 0f 85 97 02 00 00 e8 ea 9e 22 fa 4c 0f af f3 45 89 ed 31 d2 4c 89 f0 <49> f7 f5 49 89 c6 e8 d3 9e 22 fa 48 8d 7d 48 48 b8 00 00 00 00 00 RSP: 0018:ffffc90009447198 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000200000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff875152e6 RDI: 0000000000000003 RBP: ffff888020f80908 R08: 0000200000000000 R09: 0000000000000000 R10: ffffffff875152d8 R11: 0000000000000000 R12: ffffc90009447270 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 000000000097a300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000200001c4 CR3: 0000000026a52000 CR4: 00000000001506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: nf_tables_newexpr net/netfilter/nf_tables_api.c:2675 [inline] nft_expr_init+0x145/0x2d0 net/netfilter/nf_tables_api.c:2713 nft_set_elem_expr_alloc+0x27/0x280 net/netfilter/nf_tables_api.c:5160 nf_tables_newset+0x1997/0x3150 net/netfilter/nf_tables_api.c:4321 nfnetlink_rcv_batch+0x85a/0x21b0 net/netfilter/nfnetlink.c:456 nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:580 [inline] nfnetlink_rcv+0x3af/0x420 net/netfilter/nfnetlink.c:598 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: `c26844eda9` ("netfilter: nf_tables: Fix nft limit burst handling") Fixes: `3e0f64b7dd` ("netfilter: nft_limit: fix packet ratelimiting") Signed-off-by: Eric Dumazet <edumazet@google.com> Diagnosed-by: Luigi Rizzo <lrizzo@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-10 21:15:35 +02:00
Jakub Kicinski	8859a44ea0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts: MAINTAINERS - keep Chandrasekar drivers/net/ethernet/mellanox/mlx5/core/en_main.c - simple fix + trust the code re-added to param.c in -next is fine include/linux/bpf.h - trivial include/linux/ethtool.h - trivial, fix kdoc while at it include/linux/skmsg.h - move to relevant place in tcp.c, comment re-wrapped net/core/skmsg.c - add the sk = sk // sk = NULL around calls net/tipc/crypto.c - trivial Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-04-09 20:48:35 -07:00
Eric Dumazet	a7150e3822	Revert "tcp: Reset tcp connections in SYN-SENT state" This reverts commit `e880f8b3a2`. 1) Patch has not been properly tested, and is wrong [1] 2) Patch submission did not include TCP maintainer (this is me) [1] divide error: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 8426 Comm: syz-executor478 Not tainted 5.12.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__tcp_select_window+0x56d/0xad0 net/ipv4/tcp_output.c:3015 Code: 44 89 ff e8 d5 cd f0 f9 45 39 e7 0f 8d 20 ff ff ff e8 f7 c7 f0 f9 44 89 e3 e9 13 ff ff ff e8 ea c7 f0 f9 44 89 e0 44 89 e3 99 <f7> 7c 24 04 29 d3 e9 fc fe ff ff e8 d3 c7 f0 f9 41 f7 dc bf 1f 00 RSP: 0018:ffffc9000184fac0 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff87832e76 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff87832e14 R11: 0000000000000000 R12: 0000000000000000 R13: 1ffff92000309f5c R14: 0000000000000000 R15: 0000000000000000 FS: 00000000023eb300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc2b5f426c0 CR3: 000000001c5cf000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tcp_select_window net/ipv4/tcp_output.c:264 [inline] __tcp_transmit_skb+0xa82/0x38f0 net/ipv4/tcp_output.c:1351 tcp_transmit_skb net/ipv4/tcp_output.c:1423 [inline] tcp_send_active_reset+0x475/0x8e0 net/ipv4/tcp_output.c:3449 tcp_disconnect+0x15a9/0x1e60 net/ipv4/tcp.c:2955 inet_shutdown+0x260/0x430 net/ipv4/af_inet.c:905 __sys_shutdown_sock net/socket.c:2189 [inline] __sys_shutdown_sock net/socket.c:2183 [inline] __sys_shutdown+0xf1/0x1b0 net/socket.c:2201 __do_sys_shutdown net/socket.c:2209 [inline] __se_sys_shutdown net/socket.c:2207 [inline] __x64_sys_shutdown+0x50/0x70 net/socket.c:2207 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: `e880f8b3a2` ("tcp: Reset tcp connections in SYN-SENT state") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Manoj Basapathi <manojbm@codeaurora.org> Cc: Sauvik Saha <ssaha@codeaurora.org> Link: https://lore.kernel.org/r/20210409170237.274904-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-04-09 16:35:31 -07:00
Florian Westphal	b98b33043c	net: dccp: use net_generic storage DCCP is virtually never used, so no need to use space in struct net for it. Put the pernet ipv4/v6 socket in the dccp ipv4/ipv6 modules instead. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20210408174502.1625-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-04-09 16:34:56 -07:00
Paolo Abeni	27f0ad7169	net: fix hangup on napi_disable for threaded napi napi_disable() is subject to an hangup, when the threaded mode is enabled and the napi is under heavy traffic. If the relevant napi has been scheduled and the napi_disable() kicks in before the next napi_threaded_wait() completes - so that the latter quits due to the napi_disable_pending() condition, the existing code leaves the NAPI_STATE_SCHED bit set and the napi_disable() loop waiting for such bit will hang. This patch addresses the issue by dropping the NAPI_STATE_DISABLE bit test in napi_thread_wait(). The later napi_threaded_poll() iteration will take care of clearing the NAPI_STATE_SCHED. This also addresses a related problem reported by Jakub: before this patch a napi_disable()/napi_enable() pair killed the napi thread, effectively disabling the threaded mode. On the patched kernel napi_disable() simply stops scheduling the relevant thread. v1 -> v2: - let the main napi_thread_poll() loop clear the SCHED bit Reported-by: Jakub Kicinski <kuba@kernel.org> Fixes: `29863d41bb` ("net: implement threaded-able napi poll loop support") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/883923fa22745a9589e8610962b7dc59df09fb1f.1617981844.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-04-09 12:50:31 -07:00
Muhammad Usama Anjum	864db232dc	net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlh nlh is being checked for validtity two times when it is dereferenced in this function. Check for validity again when updating the flags through nlh pointer to make the dereferencing safe. CC: <stable@vger.kernel.org> Addresses-Coverity: ("NULL pointer dereference") Signed-off-by: Muhammad Usama Anjum <musamaanjum@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-08 16:41:46 -07:00
Sami Tolvanen	4f0f586bf0	treewide: Change list_sort to use const pointers list_sort() internally casts the comparison function passed to it to a different type with constant struct list_head pointers, and uses this pointer to call the functions, which trips indirect call Control-Flow Integrity (CFI) checking. Instead of removing the consts, this change defines the list_cmp_func_t type and changes the comparison function types of all list_sort() callers to use const pointers, thus avoiding type mismatches. Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-10-samitolvanen@google.com	2021-04-08 16:04:22 -07:00
David S. Miller	4438669eb7	bluetooth-next pull request for net-next: - Proper support for BCM4330 and BMC4334 - Various improvements for firmware download of Intel controllers - Update management interface revision to 20 - Support for AOSP HCI vendor commands - Initial Virtio support Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmBvL7UZHGx1aXoudm9u LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKU+eD/4qGiwnwQVia4hjnQ5PZ0c/ sBmZvzzZE43RNRvy8gAwcVLypLaiIkEYml4ruVdF6RGoqaDA6kRB7D0tdk8EtZc4 6h72LX0zefnyZNj/Fcl6+dKuq7bSHjdOK0G+oXFmpvYfScPsI6Oq0ntpYOFYwUE0 cDdeZyb/sW4f6L5AXUgLjD9AIjzlBOJEU5HcRL/uY+E/fH1oe757/IhUDerXjTIh 9EhY28mrI7KeoEcJ1NIgvjPnRCxPRLAKgrzc8NXkUsLeNZSIywrwjgRfTvCZ+qQf g/MmNv+n7yFlUzA7Fu2d8YAgRtgF9nZc45geqOSHqcSZURORSs0CzEoYinqwT7f0 xdMBi/UoSzJsLksgv+OfXOyoQ4lPNJ47pcH/edmT/ZJ/Ai2yTqUUSODHQ3MLC4Dp kQj34thQvfTf5b6+9HwKmBjlfMK0QGPqWjp6dth2bzos/ugurDP+XimAv0urvFz+ NkES5kCMUs/1z7Yh+Nj1MyU3Wfdwol0wOh+PXBTAw2fxSy6kmKYGyLw7HEdWegB0 K2w/ZS58g0EGK89ZSiqlZKmXtAd22dQqS6JlZje+9YTtEtRWu+rAF1dddS8vJDjl Rq3VSQ1B6Qw0rdGpyfxdgm/sWwq5S1YgQQcAZgVaeo9ZiIhfscVD5v5L0Ol6k9ms 3eCK/2brsX0IaV61eVnoOw== =jiCz -----END PGP SIGNATURE----- Merge tag 'for-net-next-2021-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - Proper support for BCM4330 and BMC4334 - Various improvements for firmware download of Intel controllers - Update management interface revision to 20 - Support for AOSP HCI vendor commands - Initial Virtio support ==================== Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-08 14:19:32 -07:00
Pavel Tikhomirov	1ffbc7ea91	net: sched: sch_teql: fix null-pointer dereference Reproduce: modprobe sch_teql tc qdisc add dev teql0 root teql0 This leads to (for instance in Centos 7 VM) OOPS: [ 532.366633] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 [ 532.366733] IP: [<ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql] [ 532.366825] PGD 80000001376d5067 PUD 137e37067 PMD 0 [ 532.366906] Oops: 0000 [#1] SMP [ 532.366987] Modules linked in: sch_teql ... [ 532.367945] CPU: 1 PID: 3026 Comm: tc Kdump: loaded Tainted: G ------------ T 3.10.0-1062.7.1.el7.x86_64 #1 [ 532.368041] Hardware name: Virtuozzo KVM, BIOS 1.11.0-2.vz7.2 04/01/2014 [ 532.368125] task: ffff8b7d37d31070 ti: ffff8b7c9fdbc000 task.ti: ffff8b7c9fdbc000 [ 532.368224] RIP: 0010:[<ffffffffc06124a8>] [<ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql] [ 532.368320] RSP: 0018:ffff8b7c9fdbf8e0 EFLAGS: 00010286 [ 532.368394] RAX: ffffffffc0612490 RBX: ffff8b7cb1565e00 RCX: ffff8b7d35ba2000 [ 532.368476] RDX: ffff8b7d35ba2000 RSI: 0000000000000000 RDI: ffff8b7cb1565e00 [ 532.368557] RBP: ffff8b7c9fdbf8f8 R08: ffff8b7d3fd1f140 R09: ffff8b7d3b001600 [ 532.368638] R10: ffff8b7d3b001600 R11: ffffffff84c7d65b R12: 00000000ffffffd8 [ 532.368719] R13: 0000000000008000 R14: ffff8b7d35ba2000 R15: ffff8b7c9fdbf9a8 [ 532.368800] FS: 00007f6a4e872740(0000) GS:ffff8b7d3fd00000(0000) knlGS:0000000000000000 [ 532.368885] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 532.368961] CR2: 00000000000000a8 CR3: 00000001396ee000 CR4: 00000000000206e0 [ 532.369046] Call Trace: [ 532.369159] [<ffffffff84c8192e>] qdisc_create+0x36e/0x450 [ 532.369268] [<ffffffff846a9b49>] ? ns_capable+0x29/0x50 [ 532.369366] [<ffffffff849afde2>] ? nla_parse+0x32/0x120 [ 532.369442] [<ffffffff84c81b4c>] tc_modify_qdisc+0x13c/0x610 [ 532.371508] [<ffffffff84c693e7>] rtnetlink_rcv_msg+0xa7/0x260 [ 532.372668] [<ffffffff84907b65>] ? sock_has_perm+0x75/0x90 [ 532.373790] [<ffffffff84c69340>] ? rtnl_newlink+0x890/0x890 [ 532.374914] [<ffffffff84c8da7b>] netlink_rcv_skb+0xab/0xc0 [ 532.376055] [<ffffffff84c63708>] rtnetlink_rcv+0x28/0x30 [ 532.377204] [<ffffffff84c8d400>] netlink_unicast+0x170/0x210 [ 532.378333] [<ffffffff84c8d7a8>] netlink_sendmsg+0x308/0x420 [ 532.379465] [<ffffffff84c2f3a6>] sock_sendmsg+0xb6/0xf0 [ 532.380710] [<ffffffffc034a56e>] ? __xfs_filemap_fault+0x8e/0x1d0 [xfs] [ 532.381868] [<ffffffffc034a75c>] ? xfs_filemap_fault+0x2c/0x30 [xfs] [ 532.383037] [<ffffffff847ec23a>] ? __do_fault.isra.61+0x8a/0x100 [ 532.384144] [<ffffffff84c30269>] ___sys_sendmsg+0x3e9/0x400 [ 532.385268] [<ffffffff847f3fad>] ? handle_mm_fault+0x39d/0x9b0 [ 532.386387] [<ffffffff84d88678>] ? __do_page_fault+0x238/0x500 [ 532.387472] [<ffffffff84c31921>] __sys_sendmsg+0x51/0x90 [ 532.388560] [<ffffffff84c31972>] SyS_sendmsg+0x12/0x20 [ 532.389636] [<ffffffff84d8dede>] system_call_fastpath+0x25/0x2a [ 532.390704] [<ffffffff84d8de21>] ? system_call_after_swapgs+0xae/0x146 [ 532.391753] Code: 00 00 00 00 00 00 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b b7 48 01 00 00 48 89 fb <48> 8b 8e a8 00 00 00 48 85 c9 74 43 48 89 ca eb 0f 0f 1f 80 00 [ 532.394036] RIP [<ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql] [ 532.395127] RSP <ffff8b7c9fdbf8e0> [ 532.396179] CR2: 00000000000000a8 Null pointer dereference happens on master->slaves dereference in teql_destroy() as master is null-pointer. When qdisc_create() calls teql_qdisc_init() it imediately fails after check "if (m->dev == dev)" because both devices are teql0, and it does not set qdisc_priv(sch)->m leaving it zero on error path, then qdisc_create() imediately calls teql_destroy() which does not expect zero master pointer and we get OOPS. Fixes: `87b60cfacf` ("net_sched: fix error recovery at qdisc creation") Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-08 14:14:42 -07:00
David S. Miller	971e305711	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2021-04-08 The following pull-request contains BPF updates for your net tree. We've added 4 non-merge commits during the last 2 day(s) which contain a total of 4 files changed, 31 insertions(+), 10 deletions(-). The main changes are: 1) Validate and reject invalid JIT branch displacements, from Piotr Krysiuk. 2) Fix incorrect unhash restore as well as fwd_alloc memory accounting in sock map, from John Fastabend. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-08 14:10:53 -07:00
David S. Miller	ac075bdd68	Various small fixes: * S1G beacon validation * potential leak in nl80211 * fast-RX confusion with 4-addr mode * erroneous WARN_ON that userspace can trigger * wrong time units in virt_wifi * rfkill userspace API breakage * TXQ AC confusing that led to traffic stopped forever * connection monitoring time after/before confusion * netlink beacon head validation buffer overrun -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAmBvFsAACgkQB8qZga/f l8QJRA//f7HTL6lgRMck229bKoKUO9ICRGifYdAZfIwFBzNZguBPwxiUeIZ0rFWf LDJgee+QsKv7YkVoUXYX0K7vA9DXbadcDWgQqB8pJrxjpibvxUtF2fA0YPEftrLu 1cCl9dgtJHaZF+SZKbEEUpDz19gZtSMTRdbJXLyhGWcZ+itXX6cFoa8hCWMxwWVi cftZJl1lXqJVzJTr/4/lENt51dwvha/z4jZodnTl2PZjOFrho4/L5UbJ7FlTmbg4 FiI9fpI+VCd6VJO/o5g8v6i7QMI0yJ1eMjuFYcOmDzagJqxD6diU31U/hpi94DGI Fw2RCWTMk2S8ol6SszvyIvt5GD6YM28MFJo69uBd6U30QjynvxDbHBqzpbd23Sqe 34QlHOfn+WbMK1DOAAEeTrYeRsm9dem9DSkAW0uf9YrTQF0E3HyJAMX6Zgd1h3Ic lWk3f/h9GiD4u/cK9vhISylu5e12F84sYND1Nk5yeMN/ihzyWAQGQiQN7pS2crvX JBzv2obMqiUeW6LPtaiPzCBrkdTnQ8kgHwuuagmKhLwF2vDVyQJEitmeVGHl/FdV O+89KodIVGHXd17cK9ddRieg8Fp8ecKCqxv/+aNN7Rsda/6mQPGTaVSFk/++FI6Q h8NMWi5qVqHShYWcF8Q1FMdPwBOeW/pZs5iqKYhp+60qTGDc9LU= =rGEc -----END PGP SIGNATURE----- Merge tag 'mac80211-for-net-2021-04-08.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes berg says: ==================== Various small fixes: * S1G beacon validation * potential leak in nl80211 * fast-RX confusion with 4-addr mode * erroneous WARN_ON that userspace can trigger * wrong time units in virt_wifi * rfkill userspace API breakage * TXQ AC confusing that led to traffic stopped forever * connection monitoring time after/before confusion * netlink beacon head validation buffer overrun ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-08 14:08:37 -07:00
David S. Miller	4667bf7135	This cleanup patchset includes the following patches: - for kerneldoc in batadv_priv, by Linus Luessing - drop unused header preempt.h, by Sven Eckelmann - Fix misspelled "wont", by Sven Eckelmann -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAmBu7mIWHHN3QHNpbW9u d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoTwhEACATSUDT505T88n3BFqXRhX3/gV F40bFT+Q0SSo5a5iJFBY2KgSJF+cAJ4L16F+CCYhfI0Ju7lRwFqEg/J6wDRG/MpH Cjb4Prn2N0nD/YeR3bPbUiyiSUDlDVmjDXqaQ03FghzzbXrONyVaT5xBsHb1p7CV cCyMJIPzM9FjM8qUXTh10MmHt0LvYIL/2JKp0W1FlOHST3GIkmX32S5CAhOZQD6N gYUCZ+bQqEox5TRYw0PajZIZXYs8xuPgcRNAzGgESAlewlWC8xUy2L6wOfAF6v3J 89dT+1maUxv8l+fftBChSOvKx7tHvpJ2W8XPLKtZNeO9qDFUtZ0EKMCGEMtgvDEd peLSOCGCp0/75rC+Z8NocUaQqjFicLbmZo5s57bi368S0BOmTO+NBYZM2ftyDnfz QpZenbzDi9TawcVmSF7v+QYoMZJkr3qYQ+/nH4DKJN4u3FdfRDp3bs7s64qDj9er s4ufATNsh7ybwQl6BjRyeanG3j8hYywZObnydr26iVol6Nafo45CYFq1z8SzU7EI 1/10eEwxgNUO3HFm+8vCvpkUvDu3W10XROLvd28rQuzG5+bMPnYQ2ZlUstRASftb XlNH2JPez/ZUKmftpbqOiQN1UDQ+h8RnPb7dMmKnn6gp5KrJ7yvXKU9C7tz/BYh+ 546TfEVyyPTmSlawYw== =dg37 -----END PGP SIGNATURE----- Merge tag 'batadv-next-pullrequest-20210408' of git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This cleanup patchset includes the following patches: - for kerneldoc in batadv_priv, by Linus Luessing - drop unused header preempt.h, by Sven Eckelmann - Fix misspelled "wont", by Sven Eckelmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-08 14:03:17 -07:00
Stephen Hemminger	3583a4e8d7	ipv6: report errors for iftoken via netlink extack Setting iftoken can fail for several different reasons but there and there was no report to user as to the cause. Add netlink extended errors to the processing of the request. This requires adding additional argument through rtnl_af_ops set_link_af callback. Reported-by: Hongren Zheng <li@zenithal.me> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-08 13:52:36 -07:00
Vlad Buslov	b3650bf76a	net: sched: fix err handler in tcf_action_init() With recent changes that separated action module load from action initialization tcf_action_init() function error handling code was modified to manually release the loaded modules if loading/initialization of any further action in same batch failed. For the case when all modules successfully loaded and some of the actions were initialized before one of them failed in init handler. In this case for all previous actions the module will be released twice by the error handler: First time by the loop that manually calls module_put() for all ops, and second time by the action destroy code that puts the module after destroying the action. Reproduction: $ sudo tc actions add action simple sdata \"2\" index 2 $ sudo tc actions add action simple sdata \"1\" index 1 \ action simple sdata \"2\" index 2 RTNETLINK answers: File exists We have an error talking to the kernel $ sudo tc actions ls action simple total acts 1 action order 0: Simple <"2"> index 2 ref 1 bind 0 $ sudo tc actions flush action simple $ sudo tc actions ls action simple $ sudo tc actions add action simple sdata \"2\" index 2 Error: Failed to load TC action module. We have an error talking to the kernel $ lsmod \| grep simple act_simple 20480 -1 Fix the issue by modifying module reference counting handling in action initialization code: - Get module reference in tcf_idr_create() and put it in tcf_idr_release() instead of taking over the reference held by the caller. - Modify users of tcf_action_init_1() to always release the module reference which they obtain before calling init function instead of assuming that created action takes over the reference. - Finally, modify tcf_action_init_1() to not release the module reference when overwriting existing action as this is no longer necessary since both upper and lower layers obtain and manage their own module references independently. Fixes: `d349f99768` ("net_sched: fix RTNL deadlock again caused by request_module()") Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-08 13:47:33 -07:00
Vlad Buslov	87c750e8c3	net: sched: fix action overwrite reference counting Action init code increments reference counter when it changes an action. This is the desired behavior for cls API which needs to obtain action reference for every classifier that points to action. However, act API just needs to change the action and releases the reference before returning. This sequence breaks when the requested action doesn't exist, which causes act API init code to create new action with specified index, but action is still released before returning and is deleted (unless it was referenced concurrently by cls API). Reproduction: $ sudo tc actions ls action gact $ sudo tc actions change action gact drop index 1 $ sudo tc actions ls action gact Extend tcf_action_init() to accept 'init_res' array and initialize it with action->ops->init() result. In tcf_action_add() remove pointers to created actions from actions array before passing it to tcf_action_put_many(). Fixes: `cae422f379` ("net: sched: use reference counting action init") Reported-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-08 13:47:33 -07:00
Vlad Buslov	4ba86128ba	Revert "net: sched: bump refcount for new action in ACT replace mode" This reverts commit `6855e8213e`. Following commit in series fixes the issue without introducing regression in error rollback of tcf_action_destroy(). Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-08 13:47:33 -07:00
Johannes Berg	9a6847ba17	nl80211: fix beacon head validation If the beacon head attribute (NL80211_ATTR_BEACON_HEAD) is too short to even contain the frame control field, we access uninitialized data beyond the buffer. Fix this by checking the minimal required size first. We used to do this until S1G support was added, where the fixed data portion has a different size. Reported-and-tested-by: syzbot+72b99dcf4607e8c770f3@syzkaller.appspotmail.com Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: `1d47f1198d` ("nl80211: correctly validate S1G beacon head") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20210408154518.d9b06d39b4ee.Iff908997b2a4067e8d456b3cb96cab9771d252b8@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 16:43:05 +02:00
Ilan Peer	8a16ffdc4c	cfg80211: Remove wrong RNR IE validation check Remove a wrong length check for RNR information element as it can have arbitrary length. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Link: https://lore.kernel.org/r/20210408143224.c7eeaf1a5270.Iead7762982e941a1cbff93f68bf8b5139447ff0c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 15:33:00 +02:00
Johannes Berg	db878e27a9	mac80211: bail out if cipher schemes are invalid If any of the cipher schemes specified by the driver are invalid, bail out and fail the registration rather than just warning. Otherwise, we might later crash when we try to use the invalid cipher scheme, e.g. if the hdr_len is (significantly) less than the pn_offs + pn_len, we'd have an out-of-bounds access in RX validation. Fixes: `2475b1cc0d` ("mac80211: add generic cipher scheme support") Link: https://lore.kernel.org/r/20210408143149.38a3a13a1b19.I6b7f5790fa0958ed8049cf02ac2a535c61e9bc96@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 15:32:48 +02:00
Emmanuel Grumbach	d6843d1ee2	mac80211: clear the beacon's CRC after channel switch After channel switch, we should consider any beacon with a CSA IE as a new switch. If the CSA IE is a leftover from before the switch that the AP forgot to remove, we'll get a CSA-to-Self. This caused issues in iwlwifi where the firmware saw a beacon with a CSA-to-Self with mode = 1 on the new channel after a switch. The firmware considered this a new switch and closed its queues. Since the beacon didn't change between before and after the switch, we wouldn't handle it (the CRC is the same) and we wouldn't let the firmware open its queues again or disconnect if the CSA IE stays for too long. Clear the CRC valid state after we switch to make sure that we handle the beacon and handle the CSA IE as required. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://lore.kernel.org/r/20210408143124.b9e68aa98304.I465afb55ca2c7d59f7bf610c6046a1fd732b4c28@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 15:32:37 +02:00
Johan Almbladh	96a7109a16	mac80211: Set priority and queue mapping for injected frames Some drivers, for example mt76, use the skb priority field, and expects that to be consistent with the skb queue mapping. On some frame injection code paths that was not true, and it broke frame injection. Now the skb queue mapping is set according to the skb priority value when the frame is injected. The skb priority value is also derived from the frame data for all frame types, as it was done prior to commit `dbd50a851c` (only allocate one queue when using iTXQs). Fixes frame injection with the mt76 driver on MT7610E chipset. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Link: https://lore.kernel.org/r/20210401164455.978245-1-johan.almbladh@anyfinetworks.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 15:32:04 +02:00
Sriram R	55f8205e7d	mac80211: Allow concurrent monitor iface and ethernet rx decap Some HW/driver can support passing ethernet rx decap frames and raw 802.11 frames for the monitor interface concurrently and via separate RX calls to mac80211. Packets going to the monitor interface(s) would be in 802.11 format and thus not have the RX_FLAG_8023 set, and 802.11 format monitoring frames should have RX_FLAG_ONLY_MONITOR set. Drivers doing such can enable the SUPPORTS_CONC_MON_RX_DECAP to allow using ethernet decap offload while a monitor interface is active, currently RX decapsulation offload gets disabled when a monitor interface is added. Signed-off-by: Sriram R <srirrama@codeaurora.org> Link: https://lore.kernel.org/r/1617068116-32253-1-git-send-email-srirrama@codeaurora.org [add proper documentation, rewrite commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 15:21:05 +02:00
Johannes Berg	abaf94ecc9	nl80211: fix potential leak of ACL params In case nl80211_parse_unsol_bcast_probe_resp() results in an error, need to "goto out" instead of just returning to free possibly allocated data. Fixes: `7443dcd1f1` ("nl80211: Unsolicited broadcast probe response support") Link: https://lore.kernel.org/r/20210408142833.d8bc2e2e454a.If290b1ba85789726a671ff0b237726d4851b5b0f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 14:44:56 +02:00
Johannes Berg	b5ac014649	cfg80211: check S1G beacon compat element length We need to check the length of this element so that we don't access data beyond its end. Fix that. Fixes: `9eaffe5078` ("cfg80211: convert S1G beacon to scan results") Link: https://lore.kernel.org/r/20210408142826.f6f4525012de.I9fdeff0afdc683a6024e5ea49d2daa3cd2459d11@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 14:44:54 +02:00
Emmanuel Grumbach	6f779a66dc	cfg80211: allow specifying a reason for hw_rfkill rfkill now allows to report a reason for the hw_rfkill state. Allow cfg80211 drivers to specify this reason. Keep the current API to use the default reason (RFKILL_HARD_BLOCK_SIGNAL). Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://lore.kernel.org/r/20210322204633.102581-4-emmanuel.grumbach@intel.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 13:05:56 +02:00
Marcel Holtmann	a61d67188f	Bluetooth: Allow Microsoft extension to indicate curve validation Some controllers don't support the Simple Pairing Options feature that can indicate the support for P-192 and P-256 public key validation. However they might support the Microsoft vendor extension that can indicate the validiation capability as well. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2021-04-08 12:26:34 +02:00
Marcel Holtmann	ba29d0360a	Bluetooth: Set defaults for le_scan_{int,window}_adv_monitor The le_scan_{int,window}_adv_monitor settings have not been set with a sensible default. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2021-04-08 12:26:34 +02:00
Lorenzo Bianconi	196900fd97	mac80211: set sk_pacing_shift for 802.3 txpath Similar to 802.11 txpath, set socket sk_pacing_shift for 802.3 tx path. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/7230abc48dcf940657838546cdaef7dce691ecdd.1615240733.git.lorenzo@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 12:03:06 +02:00
Johannes Berg	73bc9e0af5	mac80211: don't apply flow control on management frames In some cases (depending on the driver, but it's true e.g. for iwlwifi) we're using an internal TXQ for management packets, mostly to simplify the code and to have a place to queue them. However, it appears that in certain cases we can confuse the code and management frames are dropped, which is certainly not what we want. Short-circuit the processing of management frames. To keep the impact minimal, only put them on the frags queue and check the tid == management only for doing that and to skip the airtime fairness checks, if applicable. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20210319232800.0e876c800866.Id2b66eb5a17f3869b776c39b5ca713272ea09d5d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 12:02:10 +02:00
Aloka Dixit	272cd0e8d4	nl80211: Add missing line in nl80211_fils_discovery_policy Add NL80211_FILS_DISCOVERY_ATTR_TMPL explicitly in nl80211_fils_discovery_policy definition. Signed-off-by: Aloka Dixit <alokad@codeaurora.org> Link: https://lore.kernel.org/r/20210222212059.22492-1-alokad@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 11:36:48 +02:00
Colin Ian King	958574cbcc	mac80211: remove redundant assignment of variable result The variable result is being assigned a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210328213729.65819-1-colin.king@canonical.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 11:36:28 +02:00
Wei Yongjun	026dfac85f	mac80211: minstrel_ht: remove unused variable 'mg' in minstrel_ht_next_jump_rate() GCC reports the following warning with W=1: net/mac80211/rc80211_minstrel_ht.c:871:34: warning: variable 'mg' set but not used [-Wunused-but-set-variable] 871 \| struct minstrel_mcs_group_data *mg; \| ^~ This variable is not used in function , this commit remove it to fix the warning. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Link: https://lore.kernel.org/r/20210326024843.987941-1-weiyongjun1@huawei.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 11:20:45 +02:00
Qiheng Lin	81d94f47be	cfg80211: regulatory: use DEFINE_SPINLOCK() for spinlock spinlock can be initialized automatically with DEFINE_SPINLOCK() rather than explicitly calling spin_lock_init(). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Qiheng Lin <linqiheng@huawei.com> Link: https://lore.kernel.org/r/20210325143854.13186-1-linqiheng@huawei.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 10:17:32 +02:00
Guobin Huang	ed7247f309	rfkill: use DEFINE_SPINLOCK() for spinlock spinlock can be initialized automatically with DEFINE_SPINLOCK() rather than explicitly calling spin_lock_init(). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Guobin Huang <huangguobin4@huawei.com> Link: https://lore.kernel.org/r/1617711116-49370-1-git-send-email-huangguobin4@huawei.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 10:16:53 +02:00
Du Cheng	1b5ab825d9	cfg80211: remove WARN_ON() in cfg80211_sme_connect A WARN_ON(wdev->conn) would trigger in cfg80211_sme_connect(), if multiple send_msg(NL80211_CMD_CONNECT) system calls are made from the userland, which should be anticipated and handled by the wireless driver. Remove this WARN_ON() to prevent kernel panic if kernel is configured to "panic_on_warn". Bug reported by syzbot. Reported-by: syzbot+5f9392825de654244975@syzkaller.appspotmail.com Signed-off-by: Du Cheng <ducheng2@gmail.com> Link: https://lore.kernel.org/r/20210407162756.6101-1-ducheng2@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 10:14:55 +02:00
Ben Greear	7d73cd946d	mac80211: fix time-is-after bug in mlme The incorrect timeout check caused probing to happen when it did not need to happen. This in turn caused tx performance drop for around 5 seconds in ath10k-ct driver. Possibly that tx drop is due to a secondary issue, but fixing the probe to not happen when traffic is running fixes the symptom. Signed-off-by: Ben Greear <greearb@candelatech.com> Fixes: `9abf4e4983` ("mac80211: optimize station connection monitor") Acked-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20210330230749.14097-1-greearb@candelatech.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 10:14:53 +02:00
Johannes Berg	1153a74768	mac80211: fix TXQ AC confusion Normally, TXQs have txq->tid = tid; txq->ac = ieee80211_ac_from_tid(tid); However, the special management TXQ actually has txq->tid = IEEE80211_NUM_TIDS; // 16 txq->ac = IEEE80211_AC_VO; This makes sense, but ieee80211_ac_from_tid(16) is the same as ieee80211_ac_from_tid(0) which is just IEEE80211_AC_BE. Now, normally this is fine. However, if the netdev queues were stopped, then the code in ieee80211_tx_dequeue() will propagate the stop from the interface (vif->txqs_stopped[]) if the AC 2 (ieee80211_ac_from_tid(txq->tid)) is marked as stopped. On wake, however, __ieee80211_wake_txqs() will wake the TXQ if AC 0 (txq->ac) is woken up. If a driver stops all queues with ieee80211_stop_tx_queues() and then wakes them again with ieee80211_wake_tx_queues(), the ieee80211_wake_txqs() tasklet will run to resync queue and TXQ state. If all queues were woken, then what'll happen is that _ieee80211_wake_txqs() will run in order of HW queues 0-3, typically (and certainly for iwlwifi) corresponding to ACs 0-3, so it'll call __ieee80211_wake_txqs() for each AC in order 0-3. When __ieee80211_wake_txqs() is called for AC 0 (VO) that'll wake up the management TXQ (remember its tid is 16), and the driver's wake_tx_queue() will be called. That tries to get a frame, which will immediately stop the TXQ again, because now we check against AC 2, and AC 2 hasn't yet been marked as woken up again in sdata->vif.txqs_stopped[] since we're only in the __ieee80211_wake_txqs() call for AC 0. Thus, the management TXQ will never be started again. Fix this by checking txq->ac directly instead of calculating the AC as ieee80211_ac_from_tid(txq->tid). Fixes: `adf8ed01e4` ("mac80211: add an optional TXQ for other PS-buffered frames") Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20210323210500.bf4d50afea4a.I136ffde910486301f8818f5442e3c9bf8670a9c4@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 10:14:48 +02:00
Johannes Berg	71826654ce	rfkill: revert back to old userspace API by default Recompiling with the new extended version of struct rfkill_event broke systemd in two ways: - It used "sizeof(struct rfkill_event)" to read the event, but then complained if it actually got something != 8, this broke it on new kernels (that include the updated API); - It used sizeof(struct rfkill_event) to write a command, but didn't implement the intended expansion protocol where the kernel returns only how many bytes it accepted, and errored out due to the unexpected smaller size on kernels that didn't include the updated API. Even though systemd has now been fixed, that fix may not be always deployed, and other applications could potentially have similar issues. As such, in the interest of avoiding regressions, revert the default API "struct rfkill_event" back to the original size. Instead, add a new "struct rfkill_event_ext" that extends it by the new field, and even more clearly document that applications should be prepared for extensions in two ways: * write might only accept fewer bytes on older kernels, and will return how many to let userspace know which data may have been ignored; * read might return anything between 8 (the original size) and whatever size the application sized its buffer at, indicating how much event data was supported by the kernel. Perhaps that will help avoid such issues in the future and we won't have to come up with another version of the struct if we ever need to extend it again. Applications that want to take advantage of the new field will have to be modified to use struct rfkill_event_ext instead now, which comes with the danger of them having already been updated to use it from 'struct rfkill_event', but I found no evidence of that, and it's still relatively new. Cc: stable@vger.kernel.org # 5.11 Reported-by: Takashi Iwai <tiwai@suse.de> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v12.0.0-r4 (x86-64) Link: https://lore.kernel.org/r/20210319232510.f1a139cfdd9c.Ic5c7c9d1d28972059e132ea653a21a427c326678@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 10:14:45 +02:00
Seevalamuthu Mariappan	dd0b455381	mac80211: clear sta->fast_rx when STA removed from 4-addr VLAN In some race conditions, with more clients and traffic configuration, below crash is seen when making the interface down. sta->fast_rx wasn't cleared when STA gets removed from 4-addr AP_VLAN interface. The crash is due to try accessing 4-addr AP_VLAN interface's net_device (fast_rx->dev) which has been deleted already. Resolve this by clearing sta->fast_rx pointer when STA removes from a 4-addr VLAN. [ 239.449529] Unable to handle kernel NULL pointer dereference at virtual address 00000004 [ 239.449531] pgd = 80204000 ... [ 239.481496] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.60 #227 [ 239.481591] Hardware name: Generic DT based system [ 239.487665] task: be05b700 ti: be08e000 task.ti: be08e000 [ 239.492360] PC is at get_rps_cpu+0x2d4/0x31c [ 239.497823] LR is at 0xbe08fc54 ... [ 239.778574] [<80739740>] (get_rps_cpu) from [<8073cb10>] (netif_receive_skb_internal+0x8c/0xac) [ 239.786722] [<8073cb10>] (netif_receive_skb_internal) from [<8073d578>] (napi_gro_receive+0x48/0xc4) [ 239.795267] [<8073d578>] (napi_gro_receive) from [<c7b83e8c>] (ieee80211_mark_rx_ba_filtered_frames+0xbcc/0x12d4 [mac80211]) [ 239.804776] [<c7b83e8c>] (ieee80211_mark_rx_ba_filtered_frames [mac80211]) from [<c7b84d4c>] (ieee80211_rx_napi+0x7b8/0x8c8 [mac8 0211]) [ 239.815857] [<c7b84d4c>] (ieee80211_rx_napi [mac80211]) from [<c7f63d7c>] (ath11k_dp_process_rx+0x7bc/0x8c8 [ath11k]) [ 239.827757] [<c7f63d7c>] (ath11k_dp_process_rx [ath11k]) from [<c7f5b6c4>] (ath11k_dp_service_srng+0x2c0/0x2e0 [ath11k]) [ 239.838484] [<c7f5b6c4>] (ath11k_dp_service_srng [ath11k]) from [<7f55b7dc>] (ath11k_ahb_ext_grp_napi_poll+0x20/0x84 [ath11k_ahb] ) [ 239.849419] [<7f55b7dc>] (ath11k_ahb_ext_grp_napi_poll [ath11k_ahb]) from [<8073ce1c>] (net_rx_action+0xe0/0x28c) [ 239.860945] [<8073ce1c>] (net_rx_action) from [<80324868>] (__do_softirq+0xe4/0x228) [ 239.871269] [<80324868>] (__do_softirq) from [<80324c48>] (irq_exit+0x98/0x108) [ 239.879080] [<80324c48>] (irq_exit) from [<8035c59c>] (__handle_domain_irq+0x90/0xb4) [ 239.886114] [<8035c59c>] (__handle_domain_irq) from [<8030137c>] (gic_handle_irq+0x50/0x94) [ 239.894100] [<8030137c>] (gic_handle_irq) from [<803024c0>] (__irq_svc+0x40/0x74) Signed-off-by: Seevalamuthu Mariappan <seevalam@codeaurora.org> Link: https://lore.kernel.org/r/1616163532-3881-1-git-send-email-seevalam@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2021-04-08 10:14:42 +02:00
David S. Miller	5d1dbacde1	Merge tag 'ieee802154-for-davem-2021-04-07' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan Stefan Schmidt says: ==================== pull-request: ieee802154 for net 2021-04-07 An update from ieee802154 for your net tree. Most of these are coming from the flood of syzkaller reports lately got for the ieee802154 subsystem. There are likely to come more for this, but this is a good batch to get out for now. Alexander Aring created a patchset to avoid llsec handling on a monitor interface, which we do not support. Alex Shi removed a unused macro. Pavel Skripkin fixed another protection fault found by syzkaller. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 15:04:55 -07:00
Danielle Ratson	fde32dbe71	ethtool: Add lanes parameter for ETHTOOL_LINK_MODE_10000baseR_FEC_BIT Lanes field is missing for ETHTOOL_LINK_MODE_10000baseR_FEC_BIT link mode and it causes a failure when trying to set 'speed 10000 lanes 1' on Spectrum-2 machines when autoneg is set to on. Add the lanes parameter for ETHTOOL_LINK_MODE_10000baseR_FEC_BIT link mode. Fixes: `c8907043c6` ("ethtool: Get link mode in use instead of speed and duplex parameters") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:53:04 -07:00
Danielle Ratson	a975d7d8a3	ethtool: Remove link_mode param and derive link params from driver Some drivers clear the 'ethtool_link_ksettings' struct in their get_link_ksettings() callback, before populating it with actual values. Such drivers will set the new 'link_mode' field to zero, resulting in user space receiving wrong link mode information given that zero is a valid value for the field. Another problem is that some drivers (notably tun) can report random values in the 'link_mode' field. This can result in a general protection fault when the field is used as an index to the 'link_mode_params' array [1]. This happens because such drivers implement their set_link_ksettings() callback by simply overwriting their private copy of 'ethtool_link_ksettings' struct with the one they get from the stack, which is not always properly initialized. Fix these problems by removing 'link_mode' from 'ethtool_link_ksettings' and instead have drivers call ethtool_params_from_link_mode() with the current link mode. The function will derive the link parameters (e.g., speed) from the link mode and fill them in the 'ethtool_link_ksettings' struct. v3: * Remove link_mode parameter and derive the link parameters in the driver instead of passing link_mode parameter to ethtool and derive it there. v2: * Introduce 'cap_link_mode_supported' instead of adding a validity field to 'ethtool_link_ksettings' struct. [1] general protection fault, probably for non-canonical address 0xdffffc00f14cc32c: 0000 [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x000000078a661960-0x000000078a661967] CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446 Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 +38 d0 7c 08 84 d2 0f 85 b9 RSP: 0018:ffffc900019df7a0 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888026136008 RCX: 0000000000000000 RDX: 00000000f14cc32c RSI: ffffffff873439ca RDI: 000000078a661960 RBP: 00000000ffff8880 R08: 00000000ffffffff R09: ffff88802613606f R10: ffffffff873439bc R11: 0000000000000000 R12: 0000000000000000 R13: ffff88802613606c R14: ffff888011d0c210 R15: ffff888011d0c210 FS: 0000000000749300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004b60f0 CR3: 00000000185c2000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37 ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586 ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656 ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620 dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842 dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060 sock_ioctl+0x477/0x6a0 net/socket.c:1177 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `c8907043c6` ("ethtool: Get link mode in use instead of speed and duplex parameters") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:53:04 -07:00
Andrei Vagin	0854fa82c9	net: remove the new_ifindex argument from dev_change_net_namespace Here is only one place where we want to specify new_ifindex. In all other cases, callers pass 0 as new_ifindex. It looks reasonable to add a low-level function with new_ifindex and to convert dev_change_net_namespace to a static inline wrapper. Fixes: `eeb85a14ee` ("net: Allow to specify ifindex when device is moved to another namespace") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Andrei Vagin <avagin@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:43:28 -07:00
Andrei Vagin	7e4a51319d	net: introduce nla_policy for IFLA_NEW_IFINDEX In this case, we don't need to check that new_ifindex is positive in validate_linkmsg. Fixes: `eeb85a14ee` ("net: Allow to specify ifindex when device is moved to another namespace") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Andrei Vagin <avagin@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:42:12 -07:00
Zheng Yongjun	a79ace4b31	net: tipc: Fix spelling errors in net/tipc module These patches fix a series of spelling errors in net/tipc module. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:29:29 -07:00
Kurt Kanzenbach	9d6803921a	net: hsr: Reset MAC header for Tx path Reset MAC header in HSR Tx path. This is needed, because direct packet transmission, e.g. by specifying PACKET_QDISC_BYPASS does not reset the MAC header. This has been observed using the following setup: \|$ ip link add name hsr0 type hsr slave1 lan0 slave2 lan1 supervision 45 version 1 \|$ ifconfig hsr0 up \|$ ./test hsr0 The test binary is using mmap'ed sockets and is specifying the PACKET_QDISC_BYPASS socket option. This patch resolves the following warning on a non-patched kernel: \|[ 112.725394] ------------[ cut here ]------------ \|[ 112.731418] WARNING: CPU: 1 PID: 257 at net/hsr/hsr_forward.c:560 hsr_forward_skb+0x484/0x568 \|[ 112.739962] net/hsr/hsr_forward.c:560: Malformed frame (port_src hsr0) The warning can be safely removed, because the other call sites of hsr_forward_skb() make sure that the skb is prepared correctly. Fixes: `d346a3fae3` ("packet: introduce PACKET_QDISC_BYPASS socket option") Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:25:12 -07:00
Davide Caratti	07f8252fe0	mptcp: drop all sub-options except ADD_ADDR when the echo bit is set Current Linux carries echo-ed ADD_ADDR over pure TCP ACKs, so there is no need to add a DSS element that would fit only ADD_ADDR with IPv4 address. Drop the DSS from echo-ed ADD_ADDR, regardless of the IP version. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:09:40 -07:00
Geliang Tang	761c124ed9	mptcp: unify add_addr(6)_generate_hmac The length of the IPv4 address is 4 octets and IPv6 is 16. That's the only difference between add_addr_generate_hmac and add_addr6_generate_hmac. This patch dropped the duplicate code and unify them into one. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:09:40 -07:00
Geliang Tang	1b1a6ef597	mptcp: drop MPTCP_ADDR_IPVERSION_4/6 Since the type of the address family in struct mptcp_options_received became sa_family_t, we should set AF_INET/AF_INET6 to it, instead of using MPTCP_ADDR_IPVERSION_4/6. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:09:40 -07:00
Geliang Tang	f7dafee185	mptcp: use mptcp_addr_info in mptcp_options_received This patch added a new struct mptcp_addr_info member addr in struct mptcp_options_received, and dropped the original family, addr_id, addr, addr6 and port fields in it. Then we can pass the parameter mp_opt.addr directly to mptcp_pm_add_addr_received and mptcp_pm_add_addr_echoed. Since the port number became big-endian now, use htons to convert the incoming port number to it. Also use ntohs to convert it when passing it to add_addr_generate_hmac or printing it out. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:09:39 -07:00
Geliang Tang	fef6b7ecfb	mptcp: drop OPTION_MPTCP_ADD_ADDR6 Since the family field was added in struct mptcp_out_options, no need to use OPTION_MPTCP_ADD_ADDR6 to identify the IPv6 address. Drop it. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:09:39 -07:00
Geliang Tang	30f60bae80	mptcp: use mptcp_addr_info in mptcp_out_options This patch moved the mptcp_addr_info struct from protocol.h to mptcp.h, added a new struct mptcp_addr_info member addr in struct mptcp_out_options, and dropped the original addr, addr6, addr_id and port fields in it. Then we can use opts->addr to get the adding address from PM directly using mptcp_pm_add_addr_signal. Since the port number became big-endian now, use ntohs to convert it before sending it out with the ADD_ADDR suboption. Also convert it when passing it to add_addr_generate_hmac or printing it out. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:09:39 -07:00
Geliang Tang	daa83ab039	mptcp: move flags and ifindex out of mptcp_addr_info This patch moved the flags and ifindex fields from struct mptcp_addr_info to struct mptcp_pm_addr_entry. Add the flags and ifindex values as two new parameters to __mptcp_subflow_connect. In mptcp_pm_create_subflow_or_signal_addr, pass the local address entry's flags and ifindex fields to __mptcp_subflow_connect. In mptcp_pm_nl_add_addr_received, just pass two zeros to it. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:09:39 -07:00
Aditya Pakki	0c85a7e874	net/rds: Avoid potential use after free in rds_send_remove_from_sock In case of rs failure in rds_send_remove_from_sock(), the 'rm' resource is freed and later under spinlock, causing potential use-after-free. Set the free pointer to NULL to avoid undefined behavior. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-07 14:01:24 -07:00
Wong Vee Khee	63cf323899	ethtool: fix incorrect datatype in set_eee ops The member 'tx_lpi_timer' is defined with __u32 datatype in the ethtool header file. Hence, we should use ethnl_update_u32() in set_eee ops. Fixes: `fd77be7bd4` ("ethtool: set EEE settings with EEE_SET request") Cc: <stable@vger.kernel.org> # 5.10.x Cc: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-06 16:42:25 -07:00
David S. Miller	5106efe6ed	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following batch contains Netfilter/IPVS updates for your net-next tree: 1) Simplify log infrastructure modularity: Merge ipv4, ipv6, bridge, netdev and ARP families to nf_log_syslog.c. Add module softdeps. This fixes a rare deadlock condition that might occur when log module autoload is required. From Florian Westphal. 2) Moves part of netfilter related pernet data from struct net to net_generic() infrastructure. All of these users can be modules, so if they are not loaded there is no need to waste space. Size reduction is 7 cachelines on x86_64, also from Florian. 2) Update nftables audit support to report events once per table, to get it aligned with iptables. From Richard Guy Briggs. 3) Check for stale routes from the flowtable garbage collector path. This is fixing IPv6 which breaks due missing check for the dst_cookie. 4) Add a nfnl_fill_hdr() function to simplify netlink + nfnetlink headers setup. 5) Remove documentation on several statified functions. 6) Remove printk on netns creation for the FTP IPVS tracker, from Florian Westphal. 7) Remove unnecessary nf_tables_destroy_list_lock spinlock initialization, from Yang Yingliang. 7) Remove a duplicated forward declaration in ipset, from Wan Jiabing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-06 16:36:41 -07:00
John Fastabend	144748eb0c	bpf, sockmap: Fix incorrect fwd_alloc accounting Incorrect accounting fwd_alloc can result in a warning when the socket is torn down, [18455.319240] WARNING: CPU: 0 PID: 24075 at net/core/stream.c:208 sk_stream_kill_queues+0x21f/0x230 [...] [18455.319543] Call Trace: [18455.319556] inet_csk_destroy_sock+0xba/0x1f0 [18455.319577] tcp_rcv_state_process+0x1b4e/0x2380 [18455.319593] ? lock_downgrade+0x3a0/0x3a0 [18455.319617] ? tcp_finish_connect+0x1e0/0x1e0 [18455.319631] ? sk_reset_timer+0x15/0x70 [18455.319646] ? tcp_schedule_loss_probe+0x1b2/0x240 [18455.319663] ? lock_release+0xb2/0x3f0 [18455.319676] ? __release_sock+0x8a/0x1b0 [18455.319690] ? lock_downgrade+0x3a0/0x3a0 [18455.319704] ? lock_release+0x3f0/0x3f0 [18455.319717] ? __tcp_close+0x2c6/0x790 [18455.319736] ? tcp_v4_do_rcv+0x168/0x370 [18455.319750] tcp_v4_do_rcv+0x168/0x370 [18455.319767] __release_sock+0xbc/0x1b0 [18455.319785] __tcp_close+0x2ee/0x790 [18455.319805] tcp_close+0x20/0x80 This currently happens because on redirect case we do skb_set_owner_r() with the original sock. This increments the fwd_alloc memory accounting on the original sock. Then on redirect we may push this into the queue of the psock we are redirecting to. When the skb is flushed from the queue we give the memory back to the original sock. The problem is if the original sock is destroyed/closed with skbs on another psocks queue then the original sock will not have a way to reclaim the memory before being destroyed. Then above warning will be thrown sockA sockB sk_psock_strp_read() sk_psock_verdict_apply() -- SK_REDIRECT -- sk_psock_skb_redirect() skb_queue_tail(psock_other->ingress_skb..) sk_close() sock_map_unref() sk_psock_put() sk_psock_drop() sk_psock_zap_ingress() At this point we have torn down our own psock, but have the outstanding skb in psock_other. Note that SK_PASS doesn't have this problem because the sk_psock_drop() logic releases the skb, its still associated with our psock. To resolve lets only account for sockets on the ingress queue that are still associated with the current socket. On the redirect case we will check memory limits per `6fa9201a89`, but will omit fwd_alloc accounting until skb is actually enqueued. When the skb is sent via skb_send_sock_locked or received with sk_psock_skb_ingress memory will be claimed on psock_other. Fixes: `6fa9201a89` ("bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to self") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/161731444013.68884.4021114312848535993.stgit@john-XPS-13-9370	2021-04-07 01:29:06 +02:00
Xin Long	2a2403ca3a	tipc: increment the tmp aead refcnt before attaching it Li Shuang found a NULL pointer dereference crash in her testing: [] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [] RIP: 0010:tipc_crypto_rcv_complete+0xc8/0x7e0 [tipc] [] Call Trace: [] <IRQ> [] tipc_crypto_rcv+0x2d9/0x8f0 [tipc] [] tipc_rcv+0x2fc/0x1120 [tipc] [] tipc_udp_recv+0xc6/0x1e0 [tipc] [] udpv6_queue_rcv_one_skb+0x16a/0x460 [] udp6_unicast_rcv_skb.isra.35+0x41/0xa0 [] ip6_protocol_deliver_rcu+0x23b/0x4c0 [] ip6_input+0x3d/0xb0 [] ipv6_rcv+0x395/0x510 [] __netif_receive_skb_core+0x5fc/0xc40 This is caused by NULL returned by tipc_aead_get(), and then crashed when dereferencing it later in tipc_crypto_rcv_complete(). This might happen when tipc_crypto_rcv_complete() is called by two threads at the same time: the tmp attached by tipc_crypto_key_attach() in one thread may be released by the one attached by that in the other thread. This patch is to fix it by incrementing the tmp's refcnt before attaching it instead of calling tipc_aead_get() after attaching it. Fixes: `fc1b6d6de2` ("tipc: introduce TIPC encryption & authentication") Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-06 16:25:34 -07:00
Manoj Basapathi	e880f8b3a2	tcp: Reset tcp connections in SYN-SENT state Userspace sends tcp connection (sock) destroy on network switch i.e switching the default network of the device between multiple networks(Cellular/Wifi/Ethernet). Kernel though doesn't send reset for the connections in SYN-SENT state and these connections continue to remain. Even as per RFC 793, there is no hard rule to not send RST on ABORT in this state. Modify tcp_abort and tcp_disconnect behavior to send RST for connections in syn-sent state to avoid lingering connections on network switch. Signed-off-by: Manoj Basapathi <manojbm@codeaurora.org> Signed-off-by: Sauvik Saha <ssaha@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-06 16:17:20 -07:00
Cong Wang	928dc40680	bpf, udp: Remove some pointless comments These comments in udp_bpf_update_proto() are copied from the original TCP code and apparently do not apply to UDP. Just remove them. Reported-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210403052715.13854-1-xiyou.wangcong@gmail.com	2021-04-06 23:26:04 +02:00
Marcel Holtmann	f67743f9e0	Bluetooth: Add support for reading AOSP vendor capabilities When drivers indicate support for AOSP vendor extension, initialize them and read its capabilities. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2021-04-06 14:11:23 -07:00
Pavel Skripkin	1165affd48	net: mac802154: Fix general protection fault syzbot found general protection fault in crypto_destroy_tfm()[1]. It was caused by wrong clean up loop in llsec_key_alloc(). If one of the tfm array members is in IS_ERR() range it will cause general protection fault in clean up function [1]. Call Trace: crypto_free_aead include/crypto/aead.h:191 [inline] [1] llsec_key_alloc net/mac802154/llsec.c:156 [inline] mac802154_llsec_key_add+0x9e0/0xcc0 net/mac802154/llsec.c:249 ieee802154_add_llsec_key+0x56/0x80 net/mac802154/cfg.c:338 rdev_add_llsec_key net/ieee802154/rdev-ops.h:260 [inline] nl802154_add_llsec_key+0x3d3/0x560 net/ieee802154/nl802154.c:1584 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Reported-by: syzbot+9ec037722d2603a9f52e@syzkaller.appspotmail.com Acked-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210304152125.1052825-1-paskripkin@gmail.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:42:16 +02:00
Alexander Aring	1534efc7bb	net: ieee802154: stop dump llsec params for monitors This patch stops dumping llsec params for monitors which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Reported-by: syzbot+cde43a581a8e5f317bc2@syzkaller.appspotmail.com Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-16-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:34:38 +02:00
Alexander Aring	9dde130937	net: ieee802154: forbid monitor for del llsec seclevel This patch forbids to del llsec seclevel for monitor interfaces which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Reported-by: syzbot+fbf4fc11a819824e027b@syzkaller.appspotmail.com Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-15-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:32:52 +02:00
Alexander Aring	9ec87e3224	net: ieee802154: forbid monitor for add llsec seclevel This patch forbids to add llsec seclevel for monitor interfaces which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-14-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:30:50 +02:00
Alexander Aring	4c9b4f55ad	net: ieee802154: stop dump llsec seclevels for monitors This patch stops dumping llsec seclevels for monitors which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-13-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:28:34 +02:00
Alexander Aring	6fb8045319	net: ieee802154: forbid monitor for del llsec devkey This patch forbids to del llsec devkey for monitor interfaces which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-12-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:26:29 +02:00
Alexander Aring	a347b3b394	net: ieee802154: forbid monitor for add llsec devkey This patch forbids to add llsec devkey for monitor interfaces which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-11-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:24:08 +02:00
Alexander Aring	080d1a57a9	net: ieee802154: stop dump llsec devkeys for monitors This patch stops dumping llsec devkeys for monitors which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-10-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:21:16 +02:00
Alexander Aring	ad8f9de1f3	net: ieee802154: forbid monitor for del llsec dev This patch forbids to del llsec dev for monitor interfaces which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-9-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:19:10 +02:00
Alexander Aring	5303f956b0	net: ieee802154: forbid monitor for add llsec dev This patch forbids to add llsec dev for monitor interfaces which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-8-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:12:41 +02:00
Alexander Aring	5582d641e6	net: ieee802154: stop dump llsec devs for monitors This patch stops dumping llsec devs for monitors which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-7-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:10:09 +02:00
Alexander Aring	b6e2949544	net: ieee802154: forbid monitor for del llsec key This patch forbids to del llsec key for monitor interfaces which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-6-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:07:31 +02:00
Alexander Aring	08470c5453	net: ieee802154: forbid monitor for add llsec key This patch forbids to add llsec key for monitor interfaces which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-5-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:03:16 +02:00
Alexander Aring	fb3c5cdf88	net: ieee802154: stop dump llsec keys for monitors This patch stops dumping llsec keys for monitors which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-4-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 22:00:50 +02:00
Alexander Aring	88c17855ac	net: ieee802154: forbid monitor for set llsec params This patch forbids to set llsec params for monitor interfaces which we don't support yet. Reported-by: syzbot+8b6719da8a04beeafcc3@syzkaller.appspotmail.com Signed-off-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20210405003054.256017-3-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2021-04-06 21:57:29 +02:00
Jiapeng Chong	dee9f6ade3	sunrpc: Remove unused function ip_map_lookup Fix the following clang warnings: net/sunrpc/svcauth_unix.c:306:30: warning: unused function 'ip_map_lookup' [-Wunused-function]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-06 11:24:31 -04:00
Sathish Narasimman	8ce85ada0a	Bluetooth: LL privacy allow RPA allow RPA to add bd address to whitelist Signed-off-by: Sathish Narasimman <sathish.narasimman@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-04-06 10:48:15 +02:00
Sathish Narasimman	abb638b311	Bluetooth: Handle own address type change with HCI_ENABLE_LL_PRIVACY own_address_type has to changed to 0x02 and 0x03 only when HCI_ENABLE_LL_PRIVACY flag is set. Signed-off-by: Sathish Narasimman <sathish.narasimman@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-04-06 10:47:45 +02:00
Daniel Winkler	b6f1b79dea	Bluetooth: Do not set cur_adv_instance in adv param MGMT request We set hdev->cur_adv_instance in the adv param MGMT request to allow the callback to the hci param request to set the tx power to the correct instance. Now that the callbacks use the advertising handle from the hci request (as they should), this workaround is no longer necessary. Furthermore, this change resolves a race condition that is more prevalent when using the extended advertising MGMT calls - if hdev->cur_adv_instance is set in the params request, then when the data request is called, we believe our new instance is already active. This treats it as an update and immediately schedules the instance with the controller, which has a potential race with the software rotation adv update. By not setting hdev->cur_adv_instance too early, the new instance is queued as it should be, to be used when the rotation comes around again. This change is tested on harrison peak to confirm that it resolves the race condition on registration, and that there is no regression in single- and multi-advertising automated tests. Reviewed-by: Miao-chen Chou <mcchou@chromium.org> Signed-off-by: Daniel Winkler <danielwinkler@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-04-06 10:43:26 +02:00
Daniel Winkler	25e70886c2	Bluetooth: Use ext adv handle from requests in CCs Some extended advertising hci command complete events are still using hdev->cur_adv_instance to map the request to the correct advertisement handle. However, with extended advertising, "current instance" doesn't make sense as we can have multiple concurrent advertisements. This change switches these command complete handlers to use the advertising handle from the request/event, to ensure we will always use the correct advertising handle regardless of the state of hdev->cur_adv_instance. This change is tested on hatch and kefka chromebooks and run through single- and multi-advertising automated tests to confirm callbacks report tx power to the correct advertising handle, etc. Reviewed-by: Miao-chen Chou <mcchou@chromium.org> Signed-off-by: Daniel Winkler <danielwinkler@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-04-06 10:43:25 +02:00
Kai Ye	93917fd224	Bluetooth: use the correct print format for L2CAP debug statements Use the correct print format. Printing an unsigned int value should use %u instead of %d. For details, please read document: Documentation/core-api/printk-formats.rst Signed-off-by: Kai Ye <yekai13@huawei.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-04-06 10:40:20 +02:00
Florian Westphal	1379940bf8	netfilter: conntrack: move ecache dwork to net_generic infra dwork struct is large (>128 byte) and not needed when conntrack module is not loaded. Place it in net_generic data instead. The struct net dwork member is now obsolete and will be removed in a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-06 00:34:53 +02:00
Florian Westphal	7b5974709f	netfilter: conntrack: move sysctl pointer to net_generic infra No need to keep this in struct net, place it in the net_generic data. The sysctl pointer is removed from struct net in a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-06 00:34:53 +02:00
Florian Westphal	1d610d4d31	netfilter: x_tables: move known table lists to net_generic infra Will reduce struct net size by 208 bytes. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-06 00:34:52 +02:00
Florian Westphal	0854db2aae	netfilter: nf_tables: use net_generic infra for transaction data This moves all nf_tables pernet data from struct net to a net_generic extension, with the exception of the gencursor. The latter is used in the data path and also outside of the nf_tables core. All others are only used from the configuration plane. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-06 00:34:52 +02:00
Florian Westphal	5b53951cfc	netfilter: ebtables: use net_generic infra ebtables currently uses net->xt.tables[BRIDGE], but upcoming patch will move net->xt.tables away from struct net. To avoid exposing x_tables internals to ebtables, use a private list instead. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-06 00:34:52 +02:00
Florian Westphal	7b1957b049	netfilter: nf_defrag_ipv4: use net_generic infra This allows followup patch to remove the defrag_ipv4 member from struct net. It also allows to auto-remove the hooks later on by adding a _disable() function. This will be done later in a follow patch series. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-06 00:34:52 +02:00
Florian Westphal	8b0adbe3e3	netfilter: nf_defrag_ipv6: use net_generic infra This allows followup patch to remove these members from struct net. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-06 00:34:51 +02:00
Florian Westphal	ebfbe67568	netfilter: cttimeout: use net_generic infra reduce size of struct net and make this self-contained. The member in struct net is kept to minimize changes to struct net layout, it will be removed in a separate patch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-06 00:34:51 +02:00
Florian Westphal	1be05ea766	netfilter: nfnetlink: use net_generic infra No need to place it in struct net, nfnetlink is a module and usage doesn't occur in fastpath. Also remove rcu usage: Not a single reader of net->nfnl uses rcu accessors. When exit_batch callbacks are executed the net namespace is already dead so no calls to these functions are possible anymore (else we'd get NULL deref crash too). If the module is removed, then modules that call any of those functions have been removed too so no calls to nfnl functions are possible either. The nfnl and nfl_stash pointers in struct net are no longer used, they will be removed in a followup patch to minimize changes to struct net (causes rebuild for entire network stack). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-06 00:34:51 +02:00
Florian Westphal	237c609f87	netfilter: nfnetlink: add and use nfnetlink_broadcast This removes the only reference of net->nfnl outside of the nfnetlink module. This allows to move net->nfnl to net_generic infra. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-06 00:34:51 +02:00
Tetsuo Handa	08c27f3322	batman-adv: initialize "struct batadv_tvlv_tt_vlan_data"->reserved field KMSAN found uninitialized value at batadv_tt_prepare_tvlv_local_data() [1], for commit `ced72933a5` ("batman-adv: use CRC32C instead of CRC16 in TT code") inserted 'reserved' field into "struct batadv_tvlv_tt_data" and commit `7ea7b4a142` ("batman-adv: make the TT CRC logic VLAN specific") moved that field to "struct batadv_tvlv_tt_vlan_data" but left that field uninitialized. [1] https://syzkaller.appspot.com/bug?id=07f3e6dba96f0eb3cabab986adcd8a58b9bdbe9d Reported-by: syzbot <syzbot+50ee810676e6a089487b@syzkaller.appspotmail.com> Tested-by: syzbot <syzbot+50ee810676e6a089487b@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: `ced72933a5` ("batman-adv: use CRC32C instead of CRC16 in TT code") Fixes: `7ea7b4a142` ("batman-adv: make the TT CRC logic VLAN specific") Acked-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-05 15:06:03 -07:00
Andrei Vagin	eeb85a14ee	net: Allow to specify ifindex when device is moved to another namespace Currently, we can specify ifindex on link creation. This change allows to specify ifindex when a device is moved to another network namespace. Even now, a device ifindex can be changed if there is another device with the same ifindex in the target namespace. So this change doesn't introduce completely new behavior, it adds more control to the process. CRIU users want to restore containers with pre-created network devices. A user will provide network devices and instructions where they have to be restored, then CRIU will restore network namespaces and move devices into them. The problem is that devices have to be restored with the same indexes that they have before C/R. Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Suggested-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Andrei Vagin <avagin@gmail.com> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-05 14:49:40 -07:00
Zheng Yongjun	d3295869c4	net: nfc: Fix spelling errors in net/nfc module These patches fix a series of spelling errors in net/nfc module. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-05 12:58:31 -07:00
Maciej Żenczykowski	630e4576f8	net-ipv6: bugfix - raw & sctp - switch to ipv6_can_nonlocal_bind() Found by virtue of ipv6 raw sockets not honouring the per-socket IP{,V6}_FREEBIND setting. Based on hits found via: git grep '[.]ip_nonlocal_bind' We fix both raw ipv6 sockets to honour IP{,V6}_FREEBIND and IP{,V6}_TRANSPARENT, and we fix sctp sockets to honour IP{,V6}_TRANSPARENT (they already honoured FREEBIND), and not just the ipv6 'ip_nonlocal_bind' sysctl. The helper is defined as: static inline bool ipv6_can_nonlocal_bind(struct net net, struct inet_sock inet) { return net->ipv6.sysctl.ip_nonlocal_bind \|\| inet->freebind \|\| inet->transparent; } so this change only widens the accepted opt-outs and is thus a clean bugfix. I'm not entirely sure what 'fixes' tag to add, since this is AFAICT an ancient bug, but IMHO this should be applied to stable kernels as far back as possible. As such I'm adding a 'fixes' tag with the commit that originally added the helper, which happened in 4.19. Backporting to older LTS kernels (at least 4.9 and 4.14) would presumably require open-coding it or backporting the helper as well. Other possibly relevant commits: v4.18-rc6-1502-g83ba4645152d net: add helpers checking if socket can be bound to nonlocal address v4.18-rc6-1431-gd0c1f01138c4 net/ipv6: allow any source address for sendmsg pktinfo with ip_nonlocal_bind v4.14-rc5-271-gb71d21c274ef sctp: full support for ipv6 ip_nonlocal_bind & IP_FREEBIND v4.7-rc7-1883-g9b9742022888 sctp: support ipv6 nonlocal bind v4.1-12247-g35a256fee52c ipv6: Nonlocal bind Cc: Lorenzo Colitti <lorenzo@google.com> Fixes: `83ba464515` ("net: add helpers checking if socket can be bound to nonlocal address") Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-By: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-05 12:56:52 -07:00
Ilya Maximets	4d51419d49	openvswitch: fix send of uninitialized stack memory in ct limit reply 'struct ovs_zone_limit' has more members than initialized in ovs_ct_limit_get_default_limit(). The rest of the memory is a random kernel stack content that ends up being sent to userspace. Fix that by using designated initializer that will clear all non-specified fields. Fixes: `11efd5cb04` ("openvswitch: Support conntrack zone limit") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-05 12:54:42 -07:00
Wu XiangCheng	85d091a794	tipc: Fix a kernel-doc warning in name_table.c Fix kernel-doc warning: Documentation/networking/tipc:66: /home/sfr/next/next/net/tipc/name_table.c :558: WARNING: Unexpected indentation. Documentation/networking/tipc:66: /home/sfr/next/next/net/tipc/name_table.c :559: WARNING: Block quote ends without a blank line; unexpected unindent. Due to blank line missing. Fixes: `908148bc50` ("tipc: refactor tipc_sendmsg() and tipc_lookup_anycast()") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/netdev/20210318172255.63185609@canb.auug.org.au/ Signed-off-by: Wu XiangCheng <bobwxc@email.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-05 12:51:34 -07:00
Taehee Yoo	4b4b84468a	mld: change lockdep annotation for ip6_sf_socklist and ipv6_mc_socklist struct ip6_sf_socklist and ipv6_mc_socklist are per-socket MLD data. These data are protected by rtnl lock, socket lock, and RCU. So, when these are used, it verifies whether rtnl lock is acquired or not. ip6_mc_msfget() is called by do_ipv6_getsockopt(). But caller doesn't acquire rtnl lock. So, when these data are used in the ip6_mc_msfget() lockdep warns about it. But accessing these is actually safe because socket lock was acquired by do_ipv6_getsockopt(). So, it changes lockdep annotation from rtnl lock to socket lock. (rtnl_dereference -> sock_dereference) Locking graph for mld data is like below: When writing mld data: do_ipv6_setsockopt() rtnl_lock lock_sock (mld functions) idev->mc_lock(if per-interface mld data is modified) When reading mld data: do_ipv6_getsockopt() lock_sock ip6_mc_msfget() Splat looks like: ============================= WARNING: suspicious RCU usage 5.12.0-rc4+ #503 Not tainted ----------------------------- net/ipv6/mcast.c:610 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by mcast-listener-/923: #0: ffff888007958a70 (sk_lock-AF_INET6){+.+.}-{0:0}, at: ipv6_get_msfilter+0xaf/0x190 stack backtrace: CPU: 1 PID: 923 Comm: mcast-listener- Not tainted 5.12.0-rc4+ #503 Call Trace: dump_stack+0xa4/0xe5 ip6_mc_msfget+0x553/0x6c0 ? ipv6_sock_mc_join_ssm+0x10/0x10 ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0 ? mark_held_locks+0xb7/0x120 ? lockdep_hardirqs_on_prepare+0x27c/0x3e0 ? __local_bh_enable_ip+0xa5/0xf0 ? lock_sock_nested+0x82/0xf0 ipv6_get_msfilter+0xc3/0x190 ? compat_ipv6_get_msfilter+0x300/0x300 ? lock_downgrade+0x690/0x690 do_ipv6_getsockopt.isra.6.constprop.13+0x1809/0x29e0 ? do_ipv6_mcast_group_source+0x150/0x150 ? register_lock_class+0x1750/0x1750 ? kvm_sched_clock_read+0x14/0x30 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x170 ? find_held_lock+0x3a/0x1c0 ? lock_downgrade+0x690/0x690 ? ipv6_getsockopt+0xdb/0x1b0 ipv6_getsockopt+0xdb/0x1b0 [ ... ] Fixes: `88e2ca3080` ("mld: convert ifmcaddr6 to RCU") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-05 12:50:04 -07:00
Benjamin Coddington	98b5cee373	SUNRPC: Ensure the transport backchannel association If the server sends CB_ calls on a connection that is not associated with the backchannel, refuse to process the call and shut down the connection. This avoids a NULL dereference crash in xprt_complete_bc_request(). There's not much more we can do in this situation unless we want to look into allowing all connections to be associated with the fore and back channel. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2021-04-05 09:04:21 -04:00
Eryu Guan	6b996476f3	sunrpc: honor rpc_task's timeout value in rpcb_create() Currently rpcbind client is created without setting rpc timeout (thus using the default value). But if the rpc_task already has a customized timeout in its tk_client field, it's also ignored. Let's use the same timeout setting in rpc_task->tk_client->cl_timeout for rpcbind connection. Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2021-04-05 09:04:21 -04:00
Trond Myklebust	d737e5d418	SUNRPC: Set TCP_CORK until the transmit queue is empty When we have multiple RPC requests queued up, it makes sense to set the TCP_CORK option while the transmit queue is non-empty. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2021-04-05 09:04:20 -04:00
Greg Kroah-Hartman	9594408763	Merge 5.12-rc6 into tty-next We need the serial/tty fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-05 08:59:21 +02:00
Christophe JAILLET	7d42e84eb9	net: openvswitch: Use 'skb_push_rcsum()' instead of hand coding it 'skb_push()'/'skb_postpush_rcsum()' can be replaced by an equivalent 'skb_push_rcsum()' which is less verbose. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-04 01:43:02 -07:00
Pablo Neira Ayuso	8c56049fec	netfilter: nftables: remove documentation on static functions Since `4f16d25c68` ("netfilter: nftables: add nft_parse_register_load() and use it") and `345023b0db` ("netfilter: nftables: add nft_parse_register_store() and use it"), the following functions are not exported symbols anymore: - nft_parse_register() - nft_validate_register_load() - nft_validate_register_store() Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-03 20:36:03 +02:00
Dan Carpenter	dadf33c9f6	netfilter: nftables: fix a warning message in nf_tables_commit_audit_collect() The first argument of a WARN_ONCE() is a condition. This WARN_ONCE() will only print the table name, and is potentially problematic if the table name has a %s in it. Fixes: `c520292f29` ("audit: log nftables configuration change events once per table") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-03 20:19:37 +02:00
Florian Westphal	daf47a7c10	netfilter: ipvs: do not printk on netns creation This causes dmesg spew during normal operation, so remove this. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Julian Anastasov <ja@ssi.bg> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-04-03 20:17:11 +02:00
Florian Westphal	dc87efdb1a	mptcp: add mptcp reset option support The MPTCP reset option allows to carry a mptcp-specific error code that provides more information on the nature of a connection reset. Reset option data received gets stored in the subflow context so it can be sent to userspace via the 'subflow closed' netlink event. When a subflow is closed, the desired error code that should be sent to the peer is also placed in the subflow context structure. If a reset is sent before subflow establishment could complete, e.g. on HMAC failure during an MP_JOIN operation, the mptcp skb extension is used to store the reset information. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-02 14:21:50 -07:00
Paolo Abeni	781bf13d4f	mptcp: remove unneeded check on first subflow Currently we explicitly check for the first subflow being NULL in a couple of places, even if we don't need any special actions in such scenario. Just drop the unneeded checks, to avoid confusion. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-02 14:21:50 -07:00
Paolo Abeni	5695eb8891	mptcp: add active MPC mibs We are not currently tracking the active MPTCP connection attempts. Let's add the related counters. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-02 14:21:50 -07:00
Paolo Abeni	a16195e35c	mptcp: add mib for token creation fallback If the MPTCP protocol is unable to create a new token, the socket fallback to plain TCP, let's keep track of such events via a specific MIB. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-02 14:21:50 -07:00
Yunjian Wang	990b03b05b	net: cls_api: Fix uninitialised struct field bo->unlocked_driver_cb The 'unlocked_driver_cb' struct field in 'bo' is not being initialized in tcf_block_offload_init(). The uninitialized 'unlocked_driver_cb' will be used when calling unlocked_driver_cb(). So initialize 'bo' to zero to avoid the issue. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: `0fdcf78d59` ("net: use flow_indr_dev_setup_offload()") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-02 14:14:22 -07:00
David S. Miller	c2bcb4cf02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-04-01 The following pull-request contains BPF updates for your net-next tree. We've added 68 non-merge commits during the last 7 day(s) which contain a total of 70 files changed, 2944 insertions(+), 1139 deletions(-). The main changes are: 1) UDP support for sockmap, from Cong. 2) Verifier merge conflict resolution fix, from Daniel. 3) xsk selftests enhancements, from Maciej. 4) Unstable helpers aka kernel func calling, from Martin. 5) Batches ops for LPM map, from Pedro. 6) Fix race in bpf_get_local_storage, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-02 11:03:07 -07:00
Luiz Augusto von Dentz	0ae8ef674e	Bluetooth: SMP: Fix variable dereferenced before check 'conn' This fixes kbuild findings: smatch warnings: net/bluetooth/smp.c:1633 smp_user_confirm_reply() warn: variable dereferenced before check 'conn' (see line 1631) Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-04-02 11:09:12 +02:00
Archie Pusaka	06752d1678	Bluetooth: Check inquiry status before sending one There is a possibility where HCI_INQUIRY flag is set but we still send HCI_OP_INQUIRY anyway. Such a case can be reproduced by connecting to an LE device while active scanning. When the device is discovered, we initiate a connection, stop LE Scan, and send Discovery MGMT with status disabled, but we don't cancel the inquiry. Signed-off-by: Archie Pusaka <apusaka@chromium.org> Reviewed-by: Sonny Sasaka <sonnysasaka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-04-02 11:06:17 +02:00
Meng Yu	149b3f13b4	Bluetooth: Coding style fix 1. Add space when needed; 2. Block comments style fix; 3. Move open brace '{' following function definitions to the next line; 4. Remove unnecessary braces '{}' for single statement blocks. Signed-off-by: Meng Yu <yumeng18@huawei.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-04-02 11:03:04 +02:00
Meng Yu	82a1242619	Bluetooth: Remove 'return' in void function void function return statements are not generally useful. Signed-off-by: Meng Yu <yumeng18@huawei.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-04-02 11:01:35 +02:00
Paolo Abeni	0a3cc57978	mptcp: revert "mptcp: provide subflow aware release function" This change reverts commit `ad98dd3705` ("mptcp: provide subflow aware release function"). The latter introduced a deadlock spotted by syzkaller and is not needed anymore after the previous commit. Fixes: `ad98dd3705` ("mptcp: provide subflow aware release function") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-01 16:02:50 -07:00
Paolo Abeni	86581852d7	mptcp: forbit mcast-related sockopt on MPTCP sockets Unrolling mcast state at msk dismantel time is bug prone, as syzkaller reported: ====================================================== WARNING: possible circular locking dependency detected 5.11.0-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor905/8822 is trying to acquire lock: ffffffff8d678fe8 (rtnl_mutex){+.+.}-{3:3}, at: ipv6_sock_mc_close+0xd7/0x110 net/ipv6/mcast.c:323 but task is already holding lock: ffff888024390120 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1600 [inline] ffff888024390120 (sk_lock-AF_INET6){+.+.}-{0:0}, at: mptcp6_release+0x57/0x130 net/mptcp/protocol.c:3507 which lock already depends on the new lock. Instead we can simply forbit any mcast-related setsockopt Fixes: `717e79c867` ("mptcp: Add setsockopt()/getsockopt() socket operations") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-01 16:02:50 -07:00
Wan Jiabing	ec7e48ca4b	net: smc: Remove repeated struct declaration struct smc_clc_msg_local is declared twice. One is declared at 301st line. The blew one is not needed. Remove the duplicate. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-01 15:52:38 -07:00
Norman Maurer	98184612ac	net: udp: Add support for getsockopt(..., ..., UDP_GRO, ..., ...); Support for UDP_GRO was added in the past but the implementation for getsockopt was missed which did lead to an error when we tried to retrieve the setting for UDP_GRO. This patch adds the missing switch case for UDP_GRO Fixes: `e20cf8d3f1` ("udp: implement GRO for plain UDP sockets.") Signed-off-by: Norman Maurer <norman_maurer@apple.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-01 15:50:50 -07:00
Xu Jia	b7a320c3a1	net: ipv6: Refactor in rt6_age_examine_exception The logic in rt6_age_examine_exception is confusing. The commit is to refactor the code. Signed-off-by: Xu Jia <xujia39@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-01 15:45:26 -07:00
Hoang Le	f20a46c304	tipc: fix unique bearer names sanity check When enabling a bearer by name, we don't sanity check its name with higher slot in bearer list. This may have the effect that the name of an already enabled bearer bypasses the check. To fix the above issue, we just perform an extra checking with all existing bearers. Fixes: `cb30a63384` ("tipc: refactor function tipc_enable_bearer()") Cc: stable@vger.kernel.org Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-01 15:43:31 -07:00
Cong Wang	122e6c79ef	sock_map: Update sock type checks for UDP Now UDP supports sockmap and redirection, we can safely update the sock type checks for it accordingly. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-15-xiyou.wangcong@gmail.com	2021-04-01 10:56:14 -07:00
Cong Wang	1f5be6b3b0	udp: Implement udp_bpf_recvmsg() for sockmap We have to implement udp_bpf_recvmsg() to replace the ->recvmsg() to retrieve skmsg from ingress_msg. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-14-xiyou.wangcong@gmail.com	2021-04-01 10:56:14 -07:00
Cong Wang	2bc793e327	skmsg: Extract __tcp_bpf_recvmsg() and tcp_bpf_wait_data() Although these two functions are only used by TCP, they are not specific to TCP at all, both operate on skmsg and ingress_msg, so fit in net/core/skmsg.c very well. And we will need them for non-TCP, so rename and move them to skmsg.c and export them to modules. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331023237.41094-13-xiyou.wangcong@gmail.com	2021-04-01 10:56:14 -07:00
Cong Wang	d7f571188e	udp: Implement ->read_sock() for sockmap This is similar to tcp_read_sock(), except we do not need to worry about connections, we just need to retrieve skb from UDP receive queue. Note, the return value of ->read_sock() is unused in sk_psock_verdict_data_ready(), and UDP still does not support splice() due to lack of ->splice_read(), so users can not reach udp_read_sock() directly. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-12-xiyou.wangcong@gmail.com	2021-04-01 10:56:14 -07:00
Cong Wang	8a59f9d1e3	sock: Introduce sk->sk_prot->psock_update_sk_prot() Currently sockmap calls into each protocol to update the struct proto and replace it. This certainly won't work when the protocol is implemented as a module, for example, AF_UNIX. Introduce a new ops sk->sk_prot->psock_update_sk_prot(), so each protocol can implement its own way to replace the struct proto. This also helps get rid of symbol dependencies on CONFIG_INET. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331023237.41094-11-xiyou.wangcong@gmail.com	2021-04-01 10:56:14 -07:00
Cong Wang	a7ba4558e6	sock_map: Introduce BPF_SK_SKB_VERDICT Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is confusing and more importantly we still want to distinguish them from user-space. So we can just reuse the stream verdict code but introduce a new type of eBPF program, skb_verdict. Users are not allowed to attach stream_verdict and skb_verdict programs to the same map. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-10-xiyou.wangcong@gmail.com	2021-04-01 10:56:14 -07:00
Cong Wang	b017055255	sock_map: Kill sock_map_link_no_progs() Now we can fold sock_map_link_no_progs() into sock_map_link() and get rid of sock_map_link_no_progs(). Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331023237.41094-9-xiyou.wangcong@gmail.com	2021-04-01 10:56:13 -07:00
Cong Wang	2004fdbd8a	sock_map: Simplify sock_map_link() a bit sock_map_link() passes down map progs, but it is confusing to see both map progs and psock progs. Make the map progs more obvious by retrieving it directly with sock_map_progs() inside sock_map_link(). Now it is aligned with sock_map_link_no_progs() too. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-8-xiyou.wangcong@gmail.com	2021-04-01 10:56:13 -07:00
Cong Wang	190179f65b	skmsg: Use GFP_KERNEL in sk_psock_create_ingress_msg() This function is only called in process context. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-7-xiyou.wangcong@gmail.com	2021-04-01 10:56:13 -07:00
Cong Wang	7786dfc41a	skmsg: Use rcu work for destroying psock The RCU callback sk_psock_destroy() only queues work psock->gc, so we can just switch to rcu work to simplify the code. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-6-xiyou.wangcong@gmail.com	2021-04-01 10:56:13 -07:00
Cong Wang	799aa7f98d	skmsg: Avoid lock_sock() in sk_psock_backlog() We do not have to lock the sock to avoid losing sk_socket, instead we can purge all the ingress queues when we close the socket. Sending or receiving packets after orphaning socket makes no sense. We do purge these queues when psock refcnt reaches zero but here we want to purge them explicitly in sock_map_close(). There are also some nasty race conditions on testing bit SK_PSOCK_TX_ENABLED and queuing/canceling the psock work, we can expand psock->ingress_lock a bit to protect them too. As noticed by John, we still have to lock the psock->work, because the same work item could be running concurrently on different CPU's. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-5-xiyou.wangcong@gmail.com	2021-04-01 10:56:13 -07:00
Cong Wang	0739cd28f2	net: Introduce skb_send_sock() for sock_map We only have skb_send_sock_locked() which requires callers to use lock_sock(). Introduce a variant skb_send_sock() which locks on its own, callers do not need to lock it any more. This will save us from adding a ->sendmsg_locked for each protocol. To reuse the code, pass function pointers to __skb_send_sock() and build skb_send_sock() and skb_send_sock_locked() on top. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-4-xiyou.wangcong@gmail.com	2021-04-01 10:56:13 -07:00
Cong Wang	b01fd6e802	skmsg: Introduce a spinlock to protect ingress_msg Currently we rely on lock_sock to protect ingress_msg, it is too big for this, we can actually just use a spinlock to protect this list like protecting other skb queues. __tcp_bpf_recvmsg() is still special because of peeking, it still has to use lock_sock. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-3-xiyou.wangcong@gmail.com	2021-04-01 10:56:13 -07:00
Cong Wang	37f0e514db	skmsg: Lock ingress_skb when purging Currently we purge the ingress_skb queue only when psock refcnt goes down to 0, so locking the queue is not necessary, but in order to be called during ->close, we have to lock it here. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-2-xiyou.wangcong@gmail.com	2021-04-01 10:56:13 -07:00
Ong Boon Leong	622d13694b	xdp: fix xdp_return_frame() kernel BUG throw for page_pool memory model xdp_return_frame() may be called outside of NAPI context to return xdpf back to page_pool. xdp_return_frame() calls __xdp_return() with napi_direct = false. For page_pool memory model, __xdp_return() calls xdp_return_frame_no_direct() unconditionally and below false negative kernel BUG throw happened under preempt-rt build: [ 430.450355] BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/3884 [ 430.451678] caller is __xdp_return+0x1ff/0x2e0 [ 430.452111] CPU: 0 PID: 3884 Comm: modprobe Tainted: G U E 5.12.0-rc2+ #45 Changes in v2: - This patch fixes the issue by making xdp_return_frame_no_direct() is only called if napi_direct = true, as recommended for better by Jesper Dangaard Brouer. Thanks! Fixes: `2539650fad` ("xdp: Helpers for disabling napi_direct of xdp_return_frame") Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 15:15:23 -07:00
Eric Dumazet	0d7a7b2014	ipv6: remove extra dev_hold() for fallback tunnels My previous commits added a dev_hold() in tunnels ndo_init(), but forgot to remove it from special functions setting up fallback tunnels. Fallback tunnels do call their respective ndo_init() This leads to various reports like : unregister_netdevice: waiting for ip6gre0 to become free. Usage count = 2 Fixes: `48bb569726` ("ip6_tunnel: sit: proper dev_{hold\|put} in ndo_[un]init methods") Fixes: `6289a98f08` ("sit: proper dev_{hold\|put} in ndo_[un]init methods") Fixes: `40cb881b5a` ("ip6_vti: proper dev_{hold\|put} in ndo_[un]init methods") Fixes: `7f700334be` ("ip6_gre: proper dev_{hold\|put} in ndo_[un]init methods") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:53:11 -07:00
Yang Yingliang	ac1db7acea	net/tipc: fix missing destroy_workqueue() on error in tipc_crypto_start() Add the missing destroy_workqueue() before return from tipc_crypto_start() in the error handling case. Fixes: `1ef6f7c939` ("tipc: add automatic session key exchange") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:52:00 -07:00
Eric Dumazet	a6175633a2	ipv6: convert elligible sysctls to u8 Convert most sysctls that can fit in a byte. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:48:20 -07:00
Eric Dumazet	1c3289c931	tcp: convert tcp_comp_sack_nr sysctl to u8 tcp_comp_sack_nr max value was already 255. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:48:20 -07:00
Eric Dumazet	7d4b37ebb9	ipv4: convert igmp_link_local_mcast_reports sysctl to u8 This sysctl is a bool, can use less storage. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:48:20 -07:00
Eric Dumazet	be205fe6ec	ipv4: convert fib_multipath_{use_neigh\|hash_policy} sysctls to u8 Make room for better packing of netns_ipv4 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:48:20 -07:00
Eric Dumazet	cd04bd0222	ipv4: convert udp_l3mdev_accept sysctl to u8 Reduce footprint of sysctls. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:48:20 -07:00
Eric Dumazet	b2908fac5b	ipv4: convert fib_notify_on_flag_change sysctl to u8 Reduce footprint of sysctls. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:48:19 -07:00
David S. Miller	c9170f1321	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2021-03-31 1) Fix ipv4 pmtu checks for xfrm anf vti interfaces. From Eyal Birger. 2) There are situations where the socket passed to xfrm_output_resume() is not the same as the one attached to the skb. Use the socket passed to xfrm_output_resume() to avoid lookup failures when xfrm is used with VRFs. From Evan Nimmo. 3) Make the xfrm_state_hash_generation sequence counter per network namespace because but its write serialization lock is also per network namespace. Write protection is insufficient otherwise. From Ahmed S. Darwish. 4) Fixup sctp featue flags when used with esp offload. From Xin Long. 5) xfrm BEET mode doesn't support fragments for inner packets. This is a limitation of the protocol, so no fix possible. Warn at least to notify the user about that situation. From Xin Long. 6) Fix NULL pointer dereference on policy lookup when namespaces are uses in combination with esp offload. 7) Fix incorrect transformation on esp offload when packets get segmented at layer 3. 8) Fix some user triggered usages of WARN_ONCE in the xfrm compat layer. From Dmitry Safonov. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:37:51 -07:00
Matthew Wilcox (Oracle)	3cbf7530a1	qrtr: Convert qrtr_ports from IDR to XArray The XArray interface is easier for this driver to use. Also fixes a bug reported by the improper use of GFP_ATOMIC. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:32:20 -07:00
Lv Yunlong	bdc2ab5c61	net/rds: Fix a use after free in rds_message_map_pages In rds_message_map_pages, the rm is freed by rds_message_put(rm). But rm is still used by rm->data.op_sg in return value. My patch assigns ERR_CAST(rm->data.op_sg) to err before the rm is freed to avoid the uaf. Fixes: `7dba92037b` ("net/rds: Use ERR_PTR for rds_message_alloc_sgs()") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:26:56 -07:00
Eric Dumazet	48bb569726	ip6_tunnel: sit: proper dev_{hold\|put} in ndo_[un]init methods Same reasons than for the previous commits : `6289a98f08` ("sit: proper dev_{hold\|put} in ndo_[un]init methods") `40cb881b5a` ("ip6_vti: proper dev_{hold\|put} in ndo_[un]init methods") `7f700334be` ("ip6_gre: proper dev_{hold\|put} in ndo_[un]init methods") After adopting CONFIG_PCPU_DEV_REFCNT=n option, syzbot was able to trigger a warning [1] Issue here is that: - all dev_put() should be paired with a corresponding prior dev_hold(). - A driver doing a dev_put() in its ndo_uninit() MUST also do a dev_hold() in its ndo_init(), only when ndo_init() is returning 0. Otherwise, register_netdevice() would call ndo_uninit() in its error path and release a refcount too soon. [1] WARNING: CPU: 1 PID: 21059 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31 Modules linked in: CPU: 1 PID: 21059 Comm: syz-executor.4 Not tainted 5.12.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31 Code: 1d 6a 5a e8 09 31 ff 89 de e8 8d 1a ab fd 84 db 75 e0 e8 d4 13 ab fd 48 c7 c7 a0 e1 c1 89 c6 05 4a 5a e8 09 01 e8 2e 36 fb 04 <0f> 0b eb c4 e8 b8 13 ab fd 0f b6 1d 39 5a e8 09 31 ff 89 de e8 58 RSP: 0018:ffffc900025aefe8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000040000 RSI: ffffffff815c51f5 RDI: fffff520004b5def RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815bdf8e R11: 0000000000000000 R12: ffff888023488568 R13: ffff8880254e9000 R14: 00000000dfd82cfd R15: ffff88802ee2d7c0 FS: 00007f13bc590700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f0943e74000 CR3: 0000000025273000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __refcount_dec include/linux/refcount.h:344 [inline] refcount_dec include/linux/refcount.h:359 [inline] dev_put include/linux/netdevice.h:4135 [inline] ip6_tnl_dev_uninit+0x370/0x3d0 net/ipv6/ip6_tunnel.c:387 register_netdevice+0xadf/0x1500 net/core/dev.c:10308 ip6_tnl_create2+0x1b5/0x400 net/ipv6/ip6_tunnel.c:263 ip6_tnl_newlink+0x312/0x580 net/ipv6/ip6_tunnel.c:2052 __rtnl_newlink+0x1062/0x1710 net/core/rtnetlink.c:3443 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3491 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: `919067cc84` ("net: add CONFIG_PCPU_DEV_REFCNT") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:16:43 -07:00
Jakub Kicinski	1e5d1f69d9	ethtool: support FEC settings over netlink Add FEC API to netlink. This is not a 1-to-1 conversion. FEC settings already depend on link modes to tell user which modes are supported. Take this further an use link modes for manual configuration. Old struct ethtool_fecparam is still used to talk to the drivers, so we need to translate back and forth. We can revisit the internal API if number of FEC encodings starts to grow. Enforce only one active FEC bit (by using a bit position rather than another mask). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:15:23 -07:00
Tong Zhu	d47ec7a0a7	neighbour: Disregard DEAD dst in neigh_update After a short network outage, the dst_entry is timed out and put in DST_OBSOLETE_DEAD. We are in this code because arp reply comes from this neighbour after network recovers. There is a potential race condition that dst_entry is still in DST_OBSOLETE_DEAD. With that, another neighbour lookup causes more harm than good. In best case all packets in arp_queue are lost. This is counterproductive to the original goal of finding a better path for those packets. I observed a worst case with 4.x kernel where a dst_entry in DST_OBSOLETE_DEAD state is associated with loopback net_device. It leads to an ethernet header with all zero addresses. A packet with all zero source MAC address is quite deadly with mac80211, ath9k and 802.11 block ack. It fails ieee80211_find_sta_by_ifaddr in ath9k (xmit.c). Ath9k flushes tx queue (ath_tx_complete_aggr). BAW (block ack window) is not updated. BAW logic is damaged and ath9k transmission is disabled. Signed-off-by: Tong Zhu <zhutong@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:10:46 -07:00
Pablo Neira Ayuso	19c28b1374	netfilter: add helper function to set up the nfnetlink header and use it This patch adds a helper function to set up the netlink and nfnetlink headers. Update existing codebase to use it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 22:34:11 +02:00
Pablo Neira Ayuso	802b805162	netfilter: nftables: add helper function to set the base sequence number This patch adds a helper function to calculate the base sequence number field that is stored in the nfnetlink header. Use the helper function whenever possible. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 22:34:11 +02:00
Yang Yingliang	7726c9ce71	netfilter: nftables: remove unnecessary spin_lock_init() The spinlock nf_tables_destroy_list_lock is initialized statically. It is unnecessary to initialize by spin_lock_init(). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 22:34:11 +02:00
Pablo Neira Ayuso	8b9229d158	netfilter: flowtable: dst_check() from garbage collector path Move dst_check() to the garbage collector path. Stale routes trigger the flow entry teardown state which makes affected flows go back to the classic forwarding path to re-evaluate flow offloading. IPv6 requires the dst cookie to work, store it in the flow_tuple, otherwise dst_check() always fails. Fixes: `e5075c0bad` ("netfilter: flowtable: call dst_check() to fall back to classic forwarding") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 22:34:11 +02:00
Richard Guy Briggs	c520292f29	audit: log nftables configuration change events once per table Reduce logging of nftables events to a level similar to iptables. Restore the table field to list the table, adding the generation. Indicate the op as the most significant operation in the event. A couple of sample events: type=PROCTITLE msg=audit(2021-03-18 09:30:49.801:143) : proctitle=/usr/bin/python3 -s /usr/sbin/firewalld --nofork --nopid type=SYSCALL msg=audit(2021-03-18 09:30:49.801:143) : arch=x86_64 syscall=sendmsg success=yes exit=172 a0=0x6 a1=0x7ffdcfcbe650 a2=0x0 a3=0x7ffdcfcbd52c items=0 ppid=1 pid=367 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=roo t sgid=root fsgid=root tty=(none) ses=unset comm=firewalld exe=/usr/bin/python3.9 subj=system_u:system_r:firewalld_t:s0 key=(null) type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : table=firewalld:2 family=ipv6 entries=1 op=nft_register_table pid=367 subj=system_u:system_r:firewalld_t:s0 comm=firewalld type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : table=firewalld:2 family=ipv4 entries=1 op=nft_register_table pid=367 subj=system_u:system_r:firewalld_t:s0 comm=firewalld type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : table=firewalld:2 family=inet entries=1 op=nft_register_table pid=367 subj=system_u:system_r:firewalld_t:s0 comm=firewalld type=PROCTITLE msg=audit(2021-03-18 09:30:49.839:144) : proctitle=/usr/bin/python3 -s /usr/sbin/firewalld --nofork --nopid type=SYSCALL msg=audit(2021-03-18 09:30:49.839:144) : arch=x86_64 syscall=sendmsg success=yes exit=22792 a0=0x6 a1=0x7ffdcfcbe650 a2=0x0 a3=0x7ffdcfcbd52c items=0 ppid=1 pid=367 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=r oot sgid=root fsgid=root tty=(none) ses=unset comm=firewalld exe=/usr/bin/python3.9 subj=system_u:system_r:firewalld_t:s0 key=(null) type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : table=firewalld:3 family=ipv6 entries=30 op=nft_register_chain pid=367 subj=system_u:system_r:firewalld_t:s0 comm=firewalld type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : table=firewalld:3 family=ipv4 entries=30 op=nft_register_chain pid=367 subj=system_u:system_r:firewalld_t:s0 comm=firewalld type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : table=firewalld:3 family=inet entries=165 op=nft_register_chain pid=367 subj=system_u:system_r:firewalld_t:s0 comm=firewalld The issue was originally documented in https://github.com/linux-audit/audit-kernel/issues/124 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 22:34:11 +02:00
Florian Westphal	cefa31a9d4	netfilter: nft_log: perform module load from nf_tables modprobe calls from the nf_logger_find_get() API causes deadlock in very special cases because they occur with the nf_tables transaction mutex held. In the specific case of nf_log, deadlock is via: A nf_tables -> transaction mutex -> nft_log -> modprobe -> nf_log_syslog \ -> pernet_ops rwsem -> wait for C B netlink event -> rtnl_mutex -> nf_tables transaction mutex -> wait for A C close() -> ip6mr_sk_done -> rtnl_mutex -> wait for B Earlier patch added NFLOG/xt_LOG module softdeps to avoid the need to load the backend module during a transaction. For nft_log we would have to add a softdep for both nfnetlink_log or nf_log_syslog, since we do not know in advance which of the two backends are going to be configured. This defers the modprobe op until after the transaction mutex is released. Tested-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 22:34:11 +02:00
Florian Westphal	a38b5b56d6	netfilter: nf_log: add module softdeps xt_LOG has no direct dependency on the syslog-based logger, it relies on the nf_log core to probe the requested backend. Now that all syslog-based loggers reside in the same module, we can just add a soft dependency on nf_log_syslog and let modprobe take care of it. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 22:34:10 +02:00
Florian Westphal	e465cccd0b	netfilter: nf_log_common: merge with nf_log_syslog Remove nf_log_common. Now that all per-af modules have been merged there is no longer a need to provide a helper module. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 22:34:10 +02:00
Florian Westphal	77ccee96a6	netfilter: nf_log_bridge: merge with nf_log_syslog Provide bridge log support from nf_log_syslog. After the merge there is no need to load the "real packet loggers", all of them now reside in the same module. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 22:34:05 +02:00
Chuck Lever	e3eded5e81	svcrdma: Clean up dto_q critical section in svc_rdma_recvfrom() This, to me, seems less cluttered and less redundant. I was hoping it could help reduce lock contention on the dto_q lock by reducing the size of the critical section, but alas, the only improvement is readability. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-31 15:58:48 -04:00
Chuck Lever	5533c4f4b9	svcrdma: Remove svc_rdma_recv_ctxt::rc_pages and ::rc_arg These fields are no longer used. The size of struct svc_rdma_recv_ctxt is now less than 300 bytes on x86_64, down from 2440 bytes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-31 15:57:48 -04:00
Chuck Lever	9af723be86	svcrdma: Remove sc_read_complete_q Now that svc_rdma_recvfrom() waits for Read completion, sc_read_complete_q is no longer used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-31 15:57:48 -04:00
Chuck Lever	7d81ee8722	svcrdma: Single-stage RDMA Read Currently the generic RPC server layer calls svc_rdma_recvfrom() twice to retrieve an RPC message that uses Read chunks. I'm not exactly sure why this design was chosen originally. Instead, let's wait for the Read chunk completion inline in the first call to svc_rdma_recvfrom(). The goal is to eliminate some page allocator churn. rdma_read_complete() replaces pages in the second svc_rqst by calling put_page() repeatedly while the upper layer waits for the request to be constructed, which adds unnecessary NFS WRITE round- trip latency. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Tom Talpey <tom@talpey.com>	2021-03-31 15:57:39 -04:00
Geliang Tang	740d798e87	mptcp: remove id 0 address This patch added a new function mptcp_nl_remove_id_zero_address to remove the id 0 address. In this function, traverse all the existing msk sockets to find the msk matched the input IP address. Then fill the removing list with id 0, and pass it to mptcp_pm_remove_addr and mptcp_pm_remove_subflow. Suggested-by: Paolo Abeni <pabeni@redhat.com> Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:42:23 -07:00
Geliang Tang	9f12e97bf1	mptcp: unify RM_ADDR and RM_SUBFLOW receiving There are some duplicate code in mptcp_pm_nl_rm_addr_received and mptcp_pm_nl_rm_subflow_received. This patch unifies them into a new function named mptcp_pm_nl_rm_addr_or_subflow. In it, use the input parameter rm_type to identify it's now removing an address or a subflow. Suggested-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:42:23 -07:00
Geliang Tang	774c8a8dcb	mptcp: remove all subflows involving id 0 address There's only one subflow involving the non-zero id address, but there may be multi subflows involving the id 0 address. Here's an example: local_id=0, remote_id=0 local_id=1, remote_id=0 local_id=0, remote_id=1 If the removing address id is 0, all the subflows involving the id 0 address need to be removed. In mptcp_pm_nl_rm_addr_received/mptcp_pm_nl_rm_subflow_received, the "break" prevents the iteration to the next subflow, so this patch dropped them. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:42:23 -07:00
Eric Dumazet	b8128656a5	net: fix icmp_echo_enable_probe sysctl sysctl_icmp_echo_enable_probe is an u8. ipv4_net_table entry should use .maxlen = sizeof(u8). .proc_handler = proc_dou8vec_minmax, Fixes: `f1b8fa9fa5` ("net: add sysctl for enabling RFC 8335 PROBE messages") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Andreas Roeseler <andreas.a.roeseler@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:38:43 -07:00
Paolo Abeni	78352f73dc	udp: never accept GSO_FRAGLIST packets Currently the UDP protocol delivers GSO_FRAGLIST packets to the sockets without the expected segmentation. This change addresses the issue introducing and maintaining a couple of new fields to explicitly accept SKB_GSO_UDP_L4 or GSO_FRAGLIST packets. Additionally updates udp_unexpected_gso() accordingly. UDP sockets enabling UDP_GRO stil keep accept_udp_fraglist zeroed. v1 -> v2: - use 2 bits instead of a whole GSO bitmask (Willem) Fixes: `9fd1ff5d2a` ("udp: Support UDP fraglist GRO/GSO.") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:06:49 -07:00
Paolo Abeni	e0e3070a9b	udp: properly complete L4 GRO over UDP tunnel packet After the previous patch, the stack can do L4 UDP aggregation on top of a UDP tunnel. In such scenario, udp{4,6}_gro_complete will be called twice. This function will enter its is_flist branch immediately, even though that is only correct on the second call, as GSO_FRAGLIST is only relevant for the inner packet. Instead, we need to try first UDP tunnel-based aggregation, if the GRO packet requires that. This patch changes udp{4,6}_gro_complete to skip the frag list processing when while encap_mark == 1, identifying processing of the outer tunnel header. Additionally, clears the field in udp_gro_complete() so that we can enter the frag list path on the next round, for the inner header. v1 -> v2: - hopefully clarified the commit message Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:06:49 -07:00
Paolo Abeni	18f25dc399	udp: skip L4 aggregation for UDP tunnel packets If NETIF_F_GRO_FRAGLIST or NETIF_F_GRO_UDP_FWD are enabled, and there are UDP tunnels available in the system, udp_gro_receive() could end-up doing L4 aggregation (either SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST) at the outer UDP tunnel level for packets effectively carrying and UDP tunnel header. That could cause inner protocol corruption. If e.g. the relevant packets carry a vxlan header, different vxlan ids will be ignored/ aggregated to the same GSO packet. Inner headers will be ignored, too, so that e.g. TCP over vxlan push packets will be held in the GRO engine till the next flush, etc. Just skip the SKB_GSO_UDP_L4 and SKB_GSO_FRAGLIST code path if the current packet could land in a UDP tunnel, and let udp_gro_receive() do GRO via udp_sk(sk)->gro_receive. The check implemented in this patch is broader than what is strictly needed, as the existing UDP tunnel could be e.g. configured on top of a different device: we could end-up skipping GRO at-all for some packets. Anyhow, that is a very thin corner case and covering it will add quite a bit of complexity. v1 -> v2: - hopefully clarify the commit message Fixes: `9fd1ff5d2a` ("udp: Support UDP fraglist GRO/GSO.") Fixes: `36707061d6` ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets") Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:06:49 -07:00
Paolo Abeni	000ac44da7	udp: fixup csum for GSO receive slow path When UDP packets generated locally by a socket with UDP_SEGMENT traverse the following path: UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> UDP tunnel (rx) -> UDP socket (no UDP_GRO) ip_summed will be set to CHECKSUM_PARTIAL at creation time and such checksum mode will be preserved in the above path up to the UDP tunnel receive code where we have: __iptunnel_pull_header() -> skb_pull_rcsum() -> skb_postpull_rcsum() -> __skb_postpull_rcsum() The latter will convert the skb to CHECKSUM_NONE. The UDP GSO packet will be later segmented as part of the rx socket receive operation, and will present a CHECKSUM_NONE after segmentation. Additionally the segmented packets UDP CB still refers to the original GSO packet len. Overall that causes unexpected/wrong csum validation errors later in the UDP receive path. We could possibly address the issue with some additional checks and csum mangling in the UDP tunnel code. Since the issue affects only this UDP receive slow path, let's set a suitable csum status there. Note that SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST packets lacking an UDP encapsulation present a valid checksum when landing to udp_queue_rcv_skb(), as the UDP checksum has been validated by the GRO engine. v2 -> v3: - even more verbose commit message and comments v1 -> v2: - restrict the csum update to the packets strictly needing them - hopefully clarify the commit message and code comments Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:06:49 -07:00
Wang Qing	b9aa074b89	net/decnet: Delete obsolete TODO file The TODO file here has not been updated from 2005, and the function development described in the file have been implemented or abandoned. Its existence will mislead developers seeking to view outdated information. Signed-off-by: Wang Qing <wangqing@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 16:54:50 -07:00
Wang Qing	8d9e5bbf5c	net/ax25: Delete obsolete TODO file The TODO file here has not been updated for 13 years, and the function development described in the file have been implemented or abandoned. Its existence will mislead developers seeking to view outdated information. Signed-off-by: Wang Qing <wangqing@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 16:54:50 -07:00
Pablo Neira Ayuso	fbea31808c	netfilter: conntrack: do not print icmpv6 as unknown via /proc /proc/net/nf_conntrack shows icmpv6 as unknown. Fixes: `09ec82f5af` ("netfilter: conntrack: remove protocol name from l4proto struct") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 01:12:47 +02:00
Pablo Neira Ayuso	0e07e25b48	netfilter: flowtable: fix NAT IPv6 offload mangling Fix out-of-bound access in the address array. Fixes: `5c27d8d76c` ("netfilter: nf_flow_table_offload: add IPv6 support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 01:12:47 +02:00
Florian Westphal	1510618e45	netfilter: nf_log_netdev: merge with nf_log_syslog Provide netdev family support from the nf_log_syslog module. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 00:37:27 +02:00
Florian Westphal	f5466caab9	netfilter: nf_log_ipv6: merge with nf_log_syslog This removes the nf_log_ipv6 module, the functionality is now provided by nf_log_syslog. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 00:37:27 +02:00
Florian Westphal	f11d61e795	netfilter: nf_log_arp: merge with nf_log_syslog similar to previous change: nf_log_syslog now covers ARP logging as well, the old nf_log_arp module is removed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 00:37:27 +02:00
Florian Westphal	db3187ae21	netfilter: nf_log_ipv4: rename to nf_log_syslog Netfilter has multiple log modules: nf_log_arp nf_log_bridge nf_log_ipv4 nf_log_ipv6 nf_log_netdev nfnetlink_log nf_log_common With the exception of nfnetlink_log (packet is sent to userspace for dissection/logging), all of them log to the kernel ringbuffer. This is the first part of a series to merge all modules except nfnetlink_log into a single module: nf_log_syslog. This allows to reduce code. After the series, only two log modules remain: nfnetlink_log and nf_log_syslog. The latter provides the same functionality as the old per-af log modules. This renames nf_log_ipv4 to nf_log_syslog. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-03-31 00:37:27 +02:00
Paolo Abeni	9adc89af72	net: let skb_orphan_partial wake-up waiters. Currently the mentioned helper can end-up freeing the socket wmem without waking-up any processes waiting for more write memory. If the partially orphaned skb is attached to an UDP (or raw) socket, the lack of wake-up can hang the user-space. Even for TCP sockets not calling the sk destructor could have bad effects on TSQ. Address the issue using skb_orphan to release the sk wmem before setting the new sock_efree destructor. Additionally bundle the whole ownership update in a new helper, so that later other potential users could avoid duplicate code. v1 -> v2: - use skb_orphan() instead of sort of open coding it (Eric) - provide an helper for the ownership change (Eric) Fixes: `f6ba8d33cf` ("netem: fix skb_orphan_partial()") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 13:57:28 -07:00
Yunjian Wang	ae81feb733	sch_htb: fix null pointer dereference on a null new_q sch_htb: fix null pointer dereference on a null new_q Currently if new_q is null, the null new_q pointer will be dereference when 'q->offload' is true. Fix this by adding a braces around htb_parent_to_leaf_offload() to avoid it. Addresses-Coverity: ("Dereference after null check") Fixes: `d03b195b5a` ("sch_htb: Hierarchical QoS hardware offload") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 13:50:03 -07:00
Loic Poulain	8a03dd9257	net: qrtr: Fix memory leak on qrtr_tx_wait failure qrtr_tx_wait does not check for radix_tree_insert failure, causing the 'flow' object to be unreferenced after qrtr_tx_wait return. Fix that by releasing flow on radix_tree_insert failure. Fixes: `5fdeb0d372` ("net: qrtr: Implement outgoing flow control") Reported-by: syzbot+739016799a89c530b32a@syzkaller.appspotmail.com Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 13:48:29 -07:00
Andreas Roeseler	d329ea5bd8	icmp: add response to RFC 8335 PROBE messages Modify the icmp_rcv function to check PROBE messages and call icmp_echo if a PROBE request is detected. Modify the existing icmp_echo function to respond ot both ping and PROBE requests. This was tested using a custom modification to the iputils package and wireshark. It supports IPV4 probing by name, ifindex, and probing by both IPV4 and IPV6 addresses. It currently does not support responding to probes off the proxy node (see RFC 8335 Section 2). The modification to the iputils package is still in development and can be found here: https://github.com/Juniper-Clinic-2020/iputils.git. It supports full sending functionality of PROBE requests, but currently does not parse the response messages, which is why Wireshark is required to verify the sent and recieved PROBE messages. The modification adds the ``-e'' flag to the command which allows the user to specify the interface identifier to query the probed host. An example usage would be <./ping -4 -e 1 [destination]> to send a PROBE request of ifindex 1 to the destination node. Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 13:29:39 -07:00
Andreas Roeseler	504a40113c	ipv6: add ipv6_dev_find to stubs Add ipv6_dev_find to ipv6_stub to allow lookup of net_devices by IPV6 address in net/ipv4/icmp.c. Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 13:29:39 -07:00
Andreas Roeseler	08baf54f01	net: add support for sending RFC 8335 PROBE messages Modify the ping_supported function to support PROBE message types. This allows tools such as the ping command in the iputils package to be modified to send PROBE requests through the existing framework for sending ping requests. Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 13:29:39 -07:00
Andreas Roeseler	f1b8fa9fa5	net: add sysctl for enabling RFC 8335 PROBE messages Section 8 of RFC 8335 specifies potential security concerns of responding to PROBE requests, and states that nodes that support PROBE functionality MUST be able to enable/disable responses and that responses MUST be disabled by default Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 13:29:39 -07:00
Kumar Kartikeya Dwivedi	6855e8213e	net: sched: bump refcount for new action in ACT replace mode Currently, action creation using ACT API in replace mode is buggy. When invoking for non-existent action index 42, tc action replace action bpf obj foo.o sec <xyz> index 42 kernel creates the action, fills up the netlink response, and then just deletes the action after notifying userspace. tc action show action bpf doesn't list the action. This happens due to the following sequence when ovr = 1 (replace mode) is enabled: tcf_idr_check_alloc is used to atomically check and either obtain reference for existing action at index, or reserve the index slot using a dummy entry (ERR_PTR(-EBUSY)). This is necessary as pointers to these actions will be held after dropping the idrinfo lock, so bumping the reference count is necessary as we need to insert the actions, and notify userspace by dumping their attributes. Finally, we drop the reference we took using the tcf_action_put_many call in tcf_action_add. However, for the case where a new action is created due to free index, its refcount remains one. This when paired with the put_many call leads to the kernel setting up the action, notifying userspace of its creation, and then tearing it down. For existing actions, the refcount is still held so they remain unaffected. Fortunately due to rtnl_lock serialization requirement, such an action with refcount == 1 will not be concurrently deleted by anything else, at best CLS API can move its refcount up and down by binding to it after it has been published from tcf_idr_insert_many. Since refcount is atleast one until put_many call, CLS API cannot delete it. Also __tcf_action_put release path already ensures deterministic outcome (either new action will be created or existing action will be reused in case CLS API tries to bind to action concurrently) due to idr lock serialization. We fix this by making refcount of newly created actions as 2 in ACT API replace mode. A relaxed store will suffice as visibility is ensured only after the tcf_idr_insert_many call. Note that in case of creation or overwriting using CLS API only (i.e. bind = 1), overwriting existing action object is not allowed, and any such request is silently ignored (without error). The refcount bump that occurs in tcf_idr_check_alloc call there for existing action will pair with tcf_exts_destroy call made from the owner module for the same action. In case of action creation, there is no existing action, so no tcf_exts_destroy callback happens. This means no code changes for CLS API. Fixes: `cae422f379` ("net: sched: use reference counting action init") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 13:18:02 -07:00
Milton Miller	03cb4d05b4	net/ncsi: Avoid channel_monitor hrtimer deadlock Calling ncsi_stop_channel_monitor from channel_monitor is a guaranteed deadlock on SMP because stop calls del_timer_sync on the timer that invoked channel_monitor as its timer function. Recognise the inherent race of marking the monitor disabled before deleting the timer by just returning if enable was cleared. After a timeout (the default case -- reset to START when response received) just mark the monitor.enabled false. If the channel has an entry on the channel_queue list, or if the state is not ACTIVE or INACTIVE, then warn and mark the timer stopped and don't restart, as the locking is broken somehow. Fixes: `0795fb2021` ("net/ncsi: Stop monitor if channel times out or is inactive") Signed-off-by: Milton Miller <miltonm@us.ibm.com> Signed-off-by: Eddie James <eajames@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 13:16:23 -07:00
Sven Eckelmann	35796c1d34	batman-adv: Fix misspelled "wont" checkpatch started to complain about the mispelling of: CHECK: 'wont' may be misspelled - perhaps 'won't'? #459: FILE: ./net/batman-adv/bat_iv_ogm.c:459: + * - the resulting packet wont be bigger than Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2021-03-30 21:19:50 +02:00
Dmitry Safonov	ef19e11133	xfrm/compat: Cleanup WARN()s that can be user-triggered Replace WARN_ONCE() that can be triggered from userspace with pr_warn_once(). Those still give user a hint what's the issue. I've left WARN()s that are not possible to trigger with current code-base and that would mean that the code has issues: - relying on current compat_msg_min[type] <= xfrm_msg_min[type] - expected 4-byte padding size difference between compat_msg_min[type] and xfrm_msg_min[type] - compat_policy[type].len <= xfrma_policy[type].len (for every type) Reported-by: syzbot+834ffd1afc7212eb8147@syzkaller.appspotmail.com Fixes: `5f3eea6b7e` ("xfrm/compat: Attach xfrm dumps to 64=>32 bit translator") Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2021-03-30 07:29:09 +02:00
Martin KaFai Lau	7aae231ac9	bpf: tcp: Limit calling some tcp cc functions to CONFIG_DYNAMIC_FTRACE pahole currently only generates the btf_id for external function and ftrace-able function. Some functions in the bpf_tcp_ca_kfunc_ids are static (e.g. cubictcp_init). Thus, unless CONFIG_DYNAMIC_FTRACE is set, btf_ids for those functions will not be generated and the compilation fails during resolve_btfids. This patch limits those functions to CONFIG_DYNAMIC_FTRACE. I will address the pahole generation in a followup and then remove the CONFIG_DYNAMIC_FTRACE limitation. Fixes: `e78aea8b21` ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc") Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Reported-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329221357.834438-1-kafai@fb.com	2021-03-29 18:42:43 -07:00
Eric Dumazet	d24f511b04	tcp: fix tcp_min_tso_segs sysctl tcp_min_tso_segs is now stored in u8, so max value is 255. 255 limit is enforced by proc_dou8vec_minmax(). We can therefore remove the gso_max_segs variable. Fixes: 47996b489bdc ("tcp: convert elligible sysctls to u8") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-29 16:33:48 -07:00
Eric Dumazet	6289a98f08	sit: proper dev_{hold\|put} in ndo_[un]init methods After adopting CONFIG_PCPU_DEV_REFCNT=n option, syzbot was able to trigger a warning [1] Issue here is that: - all dev_put() should be paired with a corresponding prior dev_hold(). - A driver doing a dev_put() in its ndo_uninit() MUST also do a dev_hold() in its ndo_init(), only when ndo_init() is returning 0. Otherwise, register_netdevice() would call ndo_uninit() in its error path and release a refcount too soon. Fixes: `919067cc84` ("net: add CONFIG_PCPU_DEV_REFCNT") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-29 16:31:51 -07:00
Eric Dumazet	40cb881b5a	ip6_vti: proper dev_{hold\|put} in ndo_[un]init methods After adopting CONFIG_PCPU_DEV_REFCNT=n option, syzbot was able to trigger a warning [1] Issue here is that: - all dev_put() should be paired with a corresponding prior dev_hold(). - A driver doing a dev_put() in its ndo_uninit() MUST also do a dev_hold() in its ndo_init(), only when ndo_init() is returning 0. Otherwise, register_netdevice() would call ndo_uninit() in its error path and release a refcount too soon. Therefore, we need to move dev_hold() call from vti6_tnl_create2() to vti6_dev_init_gen() [1] WARNING: CPU: 0 PID: 15951 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31 Modules linked in: CPU: 0 PID: 15951 Comm: syz-executor.3 Not tainted 5.12.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31 Code: 1d 6a 5a e8 09 31 ff 89 de e8 8d 1a ab fd 84 db 75 e0 e8 d4 13 ab fd 48 c7 c7 a0 e1 c1 89 c6 05 4a 5a e8 09 01 e8 2e 36 fb 04 <0f> 0b eb c4 e8 b8 13 ab fd 0f b6 1d 39 5a e8 09 31 ff 89 de e8 58 RSP: 0018:ffffc90001eaef28 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000040000 RSI: ffffffff815c51f5 RDI: fffff520003d5dd7 RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815bdf8e R11: 0000000000000000 R12: ffff88801bb1c568 R13: ffff88801f69e800 R14: 00000000ffffffff R15: ffff888050889d40 FS: 00007fc79314e700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1c1ff47108 CR3: 0000000020fd5000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __refcount_dec include/linux/refcount.h:344 [inline] refcount_dec include/linux/refcount.h:359 [inline] dev_put include/linux/netdevice.h:4135 [inline] vti6_dev_uninit+0x31a/0x360 net/ipv6/ip6_vti.c:297 register_netdevice+0xadf/0x1500 net/core/dev.c:10308 vti6_tnl_create2+0x1b5/0x400 net/ipv6/ip6_vti.c:190 vti6_newlink+0x9d/0xd0 net/ipv6/ip6_vti.c:1020 __rtnl_newlink+0x1062/0x1710 net/core/rtnetlink.c:3443 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3491 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x331/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmmsg+0x195/0x470 net/socket.c:2490 __do_sys_sendmmsg net/socket.c:2519 [inline] __se_sys_sendmmsg net/socket.c:2516 [inline] __x64_sys_sendmmsg+0x99/0x100 net/socket.c:2516 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-29 16:31:51 -07:00
Eric Dumazet	7f700334be	ip6_gre: proper dev_{hold\|put} in ndo_[un]init methods After adopting CONFIG_PCPU_DEV_REFCNT=n option, syzbot was able to trigger a warning [1] Issue here is that: - all dev_put() should be paired with a corresponding dev_hold(), and vice versa. - A driver doing a dev_put() in its ndo_uninit() MUST also do a dev_hold() in its ndo_init(), only when ndo_init() is returning 0. Otherwise, register_netdevice() would call ndo_uninit() in its error path and release a refcount too soon. ip6_gre for example (among others problematic drivers) has to use dev_hold() in ip6gre_tunnel_init_common() instead of from ip6gre_newlink_common(), covering both ip6gre_tunnel_init() and ip6gre_tap_init()/ Note that ip6gre_tunnel_init_common() is not called from ip6erspan_tap_init() thus we also need to add a dev_hold() there, as ip6erspan_tunnel_uninit() does call dev_put() [1] refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 0 PID: 8422 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31 Modules linked in: CPU: 1 PID: 8422 Comm: syz-executor854 Not tainted 5.12.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31 Code: 1d 6a 5a e8 09 31 ff 89 de e8 8d 1a ab fd 84 db 75 e0 e8 d4 13 ab fd 48 c7 c7 a0 e1 c1 89 c6 05 4a 5a e8 09 01 e8 2e 36 fb 04 <0f> 0b eb c4 e8 b8 13 ab fd 0f b6 1d 39 5a e8 09 31 ff 89 de e8 58 RSP: 0018:ffffc900018befd0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88801ef19c40 RSI: ffffffff815c51f5 RDI: fffff52000317dec RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815bdf8e R11: 0000000000000000 R12: ffff888018cf4568 R13: ffff888018cf4c00 R14: ffff8880228f2000 R15: ffffffff8d659b80 FS: 00000000014eb300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055d7bf2b3138 CR3: 0000000014933000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __refcount_dec include/linux/refcount.h:344 [inline] refcount_dec include/linux/refcount.h:359 [inline] dev_put include/linux/netdevice.h:4135 [inline] ip6gre_tunnel_uninit+0x3d7/0x440 net/ipv6/ip6_gre.c:420 register_netdevice+0xadf/0x1500 net/core/dev.c:10308 ip6gre_newlink_common.constprop.0+0x158/0x410 net/ipv6/ip6_gre.c:1984 ip6gre_newlink+0x275/0x7a0 net/ipv6/ip6_gre.c:2017 __rtnl_newlink+0x1062/0x1710 net/core/rtnetlink.c:3443 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3491 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 Fixes: `919067cc84` ("net: add CONFIG_PCPU_DEV_REFCNT") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-29 16:31:51 -07:00
Jon Maloy	02fdc14d9b	tipc: fix htmldoc and smatch warnings We fix a warning from the htmldoc tool and an indentation error reported by smatch. There are no functional changes in this commit. Signed-off-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-29 16:28:50 -07:00
Lv Yunlong	6bf24dc0cc	net:tipc: Fix a double free in tipc_sk_mcast_rcv In the if(skb_peek(arrvq) == skb) branch, it calls __skb_dequeue(arrvq) to get the skb by skb = skb_peek(arrvq). Then __skb_dequeue() unlinks the skb from arrvq and returns the skb which equals to skb_peek(arrvq). After __skb_dequeue(arrvq) finished, the skb is freed by kfree_skb(__skb_dequeue(arrvq)) in the first time. Unfortunately, the same skb is freed in the second time by kfree_skb(skb) after the branch completed. My patch removes kfree_skb() in the if(skb_peek(arrvq) == skb) branch, because this skb will be freed by kfree_skb(skb) finally. Fixes: `cb1b728096` ("tipc: eliminate race condition at multicast reception") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-29 16:21:37 -07:00
Maxim Kochetkov	fb6ec87f72	net: dsa: Fix type was not set for devlink port If PHY is not available on DSA port (described at devicetree but absent or failed to detect) then kernel prints warning after 3700 secs: [ 3707.948771] ------------[ cut here ]------------ [ 3707.948784] Type was not set for devlink port. [ 3707.948894] WARNING: CPU: 1 PID: 17 at net/core/devlink.c:8097 0xc083f9d8 We should unregister the devlink port as a user port and re-register it as an unused port before executing "continue" in case of dsa_port_setup error. Fixes: `86f8b1c01a` ("net: dsa: Do not make user port errors fatal") Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-29 13:49:04 -07:00
Oliver Hartkopp	f522d9559b	can: isotp: fix msg_namelen values depending on CAN_REQUIRED_SIZE Since commit `f5223e9eee` ("can: extend sockaddr_can to include j1939 members") the sockaddr_can has been extended in size and a new CAN_REQUIRED_SIZE macro has been introduced to calculate the protocol specific needed size. The ABI for the msg_name and msg_namelen has not been adapted to the new CAN_REQUIRED_SIZE macro for the other CAN protocols which leads to a problem when an existing binary reads the (increased) struct sockaddr_can in msg_name. Fixes: `e057dd3fc2` ("can: add ISO 15765-2:2016 transport protocol") Reported-by: Richard Weinberger <richard@nod.at> Acked-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Link: https://lore.kernel.org/linux-can/1135648123.112255.1616613706554.JavaMail.zimbra@nod.at/T/#t Link: https://lore.kernel.org/r/20210325125850.1620-2-socketcan@hartkopp.net Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-03-29 09:51:43 +02:00
Oliver Hartkopp	9e9714742f	can: bcm/raw: fix msg_namelen values depending on CAN_REQUIRED_SIZE Since commit `f5223e9eee` ("can: extend sockaddr_can to include j1939 members") the sockaddr_can has been extended in size and a new CAN_REQUIRED_SIZE macro has been introduced to calculate the protocol specific needed size. The ABI for the msg_name and msg_namelen has not been adapted to the new CAN_REQUIRED_SIZE macro for the other CAN protocols which leads to a problem when an existing binary reads the (increased) struct sockaddr_can in msg_name. Fixes: `f5223e9eee` ("can: extend sockaddr_can to include j1939 members") Reported-by: Richard Weinberger <richard@nod.at> Tested-by: Richard Weinberger <richard@nod.at> Acked-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Link: https://lore.kernel.org/linux-can/1135648123.112255.1616613706554.JavaMail.zimbra@nod.at/T/#t Link: https://lore.kernel.org/r/20210325125850.1620-1-socketcan@hartkopp.net Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-03-29 09:51:20 +02:00
Steffen Klassert	c7dbf4c088	xfrm: Provide private skb extensions for segmented and hw offloaded ESP packets Commit `94579ac3f6` ("xfrm: Fix double ESP trailer insertion in IPsec crypto offload.") added a XFRM_XMIT flag to avoid duplicate ESP trailer insertion on HW offload. This flag is set on the secpath that is shared amongst segments. This lead to a situation where some segments are not transformed correctly when segmentation happens at layer 3. Fix this by using private skb extensions for segmented and hw offloaded ESP packets. Fixes: `94579ac3f6` ("xfrm: Fix double ESP trailer insertion in IPsec crypto offload.") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2021-03-29 09:14:12 +02:00
Martin KaFai Lau	21cfd2db9f	bpf: tcp: Fix an error in the bpf_tcp_ca_kfunc_ids list There is a typo in the bbr function, s/even/event/. This patch fixes it. Fixes: `e78aea8b21` ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329003213.2274210-1-kafai@fb.com	2021-03-28 18:17:08 -07:00
kernel test robot	284fda1eff	sit: use min Opportunity for min() Generated by: scripts/coccinelle/misc/minmax.cocci CC: Denis Efremov <efremov@linux.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:58:39 -07:00
Xiongfeng Wang	b6908cf795	NFC: digital: Correct function name in the kerneldoc comments Fix the following W=1 kernel build warning(s): net/nfc/digital_core.c:473: warning: expecting prototype for start_poll operation(). Prototype was for digital_start_poll() instead Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:56:56 -07:00
Xiongfeng Wang	f7b88985a1	ip6_tunnel:: Correct function name parse_tvl_tnl_enc_lim() in the kerneldoc comments Fix the following W=1 kernel build warning(s): net/ipv6/ip6_tunnel.c:401: warning: expecting prototype for parse_tvl_tnl_enc_lim(). Prototype was for ip6_tnl_parse_tlv_enc_lim() instead Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:56:56 -07:00
Xiongfeng Wang	03ff7371cb	net: 9p: Correct function names in the kerneldoc comments Fix the following W=1 kernel build warning(s): net/9p/client.c:133: warning: expecting prototype for parse_options(). Prototype was for parse_opts() instead net/9p/client.c:269: warning: expecting prototype for p9_req_alloc(). Prototype was for p9_tag_alloc() instead Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:56:56 -07:00
Xiongfeng Wang	54e625e3bd	9p/trans_fd: Correct function name p9_mux_destroy() in the kerneldoc Fix the following W=1 kernel build warning(s): net/9p/trans_fd.c:881: warning: expecting prototype for p9_mux_destroy(). Prototype was for p9_conn_destroy() instead Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:56:56 -07:00
Xiongfeng Wang	8bf94a9250	net: 9p: Correct function name errstr2errno() in the kerneldoc comments Fix the following W=1 kernel build warning(s): net/9p/error.c:207: warning: expecting prototype for errstr2errno(). Prototype was for p9_errstr2errno() instead Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:56:56 -07:00
Xiongfeng Wang	bb2882bc6c	net: core: Correct function name netevent_unregister_notifier() in the kerneldoc Fix the following W=1 kernel build warning(s): net/core/netevent.c:45: warning: expecting prototype for netevent_unregister_notifier(). Prototype was for unregister_netevent_notifier() instead Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:56:56 -07:00
Xiongfeng Wang	af82508743	net: core: Correct function name dev_uc_flush() in the kerneldoc Fix the following W=1 kernel build warning(s): net/core/dev_addr_lists.c:732: warning: expecting prototype for dev_uc_flush(). Prototype was for dev_uc_init() instead Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:56:56 -07:00
Xiongfeng Wang	3ba937fb95	netlabel: Correct function name netlbl_mgmt_add() in the kerneldoc comments Fix the following W=1 kernel build warning(s): net/netlabel/netlabel_mgmt.c:78: warning: expecting prototype for netlbl_mgmt_add(). Prototype was for netlbl_mgmt_add_common() instead Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:56:55 -07:00
Xiongfeng Wang	37569287cb	l3mdev: Correct function names in the kerneldoc comments Fix the following W=1 kernel build warning(s): net/l3mdev/l3mdev.c:111: warning: expecting prototype for l3mdev_master_ifindex(). Prototype was for l3mdev_master_ifindex_rcu() instead net/l3mdev/l3mdev.c:145: warning: expecting prototype for l3mdev_master_upper_ifindex_by_index(). Prototype was for l3mdev_master_upper_ifindex_by_index_rcu() instead Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:56:55 -07:00
Petr Machata	de1d1ee3e3	nexthop: Rename artifacts related to legacy multipath nexthop groups After resilient next-hop groups have been added recently, there are two types of multipath next-hop groups: the legacy "mpath", and the new "resilient". Calling the legacy next-hop group type "mpath" is unfortunate, because that describes the fact that a packet could be forwarded in one of several paths, which is also true for the resilient next-hop groups. Therefore, to make the naming clearer, rename various artifacts to reflect the assumptions made. Therefore as of this patch: - The flag for multipath groups is nh_grp_entry::is_multipath. This includes the legacy and resilient groups, as well as any future group types that behave as multipath groups. Functions that assume this have "mpath" in the name. - The flag for legacy multipath groups is nh_grp_entry::hash_threshold. Functions that assume this have "hthr" in the name. - The flag for resilient groups is nh_grp_entry::resilient. Functions that assume this have "res" in the name. Besides the above, struct nh_grp_entry::mpath was renamed to ::hthr as well. UAPI artifacts were obviously left intact. Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:53:39 -07:00
Lu Wei	9195f06b2d	net: vsock: Fix a typo Modify "occured" to "occurred" in net/vmw_vsock/af_vsock.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:52:51 -07:00
Lu Wei	21c00a186f	net: sctp: Fix some typos Modify "unkown" to "unknown" in net/sctp/sm_make_chunk.c and Modify "orginal" to "original" in net/sctp/socket.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:52:50 -07:00
Lu Wei	ebf893958c	net: rds: Fix a typo Modify "beween" to "between" in net/rds/send.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:52:50 -07:00
Bhaskar Chowdhury	a7fd0e6d75	xfrm_user.c: Added a punctuation s/wouldnt/wouldn\'t/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:14 -07:00
Bhaskar Chowdhury	aa8ef1b9ab	xfrm_policy.c : Mundane typo fix s/sucessful/successful/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:14 -07:00
Bhaskar Chowdhury	fb373c8455	sm_statefuns.c: Mundane spello fixes s/simulataneous/simultaneous/ ....in three dirrent places. s/tempory/temporary/ s/interpeter/interpreter/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:14 -07:00
Bhaskar Chowdhury	f2e3093172	reg.c: Fix a spello s/ingoring/ignoring/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:14 -07:00
Bhaskar Chowdhury	0184235ec6	node.c: A typo fix s/synching/syncing/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:14 -07:00
Bhaskar Chowdhury	bcae6d5faf	netfilter: nf_conntrack_acct.c: A typo fix s/Accouting/Accounting/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:14 -07:00
Bhaskar Chowdhury	f60d94f0d7	netfilter: ipvs: A spello fix s/registerd/registered/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Bhaskar Chowdhury	195a8ec403	ncsi: internal.h: Fix a spello s/Firware/Firmware/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Bhaskar Chowdhury	55320b82d6	mptcp: subflow.c: Fix a typo s/concerened/concerned/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Bhaskar Chowdhury	b18dacab6b	mac80211: cfg.c: A typo fix s/assocaited/associated/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Bhaskar Chowdhury	61f8406010	llc: llc_core.c: COuple of typo fixes s/searchs/searches/ ....two different places. Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Bhaskar Chowdhury	71a2fae508	kcm: kcmsock.c: Couple of typo fixes s/synchonization/synchronization/ s/aready/already/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Bhaskar Chowdhury	bf05d48dbd	iucv: af_iucv.c: Couple of typo fixes s/unitialized/uninitialized/ s/notifcations/notifications/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Bhaskar Chowdhury	89e8347f0f	ipv6: route.c: A spello fix s/notfication/notification/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Bhaskar Chowdhury	912b519afc	ipv6: addrconf.c: Fix a typo s/Identifers/Identifiers/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Bhaskar Chowdhury	e5ca43e82d	ipv4: tcp_lp.c: Couple of typo fixes s/resrved/reserved/ s/within/within/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Bhaskar Chowdhury	a66e04ce0e	ipv4: ip_output.c: Couple of typo fixes s/readibility/readability/ s/insufficent/insufficient/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Bhaskar Chowdhury	e919ee389c	bearer.h: Spellos fixed s/initalized/initialized/ ...three different places Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Bhaskar Chowdhury	8406d38fde	af_x25.c: Fix a spello s/facilties/facilities/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-28 17:31:13 -07:00
Atul Gopinathan	7e32a09fdc	bpf: tcp: Remove comma which is causing build error Currently, building the bpf-next source with the CONFIG_BPF_SYSCALL enabled is causing a compilation error: "net/ipv4/bpf_tcp_ca.c:209:28: error: expected identifier or '(' before ',' token" Fix this by removing an unnecessary comma. Fixes: `e78aea8b21` ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc") Reported-by: syzbot+0b74d8ec3bf0cc4e4209@syzkaller.appspotmail.com Signed-off-by: Atul Gopinathan <atulgopinathan@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210328120515.113895-1-atulgopinathan@gmail.com	2021-03-28 11:23:55 -07:00
Martin KaFai Lau	7bd1590d4e	bpf: selftests: Add kfunc_call test This patch adds a few kernel function bpf_kfunc_call_test() for the selftest's test_run purpose. They will be allowed for tc_cls prog. The selftest calling the kernel function bpf_kfunc_call_test() is also added in this patch. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015252.1551395-1-kafai@fb.com	2021-03-26 20:41:52 -07:00
Martin KaFai Lau	e78aea8b21	bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc This patch puts some tcp cong helper functions, tcp_slow_start() and tcp_cong_avoid_ai(), into the allowlist for the bpf-tcp-cc program. A few tcp cc implementation functions are also put into the allowlist. A potential use case is the bpf-tcp-cc implementation may only want to override a subset of a tcp_congestion_ops. For others, the bpf-tcp-cc can directly call the kernel counter parts instead of re-implementing (or copy-and-pasting) them to the bpf program. They will only be available to the bpf-tcp-cc typed program. The allowlist functions are not bounded to a fixed ABI contract. When any of them has changed, the bpf-tcp-cc program has to be changed like any in-tree/out-of-tree kernel tcp-cc implementations do also. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015201.1546345-1-kafai@fb.com	2021-03-26 20:41:51 -07:00
Martin KaFai Lau	d22f6ad187	tcp: Rename bictcp function prefix to cubictcp The cubic functions in tcp_cubic.c are using the bictcp prefix as in tcp_bic.c. This patch gives it the proper name cubictcp because the later patch will allow the bpf prog to directly call the cubictcp implementation. Renaming them will avoid the name collision when trying to find the intended one to call during bpf prog load time. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015155.1545532-1-kafai@fb.com	2021-03-26 20:41:51 -07:00
Yang Yingliang	72e6afe6b4	net: llc: Correct function name llc_pdu_set_pf_bit() in header Fix the following make W=1 kernel build warning: net/llc/llc_pdu.c:36: warning: expecting prototype for pdu_set_pf_bit(). Prototype was for llc_pdu_set_pf_bit() instead Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:24:14 -07:00
Yang Yingliang	8114f099d9	net: llc: Correct function name llc_sap_action_unitdata_ind() in header Fix the following make W=1 kernel build warning: net/llc/llc_s_ac.c:38: warning: expecting prototype for llc_sap_action_unit_data_ind(). Prototype was for llc_sap_action_unitdata_ind() instead Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:24:14 -07:00
Yang Yingliang	26440a63a1	net: llc: Correct some function names in header Fix the following make W=1 kernel build warning: net/llc/llc_c_ev.c:622: warning: expecting prototype for conn_ev_qlfy_last_frame_eq_1(). Prototype was for llc_conn_ev_qlfy_last_frame_eq_1() instead net/llc/llc_c_ev.c:636: warning: expecting prototype for conn_ev_qlfy_last_frame_eq_0(). Prototype was for llc_conn_ev_qlfy_last_frame_eq_0() instead Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:24:14 -07:00
Hoang Le	bc556d3edd	tipc: fix kernel-doc warnings Fix kernel-doc warning introduced in commit `b83e214b2e` ("tipc: add extack messages for bearer/media failure"): net/tipc/bearer.c:248: warning: Function parameter or member 'extack' not described in 'tipc_enable_bearer' Fixes: `b83e214b2e` ("tipc: add extack messages for bearer/media failure") Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:22:29 -07:00
Taehee Yoo	63ed8de4be	mld: add mc_lock for protecting per-interface mld data The purpose of this lock is to avoid a bottleneck in the query/report event handler logic. By previous patches, almost all mld data is protected by RTNL. So, the query and report event handler, which is data path logic acquires RTNL too. Therefore if a lot of query and report events are received, it uses RTNL for a long time. So it makes the control-plane bottleneck because of using RTNL. In order to avoid this bottleneck, mc_lock is added. mc_lock protect only per-interface mld data and per-interface mld data is used in the query/report event handler logic. So, no longer rtnl_lock is needed in the query/report event handler logic. Therefore bottleneck will be disappeared by mc_lock. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:14:56 -07:00
Taehee Yoo	f185de28d9	mld: add new workqueues for process mld events When query/report packets are received, mld module processes them. But they are processed under BH context so it couldn't use sleepable functions. So, in order to switch context, the two workqueues are added which processes query and report event. In the struct inet6_dev, mc_{query \| report}_queue are added so it is per-interface queue. And mc_{query \| report}_work are workqueue structure. When the query or report event is received, skb is queued to proper queue and worker function is scheduled immediately. Workqueues and queues are protected by spinlock, which is mc_{query \| report}_lock, and worker functions are protected by RTNL. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:14:56 -07:00
Taehee Yoo	88e2ca3080	mld: convert ifmcaddr6 to RCU The ifmcaddr6 has been protected by inet6_dev->lock(rwlock) so that the critical section is atomic context. In order to switch this context, changing locking is needed. The ifmcaddr6 actually already protected by RTNL So if it's converted to use RCU, its control path context can be switched to sleepable. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:14:56 -07:00
Taehee Yoo	4b200e3989	mld: convert ip6_sf_list to RCU The ip6_sf_list has been protected by mca_lock(spin_lock) so that the critical section is atomic context. In order to switch this context, changing locking is needed. The ip6_sf_list actually already protected by RTNL So if it's converted to use RCU, its control path context can be switched to sleepable. But It doesn't remove mca_lock yet because ifmcaddr6 isn't converted to RCU yet. So, It's not fully converted to the sleepable context. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:14:56 -07:00
Taehee Yoo	882ba1f73c	mld: convert ipv6_mc_socklist->sflist to RCU The sflist has been protected by rwlock so that the critical section is atomic context. In order to switch this context, changing locking is needed. The sflist actually already protected by RTNL So if it's converted to use RCU, its control path context can be switched to sleepable. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:14:56 -07:00
Taehee Yoo	cf2ce339b4	mld: get rid of inet6_dev->mc_lock The purpose of mc_lock is to protect inet6_dev->mc_tomb. But mc_tomb is already protected by RTNL and all functions, which manipulate mc_tomb are called under RTNL. So, mc_lock is not needed. Furthermore, it is spinlock so the critical section is atomic. In order to reduce atomic context, it should be removed. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:14:55 -07:00
Taehee Yoo	2d9a93b490	mld: convert from timer to delayed work mcast.c has several timers for delaying works. Timer's expire handler is working under atomic context so it can't use sleepable things such as GFP_KERNEL, mutex, etc. In order to use sleepable APIs, it converts from timers to delayed work. But there are some critical sections, which is used by both process and BH context. So that it still uses spin_lock_bh() and rwlock. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:14:55 -07:00
Jakub Kicinski	cf2cc0bf4f	ethtool: fec: fix FEC_NONE check Dan points out we need to use the mask not the bit (which is 0). Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `42ce127d98` ("ethtool: fec: sanitize ethtool_fecparam->fec") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:09:45 -07:00
Geliang Tang	b46a023810	mptcp: rename mptcp_pm_nl_add_addr_send_ack Since mptcp_pm_nl_add_addr_send_ack is now used for both ADD_ADDR and RM_ADDR cases, rename it to mptcp_pm_nl_addr_send_ack. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:05:15 -07:00
Geliang Tang	8dd5efb1f9	mptcp: send ack for rm_addr This patch changes the sending ACK conditions for the ADD_ADDR, send an ACK packet for RM_ADDR too. In mptcp_pm_remove_addr, invoke mptcp_pm_nl_add_addr_send_ack to send the ACK packet. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:05:15 -07:00
Geliang Tang	b65d95adb8	mptcp: drop useless addr_signal clear msk->pm.addr_signal is cleared in mptcp_pm_add_addr_signal, no need to clear it in mptcp_pm_nl_add_addr_send_ack again. Drop it. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:05:15 -07:00
Geliang Tang	557963c383	mptcp: move to next addr when subflow creation fail When an invalid address was announced, the subflow couldn't be created for this address. Therefore mptcp_pm_nl_subflow_established couldn't be invoked. Then the next addresses in the local address list didn't have a chance to be announced. This patch invokes the new function mptcp_pm_add_addr_echoed when the address is echoed. In it, use mptcp_lookup_anno_list_by_saddr to check whether this address is in the anno_list. If it is, PM schedules the status MPTCP_PM_SUBFLOW_ESTABLISHED to invoke mptcp_pm_create_subflow_or_signal_addr to deal with the next address in the local address list. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:05:15 -07:00
Geliang Tang	d88c476f4a	mptcp: export lookup_anno_list_by_saddr This patch exported the static function lookup_anno_list_by_saddr, and renamed it to mptcp_lookup_anno_list_by_saddr. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:05:15 -07:00
Geliang Tang	348d5c1dec	mptcp: move to next addr when timeout This patch called mptcp_pm_subflow_established to move to the next address when an ADD_ADDR has been retransmitted the maximum number of times. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:05:15 -07:00
Geliang Tang	62535200be	mptcp: drop unused subflow in mptcp_pm_subflow_established This patch drops the unused parameter subflow in mptcp_pm_subflow_established(). Fixes: `926bdeab55` ("mptcp: Implement path manager interface commands") Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:05:15 -07:00
Geliang Tang	d84ad04941	mptcp: skip connecting the connected address This patch added a new helper named lookup_subflow_by_daddr to find whether the destination address is in the msk's conn_list. In mptcp_pm_nl_add_addr_received, use lookup_subflow_by_daddr to check whether the announced address is already connected. If it is, skip connecting this address and send out the echo. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:05:15 -07:00
Geliang Tang	f7efc7771e	mptcp: drop argument port from mptcp_pm_announce_addr Drop the redundant argument 'port' from mptcp_pm_announce_addr, use the port field of another argument 'addr' instead. Fixes: `0f5c9e3f07` ("mptcp: add port parameter for mptcp_pm_announce_addr") Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:05:15 -07:00
Paolo Abeni	2d6f5a2b57	mptcp: clean-up the rtx path After the previous patch we can easily avoid invoking the workqueue to perform the retransmission, if the msk socket lock is held at rtx timer expiration. This also simplifies the relevant code. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-26 15:05:15 -07:00
Marcel Holtmann	d58cf00dce	Bluetooth: Increment management interface revision Increment the mgmt revision due to recent changes. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2021-03-26 14:05:22 -07:00
Marcel Holtmann	3d34a71ff8	Bluetooth: Move the advertisement monitor events to correct list The list of trusted events should contain the advertisement monitor events and not the untrusted one, so move entries to the correct list. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2021-03-26 14:05:22 -07:00
Marcel Holtmann	02431b6cdb	Bluetooth: Add missing entries for PHY configuration commands The list of supported mgmt commands for PHY configuration is missing, so just add them. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2021-03-26 14:05:22 -07:00
Marcel Holtmann	21dd118f8d	Bluetooth: Fix wrong opcode error for read advertising features The read advertising features error handling returns false the opcode for the set advertising command. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2021-03-26 12:58:13 -07:00
Marcel Holtmann	353cac0e10	Bluetooth: Fix mgmt status for LL Privacy experimental feature The return error when trying to change the setting when a controller is powered up, shall be MGMT_STATUS_REJECTED. However instead now the error MGMT_STATUS_NOT_POWERED is used which is exactly the opposite. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2021-03-26 12:58:13 -07:00
Yonghong Song	b910eaaaa4	bpf: Fix NULL pointer dereference in bpf_get_local_storage() helper Jiri Olsa reported a bug ([1]) in kernel where cgroup local storage pointer may be NULL in bpf_get_local_storage() helper. There are two issues uncovered by this bug: (1). kprobe or tracepoint prog incorrectly sets cgroup local storage before prog run, (2). due to change from preempt_disable to migrate_disable, preemption is possible and percpu storage might be overwritten by other tasks. This issue (1) is fixed in [2]. This patch tried to address issue (2). The following shows how things can go wrong: task 1: bpf_cgroup_storage_set() for percpu local storage preemption happens task 2: bpf_cgroup_storage_set() for percpu local storage preemption happens task 1: run bpf program task 1 will effectively use the percpu local storage setting by task 2 which will be either NULL or incorrect ones. Instead of just one common local storage per cpu, this patch fixed the issue by permitting 8 local storages per cpu and each local storage is identified by a task_struct pointer. This way, we allow at most 8 nested preemption between bpf_cgroup_storage_set() and bpf_cgroup_storage_unset(). The percpu local storage slot is released (calling bpf_cgroup_storage_unset()) by the same task after bpf program finished running. bpf_test_run() is also fixed to use the new bpf_cgroup_storage_set() interface. The patch is tested on top of [2] with reproducer in [1]. Without this patch, kernel will emit error in 2-3 minutes. With this patch, after one hour, still no error. [1] https://lore.kernel.org/bpf/CAKH8qBuXCfUz=w8L+Fj74OaUpbosO29niYwTki7e3Ag044_aww@mail.gmail.com/T [2] https://lore.kernel.org/bpf/20210309185028.3763817-1-yhs@fb.com Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Roman Gushchin <guro@fb.com> Link: https://lore.kernel.org/bpf/20210323055146.3334476-1-yhs@fb.com	2021-03-25 18:31:36 -07:00
Eric Dumazet	4ecc1baf36	tcp: convert elligible sysctls to u8 Many tcp sysctls are either bools or small ints that can fit into u8. Reducing space taken by sysctls can save few cache line misses when sending/receiving data while cpu caches are empty, for example after cpu idle period. This is hard to measure with typical network performance tests, but after this patch, struct netns_ipv4 has shrunk by three cache lines. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:39:33 -07:00
Eric Dumazet	2932bcda07	inet: convert tcp_early_demux and udp_early_demux to u8 For these sysctls, their dedicated helpers have to use proc_dou8vec_minmax(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:39:33 -07:00
Eric Dumazet	1c69dedc8f	ipv4: convert ip_forward_update_priority sysctl to u8 This sysctl uses ip_fwd_update_priority() helper, so the conversion needs to change it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:39:33 -07:00
Eric Dumazet	4b6bbf17d4	ipv4: shrink netns_ipv4 with sysctl conversions These sysctls that can fit in one byte instead of one int are converted to save space and thus reduce cache line misses. - icmp_echo_ignore_all, icmp_echo_ignore_broadcasts, - icmp_ignore_bogus_error_responses, icmp_errors_use_inbound_ifaddr - tcp_ecn, tcp_ecn_fallback - ip_default_ttl, ip_no_pmtu_disc, ip_fwd_use_pmtu - ip_nonlocal_bind, ip_autobind_reuse - ip_dynaddr, ip_early_demux, raw_l3mdev_accept - nexthop_compat_mode, fwmark_reflect Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:39:33 -07:00
Dmitry Vyukov	6c996e1994	net: change netdev_unregister_timeout_secs min value to 1 netdev_unregister_timeout_secs=0 can lead to printing the "waiting for dev to become free" message every jiffy. This is too frequent and unnecessary. Set the min value to 1 second. Also fix the merge issue introduced by "net: make unregister netdev warning timeout configurable": it changed "refcnt != 1" to "refcnt". Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Fixes: `5aa3afe107` ("net: make unregister netdev warning timeout configurable") Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:24:06 -07:00
Lu Wei	cbd801b3b0	net: ipv4: Fix some typos Modify "accomodate" to "accommodate" in net/ipv4/esp4.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:05:08 -07:00
Lu Wei	952a67f6f6	net: dsa: Fix a typo in tag_rtl4_a.c Modify "Apparantly" to "Apparently" in net/dsa/tag_rtl4_a.c.. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:05:08 -07:00
Lu Wei	e51443d54b	net: decnet: Fix a typo in dn_nsp_in.c Modify "erronous" to "erroneous" in net/decnet/dn_nsp_in.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:05:07 -07:00
Lu Wei	897b9fae7a	net: core: Fix a typo in dev_addr_lists.c Modify "funciton" to "function" in net/core/dev_addr_lists.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:05:07 -07:00
Lu Wei	3f9143f10c	net: ceph: Fix a typo in osdmap.c Modify "inital" to "initial" in net/ceph/osdmap.c. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:05:07 -07:00
Xiaoming Ni	4b5db93e7f	nfc: Avoid endless loops caused by repeated llcp_sock_connect() When sock_wait_state() returns -EINPROGRESS, "sk->sk_state" is LLCP_CONNECTING. In this case, llcp_sock_connect() is repeatedly invoked, nfc_llcp_sock_link() will add sk to local->connecting_sockets twice. sk->sk_node->next will point to itself, that will make an endless loop and hang-up the system. To fix it, check whether sk->sk_state is LLCP_CONNECTING in llcp_sock_connect() to avoid repeated invoking. Fixes: `b4011239a0` ("NFC: llcp: Fix non blocking sockets connections") Reported-by: "kiyin(尹亮)" <kiyin@tencent.com> Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: <stable@vger.kernel.org> #v3.11 Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:02:01 -07:00
Xiaoming Ni	7574fcdbdc	nfc: fix memory leak in llcp_sock_connect() In llcp_sock_connect(), use kmemdup to allocate memory for "llcp_sock->service_name". The memory is not released in the sock_unlink label of the subsequent failure branch. As a result, memory leakage occurs. fix CVE-2020-25672 Fixes: `d646960f79` ("NFC: Initial LLCP support") Reported-by: "kiyin(尹亮)" <kiyin@tencent.com> Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: <stable@vger.kernel.org> #v3.3 Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:02:01 -07:00
Xiaoming Ni	8a4cd82d62	nfc: fix refcount leak in llcp_sock_connect() nfc_llcp_local_get() is invoked in llcp_sock_connect(), but nfc_llcp_local_put() is not invoked in subsequent failure branches. As a result, refcount leakage occurs. To fix it, add calling nfc_llcp_local_put(). fix CVE-2020-25671 Fixes: `c7aa12252f` ("NFC: Take a reference on the LLCP local pointer when creating a socket") Reported-by: "kiyin(尹亮)" <kiyin@tencent.com> Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: <stable@vger.kernel.org> #v3.6 Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:02:01 -07:00
Xiaoming Ni	c33b1cc62a	nfc: fix refcount leak in llcp_sock_bind() nfc_llcp_local_get() is invoked in llcp_sock_bind(), but nfc_llcp_local_put() is not invoked in subsequent failure branches. As a result, refcount leakage occurs. To fix it, add calling nfc_llcp_local_put(). fix CVE-2020-25670 Fixes: `c7aa12252f` ("NFC: Take a reference on the LLCP local pointer when creating a socket") Reported-by: "kiyin(尹亮)" <kiyin@tencent.com> Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: <stable@vger.kernel.org> #v3.6 Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 17:02:01 -07:00
Lu Wei	f1dcffcc8a	net: Fix a misspell in socket.c s/addres/address Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 16:56:27 -07:00
Hoang Le	b83e214b2e	tipc: add extack messages for bearer/media failure Add extack error messages for -EINVAL errors when enabling bearer, getting/setting properties for a media/bearer Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 16:54:45 -07:00
Vladimir Oltean	479dc497db	net: dsa: only unset VLAN filtering when last port leaves last VLAN-aware bridge DSA is aware of switches with global VLAN filtering since the blamed commit, but it makes a bad decision when multiple bridges are spanning the same switch: ip link add br0 type bridge vlan_filtering 1 ip link add br1 type bridge vlan_filtering 1 ip link set swp2 master br0 ip link set swp3 master br0 ip link set swp4 master br1 ip link set swp5 master br1 ip link set swp5 nomaster ip link set swp4 nomaster [138665.939930] sja1105 spi0.1: port 3: dsa_core: VLAN filtering is a global setting [138665.947514] DSA: failed to notify DSA_NOTIFIER_BRIDGE_LEAVE When all ports leave br1, DSA blindly attempts to disable VLAN filtering on the switch, ignoring the fact that br0 still exists and is VLAN-aware too. It fails while doing that. This patch checks whether any port exists at all and is under a VLAN-aware bridge. Fixes: `d371b7c92d` ("net: dsa: Unset vlan_filtering when ports leave the bridge") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 16:48:45 -07:00
Jakub Kicinski	42ce127d98	ethtool: fec: sanitize ethtool_fecparam->fec Reject NONE on set, this mode means device does not support FEC so it's a little out of place in the set interface. This should be safe to do - user space ethtool does not allow the use of NONE on set. A few drivers treat it the same as OFF, but none use it instead of OFF. Similarly reject an empty FEC mask. The common user space tool will not send such requests and most drivers correctly reject it already. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 16:46:53 -07:00
Jakub Kicinski	d3b37fc805	ethtool: fec: sanitize ethtool_fecparam->active_fec struct ethtool_fecparam::active_fec is a GET-only field, all in-tree drivers correctly ignore it on SET. Clear the field on SET to avoid any confusion. Again, we can't reject non-zero now since ethtool user space does not zero-init the param correctly. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 16:46:53 -07:00
Jakub Kicinski	240e114411	ethtool: fec: sanitize ethtool_fecparam->reserved struct ethtool_fecparam::reserved is never looked at by the core. Make sure it's actually 0. Unfortunately we can't return an error because old ethtool doesn't zero-initialize the structure for SET. On GET we can be more verbose, there are no in tree (ab)users. Fix up the kdoc on the structure. Remove the mention of FEC bypass. Seems like a niche thing to configure in the first place. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 16:46:53 -07:00
David S. Miller	241949e488	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-03-24 The following pull-request contains BPF updates for your net-next tree. We've added 37 non-merge commits during the last 15 day(s) which contain a total of 65 files changed, 3200 insertions(+), 738 deletions(-). The main changes are: 1) Static linking of multiple BPF ELF files, from Andrii. 2) Move drop error path to devmap for XDP_REDIRECT, from Lorenzo. 3) Spelling fixes from various folks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 16:30:46 -07:00
David S. Miller	efd13b71a3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 15:31:22 -07:00
Bhaskar Chowdhury	5153ceb9e6	Bluetooth: L2CAP: Rudimentary typo fixes s/minium/minimum/ s/procdure/procedure/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-03-25 16:13:28 +01:00
Arnd Bergmann	6ad2dd6c14	ipv6: fix clang Wformat warning When building with 'make W=1', clang warns about a mismatched format string: net/ipv6/ah6.c:710:4: error: format specifies type 'unsigned short' but the argument has type 'int' [-Werror,-Wformat] aalg_desc->uinfo.auth.icv_fullbits/8); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/printk.h:375:34: note: expanded from macro 'pr_info' printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__) ~~~ ^~~~~~~~~~~ net/ipv6/esp6.c:1153:5: error: format specifies type 'unsigned short' but the argument has type 'int' [-Werror,-Wformat] aalg_desc->uinfo.auth.icv_fullbits / 8); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/printk.h:375:34: note: expanded from macro 'pr_info' printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__) ~~~ ^~~~~~~~~~~ Here, the result of dividing a 16-bit number by a 32-bit number produces a 32-bit result, which is printed as a 16-bit integer. Change the %hu format to the normal %u, which has the same effect but avoids the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2021-03-25 09:48:32 +01:00
Linus Torvalds	e138138003	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from David Miller: "Various fixes, all over: 1) Fix overflow in ptp_qoriq_adjfine(), from Yangbo Lu. 2) Always store the rx queue mapping in veth, from Maciej Fijalkowski. 3) Don't allow vmlinux btf in map_create, from Alexei Starovoitov. 4) Fix memory leak in octeontx2-af from Colin Ian King. 5) Use kvalloc in bpf x86 JIT for storing jit'd addresses, from Yonghong Song. 6) Fix tx ptp stats in mlx5, from Aya Levin. 7) Check correct ip version in tun decap, fropm Roi Dayan. 8) Fix rate calculation in mlx5 E-Switch code, from arav Pandit. 9) Work item memork leak in mlx5, from Shay Drory. 10) Fix ip6ip6 tunnel crash with bpf, from Daniel Borkmann. 11) Lack of preemptrion awareness in macvlan, from Eric Dumazet. 12) Fix data race in pxa168_eth, from Pavel Andrianov. 13) Range validate stab in red_check_params(), from Eric Dumazet. 14) Inherit vlan filtering setting properly in b53 driver, from Florian Fainelli. 15) Fix rtnl locking in igc driver, from Sasha Neftin. 16) Pause handling fixes in igc driver, from Muhammad Husaini Zulkifli. 17) Missing rtnl locking in e1000_reset_task, from Vitaly Lifshits. 18) Use after free in qlcnic, from Lv Yunlong. 19) fix crash in fritzpci mISDN, from Tong Zhang. 20) Premature rx buffer reuse in igb, from Li RongQing. 21) Missing termination of ip[a driver message handler arrays, from Alex Elder. 22) Fix race between "x25_close" and "x25_xmit"/"x25_rx" in hdlc_x25 driver, from Xie He. 23) Use after free in c_can_pci_remove(), from Tong Zhang. 24) Uninitialized variable use in nl80211, from Jarod Wilson. 25) Off by one size calc in bpf verifier, from Piotr Krysiuk. 26) Use delayed work instead of deferrable for flowtable GC, from Yinjun Zhang. 27) Fix infinite loop in NPC unmap of octeontx2 driver, from Hariprasad Kelam. 28) Fix being unable to change MTU of dwmac-sun8i devices due to lack of fifo sizes, from Corentin Labbe. 29) DMA use after free in r8169 with WoL, fom Heiner Kallweit. 30) Mismatched prototypes in isdn-capi, from Arnd Bergmann. 31) Fix psample UAPI breakage, from Ido Schimmel" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (171 commits) psample: Fix user API breakage math: Export mul_u64_u64_div_u64 ch_ktls: fix enum-conversion warning octeontx2-af: Fix memory leak of object buf ptp_qoriq: fix overflow in ptp_qoriq_adjfine() u64 calcalation net: bridge: don't notify switchdev for local FDB addresses net/sched: act_ct: clear post_ct if doing ct_clear net: dsa: don't assign an error value to tag_ops isdn: capi: fix mismatched prototypes net/mlx5: SF, do not use ecpu bit for vhca state processing net/mlx5e: Fix division by 0 in mlx5e_select_queue net/mlx5e: Fix error path for ethtool set-priv-flag net/mlx5e: Offload tuple rewrite for non-CT flows net/mlx5e: Allow to match on MPLS parameters only for MPLS over UDP net/mlx5: Add back multicast stats for uplink representor net: ipconfig: ic_dev can be NULL in ic_close_devs MAINTAINERS: Combine "QLOGIC QLGE 10Gb ETHERNET DRIVER" sections into one docs: networking: Fix a typo r8169: fix DMA being used after buffer free if WoL is enabled net: ipa: fix init header command validation ...	2021-03-24 18:16:04 -07:00
Wang Hai	da1da87fa7	6lowpan: Fix some typos in nhc_udp.c s/Orignal/Original/ s/infered/inferred/ Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 17:52:11 -07:00
Wang Hai	0e4161d0ed	net/packet: Fix a typo in af_packet.c s/sequencially/sequentially/ Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 17:52:11 -07:00
Wang Hai	72a0f6d052	net/tls: Fix a typo in tls_device.c s/beggining/beginning/ Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 17:52:11 -07:00
Eric Dumazet	aa6dd211e4	inet: use bigger hash table for IP ID generation In commit `73f156a6e8` ("inetpeer: get rid of ip_id_count") I used a very small hash table that could be abused by patient attackers to reveal sensitive information. Switch to a dynamic sizing, depending on RAM size. Typical big hosts will now use 128x more storage (2 MB) to get a similar increase in security and reduction of hash collisions. As a bonus, use of alloc_large_system_hash() spreads allocated memory among all NUMA nodes. Fixes: `73f156a6e8` ("inetpeer: get rid of ip_id_count") Reported-by: Amit Klein <aksecurity@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 16:45:11 -07:00
Sai Kalyaan Palla	c3dde0ee71	net: decnet: Fixed multiple Coding Style issues Made changes to coding style as suggested by checkpatch.pl changes are of the type: space required before the open parenthesis '(' space required after that ',' Signed-off-by: Sai Kalyaan Palla <saikalyaan63@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 16:25:21 -07:00
Bhaskar Chowdhury	536e11f96b	net: sched: Mundane typo fixes s/procdure/procedure/ s/maintanance/maintenance/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 15:09:11 -07:00
Pablo Neira Ayuso	3fb24a43c9	dsa: slave: add support for TC_SETUP_FT The dsa infrastructure provides a well-defined hierarchy of devices, pass up the call to set up the flow block to the master device. From the software dataplane, the netfilter infrastructure uses the dsa slave devices to refer to the input and output device for the given skbuff. Similarly, the flowtable definition in the ruleset refers to the dsa slave port devices. This patch adds the glue code to call ndo_setup_tc with TC_SETUP_FT with the master device via the dsa slave devices. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:39 -07:00
Pablo Neira Ayuso	17e52c0aaa	netfilter: flowtable: support for FLOW_ACTION_PPPOE_PUSH Add a PPPoE push action if layer 2 protocol is ETH_P_PPP_SES to add PPPoE flowtable hardware offload support. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:39 -07:00
Felix Fietkau	26267bf9bb	netfilter: flowtable: bridge vlan hardware offload and switchdev The switch might have already added the VLAN tag through PVID hardware offload. Keep this extra VLAN in the flowtable but skip it on egress. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:39 -07:00
Pablo Neira Ayuso	73f97025a9	netfilter: nft_flow_offload: use direct xmit if hardware offload is enabled If there is a forward path to reach an ethernet device and hardware offload is enabled, then use the direct xmit path. Moreover, store the real device in the direct xmit path info since software datapath uses dev_hard_header() to push the layer encapsulation headers while hardware offload refers to the real device. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:39 -07:00
Pablo Neira Ayuso	eeff3000f2	netfilter: flowtable: add offload support for xmit path types When the flow tuple xmit_type is set to FLOW_OFFLOAD_XMIT_DIRECT, the dst_cache pointer is not valid, and the h_source/h_dest/ifidx out fields need to be used. This patch also adds the FLOW_ACTION_VLAN_PUSH action to pass the VLAN tag to the driver. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:39 -07:00
Pablo Neira Ayuso	a11e7973cf	netfilter: flowtable: add dsa support Replace the master ethernet device by the dsa slave port. Packets coming in from the software ingress path use the dsa slave port as input device. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:39 -07:00
Pablo Neira Ayuso	72efd585f7	netfilter: flowtable: add pppoe support Add the PPPoE protocol and session id to the flow tuple using the encap fields to uniquely identify flows from the receive path. For the transmit path, dev_hard_header() on the vlan device push the headers. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:39 -07:00
Pablo Neira Ayuso	e990cef651	netfilter: flowtable: add bridge vlan filtering support Add the vlan tag based when PVID is set on. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:39 -07:00
Pablo Neira Ayuso	4cd91f7c29	netfilter: flowtable: add vlan support Add the vlan id and protocol to the flow tuple to uniquely identify flows from the receive path. For the transmit path, dev_hard_header() on the vlan device push the headers. This patch includes support for two vlan headers (QinQ) from the ingress path. Add a generic encap field to the flowtable entry which stores the protocol and the tag id. This allows to reuse these fields in the PPPoE support coming in a later patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:39 -07:00
Pablo Neira Ayuso	7a27f6ab41	netfilter: flowtable: use dev_fill_forward_path() to obtain egress device The egress device in the tuple is obtained from route. Use dev_fill_forward_path() instead to provide the real egress device for this flow whenever this is available. The new FLOW_OFFLOAD_XMIT_DIRECT type uses dev_queue_xmit() to transmit ethernet frames. Cache the source and destination hardware address to use dev_queue_xmit() to transfer packets. The FLOW_OFFLOAD_XMIT_DIRECT replaces FLOW_OFFLOAD_XMIT_NEIGH if dev_fill_forward_path() finds a direct transmit path. In case of topology updates, if peer is moved to different bridge port, the connection will time out, reconnect will result in a new entry with the correct path. Snooping fdb updates would allow for cleaning up stale flowtable entries. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:39 -07:00
Pablo Neira Ayuso	c63a7cc4d7	netfilter: flowtable: use dev_fill_forward_path() to obtain ingress device Obtain the ingress device in the tuple from the route in the reply direction. Use dev_fill_forward_path() instead to get the real ingress device for this flow. Fall back to use the ingress device that the IP forwarding route provides if: - dev_fill_forward_path() finds no real ingress device. - the ingress device that is obtained is not part of the flowtable devices. - this route has a xfrm policy. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:39 -07:00
Pablo Neira Ayuso	5139c0c007	netfilter: flowtable: add xmit path types Add the xmit_type field that defines the two supported xmit paths in the flowtable data plane, which are the neighbour and the xfrm xmit paths. This patch prepares for new flowtable xmit path types to come. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:39 -07:00
Felix Fietkau	0994d492a1	net: dsa: resolve forwarding path for dsa slave ports Add .ndo_fill_forward_path for dsa slave port devices Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:38 -07:00
Felix Fietkau	bcf2766b13	net: bridge: resolve forwarding path for VLAN tag actions in bridge devices Depending on the VLAN settings of the bridge and the port, the bridge can either add or remove a tag. When vlan filtering is enabled, the fdb lookup also needs to know the VLAN tag/proto for the destination address To provide this, keep track of the stack of VLAN tags for the path in the lookup context Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:38 -07:00
Pablo Neira Ayuso	ec9d16bab6	net: bridge: resolve forwarding path for bridge devices Add .ndo_fill_forward_path for bridge devices. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:38 -07:00
Pablo Neira Ayuso	e4417d6950	net: 8021q: resolve forwarding path for vlan devices Add .ndo_fill_forward_path for vlan devices. For instance, assuming the following topology: IP forwarding / \ eth0.100 eth0 \| eth0 . . . ethX ab💿ef🆎cd:ef For packets going through IP forwarding to eth0.100 whose destination MAC address is ab💿ef🆎cd:ef, dev_fill_forward_path() provides the following path: eth0.100 -> eth0 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:38 -07:00
Pablo Neira Ayuso	ddb94eafab	net: resolve forwarding path from virtual netdevice and HW destination address This patch adds dev_fill_forward_path() which resolves the path to reach the real netdevice from the IP forwarding side. This function takes as input the netdevice and the destination hardware address and it walks down the devices calling .ndo_fill_forward_path() for each device until the real device is found. For instance, assuming the following topology: IP forwarding / \ br0 eth0 / \ eth1 eth2 . . . ethX ab💿ef🆎cd:ef where eth1 and eth2 are bridge ports and eth0 provides WAN connectivity. ethX is the interface in another box which is connected to the eth1 bridge port. For packets going through IP forwarding to br0 whose destination MAC address is ab💿ef🆎cd:ef, dev_fill_forward_path() provides the following path: br0 -> eth1 .ndo_fill_forward_path for br0 looks up at the FDB for the bridge port from the destination MAC address to get the bridge port eth1. This information allows to create a fast path that bypasses the classic bridge and IP forwarding paths, so packets go directly from the bridge port eth1 to eth0 (wan interface) and vice versa. fast path .------------------------. / \ \| IP forwarding \| \| / \ \/ \| br0 eth0 . / \ -> eth1 eth2 . . . ethX ab💿ef🆎cd:ef Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:48:38 -07:00
Colin Ian King	ad248f7761	net: bridge: Fix missing return assignment from br_vlan_replay_one call The call to br_vlan_replay_one is returning an error return value but this is not being assigned to err and the following check on err is currently always false because err was initialized to zero. Fix this by assigning err. Addresses-Coverity: ("'Constant' variable guards dead code") Fixes: `22f67cdfae` ("net: bridge: add helper to replay VLANs installed on port") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:45:48 -07:00
Horatiu Vultur	b3cb91b97c	bridge: mrp: Disable roles before deleting the MRP instance When an MRP instance was created, the driver was notified that the instance is created and then in a different callback about role of the instance. But when the instance was deleted the driver was notified only that the MRP instance is deleted and not also that the role is disabled. This patch make sure that the driver is notified that the role is changed to disabled before the MRP instance is deleted to have similar callbacks with the creating of the instance. In this way it would simplify the logic in the drivers. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 12:14:08 -07:00
Xin Long	68dc022d04	xfrm: BEET mode doesn't support fragments for inner packets BEET mode replaces the IP(6) Headers with new IP(6) Headers when sending packets. However, when it's a fragment before the replacement, currently kernel keeps the fragment flag and replace the address field then encaps it with ESP. It would cause in RX side the fragments to get reassembled before decapping with ESP, which is incorrect. In Xiumei's testing, these fragments went over an xfrm interface and got encapped with ESP in the device driver, and the traffic was broken. I don't have a good way to fix it, but only to warn this out in dmesg. Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2021-03-24 09:58:19 +01:00
Meng Yu	0f90d320b4	Bluetooth: Remove trailing semicolon in macros Macros should not use a trailing semicolon. Signed-off-by: Meng Yu <yumeng18@huawei.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-03-24 08:56:24 +01:00
Dmitry Vyukov	5aa3afe107	net: make unregister netdev warning timeout configurable netdev_wait_allrefs() issues a warning if refcount does not drop to 0 after 10 seconds. While 10 second wait generally should not happen under normal workload in normal environment, it seems to fire falsely very often during fuzzing and/or in qemu emulation (~10x slower). At least it's not possible to understand if it's really a false positive or not. Automated testing generally bumps all timeouts to very high values to avoid flake failures. Add net.core.netdev_unregister_timeout_secs sysctl to make the timeout configurable for automated testing systems. Lowering the timeout may also be useful for e.g. manual bisection. The default value matches the current behavior. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=211877 Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 17:22:50 -07:00
Vladimir Oltean	010e269f91	net: dsa: sync up switchdev objects and port attributes when joining the bridge If we join an already-created bridge port, such as a bond master interface, then we can miss the initial switchdev notifications emitted by the bridge for this port, while it wasn't offloaded by anybody. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 14:49:06 -07:00
Vladimir Oltean	5961d6a12c	net: dsa: inherit the actual bridge port flags at join time DSA currently assumes that the bridge port starts off with this constellation of bridge port flags: - learning on - unicast flooding on - multicast flooding on - broadcast flooding on just by virtue of code copy-pasta from the bridge layer (new_nbp). This was a simple enough strategy thus far, because the 'bridge join' moment always coincided with the 'bridge port creation' moment. But with sandwiched interfaces, such as: br0 \| bond0 \| swp0 it may happen that the user has had time to change the bridge port flags of bond0 before enslaving swp0 to it. In that case, swp0 will falsely assume that the bridge port flags are those determined by new_nbp, when in fact this can happen: ip link add br0 type bridge ip link add bond0 type bond ip link set bond0 master br0 ip link set bond0 type bridge_slave learning off ip link set swp0 master br0 Now swp0 has learning enabled, bond0 has learning disabled. Not nice. Fix this by "dumpster diving" through the actual bridge port flags with br_port_flag_is_set, at bridge join time. We use this opportunity to split dsa_port_change_brport_flags into two distinct functions called dsa_port_inherit_brport_flags and dsa_port_clear_brport_flags, now that the implementation for the two cases is no longer similar. This patch also creates two functions called dsa_port_switchdev_sync and dsa_port_switchdev_unsync which collect what we have so far, even if that's asymmetrical. More is going to be added in the next patch. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 14:49:06 -07:00
Vladimir Oltean	2afc526ab3	net: dsa: pass extack to dsa_port_{bridge,lag}_join This is a pretty noisy change that was broken out of the larger change for replaying switchdev attributes and objects at bridge join time, which is when these extack objects are actually used. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 14:49:05 -07:00
Vladimir Oltean	185c9a760a	net: dsa: call dsa_port_bridge_join when joining a LAG that is already in a bridge DSA can properly detect and offload this sequence of operations: ip link add br0 type bridge ip link add bond0 type bond ip link set swp0 master bond0 ip link set bond0 master br0 But not this one: ip link add br0 type bridge ip link add bond0 type bond ip link set bond0 master br0 ip link set swp0 master bond0 Actually the second one is more complicated, due to the elapsed time between the enslavement of bond0 and the offloading of it via swp0, a lot of things could have happened to the bond0 bridge port in terms of switchdev objects (host MDBs, VLANs, altered STP state etc). So this is a bit of a can of worms, and making sure that the DSA port's state is in sync with this already existing bridge port is handled in the next patches. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 14:49:05 -07:00
Vladimir Oltean	22f67cdfae	net: bridge: add helper to replay VLANs installed on port Currently this simple setup with DSA: ip link add br0 type bridge vlan_filtering 1 ip link add bond0 type bond ip link set bond0 master br0 ip link set swp0 master bond0 will not work because the bridge has created the PVID in br_add_if -> nbp_vlan_init, and it has notified switchdev of the existence of VLAN 1, but that was too early, since swp0 was not yet a lower of bond0, so it had no reason to act upon that notification. We need a helper in the bridge to replay the switchdev VLAN objects that were notified since the bridge port creation, because some of them may have been missed. As opposed to the br_mdb_replay function, the vg->vlan_list write side protection is offered by the rtnl_mutex which is sleepable, so we don't need to queue up the objects in atomic context, we can replay them right away. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 14:49:05 -07:00
Vladimir Oltean	04846f903b	net: bridge: add helper to replay port and local fdb entries When a switchdev port starts offloading a LAG that is already in a bridge and has an FDB entry pointing to it: ip link set bond0 master br0 bridge fdb add dev bond0 00:01:02:03:04:05 master static ip link set swp0 master bond0 the switchdev driver will have no idea that this FDB entry is there, because it missed the switchdev event emitted at its creation. Ido Schimmel pointed this out during a discussion about challenges with switchdev offloading of stacked interfaces between the physical port and the bridge, and recommended to just catch that condition and deny the CHANGEUPPER event: https://lore.kernel.org/netdev/20210210105949.GB287766@shredder.lan/ But in fact, we might need to deal with the hard thing anyway, which is to replay all FDB addresses relevant to this port, because it isn't just static FDB entries, but also local addresses (ones that are not forwarded but terminated by the bridge). There, we can't just say 'oh yeah, there was an upper already so I'm not joining that'. So, similar to the logic for replaying MDB entries, add a function that must be called by individual switchdev drivers and replays local FDB entries as well as ones pointing towards a bridge port. This time, we use the atomic switchdev notifier block, since that's what FDB entries expect for some reason. Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 14:49:05 -07:00
Vladimir Oltean	4f2673b3a2	net: bridge: add helper to replay port and host-joined mdb entries I have a system with DSA ports, and udhcpcd is configured to bring interfaces up as soon as they are created. I create a bridge as follows: ip link add br0 type bridge As soon as I create the bridge and udhcpcd brings it up, I also have avahi which automatically starts sending IPv6 packets to advertise some local services, and because of that, the br0 bridge joins the following IPv6 groups due to the code path detailed below: 33:33:ff:6d:c1:9c vid 0 33:33:00:00:00:6a vid 0 33:33:00:00:00:fb vid 0 br_dev_xmit -> br_multicast_rcv -> br_ip6_multicast_add_group -> __br_multicast_add_group -> br_multicast_host_join -> br_mdb_notify This is all fine, but inside br_mdb_notify we have br_mdb_switchdev_host hooked up, and switchdev will attempt to offload the host joined groups to an empty list of ports. Of course nobody offloads them. Then when we add a port to br0: ip link set swp0 master br0 the bridge doesn't replay the host-joined MDB entries from br_add_if, and eventually the host joined addresses expire, and a switchdev notification for deleting it is emitted, but surprise, the original addition was already completely missed. The strategy to address this problem is to replay the MDB entries (both the port ones and the host joined ones) when the new port joins the bridge, similar to what vxlan_fdb_replay does (in that case, its FDB can be populated and only then attached to a bridge that you offload). However there are 2 possibilities: the addresses can be 'pushed' by the bridge into the port, or the port can 'pull' them from the bridge. Considering that in the general case, the new port can be really late to the party, and there may have been many other switchdev ports that already received the initial notification, we would like to avoid delivering duplicate events to them, since they might misbehave. And currently, the bridge calls the entire switchdev notifier chain, whereas for replaying it should just call the notifier block of the new guy. But the bridge doesn't know what is the new guy's notifier block, it just knows where the switchdev notifier chain is. So for simplification, we make this a driver-initiated pull for now, and the notifier block is passed as an argument. To emulate the calling context for mdb objects (deferred and put on the blocking notifier chain), we must iterate under RCU protection through the bridge's mdb entries, queue them, and only call them once we're out of the RCU read-side critical section. There was some opportunity for reuse between br_mdb_switchdev_host_port, br_mdb_notify and the newly added br_mdb_queue_one in how the switchdev mdb object is created, so a helper was created. Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 14:49:05 -07:00
Vladimir Oltean	f1d42ea100	net: bridge: add helper to retrieve the current ageing time The SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME attribute is only emitted from: sysfs/ioctl/netlink -> br_set_ageing_time -> __set_ageing_time therefore not at bridge port creation time, so: (a) switchdev drivers have to hardcode the initial value for the address ageing time, because they didn't get any notification (b) that hardcoded value can be out of sync, if the user changes the ageing time before enslaving the port to the bridge We need a helper in the bridge, such that switchdev drivers can query the current value of the bridge ageing time when they start offloading it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 14:49:05 -07:00
Vladimir Oltean	c0e715bbd5	net: bridge: add helper for retrieving the current bridge port STP state It may happen that we have the following topology with DSA or any other switchdev driver with LAG offload: ip link add br0 type bridge stp_state 1 ip link add bond0 type bond ip link set bond0 master br0 ip link set swp0 master bond0 ip link set swp1 master bond0 STP decides that it should put bond0 into the BLOCKING state, and that's that. The ports that are actively listening for the switchdev port attributes emitted for the bond0 bridge port (because they are offloading it) and have the honor of seeing that switchdev port attribute can react to it, so we can program swp0 and swp1 into the BLOCKING state. But if then we do: ip link set swp2 master bond0 then as far as the bridge is concerned, nothing has changed: it still has one bridge port. But this new bridge port will not see any STP state change notification and will remain FORWARDING, which is how the standalone code leaves it in. We need a function in the bridge driver which retrieves the current STP state, such that drivers can synchronize to it when they may have missed switchdev events. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 14:49:05 -07:00
Vladimir Oltean	6ab4c3117a	net: bridge: don't notify switchdev for local FDB addresses As explained in this discussion: https://lore.kernel.org/netdev/20210117193009.io3nungdwuzmo5f7@skbuf/ the switchdev notifiers for FDB entries managed to have a zero-day bug. The bridge would not say that this entry is local: ip link add br0 type bridge ip link set swp0 master br0 bridge fdb add dev swp0 00:01:02:03:04:05 master local and the switchdev driver would be more than happy to offload it as a normal static FDB entry. This is despite the fact that 'local' and non-'local' entries have completely opposite directions: a local entry is locally terminated and not forwarded, whereas a static entry is forwarded and not locally terminated. So, for example, DSA would install this entry on swp0 instead of installing it on the CPU port as it should. There is an even sadder part, which is that the 'local' flag is implicit if 'static' is not specified, meaning that this command produces the same result of adding a 'local' entry: bridge fdb add dev swp0 00:01:02:03:04:05 master I've updated the man pages for 'bridge', and after reading it now, it should be pretty clear to any user that the commands above were broken and should have never resulted in the 00:01:02:03:04:05 address being forwarded (this behavior is coherent with non-switchdev interfaces): https://patchwork.kernel.org/project/netdevbpf/cover/20210211104502.2081443-1-olteanv@gmail.com/ If you're a user reading this and this is what you want, just use: bridge fdb add dev swp0 00:01:02:03:04:05 master static Because switchdev should have given drivers the means from day one to classify FDB entries as local/non-local, but didn't, it means that all drivers are currently broken. So we can just as well omit the switchdev notifications for local FDB entries, which is exactly what this patch does to close the bug in stable trees. For further development work where drivers might want to trap the local FDB entries to the host, we can add a 'bool is_local' to br_switchdev_fdb_call_notifiers(), and selectively make drivers act upon that bit, while all the others ignore those entries if the 'is_local' bit is set. Fixes: `6b26b51b1d` ("net: bridge: Add support for notifying devices about FDB add/del") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 14:39:41 -07:00
Marcelo Ricardo Leitner	8ca1b090e5	net/sched: act_ct: clear post_ct if doing ct_clear Invalid detection works with two distinct moments: act_ct tries to find a conntrack entry and set post_ct true, indicating that that was attempted. Then, when flow dissector tries to dissect CT info and no entry is there, it knows that it was tried and no entry was found, and synthesizes/sets key->ct_state = TCA_FLOWER_KEY_CT_FLAGS_TRACKED \| TCA_FLOWER_KEY_CT_FLAGS_INVALID; mimicing what OVS does. OVS has this a bit more streamlined, as it recomputes the key after trying to find a conntrack entry for it. Issue here is, when we have 'tc action ct clear', it didn't clear post_ct, causing a subsequent match on 'ct_state -trk' to fail, due to the above. The fix, thus, is to clear it. Reproducer rules: tc filter add dev enp130s0f0np0_0 ingress prio 1 chain 0 \ protocol ip flower ip_proto tcp ct_state -trk \ action ct zone 1 pipe \ action goto chain 2 tc filter add dev enp130s0f0np0_0 ingress prio 1 chain 2 \ protocol ip flower \ action ct clear pipe \ action goto chain 4 tc filter add dev enp130s0f0np0_0 ingress prio 1 chain 4 \ protocol ip flower ct_state -trk \ action mirred egress redirect dev enp130s0f1np1_0 With the fix, the 3rd rule matches, like it does with OVS kernel datapath. Fixes: `7baf2429a1` ("net/sched: cls_flower add CT_FLAGS_INVALID flag support") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 14:32:26 -07:00
Xie He	65d2dbb300	net: lapb: Make "lapb_t1timer_running" able to detect an already running timer Problem: The "lapb_t1timer_running" function in "lapb_timer.c" is used in only one place: in the "lapb_kick" function in "lapb_out.c". "lapb_kick" calls "lapb_t1timer_running" to check if the timer is already pending, and if it is not, schedule it to run. However, if the timer has already fired and is running, and is waiting to get the "lapb->lock" lock, "lapb_t1timer_running" will not detect this, and "lapb_kick" will then schedule a new timer. The old timer will then abort when it sees a new timer pending. I think this is not right. The purpose of "lapb_kick" should be ensuring that the actual work of the timer function is scheduled to be done. If the timer function is already running but waiting for the lock, "lapb_kick" should not abort and reschedule it. Changes made: I added a new field "t1timer_running" in "struct lapb_cb" for "lapb_t1timer_running" to use. "t1timer_running" will accurately reflect whether the actual work of the timer is pending. If the timer has fired but is still waiting for the lock, "t1timer_running" will still correctly reflect whether the actual work is waiting to be done. The old "t1timer_stop" field, whose only responsibility is to ask a timer (that is already running but waiting for the lock) to abort, is no longer needed, because the new "t1timer_running" field can fully take over its responsibility. Therefore "t1timer_stop" is deleted. "t1timer_running" is not simply a negation of the old "t1timer_stop". At the end of the timer function, if it does not reschedule itself, "t1timer_running" is set to false to indicate that the timer is stopped. For consistency of the code, I also added "t2timer_running" and deleted "t2timer_stop". Signed-off-by: Xie He <xie.he.0141@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-23 14:14:50 -07:00
Sven Eckelmann	5fc087ff96	batman-adv: Drop unused header preempt.h The commit `b1de0f01b0` ("batman-adv: Use netif_rx_any_context().") removed the last user for a function declaration from linux/preempt.h. The include should therefore be cleaned up. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2021-03-23 21:52:28 +01:00
Linus Lüssing	549750babe	batman-adv: Fix order of kernel doc in batadv_priv During the inlining process of kerneldoc in commit `8b84cc4fb5` ("batman-adv: Use inline kernel-doc for enum/struct"), some comments were placed at the wrong struct members. Fixing this by reordering the comments. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2021-03-23 21:49:14 +01:00
Meng Yu	c29fb5f650	Bluetooth: Remove trailing semicolon in macros remove trailing semicolon in macros and coding style fix. Signed-off-by: Meng Yu <yumeng18@huawei.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-03-23 16:05:35 +01:00
Archie Pusaka	3af70b39fa	Bluetooth: check for zapped sk before connecting There is a possibility of receiving a zapped sock on l2cap_sock_connect(). This could lead to interesting crashes, one such case is tearing down an already tore l2cap_sock as is happened with this call trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0xc4/0x118 lib/dump_stack.c:56 register_lock_class kernel/locking/lockdep.c:792 [inline] register_lock_class+0x239/0x6f6 kernel/locking/lockdep.c:742 __lock_acquire+0x209/0x1e27 kernel/locking/lockdep.c:3105 lock_acquire+0x29c/0x2fb kernel/locking/lockdep.c:3599 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:137 [inline] _raw_spin_lock_bh+0x38/0x47 kernel/locking/spinlock.c:175 spin_lock_bh include/linux/spinlock.h:307 [inline] lock_sock_nested+0x44/0xfa net/core/sock.c:2518 l2cap_sock_teardown_cb+0x88/0x2fb net/bluetooth/l2cap_sock.c:1345 l2cap_chan_del+0xa3/0x383 net/bluetooth/l2cap_core.c:598 l2cap_chan_close+0x537/0x5dd net/bluetooth/l2cap_core.c:756 l2cap_chan_timeout+0x104/0x17e net/bluetooth/l2cap_core.c:429 process_one_work+0x7e3/0xcb0 kernel/workqueue.c:2064 worker_thread+0x5a5/0x773 kernel/workqueue.c:2196 kthread+0x291/0x2a6 kernel/kthread.c:211 ret_from_fork+0x4e/0x80 arch/x86/entry/entry_64.S:604 Signed-off-by: Archie Pusaka <apusaka@chromium.org> Reported-by: syzbot+abfc0f5e668d4099af73@syzkaller.appspotmail.com Reviewed-by: Alain Michaud <alainm@chromium.org> Reviewed-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2021-03-23 16:03:48 +01:00

... 5 6 7 8 9 ...

65058 Commits