OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Sunil Goutham	e8e095b3b3	octeontx2-af: cn10k: Bandwidth profiles config support CN10K silicons supports hierarchial ingress packet ratelimiting. There are 3 levels of profilers supported leaf, mid and top. Ratelimiting is done after packet forwarding decision is taken and a NIXLF's RQ is identified to DMA the packet. RQ's context points to a leaf bandwidth profile which can be configured to achieve desired ratelimit. This patch adds logic for management of these bandwidth profiles ie profile alloc, free, context update etc. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 11:11:05 -07:00
David S. Miller	ad5645d7b9	Merge branch 'pci200syn-cleanups' Peng Li says: ==================== net: pci200syn: clean up some code style issues This patchset clean up some code style issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 11:03:17 -07:00
Peng Li	6855d301e9	net: pci200syn: fix the comments style issue Networking block comments don't use an empty /* line, use /* Comment... This patch fixes the comments style issues. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 11:03:17 -07:00
Peng Li	8e7680c102	net: pci200syn: add necessary () to macro argument Macro argument 'card' may be better as '(card)' to avoid precedence issues. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 11:03:17 -07:00
Peng Li	2b63744668	net: pci200syn: add some required spaces Add spaces required after that close brace '}'. Add spaces required before the open parenthesis '('. Add spaces required after that ','. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 11:03:17 -07:00
Peng Li	b9282333ef	net: pci200syn: replace comparison to NULL with "!card" According to the chackpatch.pl, comparison to NULL could be written "!card". Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 11:03:17 -07:00
Peng Li	f9a03eae28	net: pci200syn: add blank line after declarations This patch fixes the checkpatch error about missing a blank line after declarations. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 11:03:17 -07:00
Peng Li	bbcb2840b0	net: pci200syn: remove redundant blank lines This patch removes some redundant blank lines. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 11:03:16 -07:00
David S. Miller	5938b227ca	Merge branch 'z85230-cleanups' Peng Li says: ==================== net: z85230: clean up some code style issues This patchset clean up some code style issues. --- Change Log: V1 -> V2: 1, fix the comments from Andrew, add commit message to [patch 04/11] about remove volatile. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:55:18 -07:00
Peng Li	2b28b711ac	net: z85230: remove unnecessary out of memory message This patch removes unnecessary out of memory message, to fix the following checkpatch.pl warning: "WARNING: Possible unnecessary 'out of memory' message" Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:55:18 -07:00
Peng Li	00a580db9e	net: z85230: fix the code style issue about open brace { This patch fixes the code style issue according to checkpatch.pl error: "ERROR: that open brace { should be on the previous line". Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:55:18 -07:00
Peng Li	b87a5cf656	net: z85230: add some required spaces Add space required before the open parenthesis '(' and '{'. Add space required after that close brace '}' and ',' Add spaces required around that '=' , '&', '*', '\|', '+', '/' and '-'. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:55:18 -07:00
Peng Li	a04544ffe8	net: z85230: remove trailing whitespaces This patch removes trailing whitespaces. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:55:18 -07:00
Peng Li	57b6de35cf	net: z85230: fix the code style issue about "if..else.." According to the chackpatch.pl, else should follow close brace '}', braces {} should be used on all arms of this statement. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:55:18 -07:00
Peng Li	c6c3ba4578	net: z85230: fix the comments style issue Networking block comments don't use an empty /* line, use /* Comment... Block comments use * on subsequent lines. Block comments use a trailing */ on a separate line. This patch fixes the comments style issues. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:55:18 -07:00
Peng Li	b55932bcfa	net: z85230: replace comparison to NULL with "!skb" According to the chackpatch.pl, comparison to NULL could be written "!skb". Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:55:18 -07:00
Peng Li	e07a1f9cbd	net: z85230: fix the code style issue about EXPORT_SYMBOL(foo) According to the chackpatch.pl, EXPORT_SYMBOL(foo); should immediately follow its function/variable. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:55:18 -07:00
Peng Li	61312d78e1	net: z85230: add blank line after declarations This patch fixes the checkpatch error about missing a blank line after declarations. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:55:18 -07:00
Peng Li	336bac5eda	net: z85230: remove redundant blank lines This patch removes some redundant blank lines. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:55:18 -07:00
Boris Sukholitko	0dca2c7404	net/sched: cls_flower: Remove match on n_proto The following flower filters fail to match packets: tc filter add dev eth0 ingress protocol 0x8864 flower \ action simple sdata hi64 tc filter add dev eth0 ingress protocol 802.1q flower \ vlan_ethtype 0x8864 action simple sdata "hi vlan" The protocol 0x8864 (ETH_P_PPP_SES) is a tunnel protocol. As such, it is being dissected by __skb_flow_dissect and it's internal protocol is being set as key->basic.n_proto. IOW, the existence of ETH_P_PPP_SES tunnel is transparent to the callers of __skb_flow_dissect. OTOH, in the filters above, cls_flower configures its key->basic.n_proto to the ETH_P_PPP_SES value configured by the user. Matching on this key fails because of __skb_flow_dissect "transparency" mentioned above. In the following, I would argue that the problem lies with cls_flower, unnessary attempting key->basic.n_proto match. There are 3 close places in fl_set_key in cls_flower setting up mask->basic.n_proto. They are (in reverse order of appearance in the code) due to: (a) No vlan is given: use TCA_FLOWER_KEY_ETH_TYPE parameter (b) One vlan tag is given: use TCA_FLOWER_KEY_VLAN_ETH_TYPE (c) Two vlans are given: use TCA_FLOWER_KEY_CVLAN_ETH_TYPE The match in case (a) is unneeded because flower has no its own eth_type parameter. It was removed by Jamal Hadi Salim in commit 488b41d020fb06428b90289f70a41210718f52b7 in iproute2. For TCA_FLOWER_KEY_ETH_TYPE the userspace uses the generic tc filter protocol field. Therefore the match for the case (a) is done by tc itself. The matches in cases (b), (c) are unneeded because the protocol will appear in and will be matched by flow_dissector_key_vlan.vlan_tpid. Therefore in the best case, key->basic.n_proto will try to repeat vlan key match again. The below patch removes mask->basic.n_proto setting and resets it to 0 in case (c). Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:26:51 -07:00
Matteo Croce	a955318fe6	stmmac: align RX buffers On RX an SKB is allocated and the received buffer is copied into it. But on some architectures, the memcpy() needs the source and destination buffers to have the same alignment to be efficient. This is not our case, because SKB data pointer is misaligned by two bytes to compensate the ethernet header. Align the RX buffer the same way as the SKB one, so the copy is faster. An iperf3 RX test gives a decent improvement on a RISC-V machine: before: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 733 MBytes 615 Mbits/sec 88 sender [ 5] 0.00-10.01 sec 730 MBytes 612 Mbits/sec receiver after: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 1.10 GBytes 942 Mbits/sec 0 sender [ 5] 0.00-10.00 sec 1.09 GBytes 940 Mbits/sec receiver And the memcpy() overhead during the RX drops dramatically. before: Overhead Shared O Symbol 43.35% [kernel] [k] memcpy 33.77% [kernel] [k] __asm_copy_to_user 3.64% [kernel] [k] sifive_l2_flush64_range after: Overhead Shared O Symbol 45.40% [kernel] [k] __asm_copy_to_user 28.09% [kernel] [k] memcpy 4.27% [kernel] [k] sifive_l2_flush64_range Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 10:25:18 -07:00
Daniel Borkmann	1f26622b79	Merge branch 'bpf-sock-migration' Kuniyuki Iwashima says: ==================== The SO_REUSEPORT option allows sockets to listen on the same port and to accept connections evenly. However, there is a defect in the current implementation [1]. When a SYN packet is received, the connection is tied to a listening socket. Accordingly, when the listener is closed, in-flight requests during the three-way handshake and child sockets in the accept queue are dropped even if other listeners on the same port could accept such connections. This situation can happen when various server management tools restart server (such as nginx) processes. For instance, when we change nginx configurations and restart it, it spins up new workers that respect the new configuration and closes all listeners on the old workers, resulting in the in-flight ACK of 3WHS is responded by RST. To avoid such a situation, users have to know deeply how the kernel handles SYN packets and implement connection draining by eBPF [2]: 1. Stop routing SYN packets to the listener by eBPF. 2. Wait for all timers to expire to complete requests 3. Accept connections until EAGAIN, then close the listener. or 1. Start counting SYN packets and accept syscalls using the eBPF map. 2. Stop routing SYN packets. 3. Accept connections up to the count, then close the listener. In either way, we cannot close a listener immediately. However, ideally, the application need not drain the not yet accepted sockets because 3WHS and tying a connection to a listener are just the kernel behaviour. The root cause is within the kernel, so the issue should be addressed in kernel space and should not be visible to user space. This patchset fixes it so that users need not take care of kernel implementation and connection draining. With this patchset, the kernel redistributes requests and connections from a listener to the others in the same reuseport group at/after close or shutdown syscalls. Although some software does connection draining, there are still merits in migration. For some security reasons, such as replacing TLS certificates, we may want to apply new settings as soon as possible and/or we may not be able to wait for connection draining. The sockets in the accept queue have not started application sessions yet. So, if we do not drain such sockets, they can be handled by the newer listeners and could have a longer lifetime. It is difficult to drain all connections in every case, but we can decrease such aborted connections by migration. In that sense, migration is always better than draining. Moreover, auto-migration simplifies user space logic and also works well in a case where we cannot modify and build a server program to implement the workaround. Note that the source and destination listeners MUST have the same settings at the socket API level; otherwise, applications may face inconsistency and cause errors. In such a case, we have to use the eBPF program to select a specific listener or to cancel migration. Special thanks to Martin KaFai Lau for bouncing ideas and exchanging code snippets along the way. Link: [1] The SO_REUSEPORT socket option https://lwn.net/Articles/542629/ [2] Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode https://lore.kernel.org/netdev/1458828813.10868.65.camel@edumazet-glaptop3.roam.corp.google.com/ Changelog: v8: * Make reuse const in reuseport_sock_index() * Don't use __reuseport_add_sock() in reuseport_alloc() * Change the arg of the second memcpy() in reuseport_grow() * Fix coding style to use goto in reuseport_alloc() * Keep sk_refcnt uninitialized in inet_reqsk_clone() * Initialize ireq_opt and ipv6_opt separately in reqsk_migrate_reset() [ This series does not include a stats patch suggested by Yuchung Cheng not to drop Acked-by/Reviewed-by tags and save reviewer's time. I will post the patch as a follow up after this series is merged. ] v7: https://lore.kernel.org/bpf/20210521182104.18273-1-kuniyu@amazon.co.jp/ * Prevent attaching/detaching a bpf prog via shutdowned socket * Fix typo in commit messages * Split selftest into subtests v6: https://lore.kernel.org/bpf/20210517002258.75019-1-kuniyu@amazon.co.jp/ * Change description in ip-sysctl.rst * Test IPPROTO_TCP before reading tfo_listener * Move reqsk_clone() to inet_connection_sock.c and rename to inet_reqsk_clone() * Pass req->rsk_listener to inet_csk_reqsk_queue_drop() and reqsk_queue_removed() in the migration path of receiving ACK * s/ARG_PTR_TO_SOCKET/PTR_TO_SOCKET/ in sk_reuseport_is_valid_access() * In selftest, use atomic ops to increment global vars, drop ACK by XDP, enable force fastopen, use "skel->bss" instead of "skel->data" v5: https://lore.kernel.org/bpf/20210510034433.52818-1-kuniyu@amazon.co.jp/ * Move initializtion of sk_node from 6th to 5th patch * Initialize sk_refcnt in reqsk_clone() * Modify some definitions in reqsk_timer_handler() * Validate in which path/state migration happens in selftest v4: https://lore.kernel.org/bpf/20210427034623.46528-1-kuniyu@amazon.co.jp/ * Make some functions and variables 'static' in selftest * Remove 'scalability' from the cover letter v3: https://lore.kernel.org/bpf/20210420154140.80034-1-kuniyu@amazon.co.jp/ * Add sysctl back for reuseport_grow() * Add helper functions to manage socks[] * Separate migration related logic into functions: reuseport_resurrect(), reuseport_stop_listen_sock(), reuseport_migrate_sock() * Clone request_sock to be migrated * Migrate request one by one * Pass child socket to eBPF prog v2: https://lore.kernel.org/netdev/20201207132456.65472-1-kuniyu@amazon.co.jp/ * Do not save closed sockets in socks[] * Revert `607904c357` * Extract inet_csk_reqsk_queue_migrate() into a single patch * Change the spin_lock order to avoid lockdep warning * Add static to __reuseport_select_sock * Use refcount_inc_not_zero() in reuseport_select_migrated_sock() * Set the default attach type in bpf_prog_load_check_attach() * Define new proto of BPF_FUNC_get_socket_cookie * Fix test to be compiled successfully * Update commit messages v1: https://lore.kernel.org/netdev/20201201144418.35045-1-kuniyu@amazon.co.jp/ * Remove the sysctl option * Enable migration if eBPF progam is not attached * Add expected_attach_type to check if eBPF program can migrate sockets * Add a field to tell migration type to eBPF program * Support BPF_FUNC_get_socket_cookie to get the cookie of sk * Allocate an empty skb if skb is NULL * Pass req_to_sk(req)->sk_hash because listener's hash is zero * Update commit messages and coverletter RFC: https://lore.kernel.org/netdev/20201117094023.3685-1-kuniyu@amazon.co.jp/ ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2021-06-15 18:01:14 +02:00
Kuniyuki Iwashima	c9d0bdef89	bpf: Test BPF_SK_REUSEPORT_SELECT_OR_MIGRATE. This patch adds a test for BPF_SK_REUSEPORT_SELECT_OR_MIGRATE and removes 'static' from settimeo() in network_helpers.c. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210612123224.12525-12-kuniyu@amazon.co.jp	2021-06-15 18:01:06 +02:00
Kuniyuki Iwashima	50501271e7	libbpf: Set expected_attach_type for BPF_PROG_TYPE_SK_REUSEPORT. This commit introduces a new section (sk_reuseport/migrate) and sets expected_attach_type to two each section in BPF_PROG_TYPE_SK_REUSEPORT program. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210612123224.12525-11-kuniyu@amazon.co.jp	2021-06-15 18:01:06 +02:00
Kuniyuki Iwashima	d5e4ddaeb6	bpf: Support socket migration by eBPF. This patch introduces a new bpf_attach_type for BPF_PROG_TYPE_SK_REUSEPORT to check if the attached eBPF program is capable of migrating sockets. When the eBPF program is attached, we run it for socket migration if the expected_attach_type is BPF_SK_REUSEPORT_SELECT_OR_MIGRATE or net.ipv4.tcp_migrate_req is enabled. Currently, the expected_attach_type is not enforced for the BPF_PROG_TYPE_SK_REUSEPORT type of program. Thus, this commit follows the earlier idea in the commit `aac3fc320d` ("bpf: Post-hooks for sys_bind") to fix up the zero expected_attach_type in bpf_prog_load_fixup_attach_type(). Moreover, this patch adds a new field (migrating_sk) to sk_reuseport_md to select a new listener based on the child socket. migrating_sk varies depending on if it is migrating a request in the accept queue or during 3WHS. - accept_queue : sock (ESTABLISHED/SYN_RECV) - 3WHS : request_sock (NEW_SYN_RECV) In the eBPF program, we can select a new listener by BPF_FUNC_sk_select_reuseport(). Also, we can cancel migration by returning SK_DROP. This feature is useful when listeners have different settings at the socket API level or when we want to free resources as soon as possible. - SK_PASS with selected_sk, select it as a new listener - SK_PASS with selected_sk NULL, fallbacks to the random selection - SK_DROP, cancel the migration. There is a noteworthy point. We select a listening socket in three places, but we do not have struct skb at closing a listener or retransmitting a SYN+ACK. On the other hand, some helper functions do not expect skb is NULL (e.g. skb_header_pointer() in BPF_FUNC_skb_load_bytes(), skb_tail_pointer() in BPF_FUNC_skb_load_bytes_relative()). So we allocate an empty skb temporarily before running the eBPF program. Suggested-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/netdev/20201123003828.xjpjdtk4ygl6tg6h@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/netdev/20201203042402.6cskdlit5f3mw4ru@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/netdev/20201209030903.hhow5r53l6fmozjn@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/bpf/20210612123224.12525-10-kuniyu@amazon.co.jp	2021-06-15 18:01:06 +02:00
Kuniyuki Iwashima	e061047684	bpf: Support BPF_FUNC_get_socket_cookie() for BPF_PROG_TYPE_SK_REUSEPORT. We will call sock_reuseport.prog for socket migration in the next commit, so the eBPF program has to know which listener is closing to select a new listener. We can currently get a unique ID of each listener in the userspace by calling bpf_map_lookup_elem() for BPF_MAP_TYPE_REUSEPORT_SOCKARRAY map. This patch makes the pointer of sk available in sk_reuseport_md so that we can get the ID by BPF_FUNC_get_socket_cookie() in the eBPF program. Suggested-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/netdev/20201119001154.kapwihc2plp4f7zc@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/bpf/20210612123224.12525-9-kuniyu@amazon.co.jp	2021-06-15 18:01:06 +02:00
Kuniyuki Iwashima	d4f2c86b2b	tcp: Migrate TCP_NEW_SYN_RECV requests at receiving the final ACK. This patch also changes the code to call reuseport_migrate_sock() and inet_reqsk_clone(), but unlike the other cases, we do not call inet_reqsk_clone() right after reuseport_migrate_sock(). Currently, in the receive path for TCP_NEW_SYN_RECV sockets, its listener has three kinds of refcnt: (A) for listener itself (B) carried by reuqest_sock (C) sock_hold() in tcp_v[46]_rcv() While processing the req, (A) may disappear by close(listener). Also, (B) can disappear by accept(listener) once we put the req into the accept queue. So, we have to hold another refcnt (C) for the listener to prevent use-after-free. For socket migration, we call reuseport_migrate_sock() to select a listener with (A) and to increment the new listener's refcnt in tcp_v[46]_rcv(). This refcnt corresponds to (C) and is cleaned up later in tcp_v[46]_rcv(). Thus we have to take another refcnt (B) for the newly cloned request_sock. In inet_csk_complete_hashdance(), we hold the count (B), clone the req, and try to put the new req into the accept queue. By migrating req after winning the "own_req" race, we can avoid such a worst situation: CPU 1 looks up req1 CPU 2 looks up req1, unhashes it, then CPU 1 loses the race CPU 3 looks up req2, unhashes it, then CPU 2 loses the race ... Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210612123224.12525-8-kuniyu@amazon.co.jp	2021-06-15 18:01:06 +02:00
Kuniyuki Iwashima	c905dee622	tcp: Migrate TCP_NEW_SYN_RECV requests at retransmitting SYN+ACKs. As with the preceding patch, this patch changes reqsk_timer_handler() to call reuseport_migrate_sock() and inet_reqsk_clone() to migrate in-flight requests at retransmitting SYN+ACKs. If we can select a new listener and clone the request, we resume setting the SYN+ACK timer for the new req. If we can set the timer, we call inet_ehash_insert() to unhash the old req and put the new req into ehash. The noteworthy point here is that by unhashing the old req, another CPU processing it may lose the "own_req" race in tcp_v[46]_syn_recv_sock() and drop the final ACK packet. However, the new timer will recover this situation. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210612123224.12525-7-kuniyu@amazon.co.jp	2021-06-15 18:01:06 +02:00
Kuniyuki Iwashima	54b92e8419	tcp: Migrate TCP_ESTABLISHED/TCP_SYN_RECV sockets in accept queues. When we call close() or shutdown() for listening sockets, each child socket in the accept queue are freed at inet_csk_listen_stop(). If we can get a new listener by reuseport_migrate_sock() and clone the request by inet_reqsk_clone(), we try to add it into the new listener's accept queue by inet_csk_reqsk_queue_add(). If it fails, we have to call __reqsk_free() to call sock_put() for its listener and free the cloned request. After putting the full socket into ehash, tcp_v[46]_syn_recv_sock() sets NULL to ireq_opt/pktopts in struct inet_request_sock, but ipv6_opt can be non-NULL. So, we have to set NULL to ipv6_opt of the old request to avoid double free. Note that we do not update req->rsk_listener and instead clone the req to migrate because another path may reference the original request. If we protected it by RCU, we would need to add rcu_read_lock() in many places. Suggested-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/netdev/20201209030903.hhow5r53l6fmozjn@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/bpf/20210612123224.12525-6-kuniyu@amazon.co.jp	2021-06-15 18:01:05 +02:00
Kuniyuki Iwashima	1cd62c2157	tcp: Add reuseport_migrate_sock() to select a new listener. reuseport_migrate_sock() does the same check done in reuseport_listen_stop_sock(). If the reuseport group is capable of migration, reuseport_migrate_sock() selects a new listener by the child socket hash and increments the listener's sk_refcnt beforehand. Thus, if we fail in the migration, we have to decrement it later. We will support migration by eBPF in the later commits. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20210612123224.12525-5-kuniyu@amazon.co.jp	2021-06-15 18:01:05 +02:00
Kuniyuki Iwashima	333bb73f62	tcp: Keep TCP_CLOSE sockets in the reuseport group. When we close a listening socket, to migrate its connections to another listener in the same reuseport group, we have to handle two kinds of child sockets. One is that a listening socket has a reference to, and the other is not. The former is the TCP_ESTABLISHED/TCP_SYN_RECV sockets, and they are in the accept queue of their listening socket. So we can pop them out and push them into another listener's queue at close() or shutdown() syscalls. On the other hand, the latter, the TCP_NEW_SYN_RECV socket is during the three-way handshake and not in the accept queue. Thus, we cannot access such sockets at close() or shutdown() syscalls. Accordingly, we have to migrate immature sockets after their listening socket has been closed. Currently, if their listening socket has been closed, TCP_NEW_SYN_RECV sockets are freed at receiving the final ACK or retransmitting SYN+ACKs. At that time, if we could select a new listener from the same reuseport group, no connection would be aborted. However, we cannot do that because reuseport_detach_sock() sets NULL to sk_reuseport_cb and forbids access to the reuseport group from closed sockets. This patch allows TCP_CLOSE sockets to remain in the reuseport group and access it while any child socket references them. The point is that reuseport_detach_sock() was called twice from inet_unhash() and sk_destruct(). This patch replaces the first reuseport_detach_sock() with reuseport_stop_listen_sock(), which checks if the reuseport group is capable of migration. If capable, it decrements num_socks, moves the socket backwards in socks[] and increments num_closed_socks. When all connections are migrated, sk_destruct() calls reuseport_detach_sock() to remove the socket from socks[], decrement num_closed_socks, and set NULL to sk_reuseport_cb. By this change, closed or shutdowned sockets can keep sk_reuseport_cb. Consequently, calling listen() after shutdown() can cause EADDRINUSE or EBUSY in inet_csk_bind_conflict() or reuseport_add_sock() which expects such sockets not to have the reuseport group. Therefore, this patch also loosens such validation rules so that a socket can listen again if it has a reuseport group with num_closed_socks more than 0. When such sockets listen again, we handle them in reuseport_resurrect(). If there is an existing reuseport group (reuseport_add_sock() path), we move the socket from the old group to the new one and free the old one if necessary. If there is no existing group (reuseport_alloc() path), we allocate a new reuseport group, detach sk from the old one, and free it if necessary, not to break the current shutdown behaviour: - we cannot carry over the eBPF prog of shutdowned sockets - we cannot attach/detach an eBPF prog to/from listening sockets via shutdowned sockets Note that when the number of sockets gets over U16_MAX, we try to detach a closed socket randomly to make room for the new listening socket in reuseport_grow(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20210612123224.12525-4-kuniyu@amazon.co.jp	2021-06-15 18:01:05 +02:00
Kuniyuki Iwashima	5c040eaf5d	tcp: Add num_closed_socks to struct sock_reuseport. As noted in the following commit, a closed listener has to hold the reference to the reuseport group for socket migration. This patch adds a field (num_closed_socks) to struct sock_reuseport to manage closed sockets within the same reuseport group. Moreover, this and the following commits introduce some helper functions to split socks[] into two sections and keep TCP_LISTEN and TCP_CLOSE sockets in each section. Like a double-ended queue, we will place TCP_LISTEN sockets from the front and TCP_CLOSE sockets from the end. TCP_LISTEN----------> <-------TCP_CLOSE +---+---+ --- +---+ --- +---+ --- +---+ \| 0 \| 1 \| ... \| i \| ... \| j \| ... \| k \| +---+---+ --- +---+ --- +---+ --- +---+ i = num_socks - 1 j = max_socks - num_closed_socks k = max_socks - 1 This patch also extends reuseport_add_sock() and reuseport_grow() to support num_closed_socks. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210612123224.12525-3-kuniyu@amazon.co.jp	2021-06-15 18:01:05 +02:00
Kuniyuki Iwashima	f9ac779f88	net: Introduce net.ipv4.tcp_migrate_req. This commit adds a new sysctl option: net.ipv4.tcp_migrate_req. If this option is enabled or eBPF program is attached, we will be able to migrate child sockets from a listener to another in the same reuseport group after close() or shutdown() syscalls. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210612123224.12525-2-kuniyu@amazon.co.jp	2021-06-15 18:01:05 +02:00
Kalle Valo	f39c2d1a18	Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git ath.git patches for v5.14. Major changes: ath11k * support for WCN6855 PCI hardware wcn36xx * WoWLAN support with magic packets and GTK rekeying	2021-06-15 18:47:30 +03:00
Johannes Berg	8f78caa226	wil6210: remove erroneous wiphy locking We already hold the wiphy lock in all cases when we get here, so this would deadlock, remove the erroneous locking. Fixes: `a05829a722` ("cfg80211: avoid holding the RTNL when calling the driver") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210426212929.83f1de07c2cd.I630a2a00eff185ba0452324b3d3f645e01128a95@changeid	2021-06-15 17:06:40 +03:00
Jiapeng Chong	75596eabd6	ath6kl: Fix inconsistent indenting Eliminate the follow smatch warning: drivers/net/wireless/ath/ath6kl/cfg80211.c:3308 ath6kl_cfg80211_sscan_start() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1622026376-68524-1-git-send-email-jiapeng.chong@linux.alibaba.com	2021-06-15 17:06:15 +03:00
Seevalamuthu Mariappan	979ebc54cf	ath11k: send beacon template after vdev_start/restart during csa Firmware has added assert if beacon template is received after vdev_down. Firmware expects beacon template after vdev_start and before vdev_up. This change is needed to support MBSSID EMA cases in firmware. Hence, Change the sequence in ath11k as expected from firmware. This new change is not causing any issues with older firmware. Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.5.0.1.r3-00011-QCAHKSWPL_SILICONZ-1 Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.5.0.1.r4-00008-QCAHKSWPL_SILICONZ-1 Fixes: `d5c65159f2` ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Signed-off-by: Seevalamuthu Mariappan <seevalam@codeaurora.org> [sven@narfation.org: added tested-on/fixes information] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210525133028.2805615-1-sven@narfation.org	2021-06-15 17:05:19 +03:00
Yang Yingliang	ea1c2023ef	ath10k: Use devm_platform_get_and_ioremap_resource() Use devm_platform_get_and_ioremap_resource() to simplify code. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210605110227.2429420-1-yangyingliang@huawei.com	2021-06-15 17:04:51 +03:00
Shaokun Zhang	a8b1de7f4f	ath10k: remove the repeated declaration Functions 'ath10k_pci_free_pipes' and 'ath10k_wmi_alloc_skb' are declared twice in their header file, so remove the repeated declaration. Cc: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1622448459-50805-1-git-send-email-zhangshaokun@hisilicon.com	2021-06-15 17:03:51 +03:00
Yang Li	e9ca70c735	ath10k: Fix an error code in ath10k_add_interface() When the code execute this if statement, the value of ret is 0. However, we can see from the ath10k_warn() log that the value of ret should be -EINVAL. Clean up smatch warning: drivers/net/wireless/ath/ath10k/mac.c:5596 ath10k_add_interface() warn: missing error code 'ret' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Fixes: `ccec9038c7` ("ath10k: enable raw encap mode and software crypto engine") Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1621939577-62218-1-git-send-email-yang.lee@linux.alibaba.com	2021-06-15 17:03:21 +03:00
Christophe JAILLET	515bda1d1e	ath11k: Fix an error handling path in ath11k_core_fetch_board_data_api_n() All error paths but this one 'goto err' in order to release some resources. Fix this. Fixes: `d5c65159f2` ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/e959eb544f3cb04258507d8e25a6f12eab126bde.1621676864.git.christophe.jaillet@wanadoo.fr	2021-06-15 17:02:03 +03:00
Yang Shen	9d1bb2289b	wil6210: Fix wrong function name in comments Fixes the following W=1 kernel build warning(s): drivers/net/wireless/ath/wil6210/interrupt.c:28: warning: expecting prototype for Theory of operation(). Prototype was for WIL6210_IRQ_DISABLE() instead drivers/net/wireless/ath/wil6210/wmi.c:227: warning: This comment starts with '/', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst drivers/net/wireless/ath/wil6210/wmi.c:245: warning: This comment starts with '/', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst drivers/net/wireless/ath/wil6210/wmi.c:263: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Cc: Maya Erez <merez@codeaurora.org> Signed-off-by: Yang Shen <shenyang39@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210517050141.61488-4-shenyang39@huawei.com	2021-06-15 17:01:25 +03:00
Yang Shen	2d1f8673ad	ath: Fix wrong function name in comments Fixes the following W=1 kernel build warning(s): drivers/net/wireless/ath/hw.c:119: warning: expecting prototype for ath_hw_set_bssid_mask(). Prototype was for ath_hw_setbssidmask() instead Signed-off-by: Yang Shen <shenyang39@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210517050141.61488-3-shenyang39@huawei.com	2021-06-15 17:01:24 +03:00
Yang Shen	3b0c7b2415	ath5k: Fix wrong function name in comments Fixes the following W=1 kernel build warning(s): drivers/net/wireless/ath/ath5k/pcu.c:865: warning: expecting prototype for at5k_hw_stop_rx_pcu(). Prototype was for ath5k_hw_stop_rx_pcu() instead Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Nick Kossifidis <mickflemm@gmail.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Yang Shen <shenyang39@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210517050141.61488-2-shenyang39@huawei.com	2021-06-15 17:01:24 +03:00
Martin Fuzzey	314538041b	rsi: fix AP mode with WPA failure due to encrypted EAPOL In AP mode WPA2-PSK connections were not established. The reason was that the AP was sending the first message of the 4 way handshake encrypted, even though no pairwise key had (correctly) yet been set. Encryption was enabled if the "security_enable" driver flag was set and encryption was not explicitly disabled by IEEE80211_TX_INTFL_DONT_ENCRYPT. However security_enable was set when any key, including the AP GTK key, had been set which was causing unwanted encryption even if no key was avaialble for the unicast packet to be sent. Fix this by adding a check that we have a key and drop the old security_enable driver flag which is insufficient and redundant. The Redpine downstream out of tree driver does it this way too. Regarding the Fixes tag the actual code being modified was introduced earlier, with the original driver submission, in `dad0d04fa7` ("rsi: Add RS9113 wireless driver"), however at that time AP mode was not yet supported so there was no bug at that point. So I have tagged the introduction of AP support instead which was part of the patch set "rsi: support for AP mode" [1] It is not clear whether AP WPA has ever worked, I can see nothing on the kernel side that broke it afterwards yet the AP support patch series says "Tests are performed to confirm aggregation, connections in WEP and WPA/WPA2 security." One possibility is that the initial tests were done with a modified userspace (hostapd). [1] https://www.spinics.net/lists/linux-wireless/msg165302.html Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group> Fixes: `38ef62353a` ("rsi: security enhancements for AP mode") CC: stable@vger.kernel.org Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1622564459-24430-1-git-send-email-martin.fuzzey@flowbird.group	2021-06-15 16:42:18 +03:00
YueHaibing	8667ab49a6	libertas: use DEVICE_ATTR_RW macro Use DEVICE_ATTR_RW helper instead of plain DEVICE_ATTR, which makes the code a bit shorter and easier to read. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210523040339.2724-1-yuehaibing@huawei.com	2021-06-15 16:41:44 +03:00
Hang Zhang	3f60f46856	cw1200: Revert unnecessary patches that fix unreal use-after-free bugs A previous commit `4f68ef64cd` ("cw1200: Fix concurrency use-after-free bugs in cw1200_hw_scan()") tried to fix a seemingly use-after-free bug between cw1200_bss_info_changed() and cw1200_hw_scan(), where the former frees a sk_buff pointed to by frame.skb, and the latter accesses the sk_buff pointed to by frame.skb. However, this issue should be a false alarm because: (1) "frame.skb" is not a shared variable between the above two functions, because "frame" is a local function variable, each of the two functions has its own local "frame" - they just happen to have the same variable name. (2) the sk_buff(s) pointed to by these two "frame.skb" are also two different object instances, they are individually allocated by different dev_alloc_skb() within the two above functions. To free one object instance will not invalidate the access of another different one. Based on these facts, the previous commit should be unnecessary. Moreover, it also introduced a missing unlock which was addressed in a subsequent commit `51c8d24101` ("cw1200: fix missing unlock on error in cw1200_hw_scan()"). Now that the original use-after-free is unreal, these two commits should be reverted. This patch performs the reversion. Fixes: `4f68ef64cd` ("cw1200: Fix concurrency use-after-free bugs in cw1200_hw_scan()") Fixes: `51c8d24101` ("cw1200: fix missing unlock on error in cw1200_hw_scan()") Signed-off-by: Hang Zhang <zh.nvgt@gmail.com> Acked-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210521223238.25020-1-zh.nvgt@gmail.com	2021-06-15 16:41:22 +03:00
Ding Senjie	03611cc526	rtlwifi: Fix spelling of 'download' downlaod -> download Signed-off-by: Ding Senjie <dingsenjie@yulong.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210521062734.21284-1-dingsenjie@163.com	2021-06-15 16:40:54 +03:00
Yang Li	a99086057e	rtlwifi: Remove redundant assignments to ul_enc_algo Variable ul_enc_algo is being initialized with a value that is never read, it is being set again in the following switch statements in all of the case and default paths. Hence the unitialization is redundant and can be removed. Clean up clang warning: drivers/net/wireless/realtek/rtlwifi/cam.c:170:6: warning: Value stored to 'ul_enc_algo' during its initialization is never read [clang-analyzer-deadcode.DeadStores] Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1621303199-1542-1-git-send-email-yang.lee@linux.alibaba.com	2021-06-15 16:39:55 +03:00
Colin Ian King	03a1b938cf	rtlwifi: rtl8723ae: remove redundant initialization of variable rtstatus The variable rtstatus is being initialized with a value that is never read, it is being updated later on. The assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210513122410.59204-1-colin.king@canonical.com	2021-06-15 16:39:37 +03:00

... 2 3 4 5 6 ...

1015991 Commits All Branches Search

1015991 Commits

All Branches