OpenCloudOS-Kernel/net/dccp
luoxuanqiang fdae4d139f Fix race for duplicate reqsk on identical SYN
[ Upstream commit ff46e3b4421923937b7f6e44ffcd3549a074f321 ]

When bonding is configured in BOND_MODE_BROADCAST mode, if two identical
SYN packets are received at the same time and processed on different CPUs,
it can potentially create the same sk (sock) but two different reqsk
(request_sock) in tcp_conn_request().

These two different reqsk will respond with two SYNACK packets, and since
the generation of the seq (ISN) incorporates a timestamp, the final two
SYNACK packets will have different seq values.

The consequence is that when the Client receives and replies with an ACK
to the earlier SYNACK packet, we will reset(RST) it.

========================================================================

This behavior is consistently reproducible in my local setup,
which comprises:

                  | NETA1 ------ NETB1 |
PC_A --- bond --- |                    | --- bond --- PC_B
                  | NETA2 ------ NETB2 |

- PC_A is the Server and has two network cards, NETA1 and NETA2. I have
  bonded these two cards using BOND_MODE_BROADCAST mode and configured
  them to be handled by different CPU.

- PC_B is the Client, also equipped with two network cards, NETB1 and
  NETB2, which are also bonded and configured in BOND_MODE_BROADCAST mode.

If the client attempts a TCP connection to the server, it might encounter
a failure. Capturing packets from the server side reveals:

10.10.10.10.45182 > localhost: Flags [S], seq 320236027,
10.10.10.10.45182 > localhost: Flags [S], seq 320236027,
localhost > 10.10.10.10.45182: Flags [S.], seq 2967855116,
localhost > 10.10.10.10.45182: Flags [S.], seq 2967855123, <==
10.10.10.10.45182 > localhost: Flags [.], ack 4294967290,
10.10.10.10.45182 > localhost: Flags [.], ack 4294967290,
localhost > 10.10.10.10.45182: Flags [R], seq 2967855117, <==
localhost > 10.10.10.10.45182: Flags [R], seq 2967855117,

Two SYNACKs with different seq numbers are sent by localhost,
resulting in an anomaly.

========================================================================

The attempted solution is as follows:
Add a return value to inet_csk_reqsk_queue_hash_add() to confirm if the
ehash insertion is successful (Up to now, the reason for unsuccessful
insertion is that a reqsk for the same connection has already been
inserted). If the insertion fails, release the reqsk.

Due to the refcnt, Kuniyuki suggests also adding a return value check
for the DCCP module; if ehash insertion fails, indicating a successful
insertion of the same connection, simply release the reqsk as well.

Simultaneously, In the reqsk_queue_hash_req(), the start of the
req->rsk_timer is adjusted to be after successful insertion.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: luoxuanqiang <luoxuanqiang@kylinos.cn>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240621013929.1386815-1-luoxuanqiang@kylinos.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05 09:33:48 +02:00
..
ccids dccp: tfrc: fix doc warnings in tfrc_equation.c 2021-06-10 14:08:49 -07:00
Kconfig dccp: Replace HTTP links with HTTPS ones 2020-07-13 11:54:07 -07:00
Makefile net: dccp: Remove dccpprobe module 2018-01-02 14:27:30 -05:00
ackvec.c net: dccp: Fix most of the kerneldoc warnings 2020-10-30 12:08:54 -07:00
ackvec.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
ccid.c net: dccp: Add __printf() markup to fix -Wsuggest-attribute=format 2020-10-30 11:31:46 -07:00
ccid.h net: dccp: Replace zero-length array with flexible-array member 2020-02-28 12:08:37 -08:00
dccp.h net: ioctl: Use kernel memory on protocol ioctl callbacks 2023-06-15 22:33:26 -07:00
diag.c inet_diag: Move the INET_DIAG_REQ_BYTECODE nlattr to cb->data 2020-02-27 18:50:19 -08:00
feat.c dccp: Return the correct errno code 2021-02-06 11:15:28 -08:00
feat.h dccp: Remove unused declaration dccp_feat_initialise_sysctls() 2023-07-27 17:16:26 -07:00
input.c net: dccp: Convert to use the preferred fallthrough macro 2020-08-22 12:38:34 -07:00
ipv4.c Fix race for duplicate reqsk on identical SYN 2024-07-05 09:33:48 +02:00
ipv6.c Fix race for duplicate reqsk on identical SYN 2024-07-05 09:33:48 +02:00
ipv6.h ipv6: remove hard coded limitation on ipv6_pinfo 2023-07-24 09:39:31 +01:00
minisocks.c tcp: allocate tcp_death_row outside of struct netns_ipv4 2022-01-26 19:00:31 -08:00
options.c net: dccp: Convert to use the preferred fallthrough macro 2020-08-22 12:38:34 -07:00
output.c dccp: fix data-race around dp->dccps_mss_cache 2023-08-04 18:27:58 -07:00
proto.c dccp: annotate data-races in dccp_poll() 2023-08-18 19:30:24 -07:00
qpolicy.c net: dccp: Fix most of the kerneldoc warnings 2020-10-30 12:08:54 -07:00
sysctl.c proc/sysctl: add shared variables for range check 2019-07-18 17:08:07 -07:00
timer.c dccp: annotate lockless accesses to sk->sk_err_soft 2023-03-17 08:25:05 +00:00
trace.h net: dccp: Use memset_startat() for TP zeroing 2021-11-19 11:22:49 +00:00