OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Vladimir Oltean	aac4daa894	net/sched: query offload capabilities through ndo_setup_tc() When adding optional new features to Qdisc offloads, existing drivers must reject the new configuration until they are coded up to act on it. Since modifying all drivers in lockstep with the changes in the Qdisc can create problems of its own, it would be nice if there existed an automatic opt-in mechanism for offloading optional features. Jakub proposes that we multiplex one more kind of call through ndo_setup_tc(): one where the driver populates a Qdisc-specific capability structure. First user will be taprio in further changes. Here we are introducing the definitions for the base functionality. Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220923163310.3192733-3-vladimir.oltean@nxp.com/ Suggested-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-29 18:52:01 -07:00
Yuan Can	8af535b6b1	net/tipc: Remove unused struct distr_queue_item After commit 09b5678c778f("tipc: remove dead code in tipc_net and relatives"), struct distr_queue_item is not used any more and can be removed as well. Signed-off-by: Yuan Can <yuancan@huawei.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Link: https://lore.kernel.org/r/20220928085636.71749-1-yuancan@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-29 18:48:32 -07:00
Paolo Abeni	dbae2b0628	net: skb: introduce and use a single page frag cache After commit `3226b158e6` ("net: avoid 32 x truesize under-estimation for tiny skbs") we are observing 10-20% regressions in performance tests with small packets. The perf trace points to high pressure on the slab allocator. This change tries to improve the allocation schema for small packets using an idea originally suggested by Eric: a new per CPU page frag is introduced and used in __napi_alloc_skb to cope with small allocation requests. To ensure that the above does not lead to excessive truesize underestimation, the frag size for small allocation is inflated to 1K and all the above is restricted to build with 4K page size. Note that we need to update accordingly the run-time check introduced with commit `fd9ea57f4e` ("net: add napi_get_frags_check() helper"). Alex suggested a smart page refcount schema to reduce the number of atomic operations and deal properly with pfmemalloc pages. Under small packet UDP flood, I measure a 15% peak tput increases. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Suggested-by: Alexander H Duyck <alexanderduyck@fb.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/6b6f65957c59f86a353fc09a5127e83a32ab5999.1664350652.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-29 18:48:15 -07:00
Kees Cook	7cba18332e	net: sched: cls_u32: Avoid memcpy() false-positive warning To work around a misbehavior of the compiler's ability to see into composite flexible array structs (as detailed in the coming memcpy() hardening series[1]), use unsafe_memcpy(), as the sizing, bounds-checking, and allocation are all very tightly coupled here. This silences the false-positive reported by syzbot: memcpy: detected field-spanning write (size 80) of single field "&n->sel" at net/sched/cls_u32.c:1043 (size 16) [1] https://lore.kernel.org/linux-hardening/20220901065914.1417829-2-keescook@chromium.org Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Reported-by: syzbot+a2c4601efc75848ba321@syzkaller.appspotmail.com Link: https://lore.kernel.org/lkml/000000000000a96c0b05e97f0444@google.com/ Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20220927153700.3071688-1-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-29 18:44:07 -07:00
Marek Vasut	5361660af6	dt-bindings: net: snps,dwmac: Document stmmac-axi-config subnode The stmmac-axi-config subnode is present in multiple dwmac instance DTs, document its content per snps,axi-config property description which is a phandle to this subnode. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220927012449.698915-1-marex@denx.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-29 18:43:51 -07:00
Jakub Kicinski	5493a2ad0d	docs: netlink: clarify the historical baggage of Netlink flags nlmsg_flags are full of historical baggage, inconsistencies and strangeness. Try to document it more thoroughly. Explain the meaning of the ECHO flag (and while at it clarify the comment in the uAPI). Handwave a little about the NEW request flags and how they make sense on the surface but cater to really old paradigm before commands were a thing. I will add more notes on how to make use of ECHO and discouragement for reuse of flags to the kernel-side documentation. Link: https://lore.kernel.org/r/20220927212306.823862-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-29 18:39:39 -07:00
Jakub Kicinski	accc3b4a57	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-29 14:30:51 -07:00
Linus Torvalds	511cce163b	Networking fixes for 6.0-rc8, including fixes from wifi and can. Current release - regressions: - phy: don't WARN for PHY_UP state in mdio_bus_phy_resume() - wifi: fix locking in mac80211 mlme - eth: - revert "net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()" - mlxbf_gige: fix an IS_ERR() vs NULL bug in mlxbf_gige_mdio_probe Previous releases - regressions: - wifi: fix regression with non-QoS drivers Previous releases - always broken: - mptcp: fix unreleased socket in accept queue - wifi: - don't start TX with fq->lock to fix deadlock - fix memory corruption in minstrel_ht_update_rates() - eth: - macb: fix ZynqMP SGMII non-wakeup source resume failure - mt7531: only do PLL once after the reset - usbnet: fix memory leak in usbnet_disconnect() Misc: - usb: qmi_wwan: add new usb-id for Dell branded EM7455 Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmM1cawSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkyn4P/3sP2MK9yjBNWZ+NcjGAapXqm5MPttDX pTihRgVVR0ldAQCvLaKS6NtB9W/o2KnQ6znsNvba5fEE8MskKCv+kh3kIe5kzq6B JrpEQUHOS7XlIgQnNIUITym1n9A79CHvPWvuQzbSr+5TbaEncM2KN/0UFi+sqkrY Gz+2BUvJqeJShqQYtZCRQDrNxOmpKtRLHuXmskS0XlSHp0bp8nz/8zQOLEIHMnqB xLHRzOgpRBIXMPO3IWTP8AHkYmuyh7Pdf1IZ5uPgVhBmcfVR7UvXQUSCXt21WlhT SoLbOuT/zAFTbehkGY5B2S40h9qUvw2WBcHO3go59PwT9NOP2a2V1qcj6C75/rt2 5Gnw75vT0Z5+VyuCHlyK4K2OVdiSpe/OMY8ZTYIRy8cGKXycAlK3AS5/m7Y+UG37 SG+DrfkrBjC1GYKcFugC3zjLW1eQ+KKWY6z9j8PgWbZ3hgmWo5g9DXsRep7cDFUF 6bzspxCQDn53WSLKnDRxIdFGPKR6bzn7Nys/qhyxaBdW59xLohXPqaF+n92K7bV8 lkrDzq0knAsNWDKUT8sDs0ATMHx7MgzOlKwEMkkvhV9F3psiob9ISdU1Mwn4dbi1 guWrSm5YtFxSu5iuAeOOZs0gZCtXuWtq0cJVsjnyZHnselcqO6Tfuvs2vzamI1KI MFnI5EZ6nY48 =WVqg -----END PGP SIGNATURE----- Merge tag 'net-6.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from wifi and can. Current release - regressions: - phy: don't WARN for PHY_UP state in mdio_bus_phy_resume() - wifi: fix locking in mac80211 mlme - eth: - revert "net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()" - mlxbf_gige: fix an IS_ERR() vs NULL bug in mlxbf_gige_mdio_probe Previous releases - regressions: - wifi: fix regression with non-QoS drivers Previous releases - always broken: - mptcp: fix unreleased socket in accept queue - wifi: - don't start TX with fq->lock to fix deadlock - fix memory corruption in minstrel_ht_update_rates() - eth: - macb: fix ZynqMP SGMII non-wakeup source resume failure - mt7531: only do PLL once after the reset - usbnet: fix memory leak in usbnet_disconnect() Misc: - usb: qmi_wwan: add new usb-id for Dell branded EM7455" * tag 'net-6.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (30 commits) mptcp: fix unreleased socket in accept queue mptcp: factor out __mptcp_close() without socket lock net: ethernet: mtk_eth_soc: fix mask of RX_DMA_GET_SPORT{,_V2} net: mscc: ocelot: fix tagged VLAN refusal while under a VLAN-unaware bridge can: c_can: don't cache TX messages for C_CAN cores ice: xsk: drop power of 2 ring size restriction for AF_XDP ice: xsk: change batched Tx descriptor cleaning net: usb: qmi_wwan: Add new usb-id for Dell branded EM7455 selftests: Fix the if conditions of in test_extra_filter() net: phy: Don't WARN for PHY_UP state in mdio_bus_phy_resume() net: stmmac: power up/down serdes in stmmac_open/release wifi: mac80211: mlme: Fix double unlock on assoc success handling wifi: mac80211: mlme: Fix missing unlock on beacon RX wifi: mac80211: fix memory corruption in minstrel_ht_update_rates() wifi: mac80211: fix regression with non-QoS drivers wifi: mac80211: ensure vif queues are operational after start wifi: mac80211: don't start TX with fq->lock to fix deadlock wifi: cfg80211: fix MCS divisor value net: hippi: Add missing pci_disable_device() in rr_init_one() net/mlxbf_gige: Fix an IS_ERR() vs NULL bug in mlxbf_gige_mdio_probe ...	2022-09-29 08:32:53 -07:00
Linus Torvalds	da9eede6b2	Input updates for v6.0-rc7 - small fixes for iqs62x-keys and melfas_mip4 drivers - corrected register address in snvs_pwrkey driver - Synaptic driver will stop trying to use intertouch (native) mode on some Lenovo AMD devices -----BEGIN PGP SIGNATURE----- iJAEABYKADgWIQST2eWILY88ieB2DOtAj56VGEWXnAUCYzUgfBocZG1pdHJ5LnRv cm9raG92QGdtYWlsLmNvbQAKCRBAj56VGEWXnEkhAP4/cOTiILkNKTzbu3nPAFn5 qVBp+wDpFrjN5zQKEIyOVgEAr5k7STjixjnneZnR7+ppald6Ti7LXaTESoXnqrY9 XwY= =zsDx -----END PGP SIGNATURE----- Merge tag 'input-for-v6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - small fixes for iqs62x-keys and melfas_mip4 drivers - corrected register address in snvs_pwrkey driver - Synaptic driver will stop trying to use intertouch (native) mode on some Lenovo AMD devices * tag 'input-for-v6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: snvs_pwrkey - fix SNVS_HPVIDR1 register address Input: synaptics - disable Intertouch for Lenovo T14 and P14s AMD G1 Input: iqs62x-keys - drop unused device node references Input: melfas_mip4 - fix return value check in mip4_probe()	2022-09-29 08:22:53 -07:00
Linus Torvalds	71f1875705	ATA fixes for 6.0-rc7 Three late patches to fix problems discovered recently. * Add a horkage to disable link power management by default for the Pioneer BDR-207M and BDR-205 DVD drives (from Niklas). * 2 patches to fix setting the maximum queue depth of libsas owned ATA devices (from me). -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCYzUUjAAKCRDdoc3SxdoY du6zAQD5X5JGmXppn5alNTFmfvekLj3wDswGgFcYtopZ0XLCrwD+Pf7K+3VVzhsj IMj45WJe0LlnWeb1J5cCviSrm6lfIwo= =Udns -----END PGP SIGNATURE----- Merge tag 'ata-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ATA fixes from Damien Le Moal: "Three late patches to fix problems discovered recently: - Add a horkage to disable link power management by default for the Pioneer BDR-207M and BDR-205 DVD drives (from Niklas) - Two patches to fix setting the maximum queue depth of libsas owned ATA devices (from me)" * tag 'ata-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: libata-sata: Fix device queue depth control ata: libata-scsi: Fix initialization of device queue depth libata: add ATA_HORKAGE_NOLPM for Pioneer BDR-207M and BDR-205	2022-09-29 05:40:59 -07:00
Linus Torvalds	81bcd4b522	LoongArch fixes for v6.0-final -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmM1AJcWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImekA3D/46J/vg5BVdGJxPxDXd6HUtPjX+ lkRs/yfGwntxKr1bQeEp0/p5W1ht9M2vtLQfUuPuXL1ArTTJ0F61jqkc4Pz9Ulzw pcQA6pMfwXjpxJqDzNfDZ3Zq17DD5KOkQyPT7MuJpO7AbX3aiaLAZ6C+cjckrBLV tKNIm8nIcCfcvivOnkP+1oaazmc2ndxQBozmiTcGTuy4Gsyyj2Qyibqjm9RQvDoH zD87wJKpgTawKN6Gx5nd9F52v7o/HJcQdgYkowVwsnkPrnAU4Wpk9da5XcCmsJ1a //YSRZlLHcMDzZqoK1tUrbhTGPxQj03e11XS6uUnzWg/qGjOrvbzFmjmHYd2PJWI xuV5IHyXAKwpTVRibFmC184V0XGPuyo9iZJE6/7n06lT2l66+1TDTGa9FRI7EDM+ BBX6Ye7kVOhqunEy/VXW3pcfAquyWTotGDVYJajFtrBPo/JgoCILh6tGNF4sKGc7 xeNhZMSjrWIcaSr1C74sTifUx2yYx34f23gt42aJtLq8yCEl0w+sS/dcJ9G2Hj2I jKDvq6nQLpsrIw2CYPYUtnsiva4Bug/zBUKeSaWc91C2LuPPgOqZdvSj+bVaAJWz ulPd6DaTkAK1aoNm/3+veL0/GQlLFDUyHHI9YDpYN52BmNlZfZJ0ffV3KHkxuFkV dbGT6ZEXjrf0oMaHFQ== =q6CX -----END PGP SIGNATURE----- Merge tag 'loongarch-fixes-6.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Some trivial fixes and cleanup" * tag 'loongarch-fixes-6.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: Clean up loongson3_smp_ops declaration LoongArch: Fix and cleanup csr_era handling in do_ri() LoongArch: Align the address of kernel_entry to 4KB	2022-09-29 05:35:32 -07:00
ruanjinjie	510bbf82f8	net: cpmac: Add __init/__exit annotations to module init/exit funcs Add __init/__exit annotations to module init/exit funcs Signed-off-by: ruanjinjie <ruanjinjie@huawei.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220928031708.89120-1-ruanjinjie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-29 13:39:58 +02:00
Yang Yingliang	0e9804cff1	ethernet: 8390: remove unnecessary check of mem The 'mem' returned by platform_get_resource() has been checked in probe function, so it is no need do this check in remove function. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220927151406.797800-1-yangyingliang@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-29 11:05:23 +02:00
Shang XiaoJing	d49e265b66	nfp: Use skb_put_data() instead of skb_put/memcpy pair Use skb_put_data() instead of skb_put() and memcpy(), which is clear. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com> Link: https://lore.kernel.org/r/20220927141835.19221-1-shangxiaojing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-29 10:46:42 +02:00
Yuan Can	01c617d73f	net: liquidio: Remove unused struct lio_trusted_vf_ctx After commit 6870957ed5bc("liquidio: make soft command calls synchronous"), no one use struct lio_trusted_vf_ctx, so remove it. Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20220927133940.104181-1-yuancan@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-29 10:14:05 +02:00
Liu Shixin	1a0c667ea8	net: ethernet: mtk_eth_soc: use DEFINE_SHOW_ATTRIBUTE to simplify code Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code. No functional change. Signed-off-by: Liu Shixin <liushixin2@huawei.com> Link: https://lore.kernel.org/r/20220927111925.2424100-1-liushixin2@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-29 09:57:23 +02:00
Shang XiaoJing	6db239f01a	wwan_hwsim: Use skb_put_data() instead of skb_put/memcpy pair Use skb_put_data() instead of skb_put() and memcpy(), which is clear. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Link: https://lore.kernel.org/r/20220927024511.14665-1-shangxiaojing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-29 09:37:40 +02:00
Shang XiaoJing	85e69a7dd6	net: ax88796c: Use skb_put_data() instead of skb_put/memcpy pair Use skb_put_data() instead of skb_put() and memcpy(), which is clear. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Link: https://lore.kernel.org/r/20220927023043.17769-1-shangxiaojing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-29 09:37:29 +02:00
Shang XiaoJing	1469327bb3	ethernet: s2io: Use skb_put_data() instead of skb_put/memcpy pair Use skb_put_data() instead of skb_put() and memcpy(), which is shorter and clear. Drop the tmp variable that is not needed any more. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Link: https://lore.kernel.org/r/20220927022802.16050-1-shangxiaojing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-29 09:19:09 +02:00
Jakub Kicinski	ceed40d799	Merge tag 'mlx5-updates-2022-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-09-27 This is Part #1 of 4 parts series to align mlx5's implementation of XSK (AF_XDP) RX-Qs indexing and management with other vendors: Maxim Says: =========== xsk: Bug fixes for frame mapping on striding RQ Striding RQ relies on the driver mapping RX buffers into the NIC's virtual memory space. Currently, regadless of the XSK frame size, mlx5e maps them using MTT, and each mapping's length is PAGE_SIZE. As the result, the stride size used by striding RQ is also equal to PAGE_SIZE. This decision has the following issues: 1. In the XSK aligned mode with frame size smaller than PAGE_SIZE, it's suboptimal. Using 2K strides and 2K pages allows to post twice as fewer WQEs. 2. MTT is not suitable for unaligned frames, as it requires natural alignment theoretically, in practice at least 8-byte alignment. 3. Using mapping and stride bigger than the frame has risk of writing over the bounds of the XSK frame upon receiving packets bigger than MTU, which is possible in some specific configurations. This series addresses issues 1 and 2 and alleviates issue 3. Where possible, page and stride size will match the XSK frame size (firmware upgrade may be needed to have effect for 2K frames). Unaligned mode will use KSM instead of MTT, which allows to drop the partial workaround [1]. [1]: https://lore.kernel.org/netdev/YufYFQ6JN91lQbso@boxer/T/ ==================== Link: https://lore.kernel.org/r/20220927203611.244301-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:37:23 -07:00
Maxim Mikityanskiy	997ce6affe	net/mlx5e: Use runtime values of striding RQ parameters in datapath Some of the parameters of striding RQ are compile-time constants, but they are going to become dynamically calculated at runtime in a following commit. This commit prepares the datapath to take cached runtime parameters, prefilled at queue creation. New fields added to struct mlx5e_rq fit into an existing 7-byte hole. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:39 -07:00
Maxim Mikityanskiy	258e655c00	net/mlx5e: Make dma_info array dynamic in struct mlx5e_mpw_info This commit moves the dma_info array to the end of struct mlx5e_mpw_info to make it a flexible array. It also removes the intermediate struct mlx5e_umr_dma_info, which used to contain only this array. The flexibility of dma_info will allow to choose its size dynamically in a following commit. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:39 -07:00
Maxim Mikityanskiy	3904d2afad	net/mlx5e: Improve the MTU change shortcut Normally, the MTU change requires reopening the channels, but it can be skipped if the new MTU doesn't change any of the queue parameters and if MTU is not used in the data path. The shortcut is applicable to the non-linear mode of striding RQ, because the only thing affected by MTU is the queue length. As ethtool sets the queue length in packets, but striding RQ length is defined in strides or bytes, we estimate the RQ length to be at least as big as the requested number of MTU-sized packets, that's why it depends on MTU. Improve the shortcut by actually checking whether the RQ length stayed the same, instead of an intermediate step in the calculation. As MTU also affects the SHAMPO parameters, skip the shortcut if SHAMPO is in use. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:38 -07:00
Maxim Mikityanskiy	411295fbe6	net/mlx5e: xsk: Fix SKB headroom calculation in validation In a typical scenario, if an XSK socket is opened first, then an XDP program is attached, mlx5e_validate_xsk_param will be called twice: first on XSK bind, second on channel restart caused by enabling XDP. The validation includes a call to mlx5e_rx_is_linear_skb, which checks the presence of the XDP program. The above means that mlx5e_rx_is_linear_skb might return true the first time, but false the second time, as mlx5e_rx_get_linear_sz_skb's return value will increase, because of a different headroom used with XDP. As XSK RQs never exist without XDP, it would make sense to trick mlx5e_rx_get_linear_sz_skb into thinking XDP is enabled at the first check as well. This way, if MTU is too big, it would be detected on XSK bind, without giving false hope to the userspace application. However, it turns out that this check is too restrictive in the first place. SKBs created on XDP_PASS on XSK RQs don't have any headroom. That means that big MTUs filtered out on the first and the second checks might actually work. So, address this issue in the proper way, but taking into account the absence of the SKB headroom on XSK RQs, when calculating the buffer size. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:38 -07:00
Maxim Mikityanskiy	8c654a1bb6	net/mlx5e: xsk: Remove dead code in validation One of the checks in mlx5e_rx_is_linear_skb verifies that the RX buffer fits into the XSK frame size. Remove the duplicating check from mlx5e_validate_xsk_param. It allows to make mlx5e_rx_get_min_frag_sz static. Remove mlx5e_rx_is_xdp altogether, as its only usage is located in a branch where xsk == NULL. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:38 -07:00
Maxim Mikityanskiy	ddbef36560	net/mlx5e: Simplify stride size calculation for linear RQ Linear RX buffers must be big enough to fit the MTU-sized packet along with the headroom. On the other hand, they must be small enough to fit into a page (or into an XSK frame). A straightforward way to check whether the linear mode is possible would be comparing the required buffer size to PAGE_SIZE or XSK frame size. Stride size in the linear mode is defined by the following constraints: 1. A stride is at least as big as the buffer size, and it's a power of two. 2. If non-XSK XDP is enabled, the stride size is PAGE_SIZE, because mlx5e requires each packet to be in its own page when XDP is in use. The previous constraint is automatically fulfilled, because buffer size can't be bigger than PAGE_SIZE. 3. XSK uses stride size equal to PAGE_SIZE, but the following commits will allow it to use roundup_pow_of_two(XSK frame size), by allowing the NIC's MMU to use page sizes not equal to the CPU page size. This commit puts the above requirements and constraints straight to the code in an attempt to simplify it and to prepare it for changes made in the next patches. For the reference, the old code uses an equivalent, but trickier calculation (high-level simplified pseudocode): if XDP or XSK: mlx5e_rx_get_linear_frag_sz := max(buffer size, PAGE_SIZE) else: mlx5e_rx_get_linear_frag_sz := buffer size mlx5e_rx_is_linear_skb := mlx5e_rx_get_linear_frag_sz <= PAGE_SIZE stride size := roundup_pow_of_two(mlx5e_rx_get_linear_frag_sz) The new code effectively removes mlx5e_rx_get_linear_frag_sz that used to return either buffer size or stride size, depending on the situation, making it hard to work with and to make changes: if XDP or XSK: mlx5e_rx_get_linear_stride_sz := PAGE_SIZE else mlx5e_rx_get_linear_stride_sz := roundup_pow_of_two(buffer size) mlx5e_rx_is_linear_skb := buffer size <= (PAGE_SIZE or XSK frame sz) stride size := mlx5e_rx_get_linear_stride_sz Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:37 -07:00
Maxim Mikityanskiy	4c78782e2e	net/mlx5e: kTLS, Check ICOSQ WQE size in advance Instead of WARNing in runtime when TLS offload WQEs posted to ICOSQ are over the hardware limit, check their size before enabling TLS RX offload, and block the offload if the condition fails. It also allows to drop a u16 field from struct mlx5e_icosq. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:37 -07:00
Maxim Mikityanskiy	21a0502d59	net/mlx5e: Use the aligned max TX MPWQE size TX MPWQE size is limited to the cacheline-aligned maximum. Use the same value for the stop room and the capability check. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:36 -07:00
Maxim Mikityanskiy	e3c4c496dc	net/mlx5e: Fix a typo in mlx5e_xdp_mpwqe_is_full Fix a typo in the function name: mpqwe -> mpwqe (stands for multi-packet work queue element). Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:36 -07:00
Maxim Mikityanskiy	527918e9cc	net/mlx5e: Use mlx5e_stop_room_for_max_wqe where appropriate mlx5e_alloc_xdpsq calculates sq->stop_room internally, but there is already a function for that: mlx5e_stop_room_for_max_wqe. This commit makes use of this function. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:36 -07:00
Maxim Mikityanskiy	ed5c92ff0f	net/mlx5e: Let mlx5e_get_sw_max_sq_mpw_wqebbs accept mdev To shorten and simplify code, let mlx5e_get_sw_max_sq_mpw_wqebbs accept mdev and derive max SQ WQEBBs from it. Also rename the function to a more generic name mlx5e_get_max_sq_aligned_wqebbs, because the following patches will use it in non-MPWQE contexts. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:35 -07:00
Maxim Mikityanskiy	44f4fd03b5	net/mlx5e: Validate striding RQ before enabling XDP Currently, the driver can silently fall back to legacy RQ after enabling XDP, even if striding RQ was active before. It happens when PAGE_SIZE is bigger than the maximum supported stride size. This commit changes this behavior to more straightforward: if an operation (enabling XDP) doesn't support the current parameters (striding RQ mode), it fails. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:35 -07:00
Maxim Mikityanskiy	7e49abb1e3	net/mlx5e: Make mlx5e_verify_rx_mpwqe_strides static mlx5e_verify_rx_mpwqe_strides is only used in en/params.c, so it can be made static and removed from en/params.h. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:35 -07:00
Maxim Mikityanskiy	665f29de4c	net/mlx5e: Remove unused fields from datapath structs No need to keep max_sq_wqebbs in mlx5e_txqsq and mlx5e_xdpsq, as it's only used when allocating the queues. Removing an extra field reduces the struct size. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:34 -07:00
Maxim Mikityanskiy	f060ccc2af	net/mlx5e: Convert mlx5e_get_max_sq_wqebbs to u8 The return value of mlx5e_get_max_sq_wqebbs is clamped down to MLX5_SEND_WQE_MAX_WQEBBS = 16, which fits into u8. This commit changes the return type of this function to u8 for stricter type safety. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:34 -07:00
Maxim Mikityanskiy	40b72108f9	net/mlx5: Add the log_min_mkey_entity_size capability Add the capability that will allow the driver to determine the minimal MTT page size to be able to map the smallest possible pages in XSK. The older firmwares that don't have this capability default to 12 (i.e. 4096-byte pages). Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:36:33 -07:00
Jakub Kicinski	0d5bfebf74	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== updates from mlx5-next 2022-09-24 Updates form mlx5-next including[1]: 1) HW definitions and support for NPPS clock settings. 2) various cleanups 3) Enable hash mode by default for all NICs 4) page tracker and advanced virtualization HW definitions for vfio [1] https://lore.kernel.org/netdev/20220907233636.388475-1-saeed@kernel.org/ * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Remove from FPGA IFC file not-needed definitions net/mlx5: Remove unused structs net/mlx5: Remove unused functions net/mlx5: detect and enable bypass port select flow table net/mlx5: Lag, enable hash mode by default for all NICs net/mlx5: Lag, set active ports if support bypass port select flow table RDMA/mlx5: Don't set tx affinity when lag is in hash mode net/mlx5: add IFC bits for bypassing port select flow table net/mlx5: Add support for NPPS with real time mode net/mlx5: Expose NPPS related registers net/mlx5: Query ADV_VIRTUALIZATION capabilities net/mlx5: Introduce ifc bits for page tracker RDMA/mlx5: Move function mlx5_core_query_ib_ppcnt() to mlx5_ib ==================== Link: https://lore.kernel.org/all/20220927201906.234015-1-saeed@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:20:49 -07:00
Sean Anderson	d4ddeefa64	net: sunhme: Fix undersized zeroing of quattro->happy_meals Just use kzalloc instead. Fixes: `d6f1e89bdb` ("sunhme: Return an ERR_PTR from quattro_pci_find") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Sean Anderson <seanga2@gmail.com> Link: https://lore.kernel.org/r/20220928004157.279731-1-seanga2@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:18:42 -07:00
Shang XiaoJing	f45892f750	net: wwan: iosm: Use skb_put_data() instead of skb_put/memcpy pair Use skb_put_data() instead of skb_put() and memcpy(), which is clear. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Reviewed-by: M Chetan Kumar <m.chetan.kumar@intel.com> Link: https://lore.kernel.org/r/20220927023254.30342-1-shangxiaojing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:18:03 -07:00
Jakub Kicinski	8278ddb161	Merge branch 'rework-resource-allocation-in-felix-dsa-driver' Vladimir Oltean says: ==================== Rework resource allocation in Felix DSA driver The Felix DSA driver controls NXP variations of Microchip switches. Colin Foster is trying to add support in this driver for "genuine" Microchip hardware, but some of the NXP-isms in this driver need to go away before that happens cleanly. https://patchwork.kernel.org/project/netdevbpf/cover/20220926002928.2744638-1-colin.foster@in-advantage.com/ The starting point was Colin's patch 08/14 "net: dsa: felix: update init_regmap to be string-based", and this continues to be the central theme here, but things are done differently. In short (full explanations are in patches), the goal is for MFD-based switches like Colin's SPI-controlled VSC7512 to be able to request a regmap that was created 100% externally (by drivers/mfd/ocelot-core.c) in a very simple way, that does not create dependencies on other modules. That is dev_get_regmap(), and as input it wants a string, for the resource name. So we rework the resource allocation in this driver to be based on string names provided by the specific instantiation (in Colin's case, ocelot_ext.c). Patch set was boot-tested on NXP LS1028A. ==================== Link: https://lore.kernel.org/r/20220927191521.1578084-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:15:30 -07:00
Vladimir Oltean	1109b97b61	net: dsa: felix: update regmap requests to be string-based Existing felix DSA drivers (vsc9959, vsc9953) are all switches that were integrated in NXP SoCs, which makes them a bit unusual compared to the usual Microchip branded Ocelot switches. To be precise, looking at Documentation/devicetree/bindings/net/mscc,vsc7514-switch.yaml, one can see 21 memory regions for the "switch" node, and these correspond to the "targets" of the switch IP, which are spread throughout the guts of that SoC's memory space. In NXP integrations, those targets still exist, but they were condensed within a single memory region, with no other peripheral in between them, so it made more sense for the driver to ioremap the entire memory space of the switch, and then find the targets within that memory space via some offsets hardcoded in the driver. The effect of this design decision is that now, the felix driver expects hardware instantiations to provide their own resource definitions, which is kind of odd when considering a typical device (those are retrieved from 'reg' properties in the device tree, using platform_get_resource() or similar). Allow other hardware instantiations that share the felix driver to not provide a hardcoded array of resources in the future. Instead, make the common denominator based on which regmaps are created be just the resource "names". Each instantiation comes with its own array of names that are mandatory for it, and with an optional array of resources. So we split the resources in 2 arrays, one is what's requested and the other is what's provided. There is one pool of provided resources, in felix->info->resources (of length felix->info->num_resources). There are 2 different ways of requesting a resource. One is by enum ocelot_target (this handles the global regmaps), and one is by int port (this handles the per-port ones). For the existing vsc9959 and vsc9953, it would be a bit stupid to request something that's not provided, given that the 2 arrays are both defined in the same place. The advantage is that we can now modify felix_request_regmap_by_name() to make felix->info->resources[] optional, and if absent, the implementation can call dev_get_regmap() and this is something that is compatible with MFD. Co-developed-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:15:02 -07:00
Yanteng Si	4f196cb64b	LoongArch: Clean up loongson3_smp_ops declaration Since loongson3_smp_ops is not used in LoongArch anymore, let's remove it for cleanup. Fixes: `f2ac457a61` ("LoongArch: Add CPU definition headers") Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2022-09-29 10:15:00 +08:00
Huacai Chen	06e76acec7	LoongArch: Fix and cleanup csr_era handling in do_ri() We don't emulate reserved instructions and just send a signal to the current process now. So we don't need to call compute_return_era() to add 4 (point to the next instruction) to csr_era in pt_regs. RA/ERA's backup/restore is cleaned up as well. Signed-off-by: Jun Yi <yijun@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2022-09-29 10:15:00 +08:00
Huacai Chen	2938431e93	LoongArch: Align the address of kernel_entry to 4KB Align the address of kernel_entry to 4KB, to avoid early tlb miss exception in case the entry code crosses page boundary. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2022-09-29 10:15:00 +08:00
Vladimir Oltean	044d447a80	net: dsa: felix: use DEFINE_RES_MEM_NAMED for resources Use less verbose resource definitions in vsc9959 and vsc9953. This also sets IORESOURCE_MEM in the constant array of resources, so we don't have to do this from felix_init_structs() - in fact, in the future, we may even support IORESOURCE_REG resources. Note that this macro takes start and length as argument, and we had start and end before. So transform end into length. While at it, sort the resources according to their offset. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:14:56 -07:00
Vladimir Oltean	8f66c64bfc	net: dsa: felix: remove felix_info :: init_regmap It turns out that the idea of having a customizable implementation of a regmap creation from a resource is not exactly useful. The idea was for the new MFD-based VSC7512 driver to use something that creates a SPI regmap from a resource. But there are problems in actually getting those resources (it involves getting them from MFD). To avoid all that, we'll be getting resources by name, so this custom init_regmap() method won't be needed. Remove it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:14:56 -07:00
Vladimir Oltean	1382ba68a0	net: dsa: felix: remove felix_info :: imdio_base This address is only relevant for the vsc9959, which is a PCIe device that holds its switch registers in a different PCIe BAR compared to the registers for the internal MDIO controller. Hide this aspect from the common felix driver and move the pci_resource_start() call to the only place that needs it, which is in vsc9959_mdio_bus_alloc(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:14:55 -07:00
Vladimir Oltean	5fc080de89	net: dsa: felix: remove felix_info :: imdio_res The imdio_res is used only by vsc9959, which references its own vsc9959_imdio_res through the common felix_info->imdio_res pointer. Since the common code doesn't care about this resource (and it can't be part of the common array of resources, either, because it belongs in a different PCI BAR), just reference it directly. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:14:55 -07:00
Jakub Kicinski	3b04cba7ad	Merge branch 'mptcp-properly-clean-up-unaccepted-subflows' Mat Martineau says: ==================== mptcp: Properly clean up unaccepted subflows Patch 1 factors out part of the mptcp_close() function for use by a caller that already owns the socket lock. This is a prerequisite for patch 2. Patch 2 is the fix that fully cleans up the unaccepted subflow sockets. ==================== Link: https://lore.kernel.org/r/20220927193158.195729-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:05:40 -07:00
Menglong Dong	30e51b923e	mptcp: fix unreleased socket in accept queue The mptcp socket and its subflow sockets in accept queue can't be released after the process exit. While the release of a mptcp socket in listening state, the corresponding tcp socket will be released too. Meanwhile, the tcp socket in the unaccept queue will be released too. However, only init subflow is in the unaccept queue, and the joined subflow is not in the unaccept queue, which makes the joined subflow won't be released, and therefore the corresponding unaccepted mptcp socket will not be released to. This can be reproduced easily with following steps: 1. create 2 namespace and veth: $ ip netns add mptcp-client $ ip netns add mptcp-server $ sysctl -w net.ipv4.conf.all.rp_filter=0 $ ip netns exec mptcp-client sysctl -w net.mptcp.enabled=1 $ ip netns exec mptcp-server sysctl -w net.mptcp.enabled=1 $ ip link add red-client netns mptcp-client type veth peer red-server \ netns mptcp-server $ ip -n mptcp-server address add 10.0.0.1/24 dev red-server $ ip -n mptcp-server address add 192.168.0.1/24 dev red-server $ ip -n mptcp-client address add 10.0.0.2/24 dev red-client $ ip -n mptcp-client address add 192.168.0.2/24 dev red-client $ ip -n mptcp-server link set red-server up $ ip -n mptcp-client link set red-client up 2. configure the endpoint and limit for client and server: $ ip -n mptcp-server mptcp endpoint flush $ ip -n mptcp-server mptcp limits set subflow 2 add_addr_accepted 2 $ ip -n mptcp-client mptcp endpoint flush $ ip -n mptcp-client mptcp limits set subflow 2 add_addr_accepted 2 $ ip -n mptcp-client mptcp endpoint add 192.168.0.2 dev red-client id \ 1 subflow 3. listen and accept on a port, such as 9999. The nc command we used here is modified, which makes it use mptcp protocol by default. $ ip netns exec mptcp-server nc -l -k -p 9999 4. open another two terminal and use each of them to connect to the server with the following command: $ ip netns exec mptcp-client nc 10.0.0.1 9999 Input something after connect to trigger the connection of the second subflow. So that there are two established mptcp connections, with the second one still unaccepted. 5. exit all the nc command, and check the tcp socket in server namespace. And you will find that there is one tcp socket in CLOSE_WAIT state and can't release forever. Fix this by closing all of the unaccepted mptcp socket in mptcp_subflow_queue_clean() with __mptcp_close(). Now, we can ensure that all unaccepted mptcp sockets will be cleaned by __mptcp_close() before they are released, so mptcp_sock_destruct(), which is used to clean the unaccepted mptcp socket, is not needed anymore. The selftests for mptcp is ran for this commit, and no new failures. Fixes: `f296234c98` ("mptcp: Add handling of incoming MP_JOIN requests") Fixes: `6aeed90450` ("mptcp: fix race on unaccepted mptcp sockets") Cc: stable@vger.kernel.org Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Mengen Sun <mengensun@tencent.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-28 19:05:21 -07:00

1 2 3 4 5 ...

1124938 Commits All Branches Search

1124938 Commits

All Branches