OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Sasha Neftin	9c99482e45	igc: Remove unused local receiver mask Local receiver mask SR_1000T_LOCAL_RX_STATUS not in use in i225 device and could be removed Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-03 15:20:32 -08:00
Sasha Neftin	ed443cdf67	igc: Prefer strscpy over strlcpy Use the strscpy method instead of strlcpy method. See: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr _i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-03 15:20:32 -08:00
Sasha Neftin	94f794d15a	igc: Expose the gPHY firmware version Extend reporting of NVM image version to include the gPHY (i225 PHY) firmware version. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-03 15:20:32 -08:00
Sasha Neftin	01bb6129c6	igc: Expose the NVM version Expose the NVM map version via drvinfo in ethtool NVM image version is reported as firmware version for i225 device Minor typo fix - remove space Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-03 15:20:32 -08:00
Sasha Neftin	e65299444e	igc: Add Host Good Packets Transmitted Count This counter counts the number of good (non-erred) packets transmitted sent by the host. A good transmit packet is considered one that is 64 or more bytes in length (from <Destination Address> through <CRC>, inclusively) in length Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-03 15:20:32 -08:00
Sasha Neftin	e96c5b46bd	igc: Remove MULR mask define Multiple Tx Data Read Requests is hardware pipeline feature and is not controlled by software Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-03 15:20:32 -08:00
Sasha Neftin	4d59f52ba7	igc: Remove igc_set_fw_version comment i225 device not supported and do not plan to support configuration of fw version string for ethtool Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-03 15:20:32 -08:00
Sasha Neftin	63532ced07	igc: Clean up nvm_operations structure valid_led_default function pointer not in use and can be removed from nvm_operations structure. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-03 15:20:32 -08:00
Geert Uytterhoeven	32d1bbb1d6	net: fec: Silence M5272 build warnings If CONFIG_M5272=y: drivers/net/ethernet/freescale/fec_main.c: In function ‘fec_restart’: drivers/net/ethernet/freescale/fec_main.c:948:6: warning: unused variable ‘val’ [-Wunused-variable] 948 \| u32 val; \| ^~~ drivers/net/ethernet/freescale/fec_main.c: In function ‘fec_get_mac’: drivers/net/ethernet/freescale/fec_main.c:1667:28: warning: unused variable ‘pdata’ [-Wunused-variable] 1667 \| struct fec_platform_data *pdata = dev_get_platdata(&fep->pdev->dev); \| ^~~~~ Fix this by moving the variable declarations inside the existing #ifdef blocks. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20210202130650.865023-1-geert@linux-m68k.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 19:02:26 -08:00
Eric Dumazet	fca23f37f3	inet: do not export inet_gro_{receive\|complete} inet_gro_receive() and inet_gro_complete() are part of GRO engine which can not be modular. Similarly, inet_gso_segment() does not need to be exported, being part of GSO stack. In other words, net/ipv6/ip6_offload.o is part of vmlinux, regardless of CONFIG_IPV6. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20210202154145.1568451-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:41:10 -08:00
Jakub Kicinski	0256317a61	This time, only RTNL locking reduction fallout. - cfg80211_dev_rename() requires RTNL - cfg80211_change_iface() and cfg80211_set_encryption() require wiphy mutex (was missing in wireless extensions) - cfg80211_destroy_ifaces() requires wiphy mutex - netdev registration can fail due to notifiers, and then notifiers are "unrolled", need to handle this properly -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAmAZZAoACgkQB8qZga/f l8QW4A/+NSpqm/MMuw9IUwRsZ3wJUYI2SDj7xhbfpFEdNAmm7RjJLxF8NqCzFtqo 2XM9DzGkbvQKlPi4DmahnRVycFqlqTGkDDs5WOvg9NdL8/zygDLsTMWJsvyI6XEp 4Y8qLuwJpoaxDhmtEpjBNbQiZrXDdYptMRsvWpaLiLN8nlzWim+Sm+qiMeIWpxz2 axutgbyfO4pREU3wxRbxe2V0RNLxRqJ7g10siAkchP+NoK2SjM1tQKzyuN7ruImZ cVriA+j1u43rWseedoKZzofCvgd74nZAi87u8dpk673s7V71//8uTHhpIOBYUbfp 6mn1V2QhjiLZ3UZfdQFFQ+WjoowSEPMQ6gPe0EdPlWTYVNRWcQXzjIlznooZnxrE KVWYDYxkQKMgqrTFdUjcjOza9m4DG0aAJqqSQZ3r9KsRftseLM680vZY6AoteOOM WeaEt3p1Qaza18CA3BH1wVHmbNnwIfCiHtmsAefkgTD3cVH28IIUyGk4RsFPGWi0 APNNQOyiPPQCVnDAMZjhrKPMX2NTyWw5UziFhPPc2jo00XjPW0+mPRjiSO2W55kz ixui/foUN5um3kEc9wJgh62eLYjOAtBomKCNZEZQjJpS9VtsvyoaY/ZjK69lP3wQ XERj8D+fVpm8Hidbv7tb2gElJvra0X4ZCky2KixnICjCrwDBBzw= =bgnR -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-net-next-2021-02-02' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== This time, only RTNL locking reduction fallout. - cfg80211_dev_rename() requires RTNL - cfg80211_change_iface() and cfg80211_set_encryption() require wiphy mutex (was missing in wireless extensions) - cfg80211_destroy_ifaces() requires wiphy mutex - netdev registration can fail due to notifiers, and then notifiers are "unrolled", need to handle this properly * tag 'mac80211-next-for-net-next-2021-02-02' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next: cfg80211: fix netdev registration deadlock cfg80211: call cfg80211_destroy_ifaces() with wiphy lock held wext: call cfg80211_set_encryption() with wiphy lock held wext: call cfg80211_change_iface() with wiphy lock held nl80211: call cfg80211_dev_rename() under RTNL ==================== Link: https://lore.kernel.org/r/20210202144106.38207-1-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:40:42 -08:00
Jakub Kicinski	390d9b565e	mlx5-updates-2021-02-01 mlx5 netdev updates: 1) Trivial refactoring ahead of the upcoming uplink representor series. 2) Increased RSS table size to 256, for better results 3) Misc. Cleanup and very trivial improvements -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmAY9rQACgkQSD+KveBX +j5PpggAy8h7xd4zUZpFWvoTgmzEUJV04StwdghfR+m7EtlJyU3mqGkbGoWV9d0O Vljh9sRs0V1/CnABThJ/UG5dqkJjU1ZbQDhHK/HLr9U0MggDoJqC1T6OT4+p3TRe Px91P9eYE73chhf1aDUSi9MI+xGvoGI1Dt3K2WX3cHiftl1U11G3w6hiL9/9bNVK xBlHZP6qtqIoFEs0nh7Ze/IsR0v7i+HTdujXy3g7BdJ1Q8hG/mEfmHxZV8YIjX9s huvrIlSXtLKCk8JGknJtgGVPF+5m5K6GlWPg7chZqhkK51G3vJn952LyWhqgwT5r I1S/vccVX4ROYTWaaCVfkTcmRjp1Gg== =iPSk -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2021-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-02-01 mlx5 netdev updates: 1) Trivial refactoring ahead of the upcoming uplink representor series. 2) Increased RSS table size to 256, for better results 3) Misc. Cleanup and very trivial improvements * tag 'mlx5-updates-2021-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: DR, Avoid unnecessary csum recalculation on supporting devices net/mlx5e: CT: remove useless conversion to PTR_ERR then ERR_PTR net/mlx5e: accel, remove redundant space net/mlx5e: kTLS, Improve TLS RX workqueue scope net/mlx5e: remove h from printk format specifier net/mlx5e: Increase indirection RQ table size to 256 net/mlx5e: Enable napi in channel's activation stage net/mlx5e: Move representor neigh init into profile enable net/mlx5e: Avoid false lock depenency warning on tc_ht net/mlx5e: Move set vxlan nic info to profile init net/mlx5e: Move netif_carrier_off() out of mlx5e_priv_init() net/mlx5e: Refactor mlx5e_netdev_init/cleanup to mlx5e_priv_init/cleanup net/mxl5e: Add change profile method net/mlx5e: Separate between netdev objects and mlx5e profiles initialization ==================== Link: https://lore.kernel.org/r/20210202065457.613312-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:38:54 -08:00
Jakub Kicinski	a1a809c489	Merge branch 'mptcp-add_addr-enhancements' Mat Martineau says: ==================== mptcp: ADD_ADDR enhancements This patch series from the MPTCP tree contains enhancements and associated tests for the ADD_ADDR ("add address") MPTCP option. This option allows already-connected MPTCP peers to share additional IP addresses with each other, which can then be used to create additional subflows within those MPTCP connections. Patches 1 & 2 remove duplicated data in the per-connection path manager structure. Patches 3-6 initiate additional subflows when an address is added using the netlink path manager interface and improve ADD_ADDR signaling reliability, subject to configured limits. Self tests are also updated. Patches 7-15 add new support for optional port numbers in ADD_ADDR. This includes creating an additional in-kernel TCP listening socket for the requested port number, validating the port number when processing incoming subflow connections, including the port number in netlink interfaces, and adding some new MIBs. New self test cases are added for subflows connecting with alternate port numbers. ==================== Link: https://lore.kernel.org/r/20210201230920.66027-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:22 -08:00
Geliang Tang	8a127bf68a	selftests: mptcp: add testcases for ADD_ADDR with port This patch adds testcases for ADD_ADDR with port and the related MIB counters check in chk_add_nr. The output looks like this: 24 signal address with port syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] - pt [ ok ] syn[ ok ] - synack[ ok ] - ack[ ok ] syn[ ok ] - ack [ ok ] 25 subflow and signal with port syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] - pt [ ok ] syn[ ok ] - synack[ ok ] - ack[ ok ] syn[ ok ] - ack [ ok ] 26 remove single address with port syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] - pt [ ok ] syn[ ok ] - synack[ ok ] - ack[ ok ] syn[ ok ] - ack [ ok ] rm [ ok ] - sf [ ok ] Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:20 -08:00
Geliang Tang	2fbdd9eaf1	mptcp: add the mibs for ADD_ADDR with port This patch adds the mibs for ADD_ADDR with port: MPTCP_MIB_PORTADD for received ADD_ADDR suboption with a port number. MPTCP_MIB_PORTSYNRX, MPTCP_MIB_PORTSYNACKRX, MPTCP_MIB_PORTACKRX, for received MP_JOIN's SYN or SYN/ACK or ACK with a port number which is different from the msk's port number. MPTCP_MIB_MISMATCHPORTSYNRX and MPTCP_MIB_MISMATCHPORTACKRX, for received SYN or ACK MP_JOIN with a mismatched port-number. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:20 -08:00
Geliang Tang	d4a7726a79	selftests: mptcp: add port argument for pm_nl_ctl This patch adds a new argument for pm_nl_ctl tool. We can use it like this: # pm_nl_ctl add 10.0.2.1 flags signal port 10100 # pm_nl_ctl dump id 1 flags signal 10.0.2.1 10100 Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:19 -08:00
Geliang Tang	a77e9179c7	mptcp: deal with MPTCP_PM_ADDR_ATTR_PORT in PM netlink This patch adds MPTCP_PM_ADDR_ATTR_PORT filling and parsing in PM netlink. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:19 -08:00
Geliang Tang	60b57bf76c	mptcp: enable use_port when invoke addresses_equal When dealing with the addresses list local_addr_list or anno_list, we should enable the function addresses_equal's parameter use_port. And enable it in address_zero too. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:19 -08:00
Geliang Tang	5bc56388c7	mptcp: add port number check for MP_JOIN This patch adds two new helpers, subflow_use_different_sport and subflow_use_different_dport, to check whether the subflow's source or destination port number is different from the msk's port number. When receiving the MP_JOIN's SYN/SYNACK/ACK, we do these port number checks and print out the different port numbers. And furthermore, when receiving the MP_JOIN's SYN/ACK, we also use a new helper mptcp_pm_sport_in_anno_list to check whether this port number is announced. If it isn't, we need to abort this connection. This patch also populates the local address's port field in local_address. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:19 -08:00
Geliang Tang	ec20e14396	mptcp: add a new helper subflow_req_create_thmac This patch adds a new helper named subflow_req_create_thmac, which is extracted from subflow_token_join_request. It initializes subflow_req's local_nonce and thmac fields, those are the more expensive to populate. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:19 -08:00
Geliang Tang	b5e2e42fe5	mptcp: drop unused skb in subflow_token_join_request This patch drops the unused parameter skb in subflow_token_join_request. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:19 -08:00
Geliang Tang	1729cf186d	mptcp: create the listening socket for new port This patch creates a listening socket when an address with a port-number is added by PM netlink. Then binds the new port to the socket, and listens for new connections. When the address is removed or the addresses are flushed by PM netlink, release the listening socket. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:19 -08:00
Geliang Tang	6208fd822a	selftests: mptcp: add testcases for newly added addresses This patch adds testcases to create subflows or signal addresses for the newly added IPv4 or IPv6 addresses. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:18 -08:00
Geliang Tang	2e8cbf45cf	selftests: mptcp: use minus values for removing address numbers This patch changes the removing addresses numbers to minus values, left the plus values for the adding addresses numbers. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:18 -08:00
Geliang Tang	b5a7acd3bd	mptcp: send ack for every add_addr This patch changes the sending ACK conditions for the ADD_ADDR, send an ACK packet for any ADD_ADDR, not just when ipv6 addresses or port numbers are included. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/139 Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:18 -08:00
Geliang Tang	875b76718f	mptcp: create subflow or signal addr for newly added address Currently, when a new MPTCP endpoint is added, the existing MPTCP sockets are not affected. This patch implements a new function mptcp_nl_add_subflow_or_signal_addr, invoked when an address is added from PM netlink. This function traverses the MPTCP sockets list and invokes mptcp_pm_create_subflow_or_signal_addr to try to create a subflow or signal an address for the newly added address, if local constraint allows that. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/19 Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:18 -08:00
Geliang Tang	a914e58668	mptcp: drop _max fields in mptcp_pm_data This patch drops the per-msk values add_addr_signal_max, add_addr_accept_max, local_addr_max and subflows_max fields in struct mptcp_pm_data, uses the pernet _max values instead. And adds four new helpers to get the pernet *_max values separately. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:18 -08:00
Geliang Tang	72603d207d	mptcp: use WRITE_ONCE for the pernet *_max This patch uses WRITE_ONCE() for all the pernet add_addr_signal_max, add_addr_accept_max, local_addr_max and subflows_max fields in struct pm_nl_pernet to avoid concurrency issues. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:37:18 -08:00
Kai-Heng Feng	e6d6ca6e12	r8169: Add support for another RTL8168FP According to the vendor driver, the new chip with XID 0x54b is essentially the same as the one with XID 0x54a, but it doesn't need the firmware. So add support accordingly. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/20210202044813.1304266-1-kai.heng.feng@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 18:02:55 -08:00
Jakub Kicinski	389cb1ecc8	Merge branch 'add-notifications-when-route-hardware-flags-change' Ido Schimmel says: ==================== Add notifications when route hardware flags change Routes installed to the kernel can be programmed to capable devices, in which case they are marked with one of two flags. RTM_F_OFFLOAD for routes that offload traffic from the kernel and RTM_F_TRAP for routes that trap packets to the kernel for processing (e.g., host routes). These flags are of interest to routing daemons since they would like to delay advertisement of routes until they are installed in hardware. This allows them to avoid packet loss or misrouted packets. Currently, routing daemons do not receive any notifications when these flags are changed, requiring them to poll the kernel tables for changes which is inefficient. This series addresses the issue by having the kernel emit RTM_NEWROUTE notifications whenever these flags change. The behavior is controlled by two sysctls (net.ipv4.fib_notify_on_flag_change and net.ipv6.fib_notify_on_flag_change) that default to 0 (no notifications). Note that even if route installation in hardware is improved to be more synchronous, these notifications are still of interest. For example, a multipath route can change from RTM_F_OFFLOAD to RTM_F_TRAP if its neighbours become invalid. A routing daemon can choose to withdraw / replace the route in that case. In addition, the deletion of a route from the kernel can prompt the installation of an identical route (already in kernel, with an higher metric) to hardware. For testing purposes, netdevsim is aligned to simulate a "real" driver that programs routes to hardware. Series overview: Patches #1-#2 align netdevsim to perform route programming in a non-atomic context Patches #3-#5 add sysctl to control IPv4 notifications Patches #6-#8 add sysctl to control IPv6 notifications Patch #9 extends existing fib tests to set sysctls before running tests Patch #10 adds test for fib notifications over netdevsim ==================== Link: https://lore.kernel.org/r/20210201194757.3463461-1-idosch@idosch.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:46:02 -08:00
Amit Cohen	19d36d2971	selftests: netdevsim: Add fib_notifications test Add test to check fib notifications behavior. The test checks route addition, route deletion and route replacement for both IPv4 and IPv6. When fib_notify_on_flag_change=0, expect single notification for route addition/deletion/replacement. When fib_notify_on_flag_change=1, expect: - two notification for route addition/replacement, first without RTM_F_TRAP and second with RTM_F_TRAP. - single notification for route deletion. $ ./fib_notifications.sh TEST: IPv4 route addition [ OK ] TEST: IPv4 route deletion [ OK ] TEST: IPv4 route replacement [ OK ] TEST: IPv6 route addition [ OK ] TEST: IPv6 route deletion [ OK ] TEST: IPv6 route replacement [ OK ] Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:59 -08:00
Amit Cohen	d1a7a48928	selftests: Extend fib tests to run with and without flags notifications Run the test cases with both `fib_notify_on_flag_change` sysctls set to '1', and then with both sysctls set to '0' to verify there are no regressions in the test when notifications are added. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:59 -08:00
Amit Cohen	907eea4868	net: ipv6: Emit notification when fib hardware flags are changed After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware. The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware. It is also possible for a route already installed in hardware to change its action and therefore its flags. For example, a host route that is trapping packets can be "promoted" to perform decapsulation following the installation of an IPinIP/VXLAN tunnel. Emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed. The aim is to provide an indication to user-space (e.g., routing daemons) about the state of the route in hardware. Introduce a sysctl that controls this behavior. Keep the default value at 0 (i.e., do not emit notifications) for several reasons: - Multiple RTM_NEWROUTE notification per-route might confuse existing routing daemons. - Convergence reasons in routing daemons. - The extra notifications will negatively impact the insertion rate. - Not all users are interested in these notifications. Move fib6_info_hw_flags_set() to C file because it is no longer a short function. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:59 -08:00
Amit Cohen	efc42879ec	net: Do not call fib6_info_hw_flags_set() when IPv6 is disabled With the next patch mlxsw and netdevsim will fail in compilation if CONFIG_IPV6 is disabled. Do not call fib6_info_hw_flags_set() when IPv6 is disabled. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:59 -08:00
Amit Cohen	fbaca8f895	net: Pass 'net' struct as first argument to fib6_info_hw_flags_set() The next patch will emit notification when hardware flags are changed, in case that fib_notify_on_flag_change sysctl is set to 1. To know sysctl values, net struct is needed. This change is consistent with the IPv4 version, which gets 'net' struct as its first argument. Currently, the only callers of this function are mlxsw and netdevsim. Patch the callers to pass net. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:59 -08:00
Amit Cohen	680aea08e7	net: ipv4: Emit notification when fib hardware flags are changed After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware. The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware. It is also possible for a route already installed in hardware to change its action and therefore its flags. For example, a host route that is trapping packets can be "promoted" to perform decapsulation following the installation of an IPinIP/VXLAN tunnel. Emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed. The aim is to provide an indication to user-space (e.g., routing daemons) about the state of the route in hardware. Introduce a sysctl that controls this behavior. Keep the default value at 0 (i.e., do not emit notifications) for several reasons: - Multiple RTM_NEWROUTE notification per-route might confuse existing routing daemons. - Convergence reasons in routing daemons. - The extra notifications will negatively impact the insertion rate. - Not all users are interested in these notifications. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Acked-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:59 -08:00
Amit Cohen	1e7bdec6bb	net: ipv4: Publish fib_nlmsg_size() Publish fib_nlmsg_size() to allow it to be used later on from fib_alias_hw_flags_set(). Remove the inline keyword since it shouldn't be used inside C files. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:58 -08:00
Amit Cohen	085547891d	net: ipv4: Pass fib_rt_info as const to fib_dump_info() fib_dump_info() does not change 'fri', so pass it as 'const'. It will later allow us to invoke fib_dump_info() from fib_alias_hw_flags_set(). Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:58 -08:00
Amit Cohen	0ae3eb7b46	netdevsim: fib: Perform the route programming in a non-atomic context Currently, netdevsim implements dummy FIB offload and marks notified routes with RTM_F_TRAP flag. netdevsim does not defer route notifications to a work queue because it does not need to program any hardware. Given that netdevsim's purpose is to both give an example implementation and allow developers to test their code, align netdevsim to a "real" hardware device driver like mlxsw and have it also perform the route "programming" in a non-atomic context. It will be used to test route flags notifications which will be added in the next patches. The following changes are needed when route handling is performed in WQ: - Handle the accounting in the main context, to be able to return an error for adding route when all the routes are used. For FIB_EVENT_ENTRY_REPLACE increase the counter before scheduling the delayed work, and in case that this event replaces an existing route, decrease the counter as part of the delayed work. - For IPv6, cannot use fen6_info->rt->fib6_siblings list because it might be changed during handling the delayed work. Save an array with the nexthops as part of fib6_event struct, and take a reference for each nexthop to prevent them from being freed while event is queued. - Change GFP_ATOMIC allocations to GFP_KERNEL. - Use single work item that is handling a list of ordered routes. Handling routes must be processed in the order they were submitted to avoid logical errors that could lead to unexpected failures. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:58 -08:00
Amit Cohen	9e635a21ca	netdevsim: fib: Convert the current occupancy to an atomic variable When route is added/deleted, the appropriate counter is increased/decreased to maintain number of routes. User can limit the number of routes and then according to the appropriate counter, adding more routes than the limitation is forbidden. Currently, there is one lock which protects hashtable, list and accounting. Handling the counters will be performed from both atomic context and non-atomic context, while the hashtable and the list will be used only from non-atomic context and therefore will be protected by a separate lock. Protect accounting by using an atomic variable, so lock is not needed. v2: * Use atomic64_sub() in nsim_nexthop_account()'s error path Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:58 -08:00
Jakub Kicinski	64b268e12f	Merge branch 'net-ipa-don-t-disable-napi-in-suspend' Alex Elder says: ==================== net: ipa: don't disable NAPI in suspend This is version 2 of a series that reworks the order in which things happen during channel stop and suspend (and start and resume), in order to address a hang that has been observed during suspend. The introductory message on the first version of the series gave some history which is omitted here. The end result of this series is that we only enable NAPI and the I/O completion interrupt on a channel when we start the channel for the first time. And we only disable them when stopping the channel "for good." In other words, NAPI and the completion interrupt remain enabled while a channel is stopped for suspend. One comment on version 1 of the series suggested not returning early on success in a function, instead having both success and error paths return from the same point at the end of the function block. This has been addressed in this version. In addition, this version consolidates things a little bit, but the net result of the series is exactly the same as version 1 (with the exception of the return fix mentioned above). First, patch 6 in the first version was a small step to make patch 7 easier to understand. The two have been combined now. Second, previous version moved (and for suspend/resume, eliminated) I/O completion interrupt and NAPI disable/enable control in separate steps (patches). Now both are moved around together in patch 5 and 6, which eliminates the need for the final (NAPI-only) patch. I won't repeat the patch summaries provided in v1: https://lore.kernel.org/netdev/20210129202019.2099259-1-elder@linaro.org/ Many thanks to Willem de Bruijn for his thoughtful input. ==================== Link: https://lore.kernel.org/r/20210201172850.2221624-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:42:38 -08:00
Alex Elder	e63169208b	net: ipa: expand last transaction check Transactions to send data for a network device can be allocated at any time up until the point the TX queue is stopped. It is possible for ipa_start_xmit() to be called in one context just before a the transmit queue is stopped in another. Update gsi_channel_trans_last() so that for TX channels the allocated and pending transaction lists are checked--in addition to the completed and polled lists--to determine the "last" transaction. This means any transaction that has been allocated before the TX queue is stopped will be allowed to complete before we conclude the channel is quiesced. Rework the function a bit to use a list pointer and gotos. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:42:36 -08:00
Alex Elder	a65c0288b3	net: ipa: don't disable interrupt on suspend No completion interrupts will occur while an endpoint is suspended, nor when a channel has been stopped for suspend. So there's no need to disable the interrupt during suspend and re-enable it when resuming. Without any interrupts occurring, there is no need to disable/re-enable NAPI for channel suspend/resume either. We'll only enable NAPI and the interrupt when we first start the channel, and disable it again only when it's "really" stopped. To accomplish this, move the enable/disable calls out of __gsi_channel_start() and __gsi_channel_stop(), and into gsi_channel_start() and gsi_channel_stop() instead. Add a call to napi_synchronize() to gsi_channel_suspend(), to ensure NAPI polling is done before moving on. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:42:36 -08:00
Alex Elder	4fef691c9b	net: ipa: disable interrupt and NAPI after channel stop Disable both the I/O completion interrupt and NAPI polling on a channel after we successfully stop it rather than before. This ensures a completion occurring just before the channel is stopped gets processed. Enable NAPI polling and the interrupt before starting a channel rather than after, to be symmetric. A stopped channel won't generate any completion interrupts anyway. Enable NAPI before the interrupt and disable it afterward. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:42:36 -08:00
Alex Elder	bd1ea1e464	net: ipa: kill gsi_channel_freeze() and gsi_channel_thaw() Open-code gsi_channel_freeze() and gsi_channel_thaw() in all callers and get rid of these two functions. This is part of reworking the sequence of things done during channel suspend/resume and start/stop. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:42:35 -08:00
Alex Elder	893b838e73	net: ipa: introduce __gsi_channel_start() Create a new function that does most of the work of starting a channel. What's different is that it takes a flag indicating whether the channel should really be started or not. Create another new function __gsi_channel_stop() that behaves similarly. IPA v3.5.1 implements suspend using a special SUSPEND endpoint setting. If the endpoint is suspended when an I/O completes on the underlying GSI channel, a SUSPEND interrupt is generated. Newer versions of IPA do not implement the SUSPEND endpoint mode. Instead, endpoint suspend is implemented by simply stopping the underlying GSI channel. In this case, a completing I/O on a stopped channel causes the SUSPEND interrupt condition. These new functions put all activity related to starting or stopping a channel (including "thawing/freezing" the channel) in one place, whether or not the channel is actually started or stopped. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:42:35 -08:00
Alex Elder	697e834e14	net: ipa: introduce gsi_channel_stop_retry() Create a new helper function that encapsulates issuing a set of channel stop commands, retrying if appropriate, with a short delay between attempts. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:42:35 -08:00
Alex Elder	6b00a76a1d	net: ipa: don't thaw channel if error starting If an error occurs starting a channel, don't "thaw" it. We should assume the channel remains in a non-started state. Update the comment in gsi_channel_stop(); calls to this function are no longer retried. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:42:35 -08:00
Marco Elver	097b9146c0	net: fix up truesize of cloned skb in skb_prepare_for_shift() Avoid the assumption that ksize(kmalloc(S)) == ksize(kmalloc(S)): when cloning an skb, save and restore truesize after pskb_expand_head(). This can occur if the allocator decides to service an allocation of the same size differently (e.g. use a different size class, or pass the allocation on to KFENCE). Because truesize is used for bookkeeping (such as sk_wmem_queued), a modified truesize of a cloned skb may result in corrupt bookkeeping and relevant warnings (such as in sk_stream_kill_queues()). Link: https://lkml.kernel.org/r/X9JR/J6dMMOy1obu@elver.google.com Reported-by: syzbot+7b99aafdcc2eedea6178@syzkaller.appspotmail.com Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20210201160420.2826895-1-elver@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:36:12 -08:00
Davide Caratti	ec99a470c7	mptcp: fix length of MP_PRIO suboption With version 0 of the protocol it was legal to encode the 'Subflow Id' in the MP_PRIO suboption, to specify which subflow would change its 'Backup' flag. This has been removed from v1 specification: thus, according to RFC 8684 §3.3.8, the resulting 'Length' for MP_PRIO changed from 4 to 3 byte. Current Linux generates / parses MP_PRIO according to the old spec, using 'Length' equal to 4, and hardcoding 1 as 'Subflow Id'; RFC compliance can improve if we change 'Length' in other to become 3, leaving a 'Nop' after the MP_PRIO suboption. In this way the kernel will emit and accept only MP_PRIO suboptions that are compliant to version 1 of the MPTCP protocol. unpatched 5.11-rc kernel: [root@bottarga ~]# tcpdump -tnnr unpatched.pcap \| grep prio reading from file unpatched.pcap, link-type LINUX_SLL (Linux cooked v1) dropped privs to tcpdump IP 10.0.3.2.48433 > 10.0.1.1.10006: Flags [.], ack 1, win 502, options [nop,nop,TS val 4032325513 ecr 1876514270,mptcp prio non-backup id 1,mptcp dss ack 14084896651682217737], length 0 patched 5.11-rc kernel: [root@bottarga ~]# tcpdump -tnnr patched.pcap \| grep prio reading from file patched.pcap, link-type LINUX_SLL (Linux cooked v1) dropped privs to tcpdump IP 10.0.3.2.49735 > 10.0.1.1.10006: Flags [.], ack 1, win 502, options [nop,nop,TS val 1276737699 ecr 2686399734,mptcp prio non-backup,nop,mptcp dss ack 18433038869082491686], length 0 Changes since v2: - when accounting for option space, don't increment 'TCPOLEN_MPTCP_PRIO' and use 'TCPOLEN_MPTCP_PRIO_ALIGN' instead, thanks to Matthieu Baerts. Changes since v1: - refactor patch to avoid using 'TCPOLEN_MPTCP_PRIO' with its old value, thanks to Geliang Tang. Fixes: `067065422f` ("mptcp: add the outgoing MP_PRIO support") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Matteo Croce <mcroce@linux.microsoft.com> Link: https://lore.kernel.org/r/846cdd41e6ad6ec88ef23fee1552ab39c2f5a3d1.1612184361.git.dcaratti@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:34:02 -08:00

1 2 3 4 5 ...

984641 Commits All Branches Search

984641 Commits

All Branches