OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Andrii Nakryiko	1e2666e029	selftests/bpf: Prevent skeleton generation race Prevent "classic" and light skeleton generation rules from stomping on each other's toes due to the use of the same <obj>.linked{1,2,3}.o naming pattern. There is no coordination and synchronizataion between .skel.h and .lskel.h rules, so they can easily overwrite each other's intermediate object files, leading to errors like: /bin/sh: line 1: 170928 Bus error (core dumped) /data/users/andriin/linux/tools/testing/selftests/bpf/tools/sbin/bpftool gen skeleton /data/users/andriin/linux/tools/testing/selftests/bpf/test_ksyms_weak.linked3.o name test_ksyms_weak > /data/users/andriin/linux/tools/testing/selftests/bpf/test_ksyms_weak.skel.h make: * [Makefile:507: /data/users/andriin/linux/tools/testing/selftests/bpf/test_ksyms_weak.skel.h] Error 135 make: * Deleting file '/data/users/andriin/linux/tools/testing/selftests/bpf/test_ksyms_weak.skel.h' Fix by using different suffix for light skeleton rule. Fixes: `c48e51c8b0` ("bpf: selftests: Add selftests for module kfunc support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-2-andrii@kernel.org	2022-05-09 17:14:40 +02:00
Mykola Lysenko	20b87e7c29	selftests/bpf: Fix two memory leaks in prog_tests Fix log_fp memory leak in dispatch_thread_read_log. Remove obsolete log_fp clean-up code in dispatch_thread. Also, release memory of subtest_selector. This can be reproduced with -n 2/1 parameters. Signed-off-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220428225744.1961643-1-mykolal@fb.com	2022-04-28 21:53:50 -07:00
Alexei Starovoitov	a2c70dbc34	Merge branch 'libbpf: allow to opt-out from BPF map creation' Andrii Nakryiko says: ==================== Add bpf_map__set_autocreate() API which is a BPF map counterpart of bpf_program__set_autoload() and serves similar goal of allowing to build more flexible CO-RE applications. See patch #3 for example scenarios in which the need for such API came up previously. Patch #1 is a follow-up patch to previous patch set adding verifier log fixup logic, making sure bpf_core_format_spec()'s return result is used for something useful. Patch #2 is a small refactoring to avoid unnecessary verbose memory management around obj->maps array. Patch #3 adds and API and corresponding BPF verifier log fix up logic to provide human-comprehensible error message with useful details. Patch #4 adds a simple selftest validating both the API itself and libbpf's log fixup logic for it. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-04-28 20:03:30 -07:00
Andrii Nakryiko	68964e1556	selftests/bpf: Test bpf_map__set_autocreate() and related log fixup logic Add a subtest that excercises bpf_map__set_autocreate() API and validates that libbpf properly fixes up BPF verifier log with correct map information. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220428041523.4089853-5-andrii@kernel.org	2022-04-28 20:03:29 -07:00
Andrii Nakryiko	ec41817b4a	libbpf: Allow to opt-out from creating BPF maps Add bpf_map__set_autocreate() API that allows user to opt-out from libbpf automatically creating BPF map during BPF object load. This is a useful feature when building CO-RE-enabled BPF application that takes advantage of some new-ish BPF map type (e.g., socket-local storage) if kernel supports it, but otherwise uses some alternative way (e.g., extra HASH map). In such case, being able to disable the creation of a map that kernel doesn't support allows to successfully create and load BPF object file with all its other maps and programs. It's still up to user to make sure that no "live" code in any of their BPF programs are referencing such map instance, which can be achieved by guarding such code with CO-RE relocation check or by using .rodata global variables. If user fails to properly guard such code to turn it into "dead code", libbpf will helpfully post-process BPF verifier log and will provide more meaningful error and map name that needs to be guarded properly. As such, instead of: ; value = bpf_map_lookup_elem(&missing_map, &zero); 4: (85) call unknown#2001000000 invalid func unknown#2001000000 ... user will see: ; value = bpf_map_lookup_elem(&missing_map, &zero); 4: <invalid BPF map reference> BPF map 'missing_map' is referenced but wasn't created Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220428041523.4089853-4-andrii@kernel.org	2022-04-28 20:03:29 -07:00
Andrii Nakryiko	69721203b1	libbpf: Use libbpf_mem_ensure() when allocating new map Reuse libbpf_mem_ensure() when adding a new map to the list of maps inside bpf_object. It takes care of proper resizing and reallocating of map array and zeroing out newly allocated memory. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220428041523.4089853-3-andrii@kernel.org	2022-04-28 20:03:29 -07:00
Andrii Nakryiko	b198881d4b	libbpf: Append "..." in fixed up log if CO-RE spec is truncated Detect CO-RE spec truncation and append "..." to make user aware that there was supposed to be more of the spec there. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220428041523.4089853-2-andrii@kernel.org	2022-04-28 20:03:29 -07:00
Andrii Nakryiko	32c03c4954	selftests/bpf: Use target-less SEC() definitions in various tests Add new or modify existing SEC() definitions to be target-less and validate that libbpf handles such program definitions correctly. For kprobe/kretprobe we also add explicit test that generic bpf_program__attach() works in cases when kprobe definition contains proper target. It wasn't previously tested as selftests code always explicitly specified the target regardless. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220428185349.3799599-4-andrii@kernel.org	2022-04-28 23:46:04 +02:00
Andrii Nakryiko	cc7d8f2c8e	libbpf: Support target-less SEC() definitions for BTF-backed programs Similar to previous patch, support target-less definitions like SEC("fentry"), SEC("freplace"), etc. For such BTF-backed program types it is expected that user will specify BTF target programmatically at runtime using bpf_program__set_attach_target() before load phase. If not, libbpf will report this as an error. Aslo use SEC_ATTACH_BTF flag instead of explicitly listing a set of types that are expected to require attach_btf_id. This was an accidental omission during custom SEC() support refactoring. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220428185349.3799599-3-andrii@kernel.org	2022-04-28 23:46:04 +02:00
Andrii Nakryiko	9af8efc45e	libbpf: Allow "incomplete" basic tracing SEC() definitions In a lot of cases the target of kprobe/kretprobe, tracepoint, raw tracepoint, etc BPF program might not be known at the compilation time and will be discovered at runtime. This was always a supported case by libbpf, with APIs like bpf_program__attach_{kprobe,tracepoint,etc}() accepting full target definition, regardless of what was defined in SEC() definition in BPF source code. Unfortunately, up till now libbpf still enforced users to specify at least something for the fake target, e.g., SEC("kprobe/whatever"), which is cumbersome and somewhat misleading. This patch allows target-less SEC() definitions for basic tracing BPF program types: - kprobe/kretprobe; - multi-kprobe/multi-kretprobe; - tracepoints; - raw tracepoints. Such target-less SEC() definitions are meant to specify declaratively proper BPF program type only. Attachment of them will have to be handled programmatically using correct APIs. As such, skeleton's auto-attachment of such BPF programs is skipped and generic bpf_program__attach() will fail, if attempted, due to the lack of enough target information. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220428185349.3799599-2-andrii@kernel.org	2022-04-28 23:45:59 +02:00
Liu Jian	3527bfe6a9	bpf, sockmap: Call skb_linearize only when required in sk_psock_skb_ingress_enqueue The skb_to_sgvec fails only when the number of frag_list and frags exceeds MAX_MSG_FRAGS. Therefore, we can call skb_linearize only when the conversion fails. Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20220427115150.210213-1-liujian56@huawei.com	2022-04-28 23:40:01 +02:00
Tiezhu Yang	9a9a90ca13	bpf, docs: Fix typo "respetively" to "respectively" "respetively" should be "respectively". Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/1651139754-4838-4-git-send-email-yangtiezhu@loongson.cn	2022-04-28 17:20:48 +02:00
Tiezhu Yang	c821d80bb8	bpf, docs: BPF_FROM_BE exists as alias for BPF_TO_BE According to include/uapi/linux/bpf.h: #define BPF_FROM_LE BPF_TO_LE #define BPF_FROM_BE BPF_TO_BE BPF_FROM_BE exists as alias for BPF_TO_BE instead of BPF_TO_LE. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/1651139754-4838-3-git-send-email-yangtiezhu@loongson.cn	2022-04-28 17:20:48 +02:00
Tiezhu Yang	67b97e5842	bpf, docs: Remove duplicated word "instructions" The word "instructions" is duplicated, remove it. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/1651139754-4838-2-git-send-email-yangtiezhu@loongson.cn	2022-04-28 17:20:48 +02:00
Zhengchao Shao	d1c57439e4	samples/bpf: Detach xdp prog when program exits unexpectedly in xdp_rxq_info_user When xdp_rxq_info_user program exits unexpectedly, it doesn't detach xdp prog of device, and other xdp prog can't be attached to the device. So call init_exit() to detach xdp prog when program exits unexpectedly. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220427062338.80173-1-shaozhengchao@huawei.com	2022-04-27 19:07:42 -07:00
Mykola Lysenko	0925225956	bpf/selftests: Add granular subtest output for prog_test Implement per subtest log collection for both parallel and sequential test execution. This allows granular per-subtest error output in the 'All error logs' section. Add subtest log transfer into the protocol during the parallel test execution. Move all test log printing logic into dump_test_log function. One exception is the output of test names when verbose printing is enabled. Move test name/result printing into separate functions to avoid repetition. Print all successful subtest results in the log. Print only failed test logs when test does not have subtests. Or only failed subtests' logs when test has subtests. Disable 'All error logs' output when verbose mode is enabled. This functionality was already broken and is causing confusion. Signed-off-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220427041353.246007-1-mykolal@fb.com	2022-04-27 19:03:58 -07:00
Jakub Kicinski	50c6afabfd	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2022-04-27 We've added 85 non-merge commits during the last 18 day(s) which contain a total of 163 files changed, 4499 insertions(+), 1521 deletions(-). The main changes are: 1) Teach libbpf to enhance BPF verifier log with human-readable and relevant information about failed CO-RE relocations, from Andrii Nakryiko. 2) Add typed pointer support in BPF maps and enable it for unreferenced pointers (via probe read) and referenced ones that can be passed to in-kernel helpers, from Kumar Kartikeya Dwivedi. 3) Improve xsk to break NAPI loop when rx queue gets full to allow for forward progress to consume descriptors, from Maciej Fijalkowski & Björn Töpel. 4) Fix a small RCU read-side race in BPF_PROG_RUN routines which dereferenced the effective prog array before the rcu_read_lock, from Stanislav Fomichev. 5) Implement BPF atomic operations for RV64 JIT, and add libbpf parsing logic for USDT arguments under riscv{32,64}, from Pu Lehui. 6) Implement libbpf parsing of USDT arguments under aarch64, from Alan Maguire. 7) Enable bpftool build for musl and remove nftw with FTW_ACTIONRETVAL usage so it can be shipped under Alpine which is musl-based, from Dominique Martinet. 8) Clean up {sk,task,inode} local storage trace RCU handling as they do not need to use call_rcu_tasks_trace() barrier, from KP Singh. 9) Improve libbpf API documentation and fix error return handling of various API functions, from Grant Seltzer. 10) Enlarge offset check for bpf_skb_{load,store}_bytes() helpers given data length of frags + frag_list may surpass old offset limit, from Liu Jian. 11) Various improvements to prog_tests in area of logging, test execution and by-name subtest selection, from Mykola Lysenko. 12) Simplify map_btf_id generation for all map types by moving this process to build time with help of resolve_btfids infra, from Menglong Dong. 13) Fix a libbpf bug in probing when falling back to legacy bpf_probe_read() helpers; the probing caused always to use old helpers, from Runqing Yang. 14) Add support for ARCompact and ARCv2 platforms for libbpf's PT_REGS tracing macros, from Vladimir Isaev. 15) Cleanup BPF selftests to remove old & unneeded rlimit code given kernel switched to memcg-based memory accouting a while ago, from Yafang Shao. 16) Refactor of BPF sysctl handlers to move them to BPF core, from Yan Zhu. 17) Fix BPF selftests in two occasions to work around regressions caused by latest LLVM to unblock CI until their fixes are worked out, from Yonghong Song. 18) Misc cleanups all over the place, from various others. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits) selftests/bpf: Add libbpf's log fixup logic selftests libbpf: Fix up verifier log for unguarded failed CO-RE relos libbpf: Simplify bpf_core_parse_spec() signature libbpf: Refactor CO-RE relo human description formatting routine libbpf: Record subprog-resolved CO-RE relocations unconditionally selftests/bpf: Add CO-RE relos and SEC("?...") to linked_funcs selftests libbpf: Avoid joining .BTF.ext data with BPF programs by section name libbpf: Fix logic for finding matching program for CO-RE relocation libbpf: Drop unhelpful "program too large" guess libbpf: Fix anonymous type check in CO-RE logic bpf: Compute map_btf_id during build time selftests/bpf: Add test for strict BTF type check selftests/bpf: Add verifier tests for kptr selftests/bpf: Add C tests for kptr libbpf: Add kptr type tag macros to bpf_helpers.h bpf: Make BTF type match stricter for release arguments bpf: Teach verifier about kptr_get kfunc helpers bpf: Wire up freeing of referenced kptr bpf: Populate pairs of btf_id and destructor kfunc in btf bpf: Adapt copy_map_value for multiple offset case ... ==================== Link: https://lore.kernel.org/r/20220427224758.20976-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-27 17:09:32 -07:00
Arun Ramadoss	c6101dd7ff	net: dsa: ksz9477: move get_stats64 to ksz_common.c The mib counters for the ksz9477 is same for the ksz9477 switch and LAN937x switch. Hence moving it to ksz_common.c file in order to have it generic function. The DSA hook get_stats64 now can call ksz_get_stats64. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220426091048.9311-1-arun.ramadoss@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-27 15:38:01 -07:00
David S. Miller	03fa8fc93e	Merge branch 'remove-virt_to_bus-drivers' Jakub Kicinski says: ==================== net: remove non-Ethernet drivers using virt_to_bus() Networking is currently the main offender in using virt_to_bus(). Frankly all the drivers which use it are super old and unlikely to be used today. They are just an ongoing maintenance burden. In other words this series is using virt_to_bus() as an excuse to shed some old stuff. Having done the tree-wide dev_addr_set() conversion recently I have limited sympathy for carrying dead code. Obviously please scream if any of these drivers _is_ in fact still being used. Otherwise let's take the chance, we can always apologize and revert if users show up later. Also I should say thanks to everyone who contributed to this code! The work continues to be appreciated although realistically in more of a "history book" fashion... ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:22:56 +01:00
Jakub Kicinski	865e2eb08f	net: hamradio: remove support for DMA SCC devices Another incarnation of Z8530, looks like? Again, no real changes in the git history, and it needs VIRT_TO_BUS. Unlikely to have users, let's spend less time refactoring dead code... Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:22:56 +01:00
Jakub Kicinski	bc6df26f1f	net: wan: remove support for Z85230-based devices Looks like all the changes to this driver had been automated churn since git era begun. The driver is using virt_to_bus(), it's just a maintenance burden unlikely to have any users. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:22:56 +01:00
Jakub Kicinski	89fbca3307	net: wan: remove support for COSA and SRP synchronous serial boards Looks like all the changes to this driver had been automated churn since git era begun. The driver is using virt_to_bus() so it should be updated to a proper DMA API or removed. Given the latest "news" entry on the website is from 1999 I'm opting for the latter. I'm marking the allocated char device major number as [REMOVED], I reckon we can't reuse it in case some SW out there assumes its COSA? Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:22:56 +01:00
Jakub Kicinski	052e1f01bf	net: atm: remove support for ZeitNet ZN122x ATM devices This driver received nothing but automated fixes in the last 15 years. Since it's using virt_to_bus it's unlikely to be used on any modern platform. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:22:56 +01:00
Jakub Kicinski	5b74a20d35	net: atm: remove support for Madge Horizon ATM devices This driver received nothing but automated fixes since git era begun. Since it's using virt_to_bus it's unlikely to be used on any modern platform. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:22:56 +01:00
Jakub Kicinski	41c335c821	net: atm: remove support for Fujitsu FireStream ATM devices This driver received nothing but automated fixes (mostly spelling and compiler warnings) since git era begun. Since it's using virt_to_bus it's unlikely to be used on any modern platform. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:22:56 +01:00
David S. Miller	95ccb04192	Merge branch 'lan966x-ptp-programmable-pins' Horatiu Vultur says: ==================== net: lan966x: Add support for PTP programmable pins Lan966x has 8 PTP programmable pins. The last pin is hardcoded to be used by PHC0 and all the rest are shareable between the PHCs. The PTP pins can implement both extts and perout functions. v1->v2: - use ptp_find_pin_unlocked instead of ptp_find_pin inside the irq handler. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:03:18 +01:00
Horatiu Vultur	f3d8e0a9c2	net: lan966x: Add support for PTP_PF_EXTTS Extend the PTP programmable pins to implement also PTP_PF_EXTTS function. The PTP pin can be configured to capture only on the rising edge of the PPS signal. And once an event is seen then an interrupt is generated and the local time counter is saved. The interrupt is shared between all the pins. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:03:18 +01:00
Horatiu Vultur	2b7ff2588e	net: lan966x: Add support for PTP_PF_PEROUT Lan966x has 8 PTP programmable pins, where the last pins is hardcoded to be used by PHC0, which does the frame timestamping. All the rest of the PTP pins can be shared between the PHCs and can have different functions like perout or extts. For now add support for PTP_FS_PEROUT. The HW is not able to support absolute start time but can use the nsec for phase adjustment when generating PPS. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:03:18 +01:00
Horatiu Vultur	3adc11e5fc	net: lan966x: Add registers used to configure the PTP pin Add registers that are used to configure the PTP pins. These registers are used to enable the interrupts per PTP pin and to set the waveform generated by the pin. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:03:18 +01:00
Horatiu Vultur	77f2accb50	net: lan966x: Change the PTP pin used to read/write the PHC. To read/write a value to a PHC, it is required to use a PTP pin. Currently it is used pin 5, but change to pin 7 as is the last pin. All the other pins will have different functions. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:03:18 +01:00
Horatiu Vultur	c1a519919d	dt-bindings: net: lan966x: Extend with the ptp external interrupt. Extend dt-bindings for lan966x with ptp external interrupt. This is generated when an external 1pps signal is received on the ptp pin. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 12:03:17 +01:00
David S. Miller	124de27101	Merge branch 'mptcp-MP_FAIL-timeout' Mat Martineau says: ==================== mptcp: Timeout for MP_FAIL response When one peer sends an infinite mapping to coordinate fallback from MPTCP to regular TCP, the other peer is expected to send a packet with the MPTCP MP_FAIL option to acknowledge the infinite mapping. Rather than leave the connection in some half-fallback state, this series adds a timeout after which the infinite mapping sender will reset the connection. Patch 1 adds a fallback self test. Patches 2-5 make use of the MPTCP socket's retransmit timer to reset the MPTCP connection if no MP_FAIL was received. Patches 6 and 7 extends the self test to check MP_FAIL-related MIBs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:55 +01:00
Geliang Tang	53f368bfff	selftests: mptcp: print extra msg in chk_csum_nr When the multiple checksum errors occur in chk_csum_nr(), print the numbers of the errors as an extra message. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:54 +01:00
Geliang Tang	1f7d325f7d	selftests: mptcp: check MP_FAIL response mibs This patch extends chk_fail_nr to check the MP_FAIL response mibs. Add a new argument invert for chk_fail_nr to allow it can check the MP_FAIL TX and RX mibs from the opposite direction. When the infinite map is received before the MP_FAIL response, the response will be lost. A '-' can be added into fail_tx or fail_rx to represent that MP_FAIL response TX or RX can be lost when doing the checks. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:54 +01:00
Geliang Tang	49fa1919d6	mptcp: reset subflow when MP_FAIL doesn't respond This patch adds a new msk->flags bit MPTCP_FAIL_NO_RESPONSE, then reuses sk_timer to trigger a check if we have not received a response from the peer after sending MP_FAIL. If the peer doesn't respond properly, reset the subflow. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:54 +01:00
Geliang Tang	9c81be0dbc	mptcp: add MP_FAIL response support This patch adds a new struct member mp_fail_response_expect in struct mptcp_subflow_context to support MP_FAIL response. In the single subflow with checksum error and contiguous data special case, a MP_FAIL is sent in response to another MP_FAIL. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:54 +01:00
Geliang Tang	4293248c67	mptcp: add data lock for sk timers mptcp_data_lock() needs to be held when manipulating the msk retransmit_timer or the sk sk_timer. This patch adds the data lock for the both timers. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:54 +01:00
Geliang Tang	bcf3cf93f6	mptcp: use mptcp_stop_timer Use the helper mptcp_stop_timer() instead of using sk_stop_timer() to stop icsk_retransmit_timer directly. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:53 +01:00
Geliang Tang	b6e074e171	selftests: mptcp: add infinite map testcase Add the single subflow test case for MP_FAIL, to test the infinite mapping case. Use the test_linkfail value to make 128KB test files. Add a new function reset_with_fail(), in it use 'iptables' and 'tc action pedit' rules to produce the bit flips to trigger the checksum failures. Set validate_checksum to enable checksums for the MP_FAIL tests without passing the '-C' argument. Set check_invert flag to enable the invert bytes check for the output data in check_transfer(). Instead of the file mismatch error, this test prints out the inverted bytes. Add a new function pedit_action_pkts() to get the numbers of the packets edited by the tc pedit actions. Print this numbers to the output. Also add the needed kernel configures in the selftests config file. Suggested-by: Davide Caratti <dcaratti@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:53 +01:00
Marcel Ziswiler	b1190d5175	net: stmmac: dwmac-imx: comment spelling fix Fix spelling in comment. Fixes: `94abdad697` ("net: ethernet: dwmac: add ethernet glue logic for NXP imx8 chip") Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Link: https://lore.kernel.org/r/20220425154856.169499-1-marcel@ziswiler.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:28:31 -07:00
Bjorn Helgaas	e39f63fe0d	net: remove comments that mention obsolete __SLOW_DOWN_IO The only remaining definitions of __SLOW_DOWN_IO (for alpha and ia64) do nothing, and the only mentions in networking are in comments. Remove these mentions. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:09:24 -07:00
Bjorn Helgaas	dac173db11	net: wan: atp: remove unused eeprom_delay() atp.h is included only by atp.c, which does not use eeprom_delay(). Remove the unused definition. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:09:23 -07:00
Jakub Kicinski	c706b2b5ed	net: tls: fix async vs NIC crypto offload When NIC takes care of crypto (or the record has already been decrypted) we forget to update darg->async. ->async is supposed to mean whether record is async capable on input and whether record has been queued for async crypto on output. Reported-by: Gal Pressman <gal@nvidia.com> Fixes: `3547a1f9d9` ("tls: rx: use async as an in-out argument") Tested-by: Gal Pressman <gal@nvidia.com> Link: https://lore.kernel.org/r/20220425233309.344858-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:08:49 -07:00
Russell King (Oracle)	fae4630840	net: dsa: mt753x: fix pcs conversion regression Daniel Golle reports that the conversion of mt753x to phylink PCS caused an oops as below. The problem is with the placement of the PCS initialisation, which occurs after mt7531_setup() has been called. However, burited in this function is a call to setup the CPU port, which requires the PCS structure to be already setup. Fix this by changing the initialisation order. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 Mem abort info: ESR = 0x96000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005 CM = 0, WnR = 0 user pgtable: 4k pages, 39-bit VAs, pgdp=0000000046057000 [0000000000000020] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 Internal error: Oops: 96000005 [#1] SMP Modules linked in: CPU: 0 PID: 32 Comm: kworker/u4:1 Tainted: G S 5.18.0-rc3-next-20220422+ #0 Hardware name: Bananapi BPI-R64 (DT) Workqueue: events_unbound deferred_probe_work_func pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : mt7531_cpu_port_config+0xcc/0x1b0 lr : mt7531_cpu_port_config+0xc0/0x1b0 sp : ffffffc008d5b980 x29: ffffffc008d5b990 x28: ffffff80060562c8 x27: 00000000f805633b x26: ffffff80001a8880 x25: 00000000000009c4 x24: 0000000000000016 x23: ffffff8005eb6470 x22: 0000000000003600 x21: ffffff8006948080 x20: 0000000000000000 x19: 0000000000000006 x18: 0000000000000000 x17: 0000000000000001 x16: 0000000000000001 x15: 02963607fcee069e x14: 0000000000000000 x13: 0000000000000030 x12: 0101010101010101 x11: ffffffc037302000 x10: 0000000000000870 x9 : ffffffc008d5b800 x8 : ffffff800028f950 x7 : 0000000000000001 x6 : 00000000662b3000 x5 : 00000000000002f0 x4 : 0000000000000000 x3 : ffffff800028f080 x2 : 0000000000000000 x1 : ffffff800028f080 x0 : 0000000000000000 Call trace: mt7531_cpu_port_config+0xcc/0x1b0 mt753x_cpu_port_enable+0x24/0x1f0 mt7531_setup+0x49c/0x5c0 mt753x_setup+0x20/0x31c dsa_register_switch+0x8bc/0x1020 mt7530_probe+0x118/0x200 mdio_probe+0x30/0x64 really_probe.part.0+0x98/0x280 __driver_probe_device+0x94/0x140 driver_probe_device+0x40/0x114 __device_attach_driver+0xb0/0x10c bus_for_each_drv+0x64/0xa0 __device_attach+0xa8/0x16c device_initial_probe+0x10/0x20 bus_probe_device+0x94/0x9c deferred_probe_work_func+0x80/0xb4 process_one_work+0x200/0x3a0 worker_thread+0x260/0x4c0 kthread+0xd4/0xe0 ret_from_fork+0x10/0x20 Code: 9409e911 937b7e60 8b0002a0 f9405800 (f9401005) ---[ end trace 0000000000000000 ]--- Reported-by: Daniel Golle <daniel@makrotopia.org> Tested-by: Daniel Golle <daniel@makrotopia.org> Fixes: `cbd1f243bc` ("net: dsa: mt7530: partially convert to phylink_pcs") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/E1nj6FW-007WZB-5Y@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:08:31 -07:00
Eric Dumazet	68822bdf76	net: generalize skb freeing deferral to per-cpu lists Logic added in commit `f35f821935` ("tcp: defer skb freeing after socket lock is released") helped bulk TCP flows to move the cost of skbs frees outside of critical section where socket lock was held. But for RPC traffic, or hosts with RFS enabled, the solution is far from being ideal. For RPC traffic, recvmsg() has to return to user space right after skb payload has been consumed, meaning that BH handler has no chance to pick the skb before recvmsg() thread. This issue is more visible with BIG TCP, as more RPC fit one skb. For RFS, even if BH handler picks the skbs, they are still picked from the cpu on which user thread is running. Ideally, it is better to free the skbs (and associated page frags) on the cpu that originally allocated them. This patch removes the per socket anchor (sk->defer_list) and instead uses a per-cpu list, which will hold more skbs per round. This new per-cpu list is drained at the end of net_action_rx(), after incoming packets have been processed, to lower latencies. In normal conditions, skbs are added to the per-cpu list with no further action. In the (unlikely) cases where the cpu does not run net_action_rx() handler fast enough, we use an IPI to raise NET_RX_SOFTIRQ on the remote cpu. Also, we do not bother draining the per-cpu list from dev_cpu_dead() This is because skbs in this list have no requirement on how fast they should be freed. Note that we can add in the future a small per-cpu cache if we see any contention on sd->defer_lock. Tested on a pair of hosts with 100Gbit NIC, RFS enabled, and /proc/sys/net/ipv4/tcp_rmem[2] tuned to 16MB to work around page recycling strategy used by NIC driver (its page pool capacity being too small compared to number of skbs/pages held in sockets receive queues) Note that this tuning was only done to demonstrate worse conditions for skb freeing for this particular test. These conditions can happen in more general production workload. 10 runs of one TCP_STREAM flow Before: Average throughput: 49685 Mbit. Kernel profiles on cpu running user thread recvmsg() show high cost for skb freeing related functions () 57.81% [kernel] [k] copy_user_enhanced_fast_string () 12.87% [kernel] [k] skb_release_data () 4.25% [kernel] [k] __free_one_page () 3.57% [kernel] [k] __list_del_entry_valid 1.85% [kernel] [k] __netif_receive_skb_core 1.60% [kernel] [k] __skb_datagram_iter () 1.59% [kernel] [k] free_unref_page_commit () 1.16% [kernel] [k] __slab_free 1.16% [kernel] [k] _copy_to_iter () 1.01% [kernel] [k] kfree () 0.88% [kernel] [k] free_unref_page 0.57% [kernel] [k] ip6_rcv_core 0.55% [kernel] [k] ip6t_do_table 0.54% [kernel] [k] flush_smp_call_function_queue () 0.54% [kernel] [k] free_pcppages_bulk 0.51% [kernel] [k] llist_reverse_order 0.38% [kernel] [k] process_backlog () 0.38% [kernel] [k] free_pcp_prepare 0.37% [kernel] [k] tcp_recvmsg_locked () 0.37% [kernel] [k] __list_add_valid 0.34% [kernel] [k] sock_rfree 0.34% [kernel] [k] _raw_spin_lock_irq () 0.33% [kernel] [k] __page_cache_release 0.33% [kernel] [k] tcp_v6_rcv () 0.33% [kernel] [k] __put_page () 0.29% [kernel] [k] __mod_zone_page_state 0.27% [kernel] [k] _raw_spin_lock After patch: Average throughput: 73076 Mbit. Kernel profiles on cpu running user thread recvmsg() looks better: 81.35% [kernel] [k] copy_user_enhanced_fast_string 1.95% [kernel] [k] _copy_to_iter 1.95% [kernel] [k] __skb_datagram_iter 1.27% [kernel] [k] __netif_receive_skb_core 1.03% [kernel] [k] ip6t_do_table 0.60% [kernel] [k] sock_rfree 0.50% [kernel] [k] tcp_v6_rcv 0.47% [kernel] [k] ip6_rcv_core 0.45% [kernel] [k] read_tsc 0.44% [kernel] [k] _raw_spin_lock_irqsave 0.37% [kernel] [k] _raw_spin_lock 0.37% [kernel] [k] native_irq_return_iret 0.33% [kernel] [k] __inet6_lookup_established 0.31% [kernel] [k] ip6_protocol_deliver_rcu 0.29% [kernel] [k] tcp_rcv_established 0.29% [kernel] [k] llist_reverse_order v2: kdoc issue (kernel bots) do not defer if (alloc_cpu == smp_processor_id()) (Paolo) replace the sk_buff_head with a single-linked list (Jakub) add a READ_ONCE()/WRITE_ONCE() for the lockless read of sd->defer_list Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20220422201237.416238-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:05:59 -07:00
Alexei Starovoitov	d54d06a4c4	Merge branch 'Teach libbpf to "fix up" BPF verifier log' Andrii Nakryiko says: ==================== This patch set teaches libbpf to enhance BPF verifier log with human-readable and relevant information about failed CO-RE relocation. Patch #9 is the main one with the new logic. See relevant commit messages for some more details. All the other patches are either fixing various bugs detected while working on this feature, most prominently a bug with libbpf not handling CO-RE relocations for SEC("?...") programs, or are refactoring libbpf internals to allow for easier reuse of CO-RE relo lookup and formatting logic. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-04-26 15:41:47 -07:00
Andrii Nakryiko	ea4128eb43	selftests/bpf: Add libbpf's log fixup logic selftests Add tests validating that libbpf is indeed patching up BPF verifier log with CO-RE relocation details. Also test partial and full truncation scenarios. This test might be a bit fragile due to changing BPF verifier log format. If that proves to be frequently breaking, we can simplify tests or remove the truncation subtests. But for now it seems useful to test it in those conditions that are otherwise rarely occuring in practice. Also test CO-RE relo failure in a subprog as that excercises subprogram CO-RE relocation mapping logic which doesn't work out of the box without extra relo storage previously done only for gen_loader case. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-11-andrii@kernel.org	2022-04-26 15:41:46 -07:00
Andrii Nakryiko	9fdc4273b8	libbpf: Fix up verifier log for unguarded failed CO-RE relos Teach libbpf to post-process BPF verifier log on BPF program load failure and detect known error patterns to provide user with more context. Currently there is one such common situation: an "unguarded" failed BPF CO-RE relocation. While failing CO-RE relocation is expected, it is expected to be property guarded in BPF code such that BPF verifier always eliminates BPF instructions corresponding to such failed CO-RE relos as dead code. In cases when user failed to take such precautions, BPF verifier provides the best log it can: 123: (85) call unknown#195896080 invalid func unknown#195896080 Such incomprehensible log error is due to libbpf "poisoning" BPF instruction that corresponds to failed CO-RE relocation by replacing it with invalid `call 0xbad2310` instruction (195896080 == 0xbad2310 reads "bad relo" if you squint hard enough). Luckily, libbpf has all the necessary information to look up CO-RE relocation that failed and provide more human-readable description of what's going on: 5: <invalid CO-RE relocation> failed to resolve CO-RE relocation <byte_off> [6] struct task_struct___bad.fake_field_subprog (0:2 @ offset 8) This hopefully makes it much easier to understand what's wrong with user's BPF program without googling magic constants. This BPF verifier log fixup is setup to be extensible and is going to be used for at least one other upcoming feature of libbpf in follow up patches. Libbpf is parsing lines of BPF verifier log starting from the very end. Currently it processes up to 10 lines of code looking for familiar patterns. This avoids wasting lots of CPU processing huge verifier logs (especially for log_level=2 verbosity level). Actual verification error should normally be found in last few lines, so this should work reliably. If libbpf needs to expand log beyond available log_buf_size, it truncates the end of the verifier log. Given verifier log normally ends with something like: processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 ... truncating this on program load error isn't too bad (end user can always increase log size, if it needs to get complete log). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-10-andrii@kernel.org	2022-04-26 15:41:46 -07:00
Andrii Nakryiko	14032f2644	libbpf: Simplify bpf_core_parse_spec() signature Simplify bpf_core_parse_spec() signature to take struct bpf_core_relo as an input instead of requiring callers to decompose them into type_id, relo, spec_str, etc. This makes using and reusing this helper easier. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-9-andrii@kernel.org	2022-04-26 15:41:46 -07:00
Andrii Nakryiko	b58af63aab	libbpf: Refactor CO-RE relo human description formatting routine Refactor how CO-RE relocation is formatted. Now it dumps human-readable representation, currently used by libbpf in either debug or error message output during CO-RE relocation resolution process, into provided buffer. This approach allows for better reuse of this functionality outside of CO-RE relocation resolution, which we'll use in next patch for providing better error message for BPF verifier rejecting BPF program due to unguarded failed CO-RE relocation. It also gets rid of annoying "stitching" of libbpf_print() calls, which was the only place where we did this. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-8-andrii@kernel.org	2022-04-26 15:41:46 -07:00

1 2 3 4 5 ...

1090150 Commits All Branches Search

1090150 Commits

All Branches