Since both glibc and musl provide RB_ flags via <sys/reboot.h>, and we
just add RB_ flags for nolibc, let's use RB_ flags instead of
LINUX_REBOOT_ flags and only reserve the required <sys/reboot.h> header.
This allows compile libc-test for musl libc without the linux headers.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Both glibc and musl provide RB_ flags via <sys/reboot.h> for reboot(),
they don't need to include <linux/reboot.h>, let nolibc provide RB_
flags too.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
musl limits the fast signed int in 32bit, but glibc and nolibc don't, to
let such test cases work on musl, let's provide the type based
SINT_MAX_OF_TYPE(type) and SINT_MIN_OF_TYPE(type).
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/bc635c4f-67fe-4e86-bfdf-bcb4879b928d@t-8ch.de/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
_GNU_SOURCE Implies _LARGEFILE64_SOURCE in glibc, but in musl, the
default configuration doesn't enable _LARGEFILE64_SOURCE.
>From include/dirent.h of musl, getdents64 is provided as getdents when
_LARGEFILE64_SOURCE is defined.
#if defined(_LARGEFILE64_SOURCE)
...
#define getdents64 getdents
#endif
Let's define _LARGEFILE64_SOURCE to fix up this compile error:
tools/testing/selftests/nolibc/nolibc-test.c: In function ‘test_getdents64’:
tools/testing/selftests/nolibc/nolibc-test.c:453:8: warning: implicit declaration of function ‘getdents64’; did you mean ‘getdents’? [-Wimplicit-function-declaration]
453 | ret = getdents64(fd, (void *)buffer, sizeof(buffer));
| ^~~~~~~~~~
| getdents
/usr/bin/ld: /tmp/ccKILm5u.o: in function `test_getdents64':
nolibc-test.c:(.text+0xe3e): undefined reference to `getdents64'
collect2: error: ld returned 1 exit status
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
As the gettid manpage [1] shows, glibc 2.30 has gettid support, so,
let's enable the test for glibc >= 2.30.
gettid works on musl too.
[1]: https://man7.org/linux/man-pages/man2/gettid.2.html
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
mmap() a file with a good offset and then munmap() it. a non-zero offset
is passed to test the 6th argument of my_syscall6().
Note, it is not easy to find a unique file for mmap() in different
scenes, so, a file list is used to search the right one:
- /dev/zero: is commonly used to allocate anonymous memory and is likely
present and readable
- /proc/1/exe: for 'run' and 'run-user' target, 'run-user' can not find
'/proc/self/exe'
- /proc/self/exe: for 'libc-test' target, normal program 'libc-test' has
no permission to access '/proc/1/exe'
- argv0: the path of the program itself, let it pass even with worst
case scene: no procfs and no /dev/zero
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702193306.GK16233@1wt.eu/
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/bff82ea6-610b-4471-a28b-6c76c28604a6@t-8ch.de/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The addr argument of munmap() must be a multiple of the page size,
passing invalid (void *)1 addr expects failure with -EINVAL.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The length argument of mmap() must be greater than 0, passing a zero
length argument expects failure with -EINVAL.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
>From musl 0.9.14 (to the latest version 1.2.3), both sbrk() and brk()
have almost been disabled for they conflict with malloc, only sbrk(0) is
still permitted as a way to get the current location of the program
break, let's support such case.
EXPECT_PTRNE() is used to expect sbrk() always successfully getting the
current break.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The syscalls like sbrk() and mmap() return pointers, to test them, more
pointer compare test macros are required, add them:
- EXPECT_PTREQ() expects two equal pointers.
- EXPECT_PTRNE() expects two non-equal pointers.
- EXPECT_PTRER() expects failure with a specified errno.
- EXPECT_PTRER2() expects failure with one of two specified errnos.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
/dev/zero is commonly used to allocate anonymous memory, it is a very
good file for tests, let's prepare it.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702193306.GK16233@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
argv0 is the path to nolibc-test program itself, which is a very good
always existing readable file for some tests, let's export it.
Note, the path may be absolute or relative, please make sure the tests
work with both of them. If it is relative, we must make sure the current
path is the one specified by the PWD environment variable.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/ZKKbS3cwKcHgnGwu@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Fix up the error reported by scripts/checkpatch.pl:
ERROR: do not use assignment in if condition
#95: FILE: tools/include/nolibc/sys.h:95:
+ if ((ret = sys_brk(0)) && (sys_brk(ret + inc) == ret + inc))
Apply the new generic __sysret() to merge the SET_ERRNO() and return
lines.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Do several cleanups together:
- Since all supported architectures have my_syscall6() now, remove the
#ifdef check.
- Move the mmap() related macros to tools/include/nolibc/types.h and
reuse most of them from <linux/mman.h>
- Apply the new generic __sysret() to convert the calling of sys_map()
to oneline code
Note, since MAP_FAILED is -1 on Linux, so we can use the generic
__sysret() which returns -1 upon error and still satisfy user land that
checks for MAP_FAILED.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702192347.GJ16233@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
No official reference states the errno range, here aligns with musl and
glibc and uses [-MAX_ERRNO, -1] instead of all negative ones.
- musl: src/internal/syscall_ret.c
- glibc: sysdeps/unix/sysv/linux/sysdep.h
The MAX_ERRNO used by musl and glibc is 4095, just like the one nolibc
defined in tools/include/nolibc/errno.h.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/ZKKdD%2Fp4UkEavru6@1wt.eu/
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Link: https://lore.kernel.org/linux-riscv/94dd5170929f454fbc0a10a2eb3b108d@AcuMS.aculab.com/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
It is able to pass the 6th argument like the 5th argument via the stack
for mips, let's add a new my_syscall6() now, see [1] for details:
The mips/o32 system call convention passes arguments 5 through 8 on
the user stack.
Both mmap() and pselect6() require my_syscall6().
[1]: https://man7.org/linux/man-pages/man2/syscall.2.html
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
my_syscall<N> share the same long clobber list, define a macro for them.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
my_syscall<N> share the same long clobber list, define a macro for them.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
replace "__asm__ volatile" with "__asm__ volatile" and insert necessary
whitespace before "\" to make sure the lines are aligned.
$ sed -i -e 's/__asm__ volatile ( /__asm__ volatile ( /g' tools/include/nolibc/*.h
Note, arch-s390.h uses post-tab instead of post-whitespaces, must avoid
insert whitespace just before the tabs:
$ sed -i -e 's/__asm__ volatile (\t/__asm__ volatile (\t/g' tools/include/nolibc/arch-*.h
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
More than 8 whitespaces of the code indent are replaced with "tab +
whitespaces" to fix up such errors reported by scripts/checkpatch.pl:
ERROR: code indent should use tabs where possible
#64: FILE: tools/include/nolibc/arch-mips.h:64:
+^I \$
ERROR: code indent should use tabs where possible
#72: FILE: tools/include/nolibc/arch-mips.h:72:
+^I "t0", "t1", "t2", "t3", "t4", "t5", "t6", "t7", "t8", "t9" \$
This command is used:
$ sed -i -e '/^\t* /{s/ /\t/g}' tools/include/nolibc/arch-*.h
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Since commit 53fcfafa8c ("tools/nolibc/unistd: add syscall()") nolibc
has support for syscall(2).
Use it to get rid of some ifdef-ery.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Add test cases for accessing the data structure fields using BTF info.
This includes the field access from parameters and retval, and accessing
string information.
Link: https://lore.kernel.org/all/169272161265.160970.14048619786574971276.stgit@devnote2/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Assume the fprobe event is a return event if there is $retval is
used in the probe's argument without %return. e.g.
echo 'f:myevent vfs_read $retval' >> dynamic_events
then 'myevent' is a return probe event.
Link: https://lore.kernel.org/all/169272160261.160970.13613040161560998787.stgit@devnote2/
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
I hit a memory leak when testing bpf_program__set_attach_target().
Basically, set_attach_target() may allocate btf_vmlinux, for example,
when setting attach target for bpf_iter programs. But btf_vmlinux
is freed only in bpf_object_load(), which means if we only open
bpf object but not load it, setting attach target may leak
btf_vmlinux.
So let's free btf_vmlinux in bpf_object__close() anyway.
Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230822193840.1509809-1-haoluo@google.com
Add a selftest for the fix provided in the previous commit. Without the
fix, the selftest passes the verifier while it should fail. The special
logic for detecting graph root or node for reg->off and bypassing
reg->off == 0 guarantee for release helpers/kfuncs has been dropped.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230822175140.1317749-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
For a bpf_kptr_xchg() with local kptr, if the map value kptr type and
allocated local obj type does not match, with the previous patch,
the below verifier error message will be logged:
R2 is of type <allocated local obj type> but <map value kptr type> is expected
Without the previous patch, the test will have unexpected success.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230822050058.2887354-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Before adding a port to bond, it need to be set down first. In the
lacpdu test the author set the port down specifically. But commit
a4abfa627c ("net: rtnetlink: Enslave device before bringing it up")
changed the operation order, the kernel will set the port down _after_
adding to bond. So all the ports will be down at last and the test failed.
In fact, the veth interfaces are already inactive when added. This
means there's no need to set them down again before adding to the bond.
Let's just remove the link down operation.
Fixes: a4abfa627c ("net: rtnetlink: Enslave device before bringing it up")
Reported-by: Zhengchao Shao <shaozhengchao@huawei.com>
Closes: https://lore.kernel.org/netdev/a0ef07c7-91b0-94bd-240d-944a330fcabd@huawei.com/
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20230817082459.1685972-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Attaching extra program to same functions system wide for api
and link tests.
This way we can test the pid filter works properly when there's
extra system wide consumer on the same uprobe that will trigger
the original uprobe handler.
We expect to have the same counts as before.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-29-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Running api and link tests also with pid filter and checking
the probe gets executed only for specific pid.
Spawning extra process to trigger attached uprobes and checking
we get correct counts from executed programs.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-28-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding test for cookies setup/retrieval in uprobe_link uprobes
and making sure bpf_get_attach_cookie works properly.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-27-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding test that attaches 50k usdt probes in usdt_multi binary.
After the attach is done we run the binary and make sure we get
proper amount of hits.
With current uprobes:
# perf stat --null ./test_progs -n 254/6
#254/6 uprobe_multi_test/bench_usdt:OK
#254 uprobe_multi_test:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
Performance counter stats for './test_progs -n 254/6':
1353.659680562 seconds time elapsed
With uprobe_multi link:
# perf stat --null ./test_progs -n 254/6
#254/6 uprobe_multi_test/bench_usdt:OK
#254 uprobe_multi_test:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
Performance counter stats for './test_progs -n 254/6':
0.322046364 seconds time elapsed
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-26-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding code in uprobe_multi test binary that defines 50k usdts
and will serve as attach point for uprobe_multi usdt bench test
in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-25-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding test that attaches 50k uprobes in uprobe_multi binary.
After the attach is done we run the binary and make sure we
get proper amount of hits.
The resulting attach/detach times on my setup:
test_bench_attach_uprobe:PASS:uprobe_multi__open 0 nsec
test_bench_attach_uprobe:PASS:uprobe_multi__attach 0 nsec
test_bench_attach_uprobe:PASS:uprobes_count 0 nsec
test_bench_attach_uprobe: attached in 0.346s
test_bench_attach_uprobe: detached in 0.419s
#262/5 uprobe_multi_test/bench_uprobe:OK
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-24-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding uprobe_multi test program that defines 50k uprobe_multi_func_*
functions and will serve as attach point for uprobe_multi bench test
in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-23-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding uprobe_multi test for bpf_link_create attach function.
Testing attachment using the struct bpf_link_create_opts.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-22-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding uprobe_multi test for bpf_program__attach_uprobe_multi
attach function.
Testing attachment using glob patterns and via bpf_uprobe_multi_opts
paths/syms fields.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-21-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding uprobe_multi test for skeleton load/attach functions,
to test skeleton auto attach for uprobe_multi link.
Test that bpf_get_func_ip works properly for uprobe_multi
attachment.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-20-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
We'd like to have single copy of get_time_ns used b bench and test_progs,
but we can't just include bench.h, because of conflicting 'struct env'
objects.
Moving get_time_ns to testing_helpers.h which is being included by both
bench and test_progs objects.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-19-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding support for usdt_manager_attach_usdt to use uprobe_multi
link to attach to usdt probes.
The uprobe_multi support is detected before the usdt program is
loaded and its expected_attach_type is set accordingly.
If uprobe_multi support is detected the usdt_manager_attach_usdt
gathers uprobes info and calls bpf_program__attach_uprobe to
create all needed uprobes.
If uprobe_multi support is not detected the old behaviour stays.
Also adding usdt.s program section for sleepable usdt probes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-18-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding uprobe-multi link detection. It will be used later in
bpf_program__attach_usdt function to check and use uprobe_multi
link over standard uprobe links.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-17-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding support for several uprobe_multi program sections
to allow auto attach of multi_uprobe programs.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-16-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding bpf_program__attach_uprobe_multi function that
allows to attach multiple uprobes with uprobe_multi link.
The user can specify uprobes with direct arguments:
binary_path/func_pattern/pid
or with struct bpf_uprobe_multi_opts opts argument fields:
const char **syms;
const unsigned long *offsets;
const unsigned long *ref_ctr_offsets;
const __u64 *cookies;
User can specify 2 mutually exclusive set of inputs:
1) use only path/func_pattern/pid arguments
2) use path/pid with allowed combinations of:
syms/offsets/ref_ctr_offsets/cookies/cnt
- syms and offsets are mutually exclusive
- ref_ctr_offsets and cookies are optional
Any other usage results in error.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-15-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding new uprobe_multi struct to bpf_link_create_opts object
to pass multiple uprobe data to link_create attr uapi.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-14-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding elf_resolve_pattern_offsets function that looks up
offsets for symbols specified by pattern argument.
The 'pattern' argument allows wildcards (*?' supported).
Offsets are returned in allocated array together with its
size and needs to be released by the caller.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-13-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding elf_resolve_syms_offsets function that looks up
offsets for symbols specified in syms array argument.
Offsets are returned in allocated array with the 'cnt' size,
that needs to be released by the caller.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-12-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding elf symbol iterator object (and some functions) that follow
open-coded iterator pattern and some functions to ease up iterating
elf object symbols.
The idea is to iterate single symbol section with:
struct elf_sym_iter iter;
struct elf_sym *sym;
if (elf_sym_iter_new(&iter, elf, binary_path, SHT_DYNSYM))
goto error;
while ((sym = elf_sym_iter_next(&iter))) {
...
}
I considered opening the elf inside the iterator and iterate all symbol
sections, but then it gets more complicated wrt user checks for when
the next section is processed.
Plus side is the we don't need 'exit' function, because caller/user is
in charge of that.
The returned iterated symbol object from elf_sym_iter_next function
is placed inside the struct elf_sym_iter, so no extra allocation or
argument is needed.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-11-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding elf_open/elf_close functions and using it in
elf_find_func_offset_from_file function. It will be
used in following changes to save some common code.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-10-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding new elf object that will contain elf related functions.
There's no functional change.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-9-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding new uprobe_multi attach type and link names,
so the functions can resolve the new values.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-8-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding support to specify pid for uprobe_multi link and the uprobes
are created only for task with given pid value.
Using the consumer.filter filter callback for that, so the task gets
filtered during the uprobe installation.
We still need to check the task during runtime in the uprobe handler,
because the handler could get executed if there's another system
wide consumer on the same uprobe (thanks Oleg for the insight).
Cc: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230809083440.3209381-6-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding support to specify cookies array for uprobe_multi link.
The cookies array share indexes and length with other uprobe_multi
arrays (offsets/ref_ctr_offsets).
The cookies[i] value defines cookie for i-the uprobe and will be
returned by bpf_get_attach_cookie helper when called from ebpf
program hooked to that specific uprobe.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230809083440.3209381-5-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding new multi uprobe link that allows to attach bpf program
to multiple uprobes.
Uprobes to attach are specified via new link_create uprobe_multi
union:
struct {
__aligned_u64 path;
__aligned_u64 offsets;
__aligned_u64 ref_ctr_offsets;
__u32 cnt;
__u32 flags;
} uprobe_multi;
Uprobes are defined for single binary specified in path and multiple
calling sites specified in offsets array with optional reference
counters specified in ref_ctr_offsets array. All specified arrays
have length of 'cnt'.
The 'flags' supports single bit for now that marks the uprobe as
return probe.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230809083440.3209381-4-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Switching BPF_F_KPROBE_MULTI_RETURN macro to anonymous enum,
so it'd show up in vmlinux.h. There's not functional change
compared to having this as macro.
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230809083440.3209381-2-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Let's test whether merging and unmerging in PROT_NONE areas works as
expected.
Pass a page protection to mmap_and_merge_range(), which will trigger
an mprotect() after writing to the pages, but before enabling merging.
Make sure that unsharing works as expected, by performing a ptrace write
(using /proc/self/mem) and by setting MADV_UNMERGEABLE.
Note that this implicitly tests that ptrace writes in an inaccessible
(PROT_NONE) mapping work as expected.
[david@redhat.com: use sizeof(i) in test_prot_none(), per Peter]
Link: https://lkml.kernel.org/r/e9cdb144-70c7-6596-2377-e675635c94e0@redhat.com
Link: https://lkml.kernel.org/r/20230803143208.383663-8-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: liubo <liubo254@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Let's extend mmap_and_merge_range() to test if anything in the current
process was merged. range_maps_duplicates() is too unreliable for that
use case, so instead look at KSM stats.
Trigger a complete unmerge first, to cleanup the stable tree and
stabilize accounting of merged pages.
Note that we're using /proc/self/ksm_merging_pages instead of
/proc/self/ksm_stat, because that one is available in more existing
kernels.
If /proc/self/ksm_merging_pages can't be opened, we can't perform any
checks and simply skip them.
We have to special-case the shared zeropage for now. But the only user
-- test_unmerge_zero_pages() -- performs its own merge checks.
Link: https://lkml.kernel.org/r/20230803143208.383663-7-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: liubo <liubo254@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
There is only one Kconfig user of CONFIG_EMBEDDED and it can be switched
to EXPERT or "if !ARCH_MULTIPLATFORM" (suggested by Arnd).
Link: https://lkml.kernel.org/r/20230816055010.31534-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com> [RISC-V]
Acked-by: Greg Ungerer <gerg@linux-m68k.org>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc]
Cc: Russell King <linux@armlinux.org.uk>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Extract from current /proc/self/smaps output:
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
THPeligible: 0
ProtectionKey: 0
That's not the alignment shown in Documentation/filesystems/proc.rst: it's
an ugly artifact from missing out the %8 other fields are using; but
there's even one selftest which expects it to look that way. Hoping no
other smaps parsers depend on THPeligible to look so ugly, fix these.
Link: https://lkml.kernel.org/r/cfb81f7a-f448-5bc2-b0e1-8136fcd1dd8c@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This adds proper tests for the nesting functionality of vm.memfd_noexec as
well as some minor cleanups to spawn_*_thread().
Link: https://lkml.kernel.org/r/20230814-memfd-vm-noexec-uapi-fixes-v2-5-7ff9e3e10ba6@cyphar.com
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Daniel Verkamp <dverkamp@chromium.org>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Jeff Xu <jeffxu@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Given the difficulty of auditing all of userspace to figure out whether
every memfd_create() user has switched to passing MFD_EXEC and
MFD_NOEXEC_SEAL flags, it seems far less distruptive to make it possible
for older programs that don't make use of executable memfds to run under
vm.memfd_noexec=2. Otherwise, a small dependency change can result in
spurious errors. For programs that don't use executable memfds, passing
MFD_NOEXEC_SEAL is functionally a no-op and thus having the same
In addition, every failure under vm.memfd_noexec=2 needs to print to the
kernel log so that userspace can figure out where the error came from.
The concerns about pr_warn_ratelimited() spam that caused the switch to
pr_warn_once()[1,2] do not apply to the vm.memfd_noexec=2 case.
This is a user-visible API change, but as it allows programs to do
something that would be blocked before, and the sysctl itself was broken
and recently released, it seems unlikely this will cause any issues.
[1]: https://lore.kernel.org/Y5yS8wCnuYGLHMj4@x1n/
[2]: https://lore.kernel.org/202212161233.85C9783FB@keescook/
Link: https://lkml.kernel.org/r/20230814-memfd-vm-noexec-uapi-fixes-v2-2-7ff9e3e10ba6@cyphar.com
Fixes: 105ff5339f ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Daniel Verkamp <dverkamp@chromium.org>
Cc: Jeff Xu <jeffxu@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "memfd: cleanups for vm.memfd_noexec", v2.
The most critical issue with vm.memfd_noexec=2 (the fact that passing
MFD_EXEC would bypass it entirely[1]) has been fixed in Andrew's
tree[2], but there are still some outstanding issues that need to be
addressed:
* vm.memfd_noexec=2 shouldn't reject old-style memfd_create(2) syscalls
because it will make it far to difficult to ever migrate. Instead it
should imply MFD_EXEC.
* The dmesg warnings are pr_warn_once(), which on most systems means
that they will be used up by systemd or some other boot process and
userspace developers will never see it.
- For the !(flags & (MFD_EXEC | MFD_NOEXEC_SEAL)) case, outputting a
rate-limited message to the kernel log is necessary to tell
userspace that they should add the new flags.
Arguably the most ideal way to deal with the spam concern[3,4]
while still prompting userspace to switch to the new flags would be
to only log the warning once per task or something similar.
However, adding something to task_struct for tracking this would be
needless bloat for a single pr_warn_ratelimited().
So just switch to pr_info_ratelimited() to avoid spamming the log
with something that isn't a real warning. There's lots of
info-level stuff in dmesg, it seems really unlikely that this
should be an actual problem. Most programs are already switching to
the new flags anyway.
- For the vm.memfd_noexec=2 case, we need to log a warning for every
failure because otherwise userspace will have no idea why their
previously working program started returning -EACCES (previously
-EINVAL) from memfd_create(2). pr_warn_once() is simply wrong here.
* The racheting mechanism for vm.memfd_noexec makes it incredibly
unappealing for most users to enable the sysctl because enabling it
on &init_pid_ns means you need a system reboot to unset it. Given the
actual security threat being protected against, CAP_SYS_ADMIN users
being restricted in this way makes little sense.
The argument for this ratcheting by the original author was that it
allows you to have a hierarchical setting that cannot be unset by
child pidnses, but this is not accurate -- changing the parent
pidns's vm.memfd_noexec setting to be more restrictive didn't affect
children.
Instead, switch the vm.memfd_noexec sysctl to be properly
hierarchical and allow CAP_SYS_ADMIN users (in the pidns's owning
userns) to lower the setting as long as it is not lower than the
parent's effective setting. This change also makes it so that
changing a parent pidns's vm.memfd_noexec will affect all
descendants, providing a properly hierarchical setting. The
performance impact of this is incredibly minimal since the maximum
depth of pidns is 32 and it is only checked during memfd_create(2)
and unshare(CLONE_NEWPID).
* The memfd selftests would not exit with a non-zero error code when
certain tests that ran in a forked process (specifically the ones
related to MFD_EXEC and MFD_NOEXEC_SEAL) failed.
[1]: https://lore.kernel.org/all/ZJwcsU0vI-nzgOB_@codewreck.org/
[2]: https://lore.kernel.org/all/20230705063315.3680666-1-jeffxu@google.com/
[3]: https://lore.kernel.org/Y5yS8wCnuYGLHMj4@x1n/
[4]: https://lore.kernel.org/f185bb42-b29c-977e-312e-3349eea15383@linuxfoundation.org/
This patch (of 5):
Before this change, a test runner using this self test would see a return
code of 0 when the tests using a child process (namely the MFD_NOEXEC_SEAL
and MFD_EXEC tests) failed, masking test failures.
Link: https://lkml.kernel.org/r/20230814-memfd-vm-noexec-uapi-fixes-v2-0-7ff9e3e10ba6@cyphar.com
Link: https://lkml.kernel.org/r/20230814-memfd-vm-noexec-uapi-fixes-v2-1-7ff9e3e10ba6@cyphar.com
Fixes: 11f75a0144 ("selftests/memfd: add tests for MFD_NOEXEC_SEAL MFD_EXEC")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Reviewed-by: Jeff Xu <jeffxu@google.com>
Cc: "Christian Brauner (Microsoft)" <brauner@kernel.org>
Cc: Daniel Verkamp <dverkamp@chromium.org>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
commit 686a8bb72349("selftests/mm: split uffd tests into uffd-stress and
uffd-unit-tests") split uffd tests into uffd-stress and uffd-unit-tests,
obviously we need to modify the help information synchronously.
Also modify code indentation.
Link: https://lkml.kernel.org/r/tencent_64FC724AC5F05568F41BD1C68058E83CEB05@qq.com
Signed-off-by: Rong Tao <rongtao@cestc.cn>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Test existence of files and validity of input keyword for DAMON monitoring
target based DAMOS filter on DAMON sysfs interface.
Link: https://lkml.kernel.org/r/20230802214312.110532-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add a selftest for checking existence of addr_{start,end} files under
DAMOS filter directory, and 'addr' damos filter type input of DAMON sysfs
interface.
Link: https://lkml.kernel.org/r/20230802214312.110532-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Update sysfs.sh DAMON selftest for checking existence of 'total_bytes'
file under the 'tried_regions' directory of DAMON sysfs interface.
Link: https://lkml.kernel.org/r/20230802213222.109841-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add KSM_MERGE_TIME and KSM_MERGE_TIME_HUGE_PAGES tests with
size of 100.
./run_vmtests.sh -t ksm
-----------------------------
running ./ksm_tests -H -s 100
-----------------------------
Number of normal pages: 0
Number of huge pages: 50
Total size: 100 MiB
Total time: 0.399844662 s
Average speed: 250.097 MiB/s
[PASS]
-----------------------------
running ./ksm_tests -P -s 100
-----------------------------
Total size: 100 MiB
Total time: 0.451931496 s
Average speed: 221.272 MiB/s
[PASS]
Link: https://lkml.kernel.org/r/20230728164102.4655-1-ayush.jain3@amd.com
Signed-off-by: Ayush Jain <ayush.jain3@amd.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Stefan Roesch <shr@devkernel.io>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
After commit 2c2241081f ("mm/gup: move private gup FOLL_ flags to
internal.h") FOLL_LONGTERM flag value got updated from 0x10000 to 0x100 at
include/linux/mm_types.h.
As hmm.hmm_device_private.hmm_gup_test uses FOLL_LONGTERM Updating same
here as well.
Before this change test goes in an infinite assert loop in
hmm.hmm_device_private.hmm_gup_test
==========================================================
RUN hmm.hmm_device_private.hmm_gup_test ...
hmm-tests.c:1962:hmm_gup_test:Expected HMM_DMIRROR_PROT_WRITE..
..(2) == m[2] (34)
hmm-tests.c:157:hmm_gup_test:Expected ret (-1) == 0 (0)
hmm-tests.c:157:hmm_gup_test:Expected ret (-1) == 0 (0)
...
==========================================================
Call Trace:
<TASK>
? sched_clock+0xd/0x20
? __lock_acquire.constprop.0+0x120/0x6c0
? ktime_get+0x2c/0xd0
? sched_clock+0xd/0x20
? local_clock+0x12/0xd0
? lock_release+0x26e/0x3b0
pin_user_pages_fast+0x4c/0x70
gup_test_ioctl+0x4ff/0xbb0
? gup_test_ioctl+0x68c/0xbb0
__x64_sys_ioctl+0x99/0xd0
do_syscall_64+0x60/0x90
? syscall_exit_to_user_mode+0x2a/0x50
? do_syscall_64+0x6d/0x90
? syscall_exit_to_user_mode+0x2a/0x50
? do_syscall_64+0x6d/0x90
? irqentry_exit_to_user_mode+0xd/0x20
? irqentry_exit+0x3f/0x50
? exc_page_fault+0x96/0x200
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7f6aaa31aaff
After this change test is able to pass successfully.
Link: https://lkml.kernel.org/r/20230808124347.79163-1-ayush.jain3@amd.com
Fixes: 2c2241081f ("mm/gup: move private gup FOLL_ flags to internal.h")
Signed-off-by: Ayush Jain <ayush.jain3@amd.com>
Reviewed-by: Raghavendra K T <raghavendra.kt@amd.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
test_kmem_basic creates 100,000 negative dentries, with each one mapping
to a slab object. After memory.high is set, these are reclaimed through
the shrink_slab function call which reclaims all 100,000 entries. The
test passes the majority of the time because when slab1 or current is
calculated, it is often above 0, however, 0 is also an acceptable value.
Link: https://lkml.kernel.org/r/7d6gcuyzdjcice6qbphrmpmv5skr5jtglg375unnjxqhstvhxc@qkn6dw6bao6v
Signed-off-by: Lucas Karpinski <lkarpins@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add a test case for IPv6 source address delete.
As David suggested, add tests:
- Single device using src address
- Two devices with the same source address
- VRF with single device using src address
- VRF with two devices using src address
As Ido points out, in IPv6, the preferred source address is looked up in
the same VRF as the first nexthop device. This will give us similar results
to IPv4 if the route is installed in the same VRF as the nexthop device, but
not when the nexthop device is enslaved to a different VRF. So add tests:
- src address and nexthop dev in same VR
- src address and nexthop device in different VRF
The link local address delete logic is different from the global address.
It should only affect the associate device it bonds to. So add tests cases
for link local address testing.
Here is the test result:
IPv6 delete address route tests
Single device using src address
TEST: Prefsrc removed when src address removed on other device [ OK ]
Two devices with the same source address
TEST: Prefsrc not removed when src address exist on other device [ OK ]
TEST: Prefsrc removed when src address removed on all devices [ OK ]
VRF with single device using src address
TEST: Prefsrc removed when src address removed on other device [ OK ]
VRF with two devices using src address
TEST: Prefsrc not removed when src address exist on other device [ OK ]
TEST: Prefsrc removed when src address removed on all devices [ OK ]
src address and nexthop dev in same VRF
TEST: Prefsrc removed from VRF when source address deleted [ OK ]
TEST: Prefsrc in default VRF not removed [ OK ]
TEST: Prefsrc not removed from VRF when source address exist [ OK ]
TEST: Prefsrc in default VRF removed [ OK ]
src address and nexthop device in different VRF
TEST: Prefsrc not removed from VRF when nexthop dev in diff VRF [ OK ]
TEST: Prefsrc not removed in default VRF [ OK ]
TEST: Prefsrc removed from VRF when nexthop dev in diff VRF [ OK ]
TEST: Prefsrc removed in default VRF [ OK ]
Table ID 0
TEST: Prefsrc removed from default VRF when source address deleted [ OK ]
Link local source route
TEST: Prefsrc not removed when delete ll addr from other dev [ OK ]
TEST: Prefsrc removed when delete ll addr [ OK ]
TEST: Prefsrc not removed when delete ll addr from other dev [ OK ]
TEST: Prefsrc removed even ll addr still exist on other dev [ OK ]
Tests passed: 19
Tests failed: 0
Suggested-by: Ido Schimmel <idosch@idosch.org>
Suggested-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the initial commit 1a01727676 ("selftests: Add VRF route leaking
tests") said, the IPv6 MTU test fails as source address selection
picking ::1. Every time we run the selftest this one report failed.
There seems not much meaning to keep reporting a failure for 3 years
that no one plan to fix/update. Let't just skip this one first. We can
add it back when the issue fixed.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update .gitignore to untrack tools directory and log.txt. "tools" is
generated in "selftests/net/Makefile" and log.txt is generated in
"selftests/net/gro.sh" when executing run_all_tests.
Signed-off-by: Anh Tuan Phan <tuananhlfc@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently any error during render leads to output an empty file.
That is quite annoying when using tools/net/ynl/ynl-regen.sh
which git greps files with content of "YNL-GEN.." and therefore ignores
empty files. So once you fail to regen, you have to checkout the file.
Avoid that by rendering to a temporary file first, only at the end
copy the content to the actual destination.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
as latter clobbers flags which interferes with fastop emulation in
KVM, leading to guests freezing during boot
- A fix for the DIV(0) quotient data leak on Zen1 to clear the divider
buffers at the right time
- Disable the SRSO mitigation on unaffected configurations as it got
enabled there unnecessarily
- Change .text section name to fix CONFIG_LTO_CLANG builds
- Improve the optprobe indirect jmp check so that certain configurations
can still be able to use optprobes at all
- A serious and good scrubbing of the untraining routines by PeterZ:
- Add proper speculation stopping traps so that objtool is happy
- Adjust objtool to handle the new thunks
- Make the thunk pointer assignable to the different untraining
sequences at runtime, thus avoiding the alternative at the return
thunk. It simplifies the code a bit too.
- Add a entry_untrain_ret() main entry point which selects the
respective untraining sequence
- Rename things so that they're more clear
- Fix stack validation with FRAME_POINTER=y builds
- Fix static call patching to handle when a JMP to the return thunk is
the last insn on the very last module memory page
- Add more documentation about what each untraining routine does and
why
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmTge4wACgkQEsHwGGHe
VUpgwRAAgP1dAq4c/DuLQh/+Mao/pM+EiNxwoDTNJ27ZoRfXG5vLXF3++TRkmFKB
ua+jEhkNTAH1xyF+um4exjUD2UC62UfNo4wBZPjl+jVmguHqpsNOsZj7M3+GRD+3
vRWspaOnNPKOIVdtvftaS6J3YavFUolwZSRC9HCFQiriX5zV4BlMZEJxkWw6LNW6
LeJt4qmbDXCIzmCRqEmtNBOhuWuMvhwWg9G1Aq4MLcHf+gHSEGNnY8Otl7YPPeqr
ys9vE5hQ3NiUmBkGnhw+Mj3gGFCL2fzWF0XqY8VCTPcYTVRFen7BmelhJVm7RhAr
wpXdyCU+bV4qrn2uRpBSbzH/DfxfQA2xbRtBR+L7x5ZbHamFyi17fN94AQv2WUXz
7TUdooWPuJLPQ2CHAgSChTEF/CZBl6pYHEorHkzA1GqV0omMT7hg8GEHn17JGI5v
FDPGYHuznsu59DhGNh7Wx4hLO10slvkSHly+se7eCaDr1hDIpJtiZLxn6n+SphZh
qzYc+Pxa3UcgNSxqqfOBqDWQQNdoYqx1ONao8nWgjj+/y0eIEf27uqIDT/o5tb7E
YejDq7xO00CartGm2g/0S0OvDvRTWbU0LoGMKNxo/HTD+goM8pa7vdE77g5NNSCy
wG0BnFWni53p66JJzzxxgPG39OYu9NR6ilcOTYT9jlPT3ZMySYg=
=ndko
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v6.5_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
"Extraordinary embargoed times call for extraordinary measures. That's
why this week's x86/urgent branch is larger than usual, containing all
the known fallout fixes after the SRSO mitigation got merged.
I know, it is a bit late in the game but everyone who has reported a
bug stemming from the SRSO pile, has tested that branch and has
confirmed that it fixes their bug.
Also, I've run it on every possible hardware I have and it is looking
good. It is running on this very machine while I'm typing, for 2 days
now without an issue. Famous last words...
- Use LEA ...%rsp instead of ADD %rsp in the Zen1/2 SRSO return
sequence as latter clobbers flags which interferes with fastop
emulation in KVM, leading to guests freezing during boot
- A fix for the DIV(0) quotient data leak on Zen1 to clear the
divider buffers at the right time
- Disable the SRSO mitigation on unaffected configurations as it got
enabled there unnecessarily
- Change .text section name to fix CONFIG_LTO_CLANG builds
- Improve the optprobe indirect jmp check so that certain
configurations can still be able to use optprobes at all
- A serious and good scrubbing of the untraining routines by PeterZ:
- Add proper speculation stopping traps so that objtool is happy
- Adjust objtool to handle the new thunks
- Make the thunk pointer assignable to the different untraining
sequences at runtime, thus avoiding the alternative at the
return thunk. It simplifies the code a bit too.
- Add a entry_untrain_ret() main entry point which selects the
respective untraining sequence
- Rename things so that they're more clear
- Fix stack validation with FRAME_POINTER=y builds
- Fix static call patching to handle when a JMP to the return thunk
is the last insn on the very last module memory page
- Add more documentation about what each untraining routine does and
why"
* tag 'x86_urgent_for_v6.5_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/srso: Correct the mitigation status when SMT is disabled
x86/static_call: Fix __static_call_fixup()
objtool/x86: Fixup frame-pointer vs rethunk
x86/srso: Explain the untraining sequences a bit more
x86/cpu/kvm: Provide UNTRAIN_RET_VM
x86/cpu: Cleanup the untrain mess
x86/cpu: Rename srso_(.*)_alias to srso_alias_\1
x86/cpu: Rename original retbleed methods
x86/cpu: Clean up SRSO return thunk mess
x86/alternative: Make custom return thunk unconditional
objtool/x86: Fix SRSO mess
x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk()
x86/cpu: Fix __x86_return_thunk symbol type
x86/retpoline,kprobes: Skip optprobe check for indirect jumps with retpolines and IBT
x86/retpoline,kprobes: Fix position of thunk sections with CONFIG_LTO_CLANG
x86/srso: Disable the mitigation on unaffected configurations
x86/CPU/AMD: Fix the DIV(0) initial fix attempt
x86/retpoline: Don't clobber RFLAGS during srso_safe_ret()
Remove assumptions about shared buffer cell size and instead query the
cell size from devlink. Adjust the test to send small packets that fit
inside a single cell.
Tested on Spectrum-{1,2,3,4}.
Fixes: 4735402173 ("mlxsw: spectrum: Extend to support Spectrum-4 ASIC")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/f7dfbf3c4d1cb23838d9eb99bab09afaa320c4ca.1692268427.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/sfc/tc.c
fa165e1949 ("sfc: don't unregister flow_indr if it was never registered")
3bf969e88a ("sfc: add MAE table machinery for conntrack table")
https://lore.kernel.org/all/20230818112159.7430e9b4@canb.auug.org.au/
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When building the kernel and selftest with clang compiler (llvm17 or llvm18),
I hit the following compilation failure:
In file included from progs/test_lwt_redirect.c:3:
In file included from /usr/include/linux/ip.h:21:
In file included from /usr/include/asm/byteorder.h:5:
In file included from /usr/include/linux/byteorder/little_endian.h:13:
/usr/include/linux/swab.h:136:8: error: unknown type name '__always_inline'
136 | static __always_inline unsigned long __swab(const unsigned long y)
| ^
/usr/include/linux/swab.h:171:8: error: unknown type name '__always_inline'
171 | static __always_inline __u16 __swab16p(const __u16 *p)
...
bpf_helpers.h file provided a definition for __always_inline.
Putting 'ip.h' after 'bpf_helpers.h' fixed the issue.
Fixes: 43a7c3ef8a ("selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT")
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230818174312.1883381-1-yonghong.song@linux.dev
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
This test is arch specific, requires "munmap everything" primitive.
Link: https://lkml.kernel.org/r/20230630183434.17434-2-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Unmap everything starting from 4GB length until it unmaps, otherwise test
has to detect which virtual memory split kernel is using.
Link: https://lkml.kernel.org/r/20230630183434.17434-1-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Since the mas_preallocate() calculation has been updated to be more
precise, the testing must also be updated to check for what is expected.
Link: https://lkml.kernel.org/r/20230724183157.3939892-13-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The current preallocation strategy is to preallocate the absolute
worst-case allocation for a tree modification. The entry (or NULL) is
needed to know how many nodes are needed to write to the tree. Start by
adding the argument to the mas_preallocate() definition.
Link: https://lkml.kernel.org/r/20230724183157.3939892-8-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
It is very unclear to me how one is supposed to run all the mm selftests
consistently and get clear results.
Most of the test programs are launched by both run_vmtests.sh and
run_kselftest.sh:
hugepage-mmap
hugepage-shm
map_hugetlb
hugepage-mremap
hugepage-vmemmap
hugetlb-madvise
map_fixed_noreplace
gup_test
gup_longterm
uffd-unit-tests
uffd-stress
compaction_test
on-fault-limit
map_populate
mlock-random-test
mlock2-tests
mrelease_test
mremap_test
thuge-gen
virtual_address_range
va_high_addr_switch
mremap_dontunmap
hmm-tests
madv_populate
memfd_secret
ksm_tests
ksm_functional_tests
soft-dirty
cow
However, of this set, when launched by run_vmtests.sh, some of the
programs are invoked multiple times with different arguments. When
invoked by run_kselftest.sh, they are invoked without arguments (and as
a consequence, some fail immediately).
Some test programs are only launched by run_vmtests.sh:
test_vmalloc.sh
And some test programs and only launched by run_kselftest.sh:
khugepaged
migration
mkdirty
transhuge-stress
split_huge_page_test
mdwe_test
write_to_hugetlbfs
Furthermore, run_vmtests.sh is invoked by run_kselftest.sh, so in this
case all the test programs invoked by both scripts are run twice!
Needless to say, this is a bit of a mess. In the absence of fully
understanding the history here, it looks to me like the best solution is
to launch ALL test programs from run_vmtests.sh, and ONLY invoke
run_vmtests.sh from run_kselftest.sh. This way, we get full control over
the parameters, each program is only invoked the intended number of
times, and regardless of which script is used, the same tests get run in
the same way.
The only drawback is that if using run_kselftest.sh, it's top-level tap
result reporting reports only a single test and it fails if any of the
contained tests fail. I don't see this as a big deal though since we
still see all the nested reporting from multiple layers. The other issue
with this is that all of run_vmtests.sh must execute within a single
kselftest timeout period, so let's increase that to something more
suitable.
In the Makefile, TEST_GEN_PROGS will compile and install the tests and
will add them to the list of tests that run_kselftest.sh will run.
TEST_GEN_FILES will compile and install the tests but will not add them
to the test list. So let's move all the programs from TEST_GEN_PROGS to
TEST_GEN_FILES so that they are built but not executed by
run_kselftest.sh. Note that run_vmtests.sh is added to TEST_PROGS, which
means it ends up in the test list. (the lack of "_GEN" means it won't be
compiled, but simply copied).
Link: https://lkml.kernel.org/r/20230724082522.1202616-9-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Until now, transhuge-stress runs until its explicitly killed, so when
invoked by run_kselftest.sh, it would run until the test timeout, then it
would be killed and the test would be marked as failed.
Add a new, optional command line parameter that allows the user to specify
the duration in seconds that the program should run. The program exits
after this duration with a success (0) exit code. If the argument is
omitted the old behacvior remains.
On it's own, this doesn't quite solve our problem because run_kselftest.sh
does not allow passing parameters to the program under test. But we will
shortly move this to run_vmtests.sh, which does allow parameter passing.
Link: https://lkml.kernel.org/r/20230724082522.1202616-8-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The `migration` test currently has a number of robustness problems that
cause it to hang and leak resources.
Timeout: There are 3 tests, which each previously ran for 60 seconds.
However, the timeout in mm/settings for a single test binary was set to 45
seconds. So when run using run_kselftest.sh, the top level timeout would
trigger before the test binary was finished. Solve this by meeting in the
middle; each of the 3 tests now runs for 20 seconds (for a total of 60),
and the top level timeout is set to 90 seconds.
Leaking child processes: the `shared_anon` test fork()s some children but
then an ASSERT() fires before the test kills those children. The assert
causes immediate exit of the parent and leaking of the children.
Furthermore, if run using the run_kselftest.sh wrapper, the wrapper would
get stuck waiting for those children to exit, which never happens. Solve
this by setting the "parent death signal" to SIGHUP in the child, so that
the child is killed automatically if the parent dies.
With these changes, the test binary now runs to completion on arm64, with
2 tests passing and the `shared_anon` test failing.
Link: https://lkml.kernel.org/r/20230724082522.1202616-7-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
va_high_addr_switch has a mechanism to determine if the tests should be
run or skipped (supported_arch()). This currently returns unconditionally
true for arm64. However, va_high_addr_switch also requires a large
virtual address space for the tests to run, otherwise they spuriously
fail.
Since arm64 can only support VA > 48 bits when the page size is 64K, let's
decide whether we should skip the test suite based on the page size. This
reduces noise when running on 4K and 16K kernels.
Link: https://lkml.kernel.org/r/20230724082522.1202616-6-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
thuge-gen was previously only munmapping part of the mmapped buffer, which
caused us to run out of 1G huge pages for a later part of the test. Fix
this by munmapping the whole buffer. Based on the code, it looks like a
typo rather than an intention to keep some of the buffer mapped.
thuge-gen was also calling mmap with SHM_HUGETLB flag (bit 11 set), which
is actually MAP_DENYWRITE in mmap context. The man page says this flag is
ignored in modern kernels. I'm pretty sure from the context that the
author intended to pass the MAP_HUGETLB flag so I've fixed that up too.
Link: https://lkml.kernel.org/r/20230724082522.1202616-5-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mrelease_test defaults to defining __NR_pidfd_open and
__NR_process_mrelease syscall numbers to -1, if they are not defined
anywhere else, and the suite would then be marked as skipped as a result.
arm64 (at least the stock debian toolchain that I'm using) requires
including <sys/syscall.h> to pull in the defines for these syscalls. So
let's add this header. With this in place, the test is passing on arm64.
Link: https://lkml.kernel.org/r/20230724082522.1202616-4-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
arm64 does not support the soft-dirty PTE bit. However, the `soft-dirty`
test suite is currently run unconditionally and therefore generates
spurious test failures on arm64. There are also some tests in
`madv_populate` which assume it is supported.
For `soft-dirty` lets disable the whole suite for arm64; it is no longer
built and run_vmtests.sh will skip it if its not present.
For `madv_populate`, we need a runtime mechanism so that the remaining
tests continue to be run. Unfortunately, the only way to determine if the
soft-dirty dirty bit is supported is to write to a page, then see if the
bit is set in /proc/self/pagemap. But the tests that we want to
conditionally execute are testing precicesly this. So if we introduced
this feature check, we could accedentally turn a real failure (on a system
that claims to support soft-dirty) into a skip. So instead, do the check
based on architecture; for arm64, we report that soft-dirty is not
supported.
Link: https://lkml.kernel.org/r/20230724082522.1202616-3-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "selftests/mm fixes for arm64", v3.
Given my on-going work on large anon folios and contpte mappings, I
decided it would be a good idea to start running mm selftests to help
guard against regressions. However, it soon became clear that I
couldn't get the suite to run cleanly on arm64 with a vanilla v6.5-rc1
kernel (perhaps I'm just doing it wrong??), so got stuck in a rabbit
hole trying to debug and fix all the issues. Some were down to
misconfigurations, but I also found a number of issues with the tests
and even a couple of issues with the kernel.
This patch (of 8):
The selftests runner pipes the test program's stdout to tap_prefix. The
presence of the pipe means that the test program sets its stdout to be
fully buffered (as aposed to line buffered when directly connected to the
terminal). The block buffering means that there is often content in the
buffer at fork() time, which causes the output to end up duplicated. This
was causing problems for mm:cow where test results were duplicated 20-30x.
Solve this by using `stdbuf`, when available to force the test program to
use line buffered mode. This means previously printf'ed results are
flushed out of the program before any fork().
Additionally, explicitly set line buffer mode in ksft_print_header(),
which means that all test programs that use the ksft framework will
benefit even if stdbuf is not present on the system.
[ryan.roberts@arm.com: add setvbuf() to set buffering mode]
Link: https://lkml.kernel.org/r/20230726070655.2713530-1-ryan.roberts@arm.com
Link: https://lkml.kernel.org/r/20230724082522.1202616-1-ryan.roberts@arm.com
Link: https://lkml.kernel.org/r/20230724082522.1202616-2-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add tests for the improvement made to read operation on HWPOISON
hugetlb page with different read granularities. For each chunk size,
three read scenarios are tested:
1. Simple regression test on read without HWPOISON.
2. Sequential read page by page should succeed until encounters the 1st
raw HWPOISON subpage.
3. After skip a raw HWPOISON subpage by lseek, read()s always succeed.
Link: https://lkml.kernel.org/r/20230713001833.3778937-5-jiaqiyan@google.com
Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The test is pretty basic, and exercises UFFDIO_POISON straightforwardly.
We register a region with userfaultfd, in missing fault mode. For each
fault, we either UFFDIO_COPY a zeroed page (odd pages) or UFFDIO_POISON
(even pages). We do this mix to test "something like a real use case",
where guest memory would be some mix of poisoned and non-poisoned pages.
We read each page in the region, and assert that the odd pages are zeroed
as expected, and the even pages yield a SIGBUS as expected.
Why UFFDIO_COPY instead of UFFDIO_ZEROPAGE? Because hugetlb doesn't
support UFFDIO_ZEROPAGE, and we don't want to have special case code.
Link: https://lkml.kernel.org/r/20230707215540.2324998-9-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Cc: Jiaqi Yan <jiaqiyan@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nadav Amit <namit@vmware.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: T.J. Alumbaugh <talumbau@google.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: ZhangPeng <zhangpeng362@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Previously, we had "one fault handler to rule them all", which used
several branches to deal with all of the scenarios required by all of the
various tests.
In upcoming patches, I plan to add a new test, which has its own slightly
different fault handling logic. Instead of continuing to add cruft to the
existing fault handler, let's allow tests to define custom ones, separate
from other tests.
Link: https://lkml.kernel.org/r/20230707215540.2324998-8-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Cc: Jiaqi Yan <jiaqiyan@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nadav Amit <namit@vmware.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: T.J. Alumbaugh <talumbau@google.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: ZhangPeng <zhangpeng362@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add a function test_unmerge_zero_page() to test the functionality on
unsharing and counting ksm-placed zero pages and counting of this patch
series.
test_unmerge_zero_page() actually contains four subjct test objects:
(1) whether the count of ksm zero pages can update correctly after merging;
(2) whether the count of ksm zero pages can update correctly after
unmerging by madvise(...MADV_UNMERGEABLE);
(3) whether the count of ksm zero pages can update correctly after
unmerging by triggering write fault.
(4) whether ksm zero pages are really unmerged.
Link: https://lkml.kernel.org/r/20230613030947.186089-1-yang.yang29@zte.com.cn
Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Xiaokai Ran <ran.xiaokai@zte.com.cn>
Reviewed-by: Yang Yang <yang.yang29@zte.com.cn>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Xuexin Jiang <jiang.xuexin@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add a test to verify that when a memcg hits its limit in zswap, it doesn't
trigger an unwanted writeback that would result in pages not owned by that
memcg to be sent to disk, even if zswap isn't full. This was fixed by
commit 0bdf0efa180a("zswap: do not shrink if cgroup may not zswap").
Link: https://lkml.kernel.org/r/20230621153548.428093-4-cerasuolodomenico@gmail.com
Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add a cgroup selftest that verifies memcg charging in zswap. The original
issue was that kmem bypass was applied to pages swapped out to zswap by
kswapd, resulting in zswapped memory not being charged. It was fixed by
commit cd08d80ecdac("mm: correctly charge compressed memory to its
memcg").
Link: https://lkml.kernel.org/r/20230621153548.428093-3-cerasuolodomenico@gmail.com
Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "selftests: cgroup: add zswap test program".
This series adds 2 zswap related selftests that verify known and fixed
issues. A new dedicated test program (test_zswap) is proposed since the
test cases are specific to zswap and hosts specific helpers.
The first patch adds the (empty) test program, while the other 2 add an
actual test function each.
This patch (of 3):
Add empty cgroup-zswap self test scaffold program, test functions to be
added in the next commits.
Link: https://lkml.kernel.org/r/20230621153548.428093-1-cerasuolodomenico@gmail.com
Link: https://lkml.kernel.org/r/20230621153548.428093-2-cerasuolodomenico@gmail.com
Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add test for expanding range in RCU mode. If we use the fast path of the
slot store to expand range in RCU mode, this test will fail.
Link: https://lkml.kernel.org/r/20230628073657.75314-3-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add a matrix for testing gup based on the current gup_test. Only run the
matrix when -a is specified because it's a bit slow.
It covers:
- Different types of huge pages: thp, hugetlb, or no huge page
- Permissions: Write / Read-only
- Fast-gup, with/without
- Types of the GUP: pin / gup / longterm pins
- Shared / Private memories
- GUP size: 1 / 512 / random page sizes
Link: https://lkml.kernel.org/r/20230628215310.73782-9-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kirill A . Shutemov <kirill@shutemov.name>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Allows to specify optional tests in run_vmtests.sh, where we can run time
consuming test matrix only when user specified "-a".
Link: https://lkml.kernel.org/r/20230628215310.73782-8-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kirill A . Shutemov <kirill@shutemov.name>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Just one partial revert for a commit from the merge window
that caused annoying behavior when building old kernels on
arm64 hosts.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmTe0Z8ACgkQYKtH/8kJ
UidZGQ/9GFtIHh9QK6XPlAlx3oC6HdFUrOqDwoueNd4sFk9nNAuW+Z/a0YX8DtnQ
/SrJPiCtoPPNCMNlxk7ZDx+ra0Tu+iC7rAvRnogKSV8qB35jNLIqhEde2opCa2Dx
sfmvtP0hr5c+b8UP9IN2U11WvJhv1yjldHtCKAjSf9zMtwPQjQvAIx1kzSlC57kE
zjsKVTB3yXUddhYGR7639RT7/1K25BVcG6kfKIEN/uPm+M/ljk58EJhUhSeEEfBf
NmtlhxWzJBySOTk5mjEVUsTAPrAckPI3Vh4sf0qd4rXe4Q+XX9Jvur9kPOoF8lQF
GO+WXrmooaEir3+1+w4PB9xXgZD26Iz018xdhcigLgEGefieM1JpvwJa8NarK8UK
iLg2+sBvCAJPJQ44E4r1hWf8ZFolEhwzg703/fGIjDPMQnhHNay4i73NyCw4ii2c
F+iCFhea5AKz+d5qQuaZmg7klY/4cBIKDVY/EsO4lmK0187eq3zZKJSMLD21+NLy
kr63OAV8yXbduLsIOjvR3KrbSvMq+9vasEX4zUGlzA+VR1l+MqtwJI13EPyqhubJ
kQQi6PUe/353ECdzwk5IAqMbGP7qhkOuv89+Ff9mPTlKAwDPwdZjbnsz5I5peskB
zVBO366Z0w3o2ENv9i4UEa+TtaCMIbLWdXJxPufwoqPHX3v7nUM=
=tTzL
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-fix-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic regression fix from Arnd Bergmann:
"Just one partial revert for a commit from the merge window that caused
annoying behavior when building old kernels on arm64 hosts"
* tag 'asm-generic-fix-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: partially revert "Unify uapi bitsperlong.h for arm64, riscv and loongarch"
This patch adds selftests that exercise kfunc flavor relocation
functionality added in the previous patch. The actual kfunc defined
in kernel/bpf/helpers.c is:
struct task_struct *bpf_task_acquire(struct task_struct *p)
The following relocation behaviors are checked:
struct task_struct *bpf_task_acquire___one(struct task_struct *name)
* Should succeed despite differing param name
struct task_struct *bpf_task_acquire___two(struct task_struct *p, void *ctx)
* Should fail because there is no two-param bpf_task_acquire
struct task_struct *bpf_task_acquire___three(void *ctx)
* Should fail because, despite vmlinux's bpf_task_acquire having one param,
the types don't match
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230817225353.2570845-2-davemarchevsky@fb.com
The function signature of kfuncs can change at any time due to their
intentional lack of stability guarantees. As kfuncs become more widely
used, BPF program writers will need facilities to support calling
different versions of a kfunc from a single BPF object. Consider this
simplified example based on a real scenario we ran into at Meta:
/* initial kfunc signature */
int some_kfunc(void *ptr)
/* Oops, we need to add some flag to modify behavior. No problem,
change the kfunc. flags = 0 retains original behavior */
int some_kfunc(void *ptr, long flags)
If the initial version of the kfunc is deployed on some portion of the
fleet and the new version on the rest, a fleetwide service that uses
some_kfunc will currently need to load different BPF programs depending
on which some_kfunc is available.
Luckily CO-RE provides a facility to solve a very similar problem,
struct definition changes, by allowing program writers to declare
my_struct___old and my_struct___new, with ___suffix being considered a
'flavor' of the non-suffixed name and being ignored by
bpf_core_type_exists and similar calls.
This patch extends the 'flavor' facility to the kfunc extern
relocation process. BPF program writers can now declare
extern int some_kfunc___old(void *ptr)
extern int some_kfunc___new(void *ptr, int flags)
then test which version of the kfunc exists with bpf_ksym_exists.
Relocation and verifier's dead code elimination will work in concert as
expected, allowing this pattern:
if (bpf_ksym_exists(some_kfunc___old))
some_kfunc___old(ptr);
else
some_kfunc___new(ptr, 0);
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Vernet <void@manifault.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230817225353.2570845-1-davemarchevsky@fb.com
The hwcaps selftest currently relies on the assembler being able to
assemble the crc32w instruction but this is not in the base v8.0 so is not
accepted by the standard GCC configurations used by many distributions.
Switch to manually encoding to fix the build.
Fixes: 09d2e95a04 ("kselftest/arm64: add crc32 feature to hwcap test")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230816-arm64-fix-crc32-build-v1-1-40165c1290f2@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Add a mock_domain_hw_info function and an iommu_test_hw_info data
structure. This allows to test the IOMMU_GET_HW_INFO ioctl passing the
test_reg value for the mock_dev.
Link: https://lore.kernel.org/r/20230818101033.4100-5-yi.l.liu@intel.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
There is no lwt test case for BPF_REROUTE yet. Add test cases for both
normal and abnormal situations. The abnormal situation is set up with an
fq qdisc on the reroute target device. Without proper fixes, overflow
this qdisc queue limit (to trigger a drop) would panic the kernel.
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/62c8ddc1e924269dcf80d2e8af1a1e632cee0b3a.1692326837.git.yan@cloudflare.com
There is no lwt_xmit test case for BPF_REDIRECT yet. Add test cases for
both normal and abnormal situations. For abnormal test cases, devices
are set down or have its carrier set down. Without proper fixes,
BPF_REDIRECT to either ingress or egress of such device would panic the
kernel.
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/96bf435243641939d9c9da329fab29cb45f7df22.1692326837.git.yan@cloudflare.com
No known outstanding regressions.
Fixes to fixes:
- virtio-net: set queues after driver_ok, avoid a potential race
added by recent fix
- Revert "vlan: Fix VLAN 0 memory leak", it may lead to a warning
when VLAN 0 is registered explicitly
- nf_tables:
- fix false-positive lockdep splat in recent fixes
- don't fail inserts if duplicate has expired (fix test failures)
- fix races between garbage collection and netns dismantle
Current release - new code bugs:
- mlx5: Fix mlx5_cmd_update_root_ft() error flow
Previous releases - regressions:
- phy: fix IRQ-based wake-on-lan over hibernate / power off
Previous releases - always broken:
- sock: fix misuse of sk_under_memory_pressure() preventing system
from exiting global TCP memory pressure if a single cgroup is under
pressure
- fix the RTO timer retransmitting skb every 1ms if linear option
is enabled
- af_key: fix sadb_x_filter validation, amment netlink policy
- ipsec: fix slab-use-after-free in decode_session6()
- macb: in ZynqMP resume always configure PS GTR for non-wakeup source
Misc:
- netfilter: set default timeout to 3 secs for sctp shutdown send and
recv state (from 300ms), align with protocol timers
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmTemA4ACgkQMUZtbf5S
IrtCThAAj+t35QM5BgGZowmrx9U4yF+kacDkdPztxlT8a/b+famrTtnZJ8USW+PF
VCk3Eu8JXheuyAOMArHyM84/crS6wim6mzGcXaucusA3981PFzoqdgCLLf9emAJ2
j9vzKrnHBtdd5fj8Exwq70KN4CzXyrzRgqwr2EXBK9lH59HjX0+J7o+trbDxNmFK
RZJE2oDCqf939iRGG3PhJryKYBmrQaMtdonNpSU5PiiRT0HnVYcEtdWcOXK7d53D
onpoaPdawcsqsns5c5Qj01E1OdyM8X54BEGkl/S4FmSw5jF9Bp6btmTcxYYtdb7E
M3CeYROZ0Kt8KcKKje/o1AzdGqWq8Hnxfwy+2WulZAHMucshg0JPm6Ev74WRondw
NGYriKJSdORSO8idK9K/i7pnjZXYr9gU50lpPUFU+QzSdd+zv+U11arjAodwI9Wi
pW+dFi3UR7J01LidaxclvHmWnZ7d5sSzE2khpqb0xd0+PagRGesl8qnKyoDJNS1P
IHsOrRh9aXLzEZjud/rVG+sUobQvc1oiHW+hvbJ04GLKoli9U5poGT2fcaa4O67M
T7JcN5oGDF+PIHJKgTEN7pfX2epY33gmofKUhbt/OPOqnvZOVbTu7/ojjuJZ8Lc5
SF8AvTe+lECcX8Htjq30PoVfai+FT6AhnZzK0H9K4HMfUB9O32Q=
=Ze13
-----END PGP SIGNATURE-----
Merge tag 'net-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from ipsec and netfilter.
No known outstanding regressions.
Fixes to fixes:
- virtio-net: set queues after driver_ok, avoid a potential race
added by recent fix
- Revert "vlan: Fix VLAN 0 memory leak", it may lead to a warning
when VLAN 0 is registered explicitly
- nf_tables:
- fix false-positive lockdep splat in recent fixes
- don't fail inserts if duplicate has expired (fix test failures)
- fix races between garbage collection and netns dismantle
Current release - new code bugs:
- mlx5: Fix mlx5_cmd_update_root_ft() error flow
Previous releases - regressions:
- phy: fix IRQ-based wake-on-lan over hibernate / power off
Previous releases - always broken:
- sock: fix misuse of sk_under_memory_pressure() preventing system
from exiting global TCP memory pressure if a single cgroup is under
pressure
- fix the RTO timer retransmitting skb every 1ms if linear option is
enabled
- af_key: fix sadb_x_filter validation, amment netlink policy
- ipsec: fix slab-use-after-free in decode_session6()
- macb: in ZynqMP resume always configure PS GTR for non-wakeup
source
Misc:
- netfilter: set default timeout to 3 secs for sctp shutdown send and
recv state (from 300ms), align with protocol timers"
* tag 'net-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits)
ice: Block switchdev mode when ADQ is active and vice versa
qede: fix firmware halt over suspend and resume
net: do not allow gso_size to be set to GSO_BY_FRAGS
sock: Fix misuse of sk_under_memory_pressure()
sfc: don't fail probe if MAE/TC setup fails
sfc: don't unregister flow_indr if it was never registered
net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset
net/mlx5: Fix mlx5_cmd_update_root_ft() error flow
net/mlx5e: XDP, Fix fifo overrun on XDP_REDIRECT
i40e: fix misleading debug logs
iavf: fix FDIR rule fields masks validation
ipv6: fix indentation of a config attribute
mailmap: add entries for Simon Horman
broadcom: b44: Use b44_writephy() return value
net: openvswitch: reject negative ifindex
team: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves
net: phy: broadcom: stub c45 read/write for 54810
netfilter: nft_dynset: disallow object maps
netfilter: nf_tables: GC transaction race with netns dismantle
netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path
...
Commit 4680b734e7 ("cpupower: Add Georgian translation") added
new language support. This change didn't add "ka" to Makefile
LANGUAGES variable. Add it now.
Reported-by: Temuri Doghonadze <temuri.doghonadze@gmail.com>
Reported-by: Zurab Kargareteli <zuraxt@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Unifying the asm-generic headers across 32-bit and 64-bit architectures
based on the compiler provided macros was a good idea and appears to work
with all user space, but it caused a regression when building old kernels
on systems that have the new headers installed in /usr/include, as this
combination trips an inconsistency in the kernel's own tools/include
headers that are a mix of userspace and kernel-internal headers.
This affects kernel builds on arm64, riscv64 and loongarch64 systems that
might end up using the "#define __BITS_PER_LONG 32" default from the old
tools headers. Backporting the commit into stable kernels would address
this, but it would still break building kernels without that backport,
and waste time for developers trying to understand the problem.
arm64 build machines are rather common, and on riscv64 this can also
happen in practice, but loongarch64 is probably new enough to not
be used much for building old kernels, so only revert the bits
for arm64 and riscv.
Link: https://lore.kernel.org/all/20230731160402.GB1823389@dev-arch.thelio-3990X/
Reported-by: Nathan Chancellor <nathan@kernel.org>
Fixes: 8386f58f8d ("asm-generic: Unify uapi bitsperlong.h for arm64, riscv and loongarch")
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZN0eNgAKCRDbK58LschI
gwhhAQCwbrEgA3LslDlk22eqyfRH04D+9d7Kc3ISQssyjlr9swD+NfwfDvYqopwj
Dp67QkHdluixf2/NMPTEvg/CA4mlmww=
=4BwF
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-08-16
We've added 17 non-merge commits during the last 6 day(s) which contain
a total of 20 files changed, 1179 insertions(+), 37 deletions(-).
The main changes are:
1) Add a BPF hook in sys_socket() to change the protocol ID
from IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy
applications, from Geliang Tang.
2) Follow-up/fallout fix from the SO_REUSEPORT + bpf_sk_assign work
to fix a splat on non-fullsock sks in inet[6]_steal_sock,
from Lorenz Bauer.
3) Improvements to struct_ops links to avoid forcing presence of
update/validate callbacks. Also add bpf_struct_ops fields documentation,
from David Vernet.
4) Ensure libbpf sets close-on-exec flag on gzopen, from Marco Vedovati.
5) Several new tcx selftest additions and bpftool link show support for
tcx and xdp links, from Daniel Borkmann.
6) Fix a smatch warning on uninitialized symbol in
bpf_perf_link_fill_kprobe, from Yafang Shao.
7) BPF selftest fixes e.g. misplaced break in kfunc_call test,
from Yipeng Zou.
8) Small cleanup to remove unused declaration bpf_link_new_file,
from Yue Haibing.
9) Small typo fix to bpftool's perf help message, from Daniel T. Lee.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
selftests/bpf: Add mptcpify test
selftests/bpf: Fix error checks of mptcp open_and_load
selftests/bpf: Add two mptcp netns helpers
bpf: Add update_socket_protocol hook
bpftool: Implement link show support for xdp
bpftool: Implement link show support for tcx
selftests/bpf: Add selftest for fill_link_info
bpf: Fix uninitialized symbol in bpf_perf_link_fill_kprobe()
net: Fix slab-out-of-bounds in inet[6]_steal_sock
bpf: Document struct bpf_struct_ops fields
bpf: Support default .validate() and .update() behavior for struct_ops links
selftests/bpf: Add various more tcx test cases
selftests/bpf: Clean up fmod_ret in bench_rename test script
selftests/bpf: Fix repeat option when kfunc_call verification fails
libbpf: Set close-on-exec flag on gzopen
bpftool: fix perf help message
bpf: Remove unused declaration bpf_link_new_file()
====================
Link: https://lore.kernel.org/r/20230816212840.1539-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For stack-validation of a frame-pointer build, objtool validates that
every CALL instruction is preceded by a frame-setup. The new SRSO
return thunks violate this with their RSB stuffing trickery.
Extend the __fentry__ exception to also cover the embedded_insn case
used for this. This cures:
vmlinux.o: warning: objtool: srso_untrain_ret+0xd: call without frame pointer save/setup
Fixes: 4ae68b26c3 ("objtool/x86: Fix SRSO mess")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/20230816115921.GH980931@hirez.programming.kicks-ass.net
Rename the original retbleed return thunk and untrain_ret to
retbleed_return_thunk() and retbleed_untrain_ret().
No functional changes.
Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.909378169@infradead.org
Use the existing configurable return thunk. There is absolute no
justification for having created this __x86_return_thunk alternative.
To clarify, the whole thing looks like:
Zen3/4 does:
srso_alias_untrain_ret:
nop2
lfence
jmp srso_alias_return_thunk
int3
srso_alias_safe_ret: // aliasses srso_alias_untrain_ret just so
add $8, %rsp
ret
int3
srso_alias_return_thunk:
call srso_alias_safe_ret
ud2
While Zen1/2 does:
srso_untrain_ret:
movabs $foo, %rax
lfence
call srso_safe_ret (jmp srso_return_thunk ?)
int3
srso_safe_ret: // embedded in movabs instruction
add $8,%rsp
ret
int3
srso_return_thunk:
call srso_safe_ret
ud2
While retbleed does:
zen_untrain_ret:
test $0xcc, %bl
lfence
jmp zen_return_thunk
int3
zen_return_thunk: // embedded in the test instruction
ret
int3
Where Zen1/2 flush the BTB entry using the instruction decoder trick
(test,movabs) Zen3/4 use BTB aliasing. SRSO adds a return sequence
(srso_safe_ret()) which forces the function return instruction to
speculate into a trap (UD2). This RET will then mispredict and
execution will continue at the return site read from the top of the
stack.
Pick one of three options at boot (evey function can only ever return
once).
[ bp: Fixup commit message uarch details and add them in a comment in
the code too. Add a comment about the srso_select_mitigation()
dependency on retbleed_select_mitigation(). Add moar ifdeffery for
32-bit builds. Add a dummy srso_untrain_ret_alias() definition for
32-bit alternatives needing the symbol. ]
Fixes: fb3bd914b3 ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.842775684@infradead.org
This testcase is constrived to reproduce a problem that the cpu buffers
become unavailable which is due to 'record_disabled' of array_buffer and
max_buffer being messed up.
Local test result after bugfix:
# ./ftracetest test.d/00basic/snapshot1.tc
=== Ftrace unit tests ===
[1] Snapshot and tracing_cpumask [PASS]
[2] (instance) Snapshot and tracing_cpumask [PASS]
# of passed: 2
# of failed: 0
# of unresolved: 0
# of untested: 0
# of unsupported: 0
# of xfailed: 0
# of undefined(test bug): 0
Link: https://lkml.kernel.org/r/20230805033816.3284594-3-zhengyejian1@huawei.com
Cc: <mhiramat@kernel.org>
Cc: <vnagarnaik@google.com>
Cc: <shuah@kernel.org>
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Implement a new test program mptcpify: if the family is AF_INET or
AF_INET6, the type is SOCK_STREAM, and the protocol ID is 0 or
IPPROTO_TCP, set it to IPPROTO_MPTCP. It will be hooked in
update_socket_protocol().
Extend the MPTCP test base, add a selftest test_mptcpify() for the
mptcpify case. Open and load the mptcpify test prog to mptcpify the
TCP sockets dynamically, then use start_server() and connect_to_fd()
to create a TCP socket, but actually what's created is an MPTCP
socket, which can be verified through 'getsockopt(SOL_PROTOCOL)'
and 'getsockopt(MPTCP_INFO)'.
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Link: https://lore.kernel.org/r/364e72f307e7bb38382ec7442c182d76298a9c41.1692147782.git.geliang.tang@suse.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Return libbpf_get_error(), instead of -EIO, for the error from
mptcp_sock__open_and_load().
Load success means prog_fd and map_fd are always valid. So drop these
unneeded ASSERT_GE checks for them in mptcp run_test().
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Link: https://lore.kernel.org/r/db5fcb93293df9ab173edcbaf8252465b80da6f2.1692147782.git.geliang.tang@suse.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add two netns helpers for mptcp tests: create_netns() and
cleanup_netns(). Use them in test_base().
These new helpers will be re-used in the following commits
introducing new tests.
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Link: https://lore.kernel.org/r/7506371fb6c417b401cc9d7365fe455754f4ba3f.1692147782.git.geliang.tang@suse.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add support to dump XDP link information to bpftool. This reuses the
recently added show_link_ifindex_{plain,json}(). The XDP link info only
exposes the ifindex.
Below shows an example link dump output, and a cgroup link is included
for comparison, too:
# bpftool link
[...]
10: cgroup prog 2466
cgroup_id 1 attach_type cgroup_inet6_post_bind
[...]
16: xdp prog 2477
ifindex enp5s0(3)
[...]
Equivalent json output:
# bpftool link --json
[...]
{
"id": 10,
"type": "cgroup",
"prog_id": 2466,
"cgroup_id": 1,
"attach_type": "cgroup_inet6_post_bind"
},
[...]
{
"id": 16,
"type": "xdp",
"prog_id": 2477,
"devname": "enp5s0",
"ifindex": 3
}
[...]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230816095651.10014-2-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add support to dump tcx link information to bpftool. This adds a
common helper show_link_ifindex_{plain,json}() which can be reused
also for other link types. The plain text and json device output is
the same format as in bpftool net dump.
Below shows an example link dump output along with a cgroup link
for comparison:
# bpftool link
[...]
10: cgroup prog 1977
cgroup_id 1 attach_type cgroup_inet6_post_bind
[...]
13: tcx prog 2053
ifindex enp5s0(3) attach_type tcx_ingress
14: tcx prog 2080
ifindex enp5s0(3) attach_type tcx_egress
[...]
Equivalent json output:
# bpftool link --json
[...]
{
"id": 10,
"type": "cgroup",
"prog_id": 1977,
"cgroup_id": 1,
"attach_type": "cgroup_inet6_post_bind"
},
[...]
{
"id": 13,
"type": "tcx",
"prog_id": 2053,
"devname": "enp5s0",
"ifindex": 3,
"attach_type": "tcx_ingress"
},
{
"id": 14,
"type": "tcx",
"prog_id": 2080,
"devname": "enp5s0",
"ifindex": 3,
"attach_type": "tcx_egress"
}
[...]
Suggested-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230816095651.10014-1-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
At the moment the cachestat syscall number is hard coded into the test
source code.
Remove that and replace it with the proper __NR_cachestat macro.
That ensures compatibility should other architectures pick a different
number.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Libraries should be listed last on the compiler's command line, so that
the linker can look for and find still unresolved symbols. The librt
library, required for the shm_* functions, was announced using CFLAGS,
which puts the library *before* the source files, and fails compilation
on my system:
======================
gcc -isystem /src/linux-selftests/usr/include -Wall -lrt test_cachestat.c
-o /src/linux-selftests/kselftest/cachestat/test_cachestat
/usr/bin/ld: /tmp/cceQWO3u.o: in function `test_cachestat_shmem':
test_cachestat.c:(.text+0x890): undefined reference to `shm_open'
/usr/bin/ld: test_cachestat.c:(.text+0x99c): undefined reference to `shm_unlink'
collect2: error: ld returned 1 exit status
make[4]: *** [../lib.mk:181: /src/linux-selftests/kselftest/cachestat/test_cachestat] Error 1
======================
Announce the library using the LDLIBS variable, which ensures the proper
ordering on the command line.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Observed occassional failures in the futex_wait_timeout test:
ok 1 futex_wait relative succeeds
ok 2 futex_wait_bitset realtime succeeds
ok 3 futex_wait_bitset monotonic succeeds
ok 4 futex_wait_requeue_pi realtime succeeds
ok 5 futex_wait_requeue_pi monotonic succeeds
not ok 6 futex_lock_pi realtime returned 0
......
The test expects the child thread to complete some steps before
the parent thread gets to run. There is an implicit expectation
of the order of invocation of futex_lock_pi between the child thread
and the parent thread. Make this order explicit. If the order is
not met, the futex_lock_pi call in the parent thread succeeds and
will not timeout.
Fixes: f4addd54b1 ("selftests: futex: Expand timeout test")
Signed-off-by: Nysal Jan K.A <nysal@linux.ibm.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
We have some dmabuf-heaps and perf_events tests but they are not hooked
up to the kselftest build infrastructure which is a bit of an obstacle
to running them in systems with generic infrastructure for selftests.
Add them to the top level kselftest Makefile so they get built as
standard.
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The user_events selftests were removed from the standard set of
selftests due to the uapi header it relies on having been temporarily
removed. That header is now reinstated so we can reenable the tests.
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
In busybox, the mktemp requires that the generated filename be
suffixed with at least six consecutive 'X' characters. Otherwise,
it will return an "Invalid argument" error.
Signed-off-by: Hui Min Mina Chou <minachou@andestech.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Add selftest for the fill_link_info of uprobe, kprobe and tracepoint.
The result:
$ tools/testing/selftests/bpf/test_progs --name=fill_link_info
#79/1 fill_link_info/kprobe_link_info:OK
#79/2 fill_link_info/kretprobe_link_info:OK
#79/3 fill_link_info/kprobe_invalid_ubuff:OK
#79/4 fill_link_info/tracepoint_link_info:OK
#79/5 fill_link_info/uprobe_link_info:OK
#79/6 fill_link_info/uretprobe_link_info:OK
#79/7 fill_link_info/kprobe_multi_link_info:OK
#79/8 fill_link_info/kretprobe_multi_link_info:OK
#79/9 fill_link_info/kprobe_multi_invalid_ubuff:OK
#79 fill_link_info:OK
Summary: 1/9 PASSED, 0 SKIPPED, 0 FAILED
The test case for kprobe_multi won't be run on aarch64, as it is not
supported.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230813141900.1268-3-laoar.shao@gmail.com
riscv now supports mmaping hardware counters to userspace so adapt the test
to run on this architecture.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Ian Rogers <irogers@google.com>
riscv now supports mmaping hardware counters so add what's needed to
take advantage of that in libperf.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Add the jscvt feature check in the set of hwcap tests.
Due to the requirement of jscvt feature, a compiler configuration
of v8.3 or above is needed to support assembly. Therefore, hand
encode is used here instead.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230815040915.3966955-5-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Now that ptrace and perf are no longer exclusive, update the
test to exercise interesting interactions.
An assembly file is used for the children to allow precise instruction
choice and addresses, while avoiding any compiler quirks.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-7-bgray@linux.ibm.com
Commit ddb5cdbafa ("kbuild: generate KSYMTAB entries by modpost")
deprecated <asm/export.h>, which is now a wrapper of <linux/export.h>.
Replace #include <asm/export.h> with #include <linux/export.h>.
After all the <asm/export.h> lines are converted, <asm/export.h> and
<asm-generic/export.h> will be removed.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
[mpe: Fixup selftests that stub asm/export.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230806150954.394189-2-masahiroy@kernel.org
The arm64 BTI selftests are currently built in the source directory,
then the generated binaries are copied to the output directory.
This leaves the object files around in a potentially otherwise pristine
source tree, tainting it for out-of-tree kernel builds.
Prepend $(OUTPUT) to every reference to an object file in the Makefile,
and remove the extra handling and copying. This puts all generated files
under the output directory.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230815145931.2522557-1-andre.przywara@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Add 1000 IPv6 routes with expiration time (w/ and w/o additional 5000
permanet routes in the background.) Wait for a few seconds to make sure
they are removed correctly.
The expected output of the test looks like the following example.
> Fib6 garbage collection test
> TEST: ipv6 route garbage collection [ OK ]
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Objtool --rethunk does two things:
- it collects all (tail) call's of __x86_return_thunk and places them
into .return_sites. These are typically compiler generated, but
RET also emits this same.
- it fudges the validation of the __x86_return_thunk symbol; because
this symbol is inside another instruction, it can't actually find
the instruction pointed to by the symbol offset and gets upset.
Because these two things pertained to the same symbol, there was no
pressing need to separate these two separate things.
However, alas, along comes SRSO and more crazy things to deal with
appeared.
The SRSO patch itself added the following symbol names to identify as
rethunk:
'srso_untrain_ret', 'srso_safe_ret' and '__ret'
Where '__ret' is the old retbleed return thunk, 'srso_safe_ret' is a
new similarly embedded return thunk, and 'srso_untrain_ret' is
completely unrelated to anything the above does (and was only included
because of that INT3 vs UD2 issue fixed previous).
Clear things up by adding a second category for the embedded instruction
thing.
Fixes: fb3bd914b3 ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.704502245@infradead.org
When run command "ip netns delete client", device link1_1 has been
deleted. So, it is no need to delete link1_1 again. Remove it.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When developing specs its useful to know which attr space
YNL was trying to find an attribute in on key error.
Instead of printing:
KeyError: 0
add info about the space:
Exception: Space 'vport' has no attribute with value '0'
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230814205627.2914583-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This selftest is designed for testing the support of NEXT-C-SID flavor
for SRv6 End.X behavior. It instantiates a virtual network composed of
several nodes: hosts and SRv6 routers. Each node is realized using a
network namespace that is properly interconnected to others through veth
pairs, according to the topology depicted in the selftest script file.
The test considers SRv6 routers implementing IPv4/IPv6 L3 VPNs leveraged
by hosts for communicating with each other. Such routers i) apply
different SRv6 Policies to the traffic received from connected hosts,
considering the IPv4 or IPv6 protocols; ii) use the NEXT-C-SID
compression mechanism for encoding several SRv6 segments within a single
128-bit SID address, referred to as a Compressed SID (C-SID) container.
The NEXT-C-SID is provided as a "flavor" of the SRv6 End.X behavior,
enabling it to properly process the C-SID containers. The correct
execution of the enabled NEXT-C-SID SRv6 End.X behavior is verified
through reachability tests carried out between hosts belonging to the
same VPN.
Signed-off-by: Paolo Lungaroni <paolo.lungaroni@uniroma2.it>
Co-developed-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230812180926.16689-3-andrea.mayer@uniroma2.it
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmTZISMeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGP+kH/RJWesO8dQ1b2jRh
v1dexbytGUykROpmHBnJKDznwsSBnhDlI9Tu62dumWKRrCzwZto8Hag1QC2jYrra
x7f3W087HdTSh3j5B92kGK/ZXgm4NwjVI078ujSv/e+qJMB3behpdL7uUkFUeeVV
OaDhlSL4ILlyVOYPX3sHMiPutmZcXxe8/25o4aylpBrzlClKen7OODRz6gIwyVOR
Nufgi/H5bkB4rDLOVI87HrxQMSpCtyGJtjTB78e/aRvIwYhJq16iuq+uBqOxQqgr
anlg1nJ3r6/LphiT9H63xNFwIJDxtL7I1V8CQ9Jyvf/O4MNGSaM7sHw2l8ujTxU9
hf4GYyY=
=loC2
-----END PGP SIGNATURE-----
Merge tag 'v6.5-rc6' into iommufd for-next
Required for following patches.
Resolve merge conflict by using the hunk from the for-next branch and
shifting the iommufd_object_deref_user() into iommufd_hw_pagetable_put()
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Allow user to pass port index for health reporter dump request.
Re-generate the related code.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-14-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extend per-instance dump command definitions to accept instance
attributes. Allow parsing of devlink handle attributes so they could
be used for instance selection.
Re-generate the related code.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-12-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add the definitions for the commands that do per-instance dump
and re-generate the related code.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-8-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Running the bench_rename test script, the following error occurs:
# ./benchs/run_bench_rename.sh
base : 0.819 ± 0.012M/s
kprobe : 0.538 ± 0.009M/s
kretprobe : 0.503 ± 0.004M/s
rawtp : 0.779 ± 0.020M/s
fentry : 0.726 ± 0.007M/s
fexit : 0.691 ± 0.007M/s
benchmark 'rename-fmodret' not found
The bench_rename_fmodret has been removed in commit b000def2e0
("selftests: Remove fmod_ret from test_overhead"), thus remove it
from the runners in the test script.
Fixes: b000def2e0 ("selftests: Remove fmod_ret from test_overhead")
Signed-off-by: Yipeng Zou <zouyipeng@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230814030727.3010390-1-zouyipeng@huawei.com
There is no way where topts.repeat can be set to 1 when tc_test fails.
Fix the typo where the break statement slipped by one line.
Fixes: fb66223a24 ("selftests/bpf: add test for accessing ctx from syscall program type")
Signed-off-by: Yipeng Zou <zouyipeng@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Li Zetao <lizetao1@huawei.com>
Link: https://lore.kernel.org/bpf/20230814031434.3077944-1-zouyipeng@huawei.com
Enable the close-on-exec flag when using gzopen. This is especially important
for multithreaded programs making use of libbpf, where a fork + exec could
race with libbpf library calls, potentially resulting in a file descriptor
leaked to the new process. This got missed in 59842c5451 ("libbpf: Ensure
libbpf always opens files with O_CLOEXEC").
Fixes: 59842c5451 ("libbpf: Ensure libbpf always opens files with O_CLOEXEC")
Signed-off-by: Marco Vedovati <marco.vedovati@crowdstrike.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230810214350.106301-1-martin.kelly@crowdstrike.com
This test verifies whether the encapsulated packets have the correct
configured TTL. It does so by sending ICMP packets through the test
topology and mirroring them to a gretap netdevice. On a busy host
however, more than just the test ICMP packets may end up flowing
through the topology, get mirrored, and counted. This leads to
potential spurious failures as the test observes much more mirrored
packets than the sent test packets, and assumes a bug.
Fix this by tightening up the mirror action match. Change it from
matchall to a flower classifier matching on ICMP packets specifically.
Fixes: 45315673e0 ("selftests: forwarding: Test changes in mirror-to-gretap")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kprobes optimization check can_optimize() calls
insn_is_indirect_jump() to detect indirect jump instructions in
a target function. If any is found, creating an optprobe is disallowed
in the function because the jump could be from a jump table and could
potentially land in the middle of the target optprobe.
With retpolines, insn_is_indirect_jump() additionally looks for calls to
indirect thunks which the compiler potentially used to replace original
jumps. This extra check is however unnecessary because jump tables are
disabled when the kernel is built with retpolines. The same is currently
the case with IBT.
Based on this observation, remove the logic to look for calls to
indirect thunks and skip the check for indirect jumps altogether if the
kernel is built with retpolines or IBT. Remove subsequently the symbols
__indirect_thunk_start and __indirect_thunk_end which are no longer
needed.
Dropping this logic indirectly fixes a problem where the range
[__indirect_thunk_start, __indirect_thunk_end] wrongly included also the
return thunk. It caused that machines which used the return thunk as
a mitigation and didn't have it patched by any alternative ended up not
being able to use optprobes in any regular function.
Fixes: 0b53c374b9 ("x86/retpoline: Use -mfunction-return")
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20230711091952.27944-3-petr.pavlu@suse.com
The linker script arch/x86/kernel/vmlinux.lds.S matches the thunk
sections ".text.__x86.*" from arch/x86/lib/retpoline.S as follows:
.text {
[...]
TEXT_TEXT
[...]
__indirect_thunk_start = .;
*(.text.__x86.*)
__indirect_thunk_end = .;
[...]
}
Macro TEXT_TEXT references TEXT_MAIN which normally expands to only
".text". However, with CONFIG_LTO_CLANG, TEXT_MAIN becomes
".text .text.[0-9a-zA-Z_]*" which wrongly matches also the thunk
sections. The output layout is then different than expected. For
instance, the currently defined range [__indirect_thunk_start,
__indirect_thunk_end] becomes empty.
Prevent the problem by using ".." as the first separator, for example,
".text..__x86.indirect_thunk". This pattern is utilized by other
explicit section names which start with one of the standard prefixes,
such as ".text" or ".data", and that need to be individually selected in
the linker script.
[ nathan: Fix conflicts with SRSO and fold in fix issue brought up by
Andrew Cooper in post-review:
https://lore.kernel.org/20230803230323.1478869-1-andrew.cooper3@citrix.com ]
Fixes: dc5723b02e ("kbuild: add support for Clang LTO")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230711091952.27944-2-petr.pavlu@suse.com
Check that traffic can be redirected from a locked bridge port and that
it does not create locked FDB entries.
Cc: Hans J. Schultz <netdev@kapio-technology.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Test explicit drops generate the right drop reason. Also, verify that
the kernel rejects flows with actions following an explicit drop.
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Test if the correct drop reason is reported when OVS drops a packet due
to an explicit flow.
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Eric Garver <eric@garver.life>
This adds an explicit drop action. This is used by OVS to drop packets
for which it cannot determine what to do. An explicit action in the
kernel allows passing the reason _why_ the packet is being dropped or
zero to indicate no particular error happened (i.e: OVS intentionally
dropped the packet).
Since the error codes coming from userspace mean nothing for the kernel,
we squash all of them into only two drop reasons:
- OVS_DROP_EXPLICIT_WITH_ERROR to indicate a non-zero value was passed
- OVS_DROP_EXPLICIT to indicate a zero value was passed (no error)
e.g. trace all OVS dropped skbs
# perf trace -e skb:kfree_skb --filter="reason >= 0x30000"
[..]
106.023 ping/2465 skb:kfree_skb(skbaddr: 0xffffa0e8765f2000, \
location:0xffffffffc0d9b462, protocol: 2048, reason: 196611)
reason: 196611 --> 0x30003 (OVS_DROP_EXPLICIT)
Also, this patch allows ovs-dpctl.py to add explicit drop actions as:
"drop" -> implicit empty-action drop
"drop(0)" -> explicit non-error action drop
"drop(42)" -> explicit error action drop
Signed-off-by: Eric Garver <eric@garver.life>
Co-developed-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here are some small char/misc driver fixes for 6.5-rc6 that resolve some
reported issues. Included in here are:
- bunch of iio driver fixes for reported problems
- interconnect driver fixes
- counter driver build fix
- cardreader driver fixes
- binder driver fixes
- other tiny driver fixes
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZNdWFQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yl9TACeOsUiS31nWuv+JS6Kljj5qZ0p+V0An0kmBw/+
FOQVIk0niv+Hnnb4WwvF
=ZtpJ
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc driver fixes from Greg KH:
"Here are some small char/misc driver fixes for 6.5-rc6 that resolve
some reported issues. Included in here are:
- bunch of iio driver fixes for reported problems
- interconnect driver fixes
- counter driver build fix
- cardreader driver fixes
- binder driver fixes
- other tiny driver fixes
All of these have been in linux-next for a while with no reported
problems"
* tag 'char-misc-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits)
misc: tps6594-esm: Disable ESM for rev 1 PMIC
misc: rtsx: judge ASPM Mode to set PETXCFG Reg
binder: fix memory leak in binder_init()
iio: cros_ec: Fix the allocation size for cros_ec_command
tools/counter: Makefile: Replace rmdir by rm to avoid make,clean failure
iio: imu: lsm6dsx: Fix mount matrix retrieval
iio: adc: meson: fix core clock enable/disable moment
iio: core: Prevent invalid memory access when there is no parent
iio: frequency: admv1013: propagate errors from regulator_get_voltage()
counter: Fix menuconfig "Counter support" submenu entries disappearance
dt-bindings: iio: adi,ad74115: remove ref from -nanoamp
iio: adc: ina2xx: avoid NULL pointer dereference on OF device match
iio: light: bu27008: Fix intensity data type
iio: light: bu27008: Fix scale format
iio: light: bu27034: Fix scale format
iio: adc: ad7192: Fix ac excitation feature
interconnect: qcom: sa8775p: add enable_mask for bcm nodes
interconnect: qcom: sm8550: add enable_mask for bcm nodes
interconnect: qcom: sm8450: add enable_mask for bcm nodes
interconnect: qcom: Add support for mask-based BCMs
...
Currently, bpftool perf subcommand has typo with the help message.
$ tools/bpf/bpftool/bpftool perf help
Usage: bpftool perf { show | list }
bpftool perf help }
Since this bpftool perf subcommand help message has the extra bracket,
this commit fix the typo by removing the extra bracket.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230811121603.17429-1-danieltimlee@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
issues, or are not considered suitable for -stable backporting.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZNad/gAKCRDdBJ7gKXxA
jmw6AP9u6k8XcS8ec3/u0IUEuh7ckHx5Vvjfmo5YgWlIJDeWegD9G2fh3ZJgcjMO
jMssklfXmP+QSijCIxUva1TlzwtPDAQ=
=MqiN
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2023-08-11-13-44' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"14 hotfixes. 11 of these are cc:stable and the remainder address
post-6.4 issues, or are not considered suitable for -stable
backporting"
* tag 'mm-hotfixes-stable-2023-08-11-13-44' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/damon/core: initialize damo_filter->list from damos_new_filter()
nilfs2: fix use-after-free of nilfs_root in dirtying inodes via iput
selftests: cgroup: fix test_kmem_basic false positives
fs/proc/kcore: reinstate bounce buffer for KCORE_TEXT regions
MAINTAINERS: add maple tree mailing list
mm: compaction: fix endless looping over same migrate block
selftests: mm: ksm: fix incorrect evaluation of parameter
hugetlb: do not clear hugetlb dtor until allocating vmemmap
mm: memory-failure: avoid false hwpoison page mapped error info
mm: memory-failure: fix potential unexpected return value from unpoison_memory()
mm/swapfile: fix wrong swap entry type for hwpoisoned swapcache page
radix tree test suite: fix incorrect allocation size for pthreads
crypto, cifs: fix error handling in extract_iter_to_sg()
zsmalloc: fix races between modifications of fullness and isolated
New device support
* adi,ad8366
- Add support for the HMC792 digital attenuator (mostly chip specific data)
* alwinner,sun20i-gpadc
- New driver for the integrated ADC on a number of allwinner SoCs
including dt-binding documentation.
* microchip,mcp4728
- New driver for this quad channel DAC. Includes dt-bindings.
* miramems, da280
- Add ID for DA217 accelerometer which is compatible with the da280.
* murata,irs-d200
- New driver for this passive infrared sensor typically used for human
detection. Includes bindings and a few pieces of new ABI to
cover a case of needing to count a number of repeats of an event
before reporting it.
* rohm,bu27008
- Add initial support for the BU27010 RGB + flickering sensor to this
driver. Substantial refactoring was needed to enable this.
Features
* adi,admv8818
- Add mode that bypasses the input and output filters.
* amlogic,meson
- Support control of the MUX on channel 7, exposed as multiple channels.
- Support channel labels.
* sensirion,scd4x
- Add pressure compensation. Controlled via an 'output' pressure channel.
* ti,lmp92040
- Add IIO buffered supported (read via chrdev).
* vishay,vcnl4000
- Add proximity interrupt support for vcnl4200.
- Add proximity integration time control for vcnl4200.
- Add illuminance integration time control for vcnl4040 and vcnl4200.
- Add calibration bias, proximity and illuminance event period, and
oversampling ratio control for vcnl4040 and vncl4200.
Cleanup and minor fixes
* core
- Tidy up handling of set_trigger_state() callback return values
to consistently assume no positive return values.
- Use min() rather than min_t() in a case where types were clearly
the same.
- Drop some else statements that follow continue with a loop or
a returns.
- White space and comment format cleanup.
- Use sysfs_match_string() helper to improve readability.
- Use krealloc_array() to make it explicit a krealloc is for an array
of structures, not just one.
* tools
- Tidy up potential overflow in array index.
* tree wide
- Fix up includes for DT related headers.
- Drop some error prints in places where as similar error message
is printed by the function being called.
- Tidy up handling of return value from platform_get_irq() to no longer
take into account 0 as a value that might be returned. Similar for
fwnode_irq_get().
* adi,ad7192
- Add missing error check and improved debug logging.
- Use sysfs_emit_at() rather than open coded variant.
* adi,adis16475
- Drop unused scan element enum entries.
- Specify that a few more devices support burst32 mode.
* adi,admv1013
- Enable all required regulators and document as required in the
dt-binding.
* adi,admv1014
- Make all regulators required in the dt-binding as the device needs
them all enabled.
* adi,adxl313
- Fix wrong enum values being used in the i2c_device_id table.
- Use i2c_get_match_data() to reduce open coded handling of the
various id tables.
* allwinner,gpadc
- Make the kconfig text more specific to make space for separate drivers
for other Allwinner devices.
* amlogic,meson
- Drop unused timestamp channels as no buffer support.
- Various minor reorganizations to enable addition of support channel 7
MUX.
- Initialize some default values to account for potential previous user
since reboot.
* qcom,spmi-adc5
- Add ADC5_GPIO2_100K_PU support to driver to line up with bindings.
* qcom,spmi-adc7
- Use predefined channel ID definitions rather than values.
* invensense, common
- Factor out the timestamp handling to a module used by both mpu6050 and
icm42600.
* invensense,mpu6050
- Read as many FIFO elements as possible in one bus access.
* men,s188
- Drop redundant initialization of driver owner field.
* microchip,mcp4018 and mcp4531
- Use i2c_get_match_data() instead of open coding. Includes making the
data format the same for the i2c_device_id and firmware match
tables.
* semtech,sx9310
- dt-bindings: Add reference to IIO schema to provide the label property.
* semtech,sx9324
- dt-bindings: Add reference to IIO schema to provide the label property.
* st,stm32-adc
- Use devm_platform_get_and_ioremap_resource() instead of open coded
version.
* st,stm-lptimer-trigger
- Drop setting platform drvdata as it wasn't then used.
* ti,ads1015
- Fix wrong dt binding description of ti,datarate for some devices.
* vishay,vcnl4200
- Move to switch statements for channel type checking to make later
additions simpler.
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAmTTydERHGppYzIzQGtl
cm5lbC5vcmcACgkQVIU0mcT0Foj9sw//QDch/sLMb1bEiJTHDwECuSgm+jWth8gi
CdG3GQerBU2U+eQkf2jgG73HhfHccanKWGAHfkdOLSblcFWJL6Q+MSJCNu4J3kGY
2XUkdazXMXRbfqoq3vqnC4u6mEMXGTpq8JZoAz7XXCs8GcW/loyqV2E54cbQJxCJ
KSqCWIhoWAU7/PQkgOJLBnDirMt+rEwUObifG4PJcOYIPq17qS+XhpMVPydNfYgg
3PJ+jK4Gg6cZsI3lW7vaBoLvzYRn+Ri2vVc2ZnwyLI/y9rEJPjt4aXt9FZbw10Eg
S/7twD1Ns5n8X6IS3VFA4A4ITu0VFX3pjOg9WLEpkqRJqLt1n2R9pcnTkD6x4a0r
6LZvmLij9R6ozPZKlalncyOPZamnEcIxyqyuqcEEFDAF6/Mf+VznwBQM8HDhQ/CO
3OAGkNprtXIvKAz/xiGx9r8z8d5Gt1lD+vclcDcf6U8YwEkMWrc95EVIumcNOkG2
qNESgzDvOh92aHpKfwrsHproNKcKbtm9he1BrN5u28G+UBA1jsPg15hmYVkv/Hd3
v6tgvkxVSn2xS94K2BhqjLzOgXgl4EmbSewg5AfPVNYM1RZrMLJvBky1/81JJmsD
565dcF/vE3/hguFX5lCOKGTxT88pwFNHth26ypAOXuwhUNz+jxp9NJ0Adn/IR8Kp
hA/SfjALiDA=
=hTrR
-----END PGP SIGNATURE-----
Merge tag 'iio-for-6.6a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next
Jonathan writes:
1st set of IIO new device support, features and cleanup for 6.6
New device support
* adi,ad8366
- Add support for the HMC792 digital attenuator (mostly chip specific data)
* alwinner,sun20i-gpadc
- New driver for the integrated ADC on a number of allwinner SoCs
including dt-binding documentation.
* microchip,mcp4728
- New driver for this quad channel DAC. Includes dt-bindings.
* miramems, da280
- Add ID for DA217 accelerometer which is compatible with the da280.
* murata,irs-d200
- New driver for this passive infrared sensor typically used for human
detection. Includes bindings and a few pieces of new ABI to
cover a case of needing to count a number of repeats of an event
before reporting it.
* rohm,bu27008
- Add initial support for the BU27010 RGB + flickering sensor to this
driver. Substantial refactoring was needed to enable this.
Features
* adi,admv8818
- Add mode that bypasses the input and output filters.
* amlogic,meson
- Support control of the MUX on channel 7, exposed as multiple channels.
- Support channel labels.
* sensirion,scd4x
- Add pressure compensation. Controlled via an 'output' pressure channel.
* ti,lmp92040
- Add IIO buffered supported (read via chrdev).
* vishay,vcnl4000
- Add proximity interrupt support for vcnl4200.
- Add proximity integration time control for vcnl4200.
- Add illuminance integration time control for vcnl4040 and vcnl4200.
- Add calibration bias, proximity and illuminance event period, and
oversampling ratio control for vcnl4040 and vncl4200.
Cleanup and minor fixes
* core
- Tidy up handling of set_trigger_state() callback return values
to consistently assume no positive return values.
- Use min() rather than min_t() in a case where types were clearly
the same.
- Drop some else statements that follow continue with a loop or
a returns.
- White space and comment format cleanup.
- Use sysfs_match_string() helper to improve readability.
- Use krealloc_array() to make it explicit a krealloc is for an array
of structures, not just one.
* tools
- Tidy up potential overflow in array index.
* tree wide
- Fix up includes for DT related headers.
- Drop some error prints in places where as similar error message
is printed by the function being called.
- Tidy up handling of return value from platform_get_irq() to no longer
take into account 0 as a value that might be returned. Similar for
fwnode_irq_get().
* adi,ad7192
- Add missing error check and improved debug logging.
- Use sysfs_emit_at() rather than open coded variant.
* adi,adis16475
- Drop unused scan element enum entries.
- Specify that a few more devices support burst32 mode.
* adi,admv1013
- Enable all required regulators and document as required in the
dt-binding.
* adi,admv1014
- Make all regulators required in the dt-binding as the device needs
them all enabled.
* adi,adxl313
- Fix wrong enum values being used in the i2c_device_id table.
- Use i2c_get_match_data() to reduce open coded handling of the
various id tables.
* allwinner,gpadc
- Make the kconfig text more specific to make space for separate drivers
for other Allwinner devices.
* amlogic,meson
- Drop unused timestamp channels as no buffer support.
- Various minor reorganizations to enable addition of support channel 7
MUX.
- Initialize some default values to account for potential previous user
since reboot.
* qcom,spmi-adc5
- Add ADC5_GPIO2_100K_PU support to driver to line up with bindings.
* qcom,spmi-adc7
- Use predefined channel ID definitions rather than values.
* invensense, common
- Factor out the timestamp handling to a module used by both mpu6050 and
icm42600.
* invensense,mpu6050
- Read as many FIFO elements as possible in one bus access.
* men,s188
- Drop redundant initialization of driver owner field.
* microchip,mcp4018 and mcp4531
- Use i2c_get_match_data() instead of open coding. Includes making the
data format the same for the i2c_device_id and firmware match
tables.
* semtech,sx9310
- dt-bindings: Add reference to IIO schema to provide the label property.
* semtech,sx9324
- dt-bindings: Add reference to IIO schema to provide the label property.
* st,stm32-adc
- Use devm_platform_get_and_ioremap_resource() instead of open coded
version.
* st,stm-lptimer-trigger
- Drop setting platform drvdata as it wasn't then used.
* ti,ads1015
- Fix wrong dt binding description of ti,datarate for some devices.
* vishay,vcnl4200
- Move to switch statements for channel type checking to make later
additions simpler.
* tag 'iio-for-6.6a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (73 commits)
Documentation: ABI: testing: admv8818: add bypass
drivers: iio: filter: admv8818: add bypass mode
iio: light: bd27008: Support BD27010 RGB
iio: light: bu27008: add chip info
dt-bindings: iio: ROHM BU27010 RGBC + flickering sensor
iio: add MCP4728 I2C DAC driver
dt-bindings: iio: dac: add mcp4728.yaml
drivers: iio: admv1013: add vcc regulators
dt-bindings: iio: admv1013: add vcc regulators
iio: trigger: stm32-lptimer-trigger: remove unneeded platform_set_drvdata()
iio: adc: men_z188_adc: Remove redundant initialization owner in men_z188_driver
dt-bindings: iio: admv1014: make all regs required
iio: cdc: ad7150: relax return value check for IRQ get
iio: mb1232: relax return value check for IRQ get
iio: adc: fix the return value handle for platform_get_irq()
tools: iio: iio_generic_buffer: Fix some integer type and calculation
iio: potentiometer: mcp4531: Use i2c_get_match_data()
iio: potentiometer: mcp4018: Use i2c_get_match_data()
iio: core: Fix issues and style of the comments
iio: core: Switch to krealloc_array()
...
Our ABI opts to provide future proofing by defining a much larger
SVE_VQ_MAX than the architecture actually supports. Since we use
this define to control the size of our vector data buffers this results
in a lot of overhead when we initialise which can be a very noticable
problem in emulation, we fill buffers that are orders of magnitude
larger than we will ever actually use even with virtual platforms that
provide the full range of architecturally supported vector lengths.
Define and use the actual architecture maximum to mitigate this.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230810-arm64-syscall-abi-perf-v1-1-6a0d7656359c@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Add the LSE and various features check in the set of hwcap tests.
As stated in the ARM manual, the LSE2 feature allows for atomic access
to unaligned memory. Therefore, for processors that only have the LSE
feature, we register .sigbus_fn to test their ability to perform
unaligned access.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230808134036.668954-6-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Some enhanced features, such as the LSE2 feature, do not result in
SILLILL if LSE2 is missing and LSE is present, but will generate a
SIGBUS exception when atomic access unaligned.
Therefore, we add test item to test this type of features.
Notice that testing for SIGBUS only makes sense after make sure that
the instruction does not cause a SIGILL signal.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230808134036.668954-5-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Add macro definition functions DEF_SIGHANDLER_FUNC() and
DEF_INST_RAISE_SIG() helpers.
Furthermore, there is no need to modify the default SIGILL handling
function throughout the entire testing lifecycle in the main() function.
It is reasonable to narrow the scope to the context of the sig_fn
function only.
This is a pre-patch for the subsequent SIGBUS handler patch.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230808134036.668954-4-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Tests that were expecting a signal were not correctly checking for a
SKIP condition. Move the check before the signal checking when
processing test result.
Cc: Shuah Khan <shuah@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: linux-kselftest@vger.kernel.org
Fixes: 9847d24af9 ("selftests/harness: Refactor XFAIL into SKIP")
Signed-off-by: Kees Cook <keescook@chromium.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRdM/uy1Ege0+EN1fNar9k/UBDW4wUCZNRx8QAKCRBar9k/UBDW
46MBAQC3YDFsEfPzX4P7ZnlM5Lf1NynjNbso5bYW0TF/dp/Y+gD+M8wdM5Vj2Mb0
Zr56TnwCJei0kGBemiel4sStt3e4qwY=
=+0u+
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Martin KaFai Lau says:
====================
pull-request: bpf-next 2023-08-09
We've added 19 non-merge commits during the last 6 day(s) which contain
a total of 25 files changed, 369 insertions(+), 141 deletions(-).
The main changes are:
1) Fix array-index-out-of-bounds access when detaching from an
already empty mprog entry from Daniel Borkmann.
2) Adjust bpf selftest because of a recent llvm change
related to the cpu-v4 ISA from Eduard Zingerman.
3) Add uprobe support for the bpf_get_func_ip helper from Jiri Olsa.
4) Fix a KASAN splat due to the kernel incorrectly accepted
an invalid program using the recent cpu-v4 instruction from
Yonghong Song.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
bpf: btf: Remove two unused function declarations
bpf: lru: Remove unused declaration bpf_lru_promote()
selftests/bpf: relax expected log messages to allow emitting BPF_ST
selftests/bpf: remove duplicated functions
bpf, docs: Fix small typo and define semantics of sign extension
selftests/bpf: Add bpf_get_func_ip test for uprobe inside function
selftests/bpf: Add bpf_get_func_ip tests for uprobe on function entry
bpf: Add support for bpf_get_func_ip helper for uprobe program
selftests/bpf: Add a movsx selftest for sign-extension of R10
bpf: Fix an incorrect verification success with movsx insn
bpf, docs: Formalize type notation and function semantics in ISA standard
bpf: change bpf_alu_sign_string and bpf_movsx_string to static
libbpf: Use local includes inside the library
bpf: fix bpf_dynptr_slice() to stop return an ERR_PTR.
bpf: fix inconsistent return types of bpf_xdp_copy_buf().
selftests/bpf: fix the incorrect verification of port numbers.
selftests/bpf: Add test for detachment on empty mprog entry
bpf: Fix mprog detachment for empty mprog entry
bpf: bpf_struct_ops: Remove unnecessary initial values of variables
====================
Link: https://lore.kernel.org/r/20230810055123.109578-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It turns out arm32 doesn't handle syscall -1 gracefully, so skip testing
for that. Additionally skip tests that depend on clone3 when it is not
available (for example when building the seccomp selftests on an old arm
image without clone3 headers). And improve error reporting for when
nanosleep fails, as seen on arm32 since v5.15.
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Still trending up in size but the good news is that the "current"
regressions are resolved, AFAIK.
We're getting weirdly many fixes for Wake-on-LAN and suspend/resume
handling on embedded this week (most not merged yet), not sure why.
But those are all for older bugs.
Current release - regressions:
- tls: set MSG_SPLICE_PAGES consistently when handing encrypted
data over to TCP
Current release - new code bugs:
- eth: mlx5: correct IDs on VFs internal to the device (IPU)
Previous releases - regressions:
- phy: at803x: fix WoL support / reporting on AR8032
- bonding: fix incorrect deletion of ETH_P_8021AD protocol VID
from slaves, leading to BUG_ON()
- tun: prevent tun_build_skb() from exceeding the packet size limit
- wifi: rtw89: fix 8852AE disconnection caused by RX full flags
- eth/PCI: enetc: fix probing after 6fffbc7ae1 ("PCI: Honor
firmware's device disabled status"), keep PCI devices around
even if they are disabled / not going to be probed to be
able to apply quirks on them
- eth: prestera: fix handling IPv4 routes with nexthop IDs
Previous releases - always broken:
- netfilter: re-work garbage collection to avoid races between
user-facing API and timeouts
- tunnels: fix generating ipv4 PMTU error on non-linear skbs
- nexthop: fix infinite nexthop bucket dump when using maximum
nexthop ID
- wifi: nl80211: fix integer overflow in nl80211_parse_mbssid_elems()
Misc:
- unix: use consistent error code in SO_PEERPIDFD
- ipv6: adjust ndisc_is_useropt() to include PREFIX_INFO,
in prep for upcoming IETF RFC
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmTVMSsACgkQMUZtbf5S
Irul3g//RlSANV/MWkiDmHIS5IhqkVWbvjGhFXFfdqZPH4gfgcX9VrsMuxgNM1Xu
YXGx+rIu408qNNkVG2hpFMxPerRiqVB/XsH1TxRr0Mi6AMFoKGXS+cGwzSOaoMQj
FYlcC6j2SnQ9N4I0qQuKOSOffyvyxrx/l9ozpVXsbGsOic1k6j1Ipwtf3+WP7dEe
kkAPUlsQPdCIhMyQdK3X4xI1PGLtAXFgY3VV9bZ7u99l7QBwmconkl3GHq/xnPa8
Uyll005ThyYce0c4EPVcrY1YBXyY0LjOBIRtiTFAk6CMWc0Su8Ug/i4+K2KTq0eh
yjqqHkpR//ruLgtAXBLLE9mxma8448vmmex/cSLIBaMAttlnj9n2LvCqvbzNfTZA
ssnKO4D3HhoQvHqbeOOW6VzVX7XyhomOvQXihfdLUs9u2tKE3nQoU+QCnrnIUXZO
VF5/ubCERRdZDPQ1SSAktimlC0R1qVL7JPMRaQF0aW5xByabbEWwMaNiwkYQOh2o
w2KsbhM/vWyd+5JB412LrNsEgK1BV6WjgwzC+27YQ7QD/JxUZBUghL0ps2jgU2Lu
d4YdbBOgYz+xyUBPByeYzcac0SIeMkB/UEcaO54ySWU8GcWYLt4KXwydUq/cXlw0
rUDCO5bikMxmygLKtnTSwmwvGbGByEXbGvVKwUwNPqTnR+vPIbM=
=NZgp
-----END PGP SIGNATURE-----
Merge tag 'net-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, wireless and bpf.
Still trending up in size but the good news is that the "current"
regressions are resolved, AFAIK.
We're getting weirdly many fixes for Wake-on-LAN and suspend/resume
handling on embedded this week (most not merged yet), not sure why.
But those are all for older bugs.
Current release - regressions:
- tls: set MSG_SPLICE_PAGES consistently when handing encrypted data
over to TCP
Current release - new code bugs:
- eth: mlx5: correct IDs on VFs internal to the device (IPU)
Previous releases - regressions:
- phy: at803x: fix WoL support / reporting on AR8032
- bonding: fix incorrect deletion of ETH_P_8021AD protocol VID from
slaves, leading to BUG_ON()
- tun: prevent tun_build_skb() from exceeding the packet size limit
- wifi: rtw89: fix 8852AE disconnection caused by RX full flags
- eth/PCI: enetc: fix probing after 6fffbc7ae1 ("PCI: Honor
firmware's device disabled status"), keep PCI devices around even
if they are disabled / not going to be probed to be able to apply
quirks on them
- eth: prestera: fix handling IPv4 routes with nexthop IDs
Previous releases - always broken:
- netfilter: re-work garbage collection to avoid races between
user-facing API and timeouts
- tunnels: fix generating ipv4 PMTU error on non-linear skbs
- nexthop: fix infinite nexthop bucket dump when using maximum
nexthop ID
- wifi: nl80211: fix integer overflow in nl80211_parse_mbssid_elems()
Misc:
- unix: use consistent error code in SO_PEERPIDFD
- ipv6: adjust ndisc_is_useropt() to include PREFIX_INFO, in prep for
upcoming IETF RFC"
* tag 'net-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits)
net: hns3: fix strscpy causing content truncation issue
net: tls: set MSG_SPLICE_PAGES consistently
ibmvnic: Ensure login failure recovery is safe from other resets
ibmvnic: Do partial reset on login failure
ibmvnic: Handle DMA unmapping of login buffs in release functions
ibmvnic: Unmap DMA login rsp buffer on send login fail
ibmvnic: Enforce stronger sanity checks on login response
net: mana: Fix MANA VF unload when hardware is unresponsive
netfilter: nf_tables: remove busy mark and gc batch API
netfilter: nft_set_hash: mark set element as dead when deleting from packet path
netfilter: nf_tables: adapt set backend to use GC transaction API
netfilter: nf_tables: GC transaction API to avoid race with control plane
selftests/bpf: Add sockmap test for redirecting partial skb data
selftests/bpf: fix a CI failure caused by vsock sockmap test
bpf, sockmap: Fix bug that strp_done cannot be called
bpf, sockmap: Fix map type error in sock_map_del_link
xsk: fix refcount underflow in error path
ipv6: adjust ndisc_is_useropt() to also return true for PIO
selftests: forwarding: bridge_mdb: Make test more robust
selftests: forwarding: bridge_mdb_max: Fix failing test with old libnet
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRdM/uy1Ege0+EN1fNar9k/UBDW4wUCZNRuIQAKCRBar9k/UBDW
4++9AP9ymOcPOKTKdQwZ6cnq3vkmvN37H6teufTyM8vsCha9NAD+OQE+vg1304RM
aETtG6d5Nb+byIHZGJrdUyT7g9jRzgw=
=qr/C
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Martin KaFai Lau says:
====================
pull-request: bpf 2023-08-09
We've added 5 non-merge commits during the last 7 day(s) which contain
a total of 6 files changed, 102 insertions(+), 8 deletions(-).
The main changes are:
1) A bpf sockmap memleak fix and a fix in accessing the programs of
a sockmap under the incorrect map type from Xu Kuohai.
2) A refcount underflow fix in xsk from Magnus Karlsson.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Add sockmap test for redirecting partial skb data
selftests/bpf: fix a CI failure caused by vsock sockmap test
bpf, sockmap: Fix bug that strp_done cannot be called
bpf, sockmap: Fix map type error in sock_map_del_link
xsk: fix refcount underflow in error path
====================
Link: https://lore.kernel.org/r/20230810055303.120917-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Revert a patch that unconditionally resolved addresses to inlines in
callchains, something that was done before when DWARF mode was asked
for, but could as well be done when just frame pointers (the default)
was selected. This enriches the callchains with inlines but the way to
resolve it is gross right now, relying on addr2line, and even if we come
up with an efficient way of processing all the associated DWARF info for
a big file as vmlinux is, this has to be something people opt-in, as it
will still result in overheads, so revert it until we get this done in a
saner way.
- Update the x86 msr-index.h header with the kernel original, no change
in tooling output, just addresses a tools/perf build warning.
- Resolve a regression where special "tool events", such as
"duration_time" were being presented for all CPUs, when it only makes
sense to show it for the workload, that is, just once.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZNP/OAAKCRCyPKLppCJ+
J7cGAQDgNpsAqGk+/Xkk7lPcp8aJ7q+5oaxv8iaGhdblq7V52gD+L2t8sNPQYWE3
sy2QQ+9tsZiONfpdxknsduxoyfE+Vgs=
=CRYB
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-fixes-for-v6.5-3-2023-08-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Revert a patch that unconditionally resolved addresses to inlines in
callchains, something that was done before when DWARF mode was asked
for, but could as well be done when just frame pointers (the default)
was selected.
This enriches the callchains with inlines but the way to resolve it
is gross right now, relying on addr2line, and even if we come up with
an efficient way of processing all the associated DWARF info for a
big file as vmlinux is, this has to be something people opt-in, as it
will still result in overheads, so revert it until we get this done
in a saner way.
- Update the x86 msr-index.h header with the kernel original, no change
in tooling output, just addresses a tools/perf build warning.
- Resolve a regression where special "tool events", such as
"duration_time" were being presented for all CPUs, when it only
makes sense to show it for the workload, that is, just once.
* tag 'perf-tools-fixes-for-v6.5-3-2023-08-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf stat: Don't display zero tool counts
tools arch x86: Sync the msr-index.h copy with the kernel sources
Revert "perf report: Append inlines to non-DWARF callchains"
Add a test case to check whether sockmap redirection works correctly
when data length returned by stream_parser is less than skb->len.
In addition, this test checks whether strp_done is called correctly.
The reason is that we returns skb->len - 1 from the stream_parser, so
the last byte in the skb will be held by strp->skb_head. Therefore,
if strp_done is not called to free strp->skb_head, we'll get a memleak
warning.
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20230804073740.194770-5-xukuohai@huaweicloud.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
BPF CI has reported the following failure:
Error: #200/79 sockmap_listen/sockmap VSOCK test_vsock_redir
Error: #200/79 sockmap_listen/sockmap VSOCK test_vsock_redir
./test_progs:vsock_unix_redir_connectible:1506: egress: write: Transport endpoint is not connected
vsock_unix_redir_connectible:FAIL:1506
./test_progs:vsock_unix_redir_connectible:1506: ingress: write: Transport endpoint is not connected
vsock_unix_redir_connectible:FAIL:1506
./test_progs:vsock_unix_redir_connectible:1506: egress: write: Transport endpoint is not connected
vsock_unix_redir_connectible:FAIL:1506
./test_progs:vsock_unix_redir_connectible:1514: ingress: recv() err, errno=11
vsock_unix_redir_connectible:FAIL:1514
./test_progs:vsock_unix_redir_connectible:1518: ingress: vsock socket map failed, a != b
vsock_unix_redir_connectible:FAIL:1518
./test_progs:vsock_unix_redir_connectible:1525: ingress: want pass count 1, have 0
It’s because the recv(... MSG_DONTWAIT) syscall in the test case is
called before the queued work sk_psock_backlog() in the kernel finishes
executing. So the data to be read is still queued in psock->ingress_skb
and cannot be read by the user program. Therefore, the non-blocking
recv() reads nothing and reports an EAGAIN error.
So replace recv(... MSG_DONTWAIT) with xrecv_nonblock(), which calls
select() to wait for data to be readable or timeout before calls recv().
Fixes: d61bd8c1fd ("selftests/bpf: add a test case for vsock sockmap")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20230804073740.194770-4-xukuohai@huaweicloud.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
The reason behind commit af7b29b1de ("Revert "net/sched: taprio: make
qdisc_leaf() see the per-netdev-queue pfifo child qdiscs"") was that the
patch it reverted caused a crash when attaching a CBS shaper to one of
the taprio classes. Prevent that from happening again by adding a test
case for it, which now passes correctly in both offload and software
modes.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20230807193324.4128292-12-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Check that the "Can only be attached as root qdisc" error message from
taprio is effective by attempting to attach it to a class of another
taprio qdisc. That operation should fail.
In the bug that was squashed by change "net/sched: taprio: try again to
report q->qdiscs[] to qdisc_leaf()", grafting a child taprio to a root
software taprio would be misinterpreted as a change() to the root
taprio. Catch this by looking at whether the base-time of the root
taprio has changed to follow the base-time of the child taprio,
something which should have absolutely never happened assuming correct
semantics.
Vinicius points out that looking at "base_time" in the tc qdisc show
output is unreliable because user space is in a race with the kernel
applying the setting. So we create a helper bash script which waits
while there is any pending schedule.
Link: https://lore.kernel.org/netdev/87il9w0xx7.fsf@intel.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20230807193324.4128292-11-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For offloaded tc-taprio testing with netdevsim, the mock-up PHC driver
is used.
Suggested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20230807193324.4128292-10-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This makes a difference for the software scheduling mode, where
dev_queue->qdisc_sleeping is the same as the taprio root Qdisc itself,
but when we're talking about what Qdisc and stats get reported for a
traffic class, the root taprio isn't what comes to mind, but q->qdiscs[]
is.
To understand the difference, I've attempted to send 100 packets in
software mode through class 8001:5, and recorded the stats before and
after the change.
Here is before:
$ tc -s class show dev eth0
class taprio 8001:1 root leaf 8001:
Sent 9400 bytes 100 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:2 root leaf 8001:
Sent 9400 bytes 100 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:3 root leaf 8001:
Sent 9400 bytes 100 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:4 root leaf 8001:
Sent 9400 bytes 100 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:5 root leaf 8001:
Sent 9400 bytes 100 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:6 root leaf 8001:
Sent 9400 bytes 100 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:7 root leaf 8001:
Sent 9400 bytes 100 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:8 root leaf 8001:
Sent 9400 bytes 100 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
and here is after:
class taprio 8001:1 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:2 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:3 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:4 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:5 root
Sent 9400 bytes 100 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:6 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:7 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
class taprio 8001:8 root leaf 800d:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
window_drops 0
The most glaring (and expected) difference is that before, all class
stats reported the global stats, whereas now, they really report just
the counters for that traffic class.
Finally, Pedro Tammela points out that there is a tc selftest which
checks specifically which handle do the child Qdiscs corresponding to
each class have. That's changing here - taprio no longer reports
tcm->tcm_info as the same handle "1:" as itself (the root Qdisc), but 0
(the handle of the default pfifo child Qdiscs). Since iproute2 does not
print a child Qdisc handle of 0, adjust the test's expected output.
Link: https://lore.kernel.org/netdev/3b83fcf6-a5e8-26fb-8c8a-ec34ec4c3342@mojatatu.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20230807193324.4128292-6-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some test cases check that the group timer is (or isn't) 0. Instead of
grepping for "0.00" grep for " 0.00" as the former can also match
"260.00" which is the default group membership interval.
Fixes: b6d00da086 ("selftests: forwarding: Add bridge MDB test")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230808141503.4060661-18-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As explained in commit 8bcfb4ae4d ("selftests: forwarding: Fix failing
tests with old libnet"), old versions of libnet (used by mausezahn) do
not use the "SO_BINDTODEVICE" socket option. For IP unicast packets,
this can be solved by prefixing mausezahn invocations with "ip vrf
exec". However, IP multicast packets do not perform routing and simply
egress the bound device, which does not exist in this case.
Fix by specifying the source and destination MAC of the packet which
will cause mausezahn to use a packet socket instead of an IP socket.
Fixes: 3446dcd7df ("selftests: forwarding: bridge_mdb_max: Add a new selftest")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230808141503.4060661-17-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As explained in commit 8bcfb4ae4d ("selftests: forwarding: Fix failing
tests with old libnet"), old versions of libnet (used by mausezahn) do
not use the "SO_BINDTODEVICE" socket option. For IP unicast packets,
this can be solved by prefixing mausezahn invocations with "ip vrf
exec". However, IP multicast packets do not perform routing and simply
egress the bound device, which does not exist in this case.
Fix by specifying the source and destination MAC of the packet which
will cause mausezahn to use a packet socket instead of an IP socket.
Fixes: b6d00da086 ("selftests: forwarding: Add bridge MDB test")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230808141503.4060661-16-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As explained in commit 8bcfb4ae4d ("selftests: forwarding: Fix failing
tests with old libnet"), old versions of libnet (used by mausezahn) do
not use the "SO_BINDTODEVICE" socket option. For IP unicast packets,
this can be solved by prefixing mausezahn invocations with "ip vrf
exec". However, IP multicast packets do not perform routing and simply
egress the bound device, which does not exist in this case.
Fix by specifying the source and destination MAC of the packet which
will cause mausezahn to use a packet socket instead of an IP socket.
Fixes: 8c33266ae2 ("selftests: forwarding: Add layer 2 miss test cases")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230808141503.4060661-15-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The test installs filters that match on various IP fragments (e.g., no
fragment, first fragment) and expects a certain amount of packets to hit
each filter. This is problematic as the filters are not specific enough
and can match IP packets (e.g., IGMP) generated by the stack, resulting
in failures [1].
Fix by making the filters more specific and match on more fields in the
IP header: Source IP, destination IP and protocol.
[1]
# timeout set to 0
# selftests: net/forwarding: tc_tunnel_key.sh
# TEST: tunnel_key nofrag (skip_hw) [FAIL]
# packet smaller than MTU was not tunneled
# INFO: Could not test offloaded functionality
not ok 89 selftests: net/forwarding: tc_tunnel_key.sh # exit=1
Fixes: 533a89b194 ("selftests: forwarding: add tunnel_key "nofrag" test case")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Acked-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230808141503.4060661-14-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The test checks that filters that match on source or destination MAC
were only hit once. A host can send more than one packet with a given
source or destination MAC, resulting in failures.
Fix by relaxing the success criterion and instead check that the filters
were not hit zero times. Using tc_check_at_least_x_packets() is also an
option, but it is not available in older kernels.
Fixes: 07e5c75184 ("selftests: forwarding: Introduce tc flower matching tests")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230808141503.4060661-13-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The test relies on 'nc' being the netcat version from the nmap project.
While this seems to be the case on Fedora, it is not the case on Ubuntu,
resulting in failures such as [1].
Fix by explicitly using the 'ncat' utility from the nmap project and the
skip the test in case it is not installed.
[1]
# timeout set to 0
# selftests: net/forwarding: tc_actions.sh
# TEST: gact drop and ok (skip_hw) [ OK ]
# TEST: mirred egress flower redirect (skip_hw) [ OK ]
# TEST: mirred egress flower mirror (skip_hw) [ OK ]
# TEST: mirred egress matchall mirror (skip_hw) [ OK ]
# TEST: mirred_egress_to_ingress (skip_hw) [ OK ]
# nc: invalid option -- '-'
# usage: nc [-46CDdFhklNnrStUuvZz] [-I length] [-i interval] [-M ttl]
# [-m minttl] [-O length] [-P proxy_username] [-p source_port]
# [-q seconds] [-s sourceaddr] [-T keyword] [-V rtable] [-W recvlimit]
# [-w timeout] [-X proxy_protocol] [-x proxy_address[:port]]
# [destination] [port]
# nc: invalid option -- '-'
# usage: nc [-46CDdFhklNnrStUuvZz] [-I length] [-i interval] [-M ttl]
# [-m minttl] [-O length] [-P proxy_username] [-p source_port]
# [-q seconds] [-s sourceaddr] [-T keyword] [-V rtable] [-W recvlimit]
# [-w timeout] [-X proxy_protocol] [-x proxy_address[:port]]
# [destination] [port]
# TEST: mirred_egress_to_ingress_tcp (skip_hw) [FAIL]
# server output check failed
# INFO: Could not test offloaded functionality
not ok 80 selftests: net/forwarding: tc_actions.sh # exit=1
Fixes: ca22da2fbd ("act_mirred: use the backlog for nested calls to mirred ingress")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230808141503.4060661-12-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
MAC Merge cannot be tested with veth pairs, resulting in failures:
# ./ethtool_mm.sh
[...]
TEST: Manual configuration with verification: swp1 to swp2 [FAIL]
Verification did not succeed
Fix by skipping the test when the interfaces do not support MAC Merge.
Fixes: e6991384ac ("selftests: forwarding: add a test for MAC Merge layer")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230808141503.4060661-11-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Layer 3 hardware stats cannot be used when the underlying interfaces are
veth pairs, resulting in failures:
# ./hw_stats_l3_gre.sh
TEST: ping gre flat [ OK ]
TEST: Test rx packets: [FAIL]
Traffic not reflected in the counter: 0 -> 0
TEST: Test tx packets: [FAIL]
Traffic not reflected in the counter: 0 -> 0
Fix by skipping the test when used with veth pairs.
Fixes: 813f97a268 ("selftests: forwarding: Add a tunnel-based test for L3 HW stats")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230808141503.4060661-10-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ethtool extended state cannot be tested with veth pairs, resulting in
failures:
# ./ethtool_extended_state.sh
TEST: Autoneg, No partner detected [FAIL]
Expected "Autoneg", got "Link detected: no"
[...]
Fix by skipping the test when used with veth pairs.
Fixes: 7d10bcce98 ("selftests: forwarding: Add tests for ethtool extended state")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230808141503.4060661-9-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Auto-negotiation cannot be tested with veth pairs, resulting in
failures:
# ./ethtool.sh
TEST: force of same speed autoneg off [FAIL]
error in configuration. swp1 speed Not autoneg off
[...]
Fix by skipping the test when used with veth pairs.
Fixes: 64916b57c0 ("selftests: forwarding: Add speed and auto-negotiation test")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230808141503.4060661-8-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A handful of tests require physical loopbacks to be used instead of veth
pairs. Add a helper that these tests will invoke in order to be skipped
when executed with veth pairs.
Fixes: 64916b57c0 ("selftests: forwarding: Add speed and auto-negotiation test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230808141503.4060661-7-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>