Handling TIP.PGD for an address filter for a guest kernel is the same as a
host kernel, but user space decoding, and hence address filters, are not
supported.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The guest kernel can be found from any guest thread belonging to the guest
machine. The guest machine is associated with the current host process pid.
An idle thread (pid=tid=0) is created as a vehicle from which to find the
guest kernel map.
Decoding guest user space is not supported.
Synthesized samples just need the cpumode set for the guest.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out machine__idle_thread() so it can be re-used for guest machines.
A thread is needed to find executable code, even for the guest kernel. To
avoid possible future pid number conflicts, the idle thread can be used.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out machines__find_guest() so it can be re-used.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The PIP packet NR (non-root) flag indicates whether or not a virtual
machine is being traced (NR=1 => VM). Add support for tracking its value.
In particular note that the PIP packet (outside of PSB+) will be
associated with a TIP packet from which address the NR value takes
effect. At that point, there is a branch from_ip, to_ip with
corresponding from_nr and to_nr.
In the event of VM-Entry failure, there should still PIP and TIP packets
that can be followed in the same way.
Also note that this assumes that a host VMM is not employing VMX controls
that affect Intel PT, e.g. to hide the host from a guest using Intel PT.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Retain the PIP packet payload as is, instead of just the CR3, because it
contains also the VMX NR flag which is needed to track VM-Entry.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In preparation to support Intel PT decoding of virtual machine traces, add
vmlaunch and vmresume as branch instructions.
Note, sample flags will show "VMentry" even if the VM-Entry fails.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In preparation to support Intel PT decoding of virtual machine traces, add
branch types for VM-Entry and VM-Exit.
Note they are both treated as "calls" because the VM-Exit transfers control
to a different address.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
aux-output events need to have an AUX area event as the group leader.
However, grouping events does not allow the AUX area event to be given
an address filter because the --filter option must come after the event,
which conflicts with the grouping syntax.
To allow filtering in that case, automatically create a group since that
is the requirement anyway.
Example: (requires Intel Tremont)
perf record -c 500 -e 'intel_pt//u' --filter 'filter main @ /bin/ls' -e 'cycles/aux-output/pp' ls
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210121140418.14705-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The ubsan reported the following error. It was because sample's raw
data missed u32 padding at the end. So it broke the alignment of the
array after it.
The raw data contains an u32 size prefix so the data size should have
an u32 padding after 8-byte aligned data.
27: Sample parsing :util/synthetic-events.c:1539:4:
runtime error: store to misaligned address 0x62100006b9bc for type
'__u64' (aka 'unsigned long long'), which requires 8 byte alignment
0x62100006b9bc: note: pointer points here
00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^
#0 0x561532a9fc96 in perf_event__synthesize_sample util/synthetic-events.c:1539:13
#1 0x5615327f4a4f in do_test tests/sample-parsing.c:284:8
#2 0x5615327f3f50 in test__sample_parsing tests/sample-parsing.c:381:9
#3 0x56153279d3a1 in run_test tests/builtin-test.c:424:9
#4 0x56153279c836 in test_and_print tests/builtin-test.c:454:9
#5 0x56153279b7eb in __cmd_test tests/builtin-test.c:675:4
#6 0x56153279abf0 in cmd_test tests/builtin-test.c:821:9
#7 0x56153264e796 in run_builtin perf.c:312:11
#8 0x56153264cf03 in handle_internal_command perf.c:364:8
#9 0x56153264e47d in run_argv perf.c:408:2
#10 0x56153264c9a9 in main perf.c:538:3
#11 0x7f137ab6fbbc in __libc_start_main (/lib64/libc.so.6+0x38bbc)
#12 0x561532596828 in _start ...
SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use
util/synthetic-events.c:1539:4 in
Fixes: 045f8cd854 ("perf tests: Add a sample parsing test")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210214091638.519643-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For X86, the var2_w field of PERF_SAMPLE_WEIGHT_STRUCT stands for the
instruction latency. Current perf forces the var2_w to the data->ins_lat
in the generic code. It works well for now because X86 is the only
architecture that supports the PERF_SAMPLE_WEIGHT_STRUCT, but it may
bring problems once other architectures support the sample type. For
example, the var2_w may be used to capture something else on PowerPC.
Create two architecture specific functions to parse and synthesize the
weight related samples. Move the X86 specific codes to the X86 version
functions. Other architectures can implement their own functions later
separately.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/1612540912-6562-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The code assumed every CYC-eligible packet has a CYC packet, which is not
the case when CYC thresholds are used. Fix by checking if a CYC packet is
actually present in that case.
Fixes: 5b1dc0fd1d ("perf intel-pt: Add support for samples to contain IPC ratio")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210205175350.23817-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The code assumed a change in cycle count means accurate IPC. That is not
correct, for example when sampling both branches and instructions, or at
a FUP packet (which is not CYC-eligible) address. Fix by using an explicit
flag to indicate when IPC can be sampled.
Fixes: 5b1dc0fd1d ("perf intel-pt: Add support for samples to contain IPC ratio")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/20210205175350.23817-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add missing CYC packet processing when walking through PSB+. This
improves the accuracy of timestamps that follow PSB+, until the next
MTC.
Fixes: 3d49807870 ("perf tools: Add new Intel PT packet definitions")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210205175350.23817-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When locating the DWARF module for a given address, __find_debuginfo()
requires a 'struct dso' passed via the userdata argument.
However, this field is only set in __report_module() if the module is
found in via dwfl_addrmodule(), not if it is found later via
dwfl_report_elf().
Set userdata irrespective of how the DWARF module was found, as long as
we found a module.
Fixes: bf53fc6b5f ("perf unwind: Fix separate debug info files when using elfutils' libdw's unwinder")
Signed-off-by: Dave Rigby <d.rigby@me.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211801
Acked-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/linux-perf-users/20210218165654.36604-1-d.rigby@me.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit da231338ec ("perf record: Use an eventfd to wakeup when
done") uses eventfd() to solve a rare race where the setting and
checking of 'done' which add done_fd to pollfd. When draining buffer,
revents of done_fd is 0 and evlist__filter_pollfd function returns a
non-zero value. As a result, perf record does not stop profiling.
The following simple scenarios can trigger this condition:
# sleep 10 &
# perf record -p $!
After the sleep process exits, perf record should stop profiling and exit.
However, perf record keeps running.
If pollfd revents contains only POLLERR or POLLHUP, perf record
indicates that buffer is draining and need to stop profiling. Use
fdarray_flag__nonfilterable() to set done eventfd to nonfilterable
objects, so that evlist__filter_pollfd() does not filter and check done
eventfd.
Fixes: da231338ec ("perf record: Use an eventfd to wakeup when done")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: zhangjinhao2@huawei.com
Link: http://lore.kernel.org/lkml/20210205065001.23252-1-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmAppPgeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGeXYH/imZPBd4A1jIMehN
5HV2A53Z+MXmmaMuGj9X1KV6vsf55/xB+IhOoFdtRAIsO8c2yYSCO8i4+4R0XfYA
+/YFJeq672rojQnmh6XbpR8dugaAV7CUHy6n7KDsyvtT6EOCpwFSwkOb4X3tBRX6
TlYgm2d/xgV/wRHSgLVugK0MdFCLMAnyb7mkPfar9QrMgG1BiDKLq07xmwnS23On
TkqpJ9yZ/rJpUrrUqQYPShSO/FmA+fSfWs0CDv7EIrJ40LUScD6PZxSHWTIHtjLk
E4jFda6wuqLRVWsBwaBzUIdD0zk7X5quHRzEpbC5ga16SK6yrWvE5YJJXCguIEuZ
f3FMRYs=
=CAjn
-----END PGP SIGNATURE-----
Merge tag 'v5.11' into rdma.git for-next
Linux 5.11
Merged to resolve conflicts with RDMA rc commits
- drivers/infiniband/sw/rxe/rxe_net.c
The final logic is to call rxe_get_dev_from_net() again with the master
netdev if the packet was rx'd on a vlan. To keep the elimination of the
local variables requires a trivial edit to the code in -rc
Link: https://lore.kernel.org/r/20210210131542.215ea67c@canb.auug.org.au
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Fix the following coccicheck warnings:
./tools/perf/util/header.c:3809:18-20: WARNING !A || A && B is
equivalent to !A || B.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/1612497255-87189-1-git-send-email-jiapeng.chong@linux.alibaba.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Do not jump when 'k' is pressed, the cursor show stay where it is.
Right now, it jumps to the currently selected hot instruction.
Signed-off-by: Martin Liška <mliska@suse.cz>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lore.kernel.org/lkml/65416cff-4eb6-713c-a174-2aa43fa64332@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Eliminate the following coccicheck warning:
./tools/perf/util/metricgroup.c:382:3-4: Unneeded semicolon
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yang li <yang.lee@linux.alibaba.com>
Link: http://lore.kernel.org/lkml/1612165277-95878-1-git-send-email-yang.lee@linux.alibaba.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Detect symbols generated by the OCaml compiler based on their prefix.
Demangle OCaml symbols, returning a newly allocated string (like the
existing Java demangling functionality).
Move a helper function (hex) from tests/code-reading.c to util/string.c
To test:
echo 'Printf.printf "%d\n" (Random.int 42)' > test.ml
perf record ocamlopt.opt test.ml
perf report -d ocamlopt.opt
Signed-off-by: Fabian Hemmer <copy@copy.sh>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LPU-Reference: 20210203211537.b25ytjb6dq5jfbwx@nyu
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently it parses the /proc file everytime it opens a file in the
cgroupfs. Save the last result to avoid it (assuming it won't be
changed between the accesses).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20201216090556.813996-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reduce the number of buffers and hopefully make it more efficient. :)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20201216090556.813996-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The cgroupfs_find_mountpoint() looks up the /proc/mounts file to find
a directory for the given cgroup subsystem. It keeps both cgroup v1
and v2 path since there's a possibility of the mixed hierarchly.
But we can simply use v1 path if it's found as it will override the v2
hierarchy.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20201216090556.813996-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When exporting static_call_key; with EXPORT_STATIC_CALL*(), the module
can use static_call_update() to change the function called. This is
not desirable in general.
Not exporting static_call_key however also disallows usage of
static_call(), since objtool needs the key to construct the
static_call_site.
Solve this by allowing objtool to create the static_call_site using
the trampoline address when it builds a module and cannot find the
static_call_key symbol. The module loader will then try and map the
trampole back to a key before it constructs the normal sites list.
Doing this requires a trampoline -> key associsation, so add another
magic section that keeps those.
Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210127231837.ifddpn7rhwdaepiu@treble
Some static call declarations are going to be needed on low level header
files. Move the necessary material to the dedicated static call types
header to avoid inclusion dependency hell.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-4-frederic@kernel.org
I've always been bothered by the endless (fragile) boilerplate for
rbtree, and I recently wrote some rbtree helpers for objtool and
figured I should lift them into the kernel and use them more widely.
Provide:
partial-order; less() based:
- rb_add(): add a new entry to the rbtree
- rb_add_cached(): like rb_add(), but for a rb_root_cached
total-order; cmp() based:
- rb_find(): find an entry in an rbtree
- rb_find_add(): find an entry, and add if not found
- rb_find_first(): find the first (leftmost) matching entry
- rb_next_match(): continue from rb_find_first()
- rb_for_each(): iterate a sub-tree using the previous two
Inlining and constant propagation should see the compiler inline the
whole thing, including the various compare functions.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
With LTO, there are symbols like these:
/usr/lib/debug/usr/lib64/libantlr4-runtime.so.4.8-4.8-1.4.x86_64.debug
10305: 0000000000955fa4 0 NOTYPE LOCAL DEFAULT 29 Predicate.cpp.2bc410e7
This comes from a runtime/debug split done by the standard way:
objcopy --only-keep-debug $runtime $debug
objcopy --add-gnu-debuglink=$debugfn -R .comment -R .GCC.command.line --strip-all $runtime
perf currently cannot resolve such symbols (relicts of LTO), as section
29 exists only in the debug file (29 is .debug_info). And perf resolves
symbols only against runtime file. This results in all symbols from such
a library being unresolved:
0.38% main2 libantlr4-runtime.so.4.8 [.] 0x00000000000671e0
So try resolving against the debug file first. And only if it fails (the
section has NOBITS set), try runtime file. We can do this, as "objcopy
--only-keep-debug" per documentation preserves all sections, but clears
data of some of them (the runtime ones) and marks them as NOBITS.
The correct result is now:
0.38% main2 libantlr4-runtime.so.4.8 [.] antlr4::IntStream::~IntStream
Note that these LTO symbols are properly skipped anyway as they belong
neither to *text* nor to *data* (is_label && !elf_sec__filter(&shdr,
secstrs) is true).
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210217122125.26416-1-jslaby@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-02-16
The following pull-request contains BPF updates for your *net-next* tree.
There's a small merge conflict between 7eeba1706e ("tcp: Add receive timestamp
support for receive zerocopy.") from net-next tree and 9cacf81f81 ("bpf: Remove
extra lock_sock for TCP_ZEROCOPY_RECEIVE") from bpf-next tree. Resolve as follows:
[...]
lock_sock(sk);
err = tcp_zerocopy_receive(sk, &zc, &tss);
err = BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sk, level, optname,
&zc, &len, err);
release_sock(sk);
[...]
We've added 116 non-merge commits during the last 27 day(s) which contain
a total of 156 files changed, 5662 insertions(+), 1489 deletions(-).
The main changes are:
1) Adds support of pointers to types with known size among global function
args to overcome the limit on max # of allowed args, from Dmitrii Banshchikov.
2) Add bpf_iter for task_vma which can be used to generate information similar
to /proc/pid/maps, from Song Liu.
3) Enable bpf_{g,s}etsockopt() from all sock_addr related program hooks. Allow
rewriting bind user ports from BPF side below the ip_unprivileged_port_start
range, both from Stanislav Fomichev.
4) Prevent recursion on fentry/fexit & sleepable programs and allow map-in-map
as well as per-cpu maps for the latter, from Alexei Starovoitov.
5) Add selftest script to run BPF CI locally. Also enable BPF ringbuffer
for sleepable programs, both from KP Singh.
6) Extend verifier to enable variable offset read/write access to the BPF
program stack, from Andrei Matei.
7) Improve tc & XDP MTU handling and add a new bpf_check_mtu() helper to
query device MTU from programs, from Jesper Dangaard Brouer.
8) Allow bpf_get_socket_cookie() helper also be called from [sleepable] BPF
tracing programs, from Florent Revest.
9) Extend x86 JIT to pad JMPs with NOPs for helping image to converge when
otherwise too many passes are required, from Gary Lin.
10) Verifier fixes on atomics with BPF_FETCH as well as function-by-function
verification both related to zero-extension handling, from Ilya Leoshkevich.
11) Better kernel build integration of resolve_btfids tool, from Jiri Olsa.
12) Batch of AF_XDP selftest cleanups and small performance improvement
for libbpf's xsk map redirect for newer kernels, from Björn Töpel.
13) Follow-up BPF doc and verifier improvements around atomics with
BPF_FETCH, from Brendan Jackman.
14) Permit zero-sized data sections e.g. if ELF .rodata section contains
read-only data from local variables, from Yonghong Song.
15) veth driver skb bulk-allocation for ndo_xdp_xmit, from Lorenzo Bianconi.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The sample structure contains the field 'data_src' which is used to
tell the data operation attributions, e.g. operation type is loading or
storing, cache level, it's snooping or remote accessing, etc. At the
end, the 'data_src' will be parsed by perf mem/c2c tools to display
human readable strings.
This patch is to fill the 'data_src' field in the synthesized samples
base on different types. Currently perf tool can display statistics for
L1/L2/L3 caches but it doesn't support the 'last level cache'. To fit
to current implementation, 'data_src' field uses L3 cache for last level
cache.
Before this commit, perf mem report looks like this:
# Samples: 75K of event 'l1d-miss'
# Total weight : 75951
# Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead Samples Local Weight Memory access Symbol Shared Object Data Symbol Data Object Snoop TLB access
# ........ ....... ............ ............. ...................... ............. ...................... ........... ..... ..........
#
81.56% 61945 0 N/A [.] 0x00000000000009d8 serial_c [.] 0000000000000000 [unknown] N/A N/A
18.44% 14003 0 N/A [.] 0x0000000000000828 serial_c [.] 0000000000000000 [unknown] N/A N/A
Now on a system with Arm SPE, addresses and access types are displayed:
# Samples: 75K of event 'l1d-miss'
# Total weight : 75951
# Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead Samples Local Weight Memory access Symbol Shared Object Data Symbol Data Object Snoop TLB access
# ........ ....... ............ ............. ...................... ............. ...................... ........... ..... ..........
#
0.43% 324 0 L1 miss [.] 0x00000000000009d8 serial_c [.] 0x0000ffff80794e00 anon N/A Walker hit
0.42% 322 0 L1 miss [.] 0x00000000000009d8 serial_c [.] 0x0000ffff80794580 anon N/A Walker hit
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20210211133856.2137-6-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The memory event can deliver two benefits:
- The first benefit is the memory event can give out global view for
memory accessing, rather than organizing events with scatter mode
(e.g. uses separate event for L1 cache, last level cache, etc) which
which can only display a event for single memory type, memory events
include all memory accessing so it can display the data accessing
cross memory levels in the same view;
- The second benefit is the sample generation might introduce a big
overhead and need to wait for long time for Perf reporting, we can
specify itrace option '--itrace=M' to filter out other events and only
output memory events, this can significantly reduce the overhead
caused by generating samples.
This patch is to enable memory event for Arm SPE.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20210211133856.2137-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To properly handle memory and branch samples, this patch divides into
two functions for generating samples: arm_spe__synth_mem_sample() is for
synthesizing memory and TLB samples; arm_spe__synth_branch_sample() is
to synthesize branch samples.
Arm SPE backend decoder has passed virtual and physical address through
packets, the address info is stored into the synthesize samples in the
function arm_spe__synth_mem_sample().
Committer notes:
Fixed this:
36 46.77 fedora:27 : FAIL clang version 5.0.2 (tags/RELEASE_502/final)
util/arm-spe.c:269:34: error: missing field 'pid' initializer [-Werror,-Wmissing-field-initializers]
struct perf_sample sample = { 0 };
^
util/arm-spe.c:288:34: error: missing field 'pid' initializer [-Werror,-Wmissing-field-initializers]
struct perf_sample sample = { 0 };
By using = { .ip = 0, };
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20210211133856.2137-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Userspace has discovered the functionality offered by SYS_kcmp and has
started to depend upon it. In particular, Mesa uses SYS_kcmp for
os_same_file_description() in order to identify when two fd (e.g. device
or dmabuf) point to the same struct file. Since they depend on it for
core functionality, lift SYS_kcmp out of the non-default
CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
Rasmus Villemoes also pointed out that systemd uses SYS_kcmp to
deduplicate the per-service file descriptor store.
Note that some distributions such as Ubuntu are already enabling
CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp.
References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> # DRM depends on kcmp
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> # systemd uses kcmp
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210205220012.1983-1-chris@chris-wilson.co.uk
The variable in practice will never be uninitialized, because the
loop will always go through at least one iteration.
In case it would not, make vcpu_get_cpuid report an assertion
failure.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This test launches 512 VMs in serial and kills them after a random
amount of time.
The test was original written to exercise KVM user notifiers in
the context of1650b4ebc99d:
- KVM: Disable irq while unregistering user notifier
- https://lore.kernel.org/kvm/CACXrx53vkO=HKfwWwk+fVpvxcNjPrYmtDZ10qWxFvVX_PTGp3g@mail.gmail.com/
Recently, this test piqued my interest because it proved useful to
for AMD SNP in exercising the "in-use" pages, described in APM section
15.36.12, "Running SNP-Active Virtual Machines".
Signed-off-by: Ignacio Alvarado <ikalvarado@google.com>
Signed-off-by: Marc Orr <marcorr@google.com>
Message-Id: <20210213001452.1719001-1-marcorr@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
GPIO CDEV is now optional and required for the selftests so add it to
the config.
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Add a port to the GPIO uAPI v2 interface and make it the default.
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
gpio-mockup-chardev helper has been obsoleted and removed, so also remove
the tools/gpio code that it, and nothing else, was using.
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
GPIO Makefile has been greatly simplified so remove references to lines
which no longer exist.
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Build restrictions related to the gpio-mockup-chardev helper are
no longer relevant so remove them.
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
GPIO selftests have changed to new gpio-mockup-cdev helper, so remove
old gpio-mockup-chardev helper.
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
The GPIO mockup selftests are overly complicated with separate
implementations of the tests for sysfs and cdev uAPI, and with the cdev
implementation being dependent on tools/gpio and libmount.
Rework the test implementation to provide a common test suite with a
simplified pluggable uAPI interface. The cdev implementation utilises
the GPIO uAPI directly to remove the dependence on tools/gpio.
The simplified uAPI interface removes the need for any file system mount
checks in C, and so removes the dependence on libmount.
The rework also fixes the sysfs test implementation which has been broken
since the device created in the multiple gpiochip case was split into
separate devices.
Fixes: 8a39f597bc ("gpio: mockup: rework device probing")
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Update the kselftest framework to allow client drivers to
specify that some tests were skipped.
Signed-off-by: Timur Tabi <timur@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Acked-by: Marco Elver <elver@google.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210214161348.369023-3-timur@kernel.org
test_global_func9 - check valid pointer's scenarios
test_global_func10 - check that a smaller type cannot be passed as a
larger one
test_global_func11 - check that CTX pointer cannot be passed
test_global_func12 - check access to a null pointer
test_global_func13 - check access to an arbitrary pointer value
test_global_func14 - check that an opaque pointer cannot be passed
test_global_func15 - check that a variable has an unknown value after
it was passed to a global function by pointer
test_global_func16 - check access to uninitialized stack memory
test_global_func_args - check read and write operations through a pointer
Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210212205642.620788-5-me@ubique.spb.ru
Add tests in tc_flower.sh for generic matching on MPLS Label Stack
Entries. The label, tc, bos and ttl fields are tested for the first
and second labels. For each field, the minimal and maximal values are
tested (the former at depth 1 and the later at depth 2).
There are also tests for matching the presence of a label stack entry
at a given depth.
In order to reduce the amount of code, all "lse" subcommands are tested
in match_mpls_lse_test(). Action "continue" is used, so that test
packets are evaluated by all filters. Then, we can verify if each
filter matched the expected number of packets.
Some versions of tc-flower produced invalid json output when dumping
MPLS filters with depth > 1. Skip the test if tc isn't recent enough.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add tests in tc_flower.sh for mpls_label, mpls_tc, mpls_bos and
mpls_ttl. For each keyword, test the minimal and maximal values.
Selectively skip these new mpls tests for tc versions that don't
support them.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
the following command:
# tc filter add dev $h2 ingress protocol ip pref 1 handle 101 flower \
$tcflags dst_ip 192.0.2.2 ip_ttl 63 action drop
doesn't drop all IPv4 packets that match the configured TTL / destination
address. In particular, if "fragment offset" or "more fragments" have non
zero value in the IPv4 header, setting of FLOW_DISSECTOR_KEY_IP is simply
ignored. Fix this dissecting IPv4 TTL and TOS before fragment info; while
at it, add a selftest for tc flower's match on 'ip_ttl' that verifies the
correct behavior.
Fixes: 518d8a2e9b ("net/flow_dissector: add support for dissection of misc ip header fields")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we receive less MPCapable SYN or 3rd ACK than expected, we now mark
the test as failed.
On the other hand, if we receive more, we keep the warning but we add a
hint that it is probably due to retransmissions and that's why we don't
mark the test as failed.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/148
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Info from received MPCapable SYN were printed instead of the ones from
received MPCapable 3rd ACK.
Fixes: fed61c4b58 ("selftests: mptcp: make 2nd net namespace use tcp syn cookies unconditionally")
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even if that may sound completely unlikely, the mptcp implementation
is not perfect, yet.
When the self-tests report an error we usually need more information
of what the scripts currently report. iproute allow provides
some additional goodies since a few releases, let's dump them.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding selftest for BPF-helper bpf_check_mtu(). Making sure
it can be used from both XDP and TC.
V16:
- Fix 'void' function definition
V11:
- Addresse nitpicks from Andrii Nakryiko
V10:
- Remove errno non-zero test in CHECK_ATTR()
- Addresse comments from Andrii Nakryiko
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/161287791989.790810.13612620012522164562.stgit@firesoul
This demonstrate how bpf_check_mtu() helper can easily be used together
with bpf_skb_adjust_room() helper, prior to doing size adjustment, as
delta argument is already setup.
Hint: This specific test can be selected like this:
./test_progs -t cls_redirect
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/161287791481.790810.4444271170546646080.stgit@firesoul
This BPF-helper bpf_check_mtu() works for both XDP and TC-BPF programs.
The SKB object is complex and the skb->len value (accessible from
BPF-prog) also include the length of any extra GRO/GSO segments, but
without taking into account that these GRO/GSO segments get added
transport (L4) and network (L3) headers before being transmitted. Thus,
this BPF-helper is created such that the BPF-programmer don't need to
handle these details in the BPF-prog.
The API is designed to help the BPF-programmer, that want to do packet
context size changes, which involves other helpers. These other helpers
usually does a delta size adjustment. This helper also support a delta
size (len_diff), which allow BPF-programmer to reuse arguments needed by
these other helpers, and perform the MTU check prior to doing any actual
size adjustment of the packet context.
It is on purpose, that we allow the len adjustment to become a negative
result, that will pass the MTU check. This might seem weird, but it's not
this helpers responsibility to "catch" wrong len_diff adjustments. Other
helpers will take care of these checks, if BPF-programmer chooses to do
actual size adjustment.
V14:
- Improve man-page desc of len_diff.
V13:
- Enforce flag BPF_MTU_CHK_SEGS cannot use len_diff.
V12:
- Simplify segment check that calls skb_gso_validate_network_len.
- Helpers should return long
V9:
- Use dev->hard_header_len (instead of ETH_HLEN)
- Annotate with unlikely req from Daniel
- Fix logic error using skb_gso_validate_network_len from Daniel
V6:
- Took John's advice and dropped BPF_MTU_CHK_RELAX
- Returned MTU is kept at L3-level (like fib_lookup)
V4: Lot of changes
- ifindex 0 now use current netdev for MTU lookup
- rename helper from bpf_mtu_check to bpf_check_mtu
- fix bug for GSO pkt length (as skb->len is total len)
- remove __bpf_len_adj_positive, simply allow negative len adj
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/161287790461.790810.3429728639563297353.stgit@firesoul
The BPF-helpers for FIB lookup (bpf_xdp_fib_lookup and bpf_skb_fib_lookup)
can perform MTU check and return BPF_FIB_LKUP_RET_FRAG_NEEDED. The BPF-prog
don't know the MTU value that caused this rejection.
If the BPF-prog wants to implement PMTU (Path MTU Discovery) (rfc1191) it
need to know this MTU value for the ICMP packet.
Patch change lookup and result struct bpf_fib_lookup, to contain this MTU
value as output via a union with 'tot_len' as this is the value used for
the MTU lookup.
V5:
- Fixed uninit value spotted by Dan Carpenter.
- Name struct output member mtu_result
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/161287789952.790810.13134700381067698781.stgit@firesoul
Perf failed to add a kretprobe event with debuginfo of vmlinux which is
compiled by gcc with -fpatchable-function-entry option enabled. The
same issue with kernel module.
Issue:
# perf probe -v 'kernel_clone%return $retval'
......
Writing event: r:probe/kernel_clone__return _text+599624 $retval
Failed to write event: Invalid argument
Error: Failed to add events. Reason: Invalid argument (Code: -22)
# cat /sys/kernel/debug/tracing/error_log
[156.75] trace_kprobe: error: Retprobe address must be an function entry
Command: r:probe/kernel_clone__return _text+599624 $retval
^
# llvm-dwarfdump vmlinux |grep -A 10 -w 0x00df2c2b
0x00df2c2b: DW_TAG_subprogram
DW_AT_external (true)
DW_AT_name ("kernel_clone")
DW_AT_decl_file ("/home/code/linux-next/kernel/fork.c")
DW_AT_decl_line (2423)
DW_AT_decl_column (0x07)
DW_AT_prototyped (true)
DW_AT_type (0x00dcd492 "pid_t")
DW_AT_low_pc (0xffff800010092648)
DW_AT_high_pc (0xffff800010092b9c)
DW_AT_frame_base (DW_OP_call_frame_cfa)
# cat /proc/kallsyms |grep kernel_clone
ffff800010092640 T kernel_clone
# readelf -s vmlinux |grep -i kernel_clone
183173: ffff800010092640 1372 FUNC GLOBAL DEFAULT 2 kernel_clone
# objdump -d vmlinux |grep -A 10 -w \<kernel_clone\>:
ffff800010092640 <kernel_clone>:
ffff800010092640: d503201f nop
ffff800010092644: d503201f nop
ffff800010092648: d503233f paciasp
ffff80001009264c: a9b87bfd stp x29, x30, [sp, #-128]!
ffff800010092650: 910003fd mov x29, sp
ffff800010092654: a90153f3 stp x19, x20, [sp, #16]
The entry address of kernel_clone converted by debuginfo is _text+599624
(0x92648), which is consistent with the value of DW_AT_low_pc attribute.
But the symbolic address of kernel_clone from /proc/kallsyms is
ffff800010092640.
This issue is found on arm64, -fpatchable-function-entry=2 is enabled when
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y;
Just as objdump displayed the assembler contents of kernel_clone,
GCC generate 2 NOPs at the beginning of each function.
kprobe_on_func_entry detects that (_text+599624) is not the entry address
of the function, which leads to the failure of adding kretprobe event.
kprobe_on_func_entry
->_kprobe_addr
->kallsyms_lookup_size_offset
->arch_kprobe_on_func_entry // FALSE
The cause of the issue is that the first instruction in the compile unit
indicated by DW_AT_low_pc does not include NOPs.
This issue exists in all gcc versions that support
-fpatchable-function-entry option.
I have reported it to the GCC community:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98776
Currently arm64 and PA-RISC may enable fpatchable-function-entry option.
The kernel compiled with clang does not have this issue.
FIX:
This GCC issue only cause the registration failure of the kretprobe event
which doesn't need debuginfo. So, stop using debuginfo for retprobe.
map will be used to query the probe function address.
Signed-off-by: Jianlin Lv <Jianlin.Lv@arm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: clang-built-linux@googlegroups.com
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20210210062646.2377995-1-Jianlin.Lv@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The first time dso__load() was called on a PE file it always returned -1
error. This caused the first call to map__find_symbol() to always fail
on a PE file so the first sample from each PE file always had symbol
<unknown>. Subsequent samples succeed however because the DSO is already
loaded.
This fixes dso__load() to return 0 when successfully loading a DSO with
libbfd.
Fixes: eac9a4342e ("perf symbols: Try reading the symbol table with libbfd")
Signed-off-by: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Huw Davies <huw@codeweavers.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Ulrich Czekalla <uczekalla@codeweavers.com>
Link: http://lore.kernel.org/lkml/1671b43b-09c3-1911-dbf8-7f030242fbf7@codeweavers.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
dso__load_bfd_symbols() attempts to load a DSO at its original path,
then closes it and loads the file in the debug cache. This is incorrect.
It should ignore the original file and work with only the debug cache.
The original file may have changed or may not even exist, for example if
the debug cache has been transferred to another machine via "perf
archive".
This fix makes it only load the file in the debug cache.
Further notes from Nicholas:
dso__load_bfd_symbols() is called in a loop from dso__load() for a variety
of paths. These are generated by the various DSO_BINARY_TYPEs in the
binary_type_symtab list at the top of util/symbol.c. In each case the
debugfile passed to dso__load_bfd_symbols() is the path to try.
One of those iterations (the first one I believe) passes the original path
as the debugfile. If the file still exists at the original path, this is
the one that ends up being used in case the debugcache was deleted or the
PE file doesn't have a build-id.
A later iteration (BUILD_ID_CACHE) passes debugfile as the file in the
debugcache if it has a build-id. Even if the file was previously loaded at
its original path, (if I understand correctly) this load will override it
so the debugcache file ends up being used.
Committer notes:
So if it fails to find in the cache, it will eventually hope for the
best and look at the path in the local filesystem, which in many cases
is enough.
At some point we need to switch from this "hope for the best" approach
to one that warns the user that there is no guarantee, if no buildid is
present, that just by looking at the pathname the symbolisation will
work.
Signed-off-by: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Huw Davies <huw@codeweavers.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Ulrich Czekalla <uczekalla@codeweavers.com>
Link: http://lore.kernel.org/lkml/e58e1237-94ab-e1c9-a7b9-473531906954@codeweavers.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is what I see after compiling the kernel:
# bpf-next...bpf-next/master
?? tools/bpf/resolve_btfids/libbpf/
Fixes: fc6b48f692 ("tools/resolve_btfids: Build libbpf and libsubcmd in separate directories")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210212010053.668700-1-sdf@google.com
The test dumps information similar to /proc/pid/maps. The first line of
the output is compared against the /proc file to make sure they match.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210212183107.50963-4-songliubraving@fb.com
This patch is to store operation type in packet structure.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20210211133856.2137-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch is to store virtual and physical memory addresses in packet,
which will be used for memory samples.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210211133856.2137-2-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in
the perf data, when output the tracing data, it tells tools that it
contains data source in the memory event.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210211133856.2137-1-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Minor cleanup.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210211183914.4093187-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Migrated to libperf in:
4b247fa731 ("libperf: Adopt xyarray class from perf")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210212043803.365993-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds a "void *owner" member. The existing
bpf_tcp_ca test will ensure the bpf_cubic.o and bpf_dctcp.o
can be loaded.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210212021037.267278-1-kafai@fb.com
When libbpf initializes the kernel's struct_ops in
"bpf_map__init_kern_struct_ops()", it enforces all
pointer types must be a function pointer and rejects
others. It turns out to be too strict. For example,
when directly using "struct tcp_congestion_ops" from vmlinux.h,
it has a "struct module *owner" member and it is set to NULL
in a bpf_tcp_cc.o.
Instead, it only needs to ensure the member is a function
pointer if it has been set (relocated) to a bpf-prog.
This patch moves the "btf_is_func_proto(kern_mtype)" check
after the existing "if (!prog) { continue; }". The original debug
message in "if (!prog) { continue; }" is also removed since it is
no longer valid. Beside, there is a later debug message to tell
which function pointer is set.
The "btf_is_func_proto(mtype)" has already been guaranteed
in "bpf_object__collect_st_ops_relos()" which has been run
before "bpf_map__init_kern_struct_ops()". Thus, this check
is removed.
v2:
- Remove outdated debug message (Andrii)
Remove because there is a later debug message to tell
which function pointer is set.
- Following mtype->type is no longer needed. Remove:
"skip_mods_and_typedefs(btf, mtype->type, &mtype_id)"
- Do "if (!prog)" test before skip_mods_and_typedefs.
Fixes: 590a008882 ("bpf: libbpf: Add STRUCT_OPS support")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210212021030.266932-1-kafai@fb.com
We have the environments where usage of AF_INET is prohibited
(cgroup/sock_create returns EPERM for AF_INET). Let's use
AF_LOCAL instead of AF_INET, it should perfectly work with SIOCETHTOOL.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/20210209221826.922940-1-sdf@google.com
This is a tool that is intended to work around the fact that the
preemptoff, irqsoff, and preemptirqsoff tracers only work in
overwrite mode. The idea is to act randomly in such a way that we
do not systematically lose any latencies, so that if enough testing
is done, all latencies will be captured. If the same burst of
latencies is repeated, then sooner or later we will have captured all
the latencies.
It also works with the wakeup_dl, wakeup_rt, and wakeup tracers.
However, in that case it is probably not useful to use the random
sleep functionality.
The reason why it may be desirable to catch all latencies with a long
test campaign is that for some organizations, it's necessary to test
the kernel in the field and not practical for developers to work
iteratively with field testers. Because of cost and project schedules
it is not possible to start a new test campaign every time a latency
problem has been fixed.
It uses inotify to detect changes to /sys/kernel/tracing/trace.
When a latency is detected, it will either sleep or print
immediately, depending on a function that act as an unfair coin
toss.
If immediate print is chosen, it means that we open
/sys/kernel/tracing/trace and thereby cause a blackout period
that will hide any subsequent latencies.
If sleep is chosen, it means that we wait before opening
/sys/kernel/tracing/trace, by default for 1000 ms, to see if
there is another latency during this period. If there is, then we will
lose the previous latency. The coin will be tossed again with a
different probability, and we will either print the new latency, or
possibly a subsequent one.
The probability for the unfair coin toss is chosen so that there
is equal probability to obtain any of the latencies in a burst.
However, this assumes that we make an assumption of how many
latencies there can be. By default the program assumes that there
are no more than 2 latencies in a burst, the probability of immediate
printout will be:
1/2 and 1
Thus, the probability of getting each of the two latencies will be 1/2.
If we ever find that there is more than one latency in a series,
meaning that we reach the probability of 1, then the table will be
expanded to:
1/3, 1/2, and 1
Thus, we assume that there are no more than three latencies and each
with a probability of 1/3 of being captured. If the probability of 1
is reached in the new table, that is we see more than two closely
occurring latencies, then the table will again be extended, and so
on.
On my systems, it seems like this scheme works fairly well, as
long as the latencies we trace are long enough, 300 us seems to be
enough. This userspace program receive the inotify event at the end
of a latency, and it has time until the end of the next latency
to react, that is to open /sys/kernel/tracing/trace. Thus,
if we trace latencies that are >300 us, then we have at least 300 us
to react.
The minimum latency will of course not be 300 us on all systems, it
will depend on the hardware, kernel version, workload and
configuration.
Example usage:
In one shell, give the following command:
sudo latency-collector -rvv -t preemptirqsoff -s 2000 -a 3
This will trace latencies > 2000us with the preemptirqsoff tracer,
using random sleep with maximum verbosity, with a probability
table initialized to a size of 3.
In another shell, generate a few bursts of latencies:
root@host:~# modprobe preemptirq_delay_test delay=3000 test_mode=alternate
burst_size=3
root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
If all goes well, you should be getting stack traces that shows
all the different latencies, i.e. you should see all the three
functions preemptirqtest_0, preemptirqtest_1, preemptirqtest_2 in the
stack traces.
Link: https://lkml.kernel.org/r/20210212134421.172750-2-Viktor.Rosendahl@bmw.de
Signed-off-by: Viktor Rosendahl <Viktor.Rosendahl@bmw.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
- Make the nVHE EL2 object relocatable, resulting in much more
maintainable code
- Handle concurrent translation faults hitting the same page
in a more elegant way
- Support for the standard TRNG hypervisor call
- A bunch of small PMU/Debug fixes
- Allow the disabling of symbol export from assembly code
- Simplification of the early init hypercall handling
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmAmjqEPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDoUEQAIrJ7YF4v4gz06a0HG9+b6fbmykHyxlG7jfm
trvctfaiKzOybKoY5odPpNFzhbYOOdXXqYipyTHGwBYtGSy9G/9SjMKSUrfln2Ni
lr1wBqapr9TE+SVKoR8pWWuZxGGbHVa7brNuMbMsMi1wwAsM2/n70H9PXrdq3QiK
Ge1DWLso2oEfhtTwqNKa4dwB2MHjBhBFhhq+Nq5pslm6mmxJaYqz7pyBmw/C+2cc
oU/6kpAa1yPAauptWXtYXJYOMHihxgEa1IdK3Gl0hUyFyu96xVkwH/KFsj+bRs23
QGGCSdy4313hzaoGaSOTK22R98Aeg0wI9a6tcCBvVVjTAztnlu1FPtUZr8e/F7uc
+r8xVJUJFiywt3Zktf/D7YDK9LuMMqFnj0BkI4U9nIBY59XZRNhENsBCmjru5lnL
iXa5cuta03H4emfssIChLpgn0XHFas6t5dFXBPGbXyw0qsQchTw98iQX9LVxefUK
rOUGPIN4nE9ESRIZe0SPlAVeCtNP8cLH7+0YG9MJ1QeDVYaUsnvy9Ln/ox+514mR
5y2KJ6y7xnLB136SKCzPDDloYtz7BDiJq6a/RPiXKGheKoxy+N+BSe58yWCqFZYE
Fx/cGUr7oSg39U7gCboog6BDp5e2CXBfbRllg6P47bZFfdPNwzNEzHvk49VltMxx
Rl2W05bk
=6EwV
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for Linux 5.12
- Make the nVHE EL2 object relocatable, resulting in much more
maintainable code
- Handle concurrent translation faults hitting the same page
in a more elegant way
- Support for the standard TRNG hypervisor call
- A bunch of small PMU/Debug fixes
- Allow the disabling of symbol export from assembly code
- Simplification of the early init hypercall handling
Merge in the recent paravirt changes to resolve conflicts caused
by objtool annotations.
Conflicts:
arch/x86/xen/xen-asm.S
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull RCU updates from Paul E. McKenney:
- Documentation updates.
- Miscellaneous fixes.
- kfree_rcu() updates: Addition of mem_dump_obj() to provide allocator return
addresses to more easily locate bugs. This has a couple of RCU-related commits,
but is mostly MM. Was pulled in with akpm's agreement.
- Per-callback-batch tracking of numbers of callbacks,
which enables better debugging information and smarter
reactions to large numbers of callbacks.
- The first round of changes to allow CPUs to be runtime switched from and to
callback-offloaded state.
- CONFIG_PREEMPT_RT-related changes.
- RCU CPU stall warning updates.
- Addition of polling grace-period APIs for SRCU.
- Torture-test and torture-test scripting updates, including a "torture everything"
script that runs rcutorture, locktorture, scftorture, rcuscale, and refscale.
Plus does an allmodconfig build.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This builds up on the existing socket cookie test which checks whether
the bpf_get_socket_cookie helpers provide the same value in
cgroup/connect6 and sockops programs for a socket created by the
userspace part of the test.
Instead of having an update_cookie sockops program tag a socket local
storage with 0xFF, this uses both an update_cookie_sockops program and
an update_cookie_tracing program which succesively tag the socket with
0x0F and then 0xF0.
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/bpf/20210210111406.785541-5-revest@chromium.org
When migrating from the bpf.h's to the vmlinux.h's definition of struct
bps_sock, an interesting LLVM behavior happened. LLVM started producing
two fetches of ctx->sk in the sockops program this means that the
verifier could not keep track of the NULL-check on ctx->sk. Therefore,
we need to extract ctx->sk in a variable before checking and
dereferencing it.
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210111406.785541-4-revest@chromium.org
Currently, the selftest for the BPF socket_cookie helpers is built and
run independently from test_progs. It's easy to forget and hard to
maintain.
This patch moves the socket cookies test into prog_tests/ and vastly
simplifies its logic by:
- rewriting the loading code with BPF skeletons
- rewriting the server/client code with network helpers
- rewriting the cgroup code with test__join_cgroup
- rewriting the error handling code with CHECKs
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210111406.785541-3-revest@chromium.org
This needs a new helper that:
- can work in a sleepable context (using sock_gen_cookie)
- takes a struct sock pointer and checks that it's not NULL
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210111406.785541-2-revest@chromium.org
Since "92acdc58ab11 bpf, net: Rework cookie generator as per-cpu one"
socket cookies are not guaranteed to be non-decreasing. The
bpf_get_socket_cookie helper descriptions are currently specifying that
cookies are non-decreasing but we don't want users to rely on that.
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/bpf/20210210111406.785541-1-revest@chromium.org
GCC (GCC) 8.4.0 20200304 fails to build perf with:
: util/symbol.c: In function 'dso__load_bfd_symbols':
: util/symbol.c:1626:16: error: comparison of integer expressions of different signednes
: for (i = 0; i < symbols_count; ++i) {
: ^
: util/symbol.c:1632:16: error: comparison of integer expressions of different signednes
: while (i + 1 < symbols_count &&
: ^
: util/symbol.c:1637:13: error: comparison of integer expressions of different signednes
: if (i + 1 < symbols_count &&
: ^
: cc1: all warnings being treated as errors
It's unlikely that the symtable will be that big, but the fix is an
oneliner and as perf has CORE_CFLAGS += -Wextra, which makes build to
fail together with CORE_CFLAGS += -Werror
Fixes: eac9a4342e ("perf symbols: Try reading the symbol table with libbfd")
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Jacek Caban <jacek@codeweavers.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Link: http://lore.kernel.org/lkml/20210209145148.178702-1-dima@arista.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Considering the following testcase:
int
foo(int a, int b)
{
for (unsigned i = 0; i < 1000000000; i++)
a += b;
return a;
}
int main()
{
foo (3, 4);
return 0;
}
'perf annotate' displays:
86.52 │40055e: → ja 40056c <foo(int, int)+0x26>
13.37 │400560: mov -0x18(%rbp),%eax
│400563: add %eax,-0x14(%rbp)
│400566: addl $0x1,-0x4(%rbp)
0.11 │40056a: → jmp 400557 <foo(int, int)+0x11>
│40056c: mov -0x14(%rbp),%eax
│40056f: pop %rbp
and the 'ja 40056c' does not link to the location in the function. It's
caused by fact that comma is wrongly parsed, it's part of function
signature.
With my patch I see:
86.52 │ ┌──ja 26
13.37 │ │ mov -0x18(%rbp),%eax
│ │ add %eax,-0x14(%rbp)
│ │ addl $0x1,-0x4(%rbp)
0.11 │ │↑ jmp 11
│26:└─→mov -0x14(%rbp),%eax
and 'o' output prints:
86.52 │4005┌── ↓ ja 40056c <foo(int, int)+0x26>
13.37 │4005│0: mov -0x18(%rbp),%eax
│4005│3: add %eax,-0x14(%rbp)
│4005│6: addl $0x1,-0x4(%rbp)
0.11 │4005│a: ↑ jmp 400557 <foo(int, int)+0x11>
│4005└─→ mov -0x14(%rbp),%eax
On the contrary, compiling the very same file with gcc -x c, the parsing
is fine because function arguments are not displayed:
jmp 400543 <foo+0x1d>
Committer testing:
Before:
$ cat cpp_args_annotate.c
int
foo(int a, int b)
{
for (unsigned i = 0; i < 1000000000; i++)
a += b;
return a;
}
int main()
{
foo (3, 4);
return 0;
}
$ gcc --version |& head -1
gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9)
$ gcc -g cpp_args_annotate.c -o cpp_args_annotate
$ perf record ./cpp_args_annotate
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.275 MB perf.data (7188 samples) ]
$ perf annotate --stdio2 foo
Samples: 7K of event 'cycles:u', 4000 Hz, Event count (approx.): 7468429289, [percent: local period]
foo() /home/acme/c/cpp_args_annotate
Percent
0000000000401106 <foo>:
foo():
int
foo(int a, int b)
{
push %rbp
mov %rsp,%rbp
mov %edi,-0x14(%rbp)
mov %esi,-0x18(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
movl $0x0,-0x4(%rbp)
↓ jmp 1d
a += b;
13.45 13: mov -0x18(%rbp),%eax
add %eax,-0x14(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
addl $0x1,-0x4(%rbp)
0.09 1d: cmpl $0x3b9ac9ff,-0x4(%rbp)
86.46 ↑ jbe 13
return a;
mov -0x14(%rbp),%eax
}
pop %rbp
← retq
$
I.e. works for C, now lets switch to C++:
$ g++ -g cpp_args_annotate.c -o cpp_args_annotate
$ perf record ./cpp_args_annotate
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.268 MB perf.data (6976 samples) ]
$ perf annotate --stdio2 foo
Samples: 6K of event 'cycles:u', 4000 Hz, Event count (approx.): 7380681761, [percent: local period]
foo() /home/acme/c/cpp_args_annotate
Percent
0000000000401106 <foo(int, int)>:
foo(int, int):
int
foo(int a, int b)
{
push %rbp
mov %rsp,%rbp
mov %edi,-0x14(%rbp)
mov %esi,-0x18(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
movl $0x0,-0x4(%rbp)
cmpl $0x3b9ac9ff,-0x4(%rbp)
86.53 → ja 40112c <foo(int, int)+0x26>
a += b;
13.32 mov -0x18(%rbp),%eax
0.00 add %eax,-0x14(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
addl $0x1,-0x4(%rbp)
0.15 → jmp 401117 <foo(int, int)+0x11>
return a;
mov -0x14(%rbp),%eax
}
pop %rbp
← retq
$
Reproduced.
Now with this patch:
Reusing the C++ built binary, as we can see here:
$ readelf -wi cpp_args_annotate | grep producer
<c> DW_AT_producer : (indirect string, offset: 0x2e): GNU C++14 10.2.1 20201125 (Red Hat 10.2.1-9) -mtune=generic -march=x86-64 -g
$
And furthermore:
$ file cpp_args_annotate
cpp_args_annotate: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=4fe3cab260204765605ec630d0dc7a7e93c361a9, for GNU/Linux 3.2.0, with debug_info, not stripped
$ perf buildid-list -i cpp_args_annotate
4fe3cab260204765605ec630d0dc7a7e93c361a9
$ perf buildid-list | grep cpp_args_annotate
4fe3cab260204765605ec630d0dc7a7e93c361a9 /home/acme/c/cpp_args_annotate
$
It now works:
$ perf annotate --stdio2 foo
Samples: 6K of event 'cycles:u', 4000 Hz, Event count (approx.): 7380681761, [percent: local period]
foo() /home/acme/c/cpp_args_annotate
Percent
0000000000401106 <foo(int, int)>:
foo(int, int):
int
foo(int a, int b)
{
push %rbp
mov %rsp,%rbp
mov %edi,-0x14(%rbp)
mov %esi,-0x18(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
movl $0x0,-0x4(%rbp)
11: cmpl $0x3b9ac9ff,-0x4(%rbp)
86.53 ↓ ja 26
a += b;
13.32 mov -0x18(%rbp),%eax
0.00 add %eax,-0x14(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
addl $0x1,-0x4(%rbp)
0.15 ↑ jmp 11
return a;
26: mov -0x14(%rbp),%eax
}
pop %rbp
← retq
$
Signed-off-by: Martin Liška <mliska@suse.cz>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Link: http://lore.kernel.org/lkml/13e1a405-edf9-e4c2-4327-a9b454353730@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some of the synthetic event errors and positions have changed in the
code - update those and add several more tests.
Also add a runtime check to ensure that the kernel supports dynamic
strings in synthetic events, which these tests require.
Link: https://lkml.kernel.org/r/51402656433455baead34f068c6e9466b64df9c0.1612208610.git.zanussi@kernel.org
Fixes: 81ff92a93d (selftests/ftrace: Add test case for synthetic event syntax errors)
Reported-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
As started by commit 05a5f51ca5 ("Documentation: Replace lkml.org
links with lore"), replace lkml.org links with lore to better use a
single source that's more likely to stay available long-term.
Signed-off-by: Kees Kook <keescook@chromium.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: http://lore.kernel.org/lkml/20210210234220.2401035-1-keescook@chromium.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick a new prctl introduced in:
36a6c843fd ("entry: Use different define for selector variable in SUD")
That don't result in any changes in tooling:
$ tools/perf/trace/beauty/prctl_option.sh > before
$ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
$ tools/perf/trace/beauty/prctl_option.sh > after
$ diff -u before after
Just silences this perf tools build warning:
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The test_xdp_redirect.sh script uses a bash feature, '&>'. On systems,
e.g. Debian, where '/bin/sh' is dash, this will not work as
expected. Use bash in the shebang to get the expected behavior.
Further, using 'set -e' means that the error of a command cannot be
captured without the command being executed with '&&' or '||'. Let us
restructure the ping-commands, and use them as an if-expression, so
that we can capture the return value.
v4: Added missing Fixes:, and removed local variables. (Andrii)
v3: Reintroduced /bin/bash, and kept 'set -e'. (Andrii)
v2: Kept /bin/sh and removed bashisms. (Randy)
Fixes: 996139e801 ("selftests: bpf: add a test for XDP redirect")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210211082029.1687666-1-bjorn.topel@gmail.com
Add a basic test for map-in-map and per-cpu maps in sleepable programs.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/bpf/20210210033634.62081-10-alexei.starovoitov@gmail.com
Add per-program counter for number of times recursion prevention mechanism
was triggered and expose it via show_fdinfo and bpf_prog_info.
Teach bpftool to print it.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210033634.62081-7-alexei.starovoitov@gmail.com
Add recursive non-sleepable fentry program as a test.
All attach points where sleepable progs can execute are non recursive so far.
The recursion protection mechanism for sleepable cannot be activated yet.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210033634.62081-6-alexei.starovoitov@gmail.com
Since both sleepable and non-sleepable programs execute under migrate_disable
add recursion prevention mechanism to both types of programs when they're
executed via bpf trampoline.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210033634.62081-5-alexei.starovoitov@gmail.com
Add a test for the perf daemon 'lock' command ensuring only one instance
of daemon can run over one base directory.
Committer testing:
[root@five ~]# perf test -v daemon
76: daemon operations :
--- start ---
test child forked, pid 793255
test daemon list
test daemon reconfig
test daemon stop
test daemon signal
signal 12 sent to session 'test [793506]'
signal 12 sent to session 'test [793506]'
test daemon ping
test daemon lock
test child finished with 0
---- end ----
daemon operations: Ok
[root@five ~]#
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-25-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>