- Yet another fix for non-CPU accesses to the memory backing
the VGICv3 subsystem
- A set of fixes for the setlftest checking for the S1PTW
behaviour after the fix that went in ealier in the cycle
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmPWwGEPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDi/wP/3kbZrZ+y/YNcYioQwqibRS5DKuACKXM1dbh
sMX0e8t3frmkrfkHZ1FsBNjSWtDLmRbjANNDWi8ypAXaPVm7/0whFqkJgyPWDO+v
/1VXYMwMjy2zpWfPGPu+/fQL0Ninp+EfLP3Y2/Lr8VW5rH21bfuQ1rm41ucK/jB5
IsMiQ+YObZUTrSq22fHfNJKc8fysSqeMHW96bl0QnJxf6aDDieZFGF9rlRQf/faq
lPux0faasgQC0VgXlokWGdU1x5kXIf3Ta4VtiKARKNwxziuG8B484+5hHXvoBR1h
bXFJJUQjQs2qBuH75BJftini9fvWvQPgbk4NvkD1tlyMhlZ5w2MTTKB4QmuW/WDT
OGuGXAcuP2stm0dUaSn1aCwzfYgtihssp+RCAB5DOoL64i/CtHl+FJgz8wZfDPRk
UNXdK2JccDfD6bGv/kQqPJoozjI5e8Ha2ks1O4IPHIDpIsVMIWRRGULgIRvLaHaS
iaR7Vx+XgzW50Knj++S85eak/aTSkVaykYZIiiB4DTai1/XuAZfMA79X6IvQLxHq
419FHmXwhJmYdWZ/JFBXWnbR6wRJiv4TR23A5u8X6o/YgBn6fmwAt6o8Avk1quZQ
mslRPHG45hM/7Z7uSEsIQnbVVnHPhbaKr3GmHlJJ4zXRI8GaSMe23wpnJdUj1q9a
w1Oe0rpq
=2l/n
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.2, take #3
- Yet another fix for non-CPU accesses to the memory backing
the VGICv3 subsystem
- A set of fixes for the setlftest checking for the S1PTW
behaviour after the fix that went in ealier in the cycle
With each test taking 4 seconds the runtime of pcm-test can add up. Since
generally each card in the system is physically independent and will be
unaffected by what's going on with other cards we can mitigate this by
testing each card in parallel. Make a list of cards as we enumerate the
system and then start a thread for each, then join the threads to ensure
they have all finished. The threads each run the same tests we currently
run for each PCM on the card before exiting.
The list of PCMs is kept global since it helps with global operations
like working out our planned number of tests and identifying missing PCMs
and it seemed neater to check for PCMs on the right card in the card
thread than make every PCM loop iterate over cards as well.
We don't run per-PCM tests in parallel since in embedded systems it can
be the case that resources are shared between the PCMs and operations on
one PCM on a card may constrain what can be done on another PCM on the same
card leading to potentially unstable results.
We use a mutex to ensure that the reporting of results is serialised and we
don't have issues with anything like the current test number, we could do
this in the kselftest framework but it seems like this might cause problems
for other tests that are doing lower level testing and building in
constrained environments such as nolibc so this seems more sensible.
Note that the ordering of the tests can't be guaranteed as things stand,
this does not seem like a major problem since the numbering of tests often
changes as test programs are changed so results parsers are expected to
rely on the test name rather than the test numbers. We also now prefix the
machine generated test name when printing the description of the test since
this is logged before streaming starts.
On my two card desktop system this reduces the overall runtime by a
third.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230203-alsa-pcm-test-card-thread-v1-1-59941640ebba@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Fix a simple typo in the documentation for bpf_perf_prog_read_value.
Signed-off-by: Florian Lehner <dev@der-flo.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20230203121439.25884-1-dev@der-flo.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Since setbuf(stdout, NULL) has been called in kvm_util.c with
__attribute((constructor)). Selftests no need to setup it in their own
code.
Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20230203061038.277655-1-shahuang@redhat.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
The Retire Latency field is added in the var3_w of the
PERF_SAMPLE_WEIGHT_STRUCT. The Retire Latency reports the number of
elapsed core clocks between the retirement of the instruction indicated
by the Instruction Pointer field of the PEBS record and the retirement
of the prior instruction. That's quite useful to display the information
with perf script.
Add a new field retire_lat for the Retire Latency information.
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20230104201349.1451191-9-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The Retire Latency field is added in the var3_w of the
PERF_SAMPLE_WEIGHT_STRUCT. The Retire Latency reports pipeline stall of
this instruction compared to the previous instruction in cycles. That's
quite useful to display the information with perf mem report.
The p_stage_cyc for Power is also from the var3_w. Union the p_stage_cyc
and retire_lat to share the code.
Implement X86 specific codes to display the X86 specific header.
Add a new sort key retire_lat for the Retire Latency.
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20230104201349.1451191-8-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It'd be useful to filter other than the current aggregation mode. For
example, users may want to see callstacks for specific locks only. Or
they may want tasks from a certain callstack.
The tracepoints already collected the information but it needs to check
the condition again when processing the event. And it needs to change
BPF to allow the key combinations.
The lock contentions on 'rcu_state' spinlock can be monitored:
$ sudo perf lock con -abv -L rcu_state sleep 1
...
contended total wait max wait avg wait type caller
4 151.39 us 62.57 us 37.85 us spinlock rcu_core+0xcb
0xffffffff81fd1666 _raw_spin_lock_irqsave+0x46
0xffffffff8172d76b rcu_core+0xcb
0xffffffff822000eb __softirqentry_text_start+0xeb
0xffffffff816a0ba9 __irq_exit_rcu+0xc9
0xffffffff81fc0112 sysvec_apic_timer_interrupt+0xa2
0xffffffff82000e46 asm_sysvec_apic_timer_interrupt+0x16
0xffffffff81d49f78 cpuidle_enter_state+0xd8
0xffffffff81d4a259 cpuidle_enter+0x29
1 30.21 us 30.21 us 30.21 us spinlock rcu_core+0xcb
0xffffffff81fd1666 _raw_spin_lock_irqsave+0x46
0xffffffff8172d76b rcu_core+0xcb
0xffffffff822000eb __softirqentry_text_start+0xeb
0xffffffff816a0ba9 __irq_exit_rcu+0xc9
0xffffffff81fc00c4 sysvec_apic_timer_interrupt+0x54
0xffffffff82000e46 asm_sysvec_apic_timer_interrupt+0x16
1 28.84 us 28.84 us 28.84 us spinlock rcu_accelerate_cbs_unlocked+0x40
0xffffffff81fd1c60 _raw_spin_lock+0x30
0xffffffff81728cf0 rcu_accelerate_cbs_unlocked+0x40
0xffffffff8172da82 rcu_core+0x3e2
0xffffffff822000eb __softirqentry_text_start+0xeb
0xffffffff816a0ba9 __irq_exit_rcu+0xc9
0xffffffff81fc0112 sysvec_apic_timer_interrupt+0xa2
0xffffffff82000e46 asm_sysvec_apic_timer_interrupt+0x16
0xffffffff81d49f78 cpuidle_enter_state+0xd8
...
To see tasks calling 'rcu_core' function:
$ sudo perf lock con -abt -S rcu_core sleep 1
contended total wait max wait avg wait pid comm
19 23.46 us 2.21 us 1.23 us 0 swapper
2 18.37 us 17.01 us 9.19 us 2061859 ThreadPoolForeg
3 5.76 us 1.97 us 1.92 us 3909 pipewire-pulse
1 2.26 us 2.26 us 2.26 us 1809271 MediaSu~isor #2
1 1.97 us 1.97 us 1.97 us 1514882 Chrome_ChildIOT
1 987 ns 987 ns 987 ns 3740 pipewire-pulse
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230203021324.143540-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a preparation work to support complex keys of BPF maps. Now it
has single value key according to the aggregation mode like stack_id or
pid. But we want to use a combination of those keys.
Then lock_contention_read() should still aggregate the result based on
the key that was requested by user. The other key info will be used for
filtering.
So instead of creating a lock_stat entry always, Check if it's already
there using lock_stat_find() first.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230203021324.143540-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The lock_contention_get_name() returns a name for the lock stat entry
based on the current aggregation mode. As it's called sequentially in a
single thread, it can return the address of a static buffer for symbol
and offset of the caller.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230203021324.143540-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arm SPEv1.2 adds a new optional address packet type: previous branch
target. The recorded address is the target virtual address of the most
recently taken branch in program order.
Add support for decoding the address packet in raw dumps.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230203162401.132931-1-robh@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a target that generates a log file for running metric_test.py and
make this a dependency on generating pmu-events.c. The log output is
displayed if the test fails like (the test was modified to make it
fail):
```
TEST /tmp/perf/pmu-events/metric_test.log
F......
======================================================================
FAIL: test_Brackets (__main__.TestMetricExpressions)
----------------------------------------------------------------------
Traceback (most recent call last):
File "tools/perf/pmu-events/metric_test.py", line 33, in test_Brackets
self.assertEqual((a * b + c).ToPerfJson(), 'a * b + d')
AssertionError: 'a * b + c' != 'a * b + d'
- a * b + c
? ^
+ a * b + d
? ^
----------------------------------------------------------------------
Ran 7 tests in 0.004s
FAILED (failures=1)
make[3]: *** [pmu-events/Build:32: /tmp/perf/pmu-events/metric_test.log] Error 1
```
However, normal execution will just show the TEST line.
This is roughly modeled on fortify testing in the kernel lib directory.
Modify metric_test.py so that it is executable. This is necessary when
PYTHON isn't specified in the build, the normal case.
Use variables to make the paths to files clearer and more consistent.
Committer notes:
Add pmu-events/metric_test.log to tools/perf/.gitignore and to the
'clean' target on tools/perf/Makefile.perf.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY9x+swAKCRDdBJ7gKXxA
joPwAP95XqB7gzy2l1Mc++Ta7Ih0fS34Pj1vTAxwsRQnqzr6rwD/QOt3YU9KgXpy
D7Fp8NnaQZq6m5o8cvV5+fBqA3uarAM=
=IIB8
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2023-02-02-19-24-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"25 hotfixes, mainly for MM. 13 are cc:stable"
* tag 'mm-hotfixes-stable-2023-02-02-19-24-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (26 commits)
mm: memcg: fix NULL pointer in mem_cgroup_track_foreign_dirty_slowpath()
Kconfig.debug: fix the help description in SCHED_DEBUG
mm/swapfile: add cond_resched() in get_swap_pages()
mm: use stack_depot_early_init for kmemleak
Squashfs: fix handling and sanity checking of xattr_ids count
sh: define RUNTIME_DISCARD_EXIT
highmem: round down the address passed to kunmap_flush_on_unmap()
migrate: hugetlb: check for hugetlb shared PMD in node migration
mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups
Revert "mm: kmemleak: alloc gray object for reserved region with direct map"
freevxfs: Kconfig: fix spelling
maple_tree: should get pivots boundary by type
.mailmap: update e-mail address for Eugen Hristev
mm, mremap: fix mremap() expanding for vma's with vm_ops->close()
squashfs: harden sanity check in squashfs_read_xattr_id_table
ia64: fix build error due to switch case label appearing next to declaration
mm: multi-gen LRU: fix crash during cgroup migration
Revert "mm: add nodes= arg to memory.reclaim"
zsmalloc: fix a race with deferred_handles storing
...
Add quiet_cmd_test so that:
$(Q)$(call echo-cmd,test)
will print:
TEST <path>
This is useful for executing compile-time tests similar to what
happens for fortify tests in the kernel's lib directory.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A character encoding issue added a "3D" character that breaks the
metrics test.
Fixes: 40769665b6 ("perf jevents: Parse metrics during conversion")
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The #slots literal will return NAN when not on ARM64 which causes a
perf test failure when not on an ARM64 for a JEVENTS_ARCH=all build:
..
10.4: Parsing of PMU event table metrics with fake PMUs : FAILED!
..
Add an is_test boolean so that the failure can be avoided when running
as a test.
Fixes: acef233b7c ("perf pmu: Add #slots literal support for arm64")
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This allows the set of generated jevents events and metrics be limited
to a subset of the model names. Appropriate if trying to minimize the
binary size where only a set of models are possible.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Turn a perf json event into an event, metric or both. This reduces the
number of events needed to scan to find an event or metric. As events
no longer need the relatively seldom used metric fields, 4 bytes is
saved per event. This reduces the big C string's size by 335kb (14.8%)
on x86.
Note, for the test PMU architecture pme_test_soc_cpu is renamed
pmu_events__test_soc_cpu for consistency with the event vs metric
naming convention.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a metrics table that is just a cast from pmu_events_table. This
changes the APIs so that event and metric usage of the underlying
table is different. For the no jevents case the tables are already
separate, later changes will separate the tables for the jevents case.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Combine into a single function to simplify, in a later change, writing
metrics separately.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Metrics are their own unit and these variables held broken metrics
previously and now just hold the value NULL. Remove code that used
these variables.
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Previous changes separated the uses of pmu_event and pmu_metric,
however, both structures contained all the variables of event and
metric. This change removes the event variables from metric and the
metric variables from event.
Note, this change removes the setting of evsel's metric_name/expr as
these fields are no longer part of struct pmu_event. The metric
remains but is no longer implicitly requested when the event is. This
impacts a few Intel uncore events, however, as the ScaleUnit is shared
by the event and the metric this utility is questionable. Also the
MetricNames look broken (contain spaces) in some cases and when trying
to use the functionality with '-e' the metrics fail but regular
metrics with '-M' work. For example, on SkylakeX '-M' works:
```
$ perf stat -M LLC_MISSES.PCIE_WRITE -a sleep 1
Performance counter stats for 'system wide':
0 UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 # 57896.0 Bytes LLC_MISSES.PCIE_WRITE (49.84%)
7,174 UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 (49.85%)
0 UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 (50.16%)
63 UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 (50.15%)
1.004576381 seconds time elapsed
```
whilst the event '-e' version is broken even with --group/-g (fwiw, we should also remove -g [1]):
```
$ perf stat -g -e LLC_MISSES.PCIE_WRITE -g -a sleep 1
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Performance counter stats for 'system wide':
27,316 Bytes LLC_MISSES.PCIE_WRITE
1.004505469 seconds time elapsed
```
The code also carries warnings where the user is supposed to select
events for metrics [2] but given the lack of use of such a feature,
let's clean the code and just remove.
[1] https://lore.kernel.org/lkml/20220707195610.303254-1-irogers@google.com/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/stat-shadow.c?id=01b8957b738f42f96a130079bc951b3cc78c5b8a#n425
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Separate the event and metric table when building without jevents. Add
find_core_metrics_table and perf_pmu__find_metrics_table while
renaming existing utilities to be event specific, so that users can
find the right table for their need.
Committer notes:
Fix the build on aarch64 with:
tools/perf/arch/arm64/util/pmu.c
@@ -32,7 +32,7 @@ const struct pmu_events_table *pmu_events_table__find(void)
- return perf_pmu__find_table(pmu);
+ return perf_pmu__find_events_table(pmu);
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This release adds following change:
- Minor fixes for coverity static analysis
- Don't read cpufreq on offline CPUs
- SST turbo-freq enable on auto mode when user disables SMT from
kernel command line
- Fix uncore frequency display
- Set uncore frequency max/min limits on perf level change
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
When perf level is changed, uncore limits can change. Set the uncore
limits via Linux uncore sysfs, when user changes perf level with
-o option.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Need memory frequency quirk as Sapphire Rapids in Emerald Rapids.
So add Emerald Rapids CPU model check in is_spr_platform().
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: Subject, changelog and code edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Uncore P1 is not uncore minmum frequency. This is uncore base frequency.
Correct display from uncore-frequency-min(MHz)
to uncore-frequency-base(Mhz).
To get uncore min frequency use mailbox command
CONFIG_TDP_GET_RATIO_INFO. Use this mailbox to get uncore frequency
limits when present.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
When SMT is disabled from kernel command line, sibling CPUs still
appears in the sysfs as offline CPUs. This is a problem when turbo-freq
is enabled in auto mode. They are still assigned to CLOS value
of 3 as they are still in the present CPU list. But they are not in the
sibling list of a CPU. When the CPU is a high priority CPU, because of
sibling it will be still set to CLOS to 3 as CLOS is assigned at core
level not at CPU level.
So, avoid setting CLOS 3 to offline CPU.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Due to some recent kernel changes, reading cpufreq attributes like
scaling_max_freq on offline CPUs returns error. So avoid reading
cpufreq attributes on offline CPUs.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
strlen() and strtok() takes null-termimated strings as input.
Make sure these strings are null-terminated before using them.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Add handling for open() failure case to make sure a valid file
descriptor is passed to dup().
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
variable 'non_block' is always 0, thus remove the variable and the
handling for "non_block != 0" case.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
struct isst_id *id is a pointer, comparing it with less than zero is wrong.
The check is there to make sure the id->pkg and id->die is set to -1, when
it is illegal or unavailable. Here comparing with MAX_PACKAGE_COUNT and
MAX_DIE_PER_PACKAGE is sufficient.
Hence remove the wrong check.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: Subject and changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Right now there is no way to provide additional cflags/ldflags when
building tools/vm binaries. And using eg. make CFLAGS=<options> will
override the CFLAGS being set in the Makefile, making the build fail since
it requires the include of the ../lib dir (for libapi).
This change then allows you to specify:
CFLAGS=<options> LDFLAGS=<options> make V=1 -C tools/vm
And the options will be correctly appended as can be seen from the
make output.
Link: https://lkml.kernel.org/r/20230116224921.4106324-1-herton@redhat.com
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Justin Forbes <jforbes@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Scott Weaver <scweaver@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
__GFP_ATOMIC serves little purpose. Its main effect is to set
ALLOC_HARDER which adds a few little boosts to increase the chance of an
allocation succeeding, one of which is to lower the water-mark at which it
will succeed.
It is *always* paired with __GFP_HIGH which sets ALLOC_HIGH which also
adjusts this watermark. It is probable that other users of __GFP_HIGH
should benefit from the other little bonuses that __GFP_ATOMIC gets.
__GFP_ATOMIC also gives a warning if used with __GFP_DIRECT_RECLAIM.
There is little point to this. We already get a might_sleep() warning if
__GFP_DIRECT_RECLAIM is set.
__GFP_ATOMIC allows the "watermark_boost" to be side-stepped. It is
probable that testing ALLOC_HARDER is a better fit here.
__GFP_ATOMIC is used by tegra-smmu.c to check if the allocation might
sleep. This should test __GFP_DIRECT_RECLAIM instead.
This patch:
- removes __GFP_ATOMIC
- allows __GFP_HIGH allocations to ignore watermark boosting as well
as GFP_ATOMIC requests.
- makes other adjustments as suggested by the above.
The net result is not change to GFP_ATOMIC allocations. Other
allocations that use __GFP_HIGH will benefit from a few different extra
privileges. This affects:
xen, dm, md, ntfs3
the vermillion frame buffer
hibernation
ksm
swap
all of which likely produce more benefit than cost if these selected
allocation are more likely to succeed quickly.
[mgorman: Minor adjustments to rework on top of a series]
Link: https://lkml.kernel.org/r/163712397076.13692.4727608274002939094@noble.neil.brown.name
Link: https://lkml.kernel.org/r/20230113111217.14134-7-mgorman@techsingularity.net
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The parameter entry of mas_preallocate is not used, so drop it.
Link: https://lkml.kernel.org/r/20230110154211.1758562-1-vernon2gm@gmail.com
Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
A selftest case for DAMON debugfs interface has a test for expected
failure. To make the test output clean, hide the expected failure error
message.
Link: https://lkml.kernel.org/r/20230110190400.119388-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
DAMON selftests for sysfs (sysfs.sh) tests if some writes to DAMON sysfs
interface files fails as expected. It makes the test results noisy with
the failure error message because it tests a number of such failures.
Redirect the expected failure error messages to /dev/null to make the
results clean.
Link: https://lkml.kernel.org/r/20230110190400.119388-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The number of online cpu may be not equal to possible cpu.
"bpftool prog profile" can not create pmu event on possible
but on online cpu.
$ dmidecode -s system-product-name
PowerEdge R620
$ cat /sys/devices/system/cpu/possible
0-47
$ cat /sys/devices/system/cpu/online
0-31
Disable cpu dynamically:
$ echo 0 > /sys/devices/system/cpu/cpuX/online
If one cpu is offline, perf_event_open will return ENODEV.
To fix this issue:
* check value returned and skip offline cpu.
* close pmu_fd immediately on error path, avoid fd leaking.
Fixes: 47c09d6a9f ("bpftool: Introduce "prog profile" command")
Signed-off-by: Tonghao Zhang <tong@infragraf.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20230202131701.29519-1-tong@infragraf.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
The usual mixed bag. So far this has been a quiet cycle for IIO.
New device support
* adi,ad8686
- Add support for the AD5337 DAC - ID and 8 bit channel support.
* maxim,max5522
- New driver for this 2 channel DAC.
* nxp,imx93-adc
- New driver for this SoC ADC which is a fresh IP that will probably
turn up in additional SoCs going forwards.
* st,magn
- Add support for magnetometer part of LSM303C which is very similar
to standalone LIS3MDL already supported.
* ti,ads7924
- New driver for this 4 channel, 12-bit I2C ADC.
* ti,lmp92064
- New driver for this 12 bit SPI ADC.
* ti,tmag5273
- New driver for this 3D Hall-Effect Sensor.
Features
* core
- Add a standard structure for the value pairs in IIO_VAL_INT_PLUS_MICRO
available attributes and similar.
* cirrus,ep93xx
- Add DT binding docs and convert driver to DT based probing.
- Enable testing building with CONFIG_COMPILE_TEST.
* st,stm32-dfsdm
- Enable ID register support for discovery of hardware capabilities on
some devices.
Cleanups and minor fixes
* core
- Drop the custom iio_sysfs_match_string_with_gaps().
The special ability of this function to skip gaps in an array
was never used by any upstream driver.
- Sort headers whilst touching this file.
* tools
- Fix memory leak in iio_utils.c
* various
- leftover i2c probe_new() conversions.
- scnprintf() -> sysfs_emit() cleanups.
- hand rolled devm enables -> devm_regulator[_bulk]_get_enable()
- typo fixes
- dt-binding cleanup (whitespace, excess quotes and similar)
* adi,ad7746
- Set variable without pointless conditional.
* fsl,mma9551
- Squash false positives about use of uninitialized variable where
garbage undergoes an endian conversion before being ignored.
* measspec,ms5611
- Switch to fully devm_ managed probe() and so drop explicit remove()
* qcom,spmi-adc
- Use dev_err_probe() to suppress deferred print.
* qcom,spmi-adc5
- Define a missing channel used for battery identification.
* qcom,spmi-iadc
- Document a compatible seen in wild.
* semtech,sx9360
- Fix units on semtech,resolution dt-binding.
* sensiron,scd30
- dev_err_probe() usage to simplify error paths a little.
* st,lsm6dsx
- Add missing mount matrix for the gyro IIO device.
* taos,tsl2563
- Respect firmware configured interrupt polarity if present.
- Use i2c_smbus_write_word_data() in a few cases not previously covered.
- Factor out duplicated interrupt configuration.
- Switch to GENMASK() / BIT() from hand coded equivalents.
- Tidy up unused definitions.
- Use dev_err_probe() as appropriate.
- Drop platform_data as no in kernel users and there are better ways to
do equivalent if any are added.
- Add local struct device variable to tidy up code.
- Avoid dance via i2c_client to get the drvdata.
- Tidy up headers ordering and Makefile ordering.
* ti,adc128s052
- Use new spi_get_device_match_data().
- Drop ACPI_PTR() protection.
- Sort headers whilst here.
- Use asm instead of incorrect include of asm-generic/unaligned.h
* vishay,vcn4000
- Interrupt support for vcnl4040 (lots of refactoring needed)
* xilinx,ams
- Use fwnode_device_is_compatible() instead of open coding it.
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAmPb+A4RHGppYzIzQGtl
cm5lbC5vcmcACgkQVIU0mcT0FoiXQw/+N0ubxewmq7dUfhgzTv2GLjhNBlVJXgC0
GyXNEF3r2PMmKHQCpUHcJo2NxsOx4tpYhb0J61kuaCjBlRR3ON+KRKNWK+PLVGZh
I/ljRDxHo0vw9RQs1d+MRZAobZDVkEvJOoovtQC4Xna8vs+HDNVgpiQA/xVCHTF4
2qxqlRMloTpI4RprLTu6oeDgCthX3RcMM9yw+9N2dcQBw1MxqEZzkJej2dOJncz7
AJ7Wtw62EDhG+rOaWQHNUMTxOxVBZkJd3KEvc2aKDT8D08iZRtpWCGauaOrRn6Ss
IddopUW7ihAyyxDYXxLzSRJLs4XJ7t5HlKUJX4tSyuoMU/roPP8ACYte5SbiC9eD
wJ65wWQqkCIhgOD6KP2Q44GYDtWCARioun181Et4mcJTPduUiKZxRhyD6aiXcLGr
vJWaDcOmu1OtUbM57cNDaft3bEi8qh1+uY05ex7TRyonlw+d9QQvVBkaCWBvJwIm
DHJf6p9V6srxHhpsQziG3xYHzacc4GNP7+nr/rQw7lvatJYEYujUJEKfJ1QsDn9G
gNMkZeSga1b2Tsvg4InNPfrzO+K3XlT1HSF2nBZE9+o+w++5oX3DQng+2nnapzL0
jTTfpw6B4p40hbJB0bykfoy+XeDpgPhacUZvWwKmwLIy2/LnsorJgIc9aNkOfsvx
a+/m7/8/zaM=
=Dfec
-----END PGP SIGNATURE-----
Merge tag 'iio-for-6.3a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next
Jonathan writes:
1st set of IIO new device support, features and cleanups for the 6.3 cycle
The usual mixed bag. So far this has been a quiet cycle for IIO.
New device support
* adi,ad8686
- Add support for the AD5337 DAC - ID and 8 bit channel support.
* maxim,max5522
- New driver for this 2 channel DAC.
* nxp,imx93-adc
- New driver for this SoC ADC which is a fresh IP that will probably
turn up in additional SoCs going forwards.
* st,magn
- Add support for magnetometer part of LSM303C which is very similar
to standalone LIS3MDL already supported.
* ti,ads7924
- New driver for this 4 channel, 12-bit I2C ADC.
* ti,lmp92064
- New driver for this 12 bit SPI ADC.
* ti,tmag5273
- New driver for this 3D Hall-Effect Sensor.
Features
* core
- Add a standard structure for the value pairs in IIO_VAL_INT_PLUS_MICRO
available attributes and similar.
* cirrus,ep93xx
- Add DT binding docs and convert driver to DT based probing.
- Enable testing building with CONFIG_COMPILE_TEST.
* st,stm32-dfsdm
- Enable ID register support for discovery of hardware capabilities on
some devices.
Cleanups and minor fixes
* core
- Drop the custom iio_sysfs_match_string_with_gaps().
The special ability of this function to skip gaps in an array
was never used by any upstream driver.
- Sort headers whilst touching this file.
* tools
- Fix memory leak in iio_utils.c
* various
- leftover i2c probe_new() conversions.
- scnprintf() -> sysfs_emit() cleanups.
- hand rolled devm enables -> devm_regulator[_bulk]_get_enable()
- typo fixes
- dt-binding cleanup (whitespace, excess quotes and similar)
* adi,ad7746
- Set variable without pointless conditional.
* fsl,mma9551
- Squash false positives about use of uninitialized variable where
garbage undergoes an endian conversion before being ignored.
* measspec,ms5611
- Switch to fully devm_ managed probe() and so drop explicit remove()
* qcom,spmi-adc
- Use dev_err_probe() to suppress deferred print.
* qcom,spmi-adc5
- Define a missing channel used for battery identification.
* qcom,spmi-iadc
- Document a compatible seen in wild.
* semtech,sx9360
- Fix units on semtech,resolution dt-binding.
* sensiron,scd30
- dev_err_probe() usage to simplify error paths a little.
* st,lsm6dsx
- Add missing mount matrix for the gyro IIO device.
* taos,tsl2563
- Respect firmware configured interrupt polarity if present.
- Use i2c_smbus_write_word_data() in a few cases not previously covered.
- Factor out duplicated interrupt configuration.
- Switch to GENMASK() / BIT() from hand coded equivalents.
- Tidy up unused definitions.
- Use dev_err_probe() as appropriate.
- Drop platform_data as no in kernel users and there are better ways to
do equivalent if any are added.
- Add local struct device variable to tidy up code.
- Avoid dance via i2c_client to get the drvdata.
- Tidy up headers ordering and Makefile ordering.
* ti,adc128s052
- Use new spi_get_device_match_data().
- Drop ACPI_PTR() protection.
- Sort headers whilst here.
- Use asm instead of incorrect include of asm-generic/unaligned.h
* vishay,vcn4000
- Interrupt support for vcnl4040 (lots of refactoring needed)
* xilinx,ams
- Use fwnode_device_is_compatible() instead of open coding it.
* tag 'iio-for-6.3a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (71 commits)
iio: adc: ad7291: Fix indentation error by adding extra spaces
iio: accel: mma9551_core: Prevent uninitialized variable in mma9551_read_config_word()
iio: accel: mma9551_core: Prevent uninitialized variable in mma9551_read_status_word()
dt-bindings: iio/proximity: semtech,sx9360: Fix 'semtech,resolution' type
iio: imu: fix spdx format
iio: adc: imx93: Fix spelling mistake "geting" -> "getting"
dt-bindings: iio: cleanup examples - indentation
dt-bindings: iio: use lowercase hex in examples
dt-bindings: iio: correct node names in examples
dt-bindings: iio: minor whitespace cleanups
dt-bindings: iio: drop unneeded quotes
dt-bindings: iio: adc: Add NXP IMX93 ADC
iio: adc: add imx93 adc support
dt-bindings: iio: adc: add Texas Instruments ADS7924
iio: adc: ti-ads7924: add Texas Instruments ADS7924 driver
iio: imu: st_lsm6dsx: add 'mount_matrix' sysfs entry to gyro channel.
iio: imu: st_lsm6dsx: fix naming of 'struct iio_info' in st_lsm6dsx_shub.c.
iio: light: vcnl4000: Add interrupt support for vcnl4040
iio: light: vcnl4000: Make irq handling more generic
iio: light: vcnl4000: Prepare for more generic setup
...
Introduce xdp_features tool in order to test XDP features supported by
the NIC and match them against advertised ones.
In order to test supported/advertised XDP features, xdp_features must
run on the Device Under Test (DUT) and on a Tester device.
xdp_features opens a control TCP channel between DUT and Tester devices
to send control commands from Tester to the DUT and a UDP data channel
where the Tester sends UDP 'echo' packets and the DUT is expected to
reply back with the same packet. DUT installs multiple XDP programs on the
NIC to test XDP capabilities and reports back to the Tester some XDP stats.
Currently xdp_features supports the following XDP features:
- XDP_DROP
- XDP_ABORTED
- XDP_PASS
- XDP_TX
- XDP_REDIRECT
- XDP_NDO_XMIT
Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/7c1af8e7e6ef0614cf32fa9e6bdaa2d8d605f859.1675245258.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Extend bpf_xdp_query routine in order to get XDP/XSK supported features
of netdev over route netlink interface.
Extend libbpf netlink implementation in order to support netlink_generic
protocol.
Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Co-developed-by: Marek Majtyka <alardam@gmail.com>
Signed-off-by: Marek Majtyka <alardam@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/a72609ef4f0de7fee5376c40dbf54ad7f13bfb8d.1675245258.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a Netlink spec-compatible family for netdevs.
This is a very simple implementation without much
thought going into it.
It allows us to reap all the benefits of Netlink specs,
one can use the generic client to issue the commands:
$ ./cli.py --spec netdev.yaml --dump dev_get
[{'ifindex': 1, 'xdp-features': set()},
{'ifindex': 2, 'xdp-features': {'basic', 'ndo-xmit', 'redirect'}},
{'ifindex': 3, 'xdp-features': {'rx-sg'}}]
the generic python library does not have flags-by-name
support, yet, but we also don't have to carry strings
in the messages, as user space can get the names from
the spec.
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Co-developed-by: Marek Majtyka <alardam@gmail.com>
Signed-off-by: Marek Majtyka <alardam@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/327ad9c9868becbe1e601b580c962549c8cd81f2.1675245258.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Just silence the following build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/1675319486-27744-2-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
xdp_synproxy/xdp fails in CI with:
Error: bpf_tc_hook_create: File exists
The XDP version of the test should not be calling bpf_tc_hook_create();
the reason it's happening anyway is that if we don't specify --tc on the
command line, tc variable remains uninitialized.
Fixes: 784d5dc0ef ("selftests/bpf: Add selftests for raw syncookie helpers in TC mode")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Reported-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230202235335.3403781-1-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Current release - regressions:
- phy: fix null-deref in phy_attach_direct
- mac802154: fix possible double free upon parsing error
Previous releases - regressions:
- bpf: preserve reg parent/live fields when copying range info,
prevent mis-verification of programs as safe
- ip6: fix GRE tunnels not generating IPv6 link local addresses
- phy: dp83822: fix null-deref on DP83825/DP83826 devices
- sctp: do not check hb_timer.expires when resetting hb_timer
- eth: mtk_sock: fix SGMII configuration after phylink conversion
Previous releases - always broken:
- eth: xdp: execute xdp_do_flush() before napi_complete_done()
- skb: do not mix page pool and page referenced frags in GRO
- bpf:
- fix a possible task gone issue with bpf_send_signal[_thread]()
- fix an off-by-one bug in bpf_mem_cache_idx() to select
the right cache
- add missing btf_put to register_btf_id_dtor_kfuncs
- sockmap: fon't let sock_map_{close,destroy,unhash} call itself
- gso: fix null-deref in skb_segment_list()
- mctp: purge receive queues on sk destruction
- fix UaF caused by accept on already connected socket in exotic
socket families
- tls: don't treat list head as an entry in tls_is_tx_ready()
- netfilter: br_netfilter: disable sabotage_in hook after first
suppression
- wwan: t7xx: fix runtime PM implementation
Misc:
- MAINTAINERS: spring cleanup of networking maintainers
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmPcKlkACgkQMUZtbf5S
IrtHgxAAxvuXeN9kQ+/gDYbxTa4Fc3gwCAA7SdJiBdfxP9xJAUEBs9TNfh5MPynb
iIWl9tTe1OOQtWJTnp5CKMuegUj4hlJVYuLu4MK36hW56Q41cbCFvhsDk/1xoFTF
2mtmEr5JiXZfgEdf9/tEITTGKgfANsbXOAzL4b5HcV9qtvwvEd2aFxll3VO9N2Xn
k6o59Lr2mff7NnXQsrVkTLBxXRK9oir8VwyTnCs7j+T7E8Qe4hIkx2LDWcDiRC1e
vSxfoWFoe+uOGTQMxlarMOAgAOt7nvOQngwCaaVpMBa/OLzGo51Clf98cpPm+cSi
18ZjmA8r5owGghd75p2PtxcND8U1vXeeRJW+FKK60K8qznMc0D4s7HX+wHBfevnV
U649F0OHsNsX0Y75Z5sqA1eSmJbTR9QeyiRbuS6nfOnT7ZKDw6vPJrKGJRN789y0
erzG/JwrfnrCDavXfNNEgk0mMlP0squemffWjuH6FfL4Pt4/gUz63Nokr5vRFSns
yskHZgXVs4gq++yrbZDZjKfQaPDkA1P/z6VADVRWh4Y5m1m6GtvDUy6vJxK7/hUs
5giqmhbAAQq0U+2L5RIaoYGfjk2UaDcunNpqTCgjdCdMXx9Yz41QBloFgBjRshp/
3kg6ElYPzflXXCYK4pGRrzqO4Gqz1Hrvr9YuT+kka+wrpZPDtWc=
=OGU7
-----END PGP SIGNATURE-----
Merge tag 'net-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf, can and netfilter.
Current release - regressions:
- phy: fix null-deref in phy_attach_direct
- mac802154: fix possible double free upon parsing error
Previous releases - regressions:
- bpf: preserve reg parent/live fields when copying range info,
prevent mis-verification of programs as safe
- ip6: fix GRE tunnels not generating IPv6 link local addresses
- phy: dp83822: fix null-deref on DP83825/DP83826 devices
- sctp: do not check hb_timer.expires when resetting hb_timer
- eth: mtk_sock: fix SGMII configuration after phylink conversion
Previous releases - always broken:
- eth: xdp: execute xdp_do_flush() before napi_complete_done()
- skb: do not mix page pool and page referenced frags in GRO
- bpf:
- fix a possible task gone issue with bpf_send_signal[_thread]()
- fix an off-by-one bug in bpf_mem_cache_idx() to select the right
cache
- add missing btf_put to register_btf_id_dtor_kfuncs
- sockmap: fon't let sock_map_{close,destroy,unhash} call itself
- gso: fix null-deref in skb_segment_list()
- mctp: purge receive queues on sk destruction
- fix UaF caused by accept on already connected socket in exotic
socket families
- tls: don't treat list head as an entry in tls_is_tx_ready()
- netfilter: br_netfilter: disable sabotage_in hook after first
suppression
- wwan: t7xx: fix runtime PM implementation
Misc:
- MAINTAINERS: spring cleanup of networking maintainers"
* tag 'net-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits)
mtk_sgmii: enable PCS polling to allow SFP work
net: mediatek: sgmii: fix duplex configuration
net: mediatek: sgmii: ensure the SGMII PHY is powered down on configuration
MAINTAINERS: update SCTP maintainers
MAINTAINERS: ipv6: retire Hideaki Yoshifuji
mailmap: add John Crispin's entry
MAINTAINERS: bonding: move Veaceslav Falico to CREDITS
net: openvswitch: fix flow memory leak in ovs_flow_cmd_new
net: ethernet: mtk_eth_soc: disable hardware DSA untagging for second MAC
virtio-net: Keep stop() to follow mirror sequence of open()
selftests: net: udpgso_bench_tx: Cater for pending datagrams zerocopy benchmarking
selftests: net: udpgso_bench: Fix racing bug between the rx/tx programs
selftests: net: udpgso_bench_rx/tx: Stop when wrong CLI args are provided
selftests: net: udpgso_bench_rx: Fix 'used uninitialized' compiler warning
can: mcp251xfd: mcp251xfd_ring_set_ringparam(): assign missing tx_obj_num_coalesce_irq
can: isotp: split tx timer into transmission and timeout
can: isotp: handle wait_event_interruptible() return values
can: raw: fix CAN FD frame transmissions over CAN XL devices
can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivate
hv_netvsc: Fix missed pagebuf entries in netvsc_dma_map/unmap()
...
Create a new pmu_metric for the metric related variables from pmu_event
but that is initially just a clone of pmu_event. Add iterators for
pmu_metric and use in places that metrics are desired rather than
events. Make the event iterator skip metric only events, and the metric
iterator skip event only events.
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rewrite metrics within the same file in terms of each other. For example, on Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.
To avoid recomputation decorate the function with a cache.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add RewriteMetricsInTermsOfOthers that iterates over pairs of names and
expressions trying to replace an expression, within the current
expression, with its name.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
rhs may not be defined, say for source_count, so add a guard.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Show the branch speculation info if provided by the branch recording
hardware feature. This can be useful for optimizing code further.
The speculation info is appended to the end of the list of fields so any
existing tools that use "/" as a delimiter for access fields via an index
remain unaffected. Also show "-" instead of "N/A" when speculation info
is unavailable because "/" is used as the field separator.
E.g.
$ perf record -j any,u,save_type ./test_branch
$ perf script --fields brstacksym
Before:
[...]
check_match+0x60/strcmp+0x0/P/-/-/0/CALL
do_lookup_x+0x3c5/check_match+0x0/P/-/-/0/CALL
[...]
After:
[...]
check_match+0x60/strcmp+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
do_lookup_x+0x3c5/check_match+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
[...]
The bitfield swapping scheme used duing sample parsing has changed
because of the addition of new branch flags, namely "spec", "new_type"
and "priv". Earlier, these were all part of the "reserved" field but
now, each of these fields get swapped separately. Change the expected
flag values accordingly for the test to pass.
E.g.
$ perf test -v 27
Before:
27: Sample parsing :
--- start ---
test child forked, pid 61979
parsing failed for sample_type 0x800
test child finished with -1
---- end ----
Sample parsing: FAILED!
After:
27: Sample parsing :
--- start ---
test child forked, pid 63293
test child finished with 0
---- end ----
Sample parsing: Ok
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/56e272583552526e999ba0b536ac009ae3613966.1675333809.git.sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This commit adds the execve syscall benchmark, more syscall benchmarks
can be added in the future.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1668052208-14047-5-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This commit adds a simple getpgid syscall benchmark, more syscall
benchmarks can be added in the future.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1668052208-14047-4-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the current code, there is only a basic syscall benchmark via
getppid, this is not enough. Introduce bench_syscall_common() so that we
can add more syscalls to benchmark.
This is preparation for later patch, no functionality change.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1668052208-14047-3-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is better to keep list sorted by number in unistd_{32,64}.h,
so that we can add more syscall number to a proper position.
This is preparation for later patch, no functionality change.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1668052208-14047-2-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As detailed in https://www.shellcheck.net/wiki/SC2006:
The use of `...` is legacy syntax with several issues:
1. It has a series of undefined behaviors related to quoting in POSIX.
2. It imposes a custom escaping mode with surprising results.
3. It's exceptionally hard to nest.
$(...) command substitution has none of these problems,
and is therefore strongly encouraged.
Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
Acked-by: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230201214945.127474-3-didi.debian@cknow.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To count the number of results from grep, use the '-c' parameter
instead of piping it to 'wc'.
See also https://www.shellcheck.net/wiki/SC2126
Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
Acked-by: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230201214945.127474-2-didi.debian@cknow.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The current display code for perf stat iterates given cpus and build the
aggr map to collect the event data for the aggregation mode.
But uncore events have their own cpu maps and it won't guarantee that
it'd match to the aggr map. For example, per-package uncore events
would generate a single value for each socket. When user asks per-core
aggregation mode, the output would contain 0 values for other cores.
Thus it needs to check the uncore PMU's cpumask and if it matches to the
current aggregation id.
Before:
$ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1
Performance counter stats for 'system wide':
S0-D0-C0 1 3.73 Joules power/energy-pkg/
S0-D0-C1 0 <not counted> Joules power/energy-pkg/
S0-D0-C2 0 <not counted> Joules power/energy-pkg/
S0-D0-C3 0 <not counted> Joules power/energy-pkg/
1.001404046 seconds time elapsed
Some events weren't counted. Try disabling the NMI watchdog:
echo 0 > /proc/sys/kernel/nmi_watchdog
perf stat ...
echo 1 > /proc/sys/kernel/nmi_watchdog
The core 1, 2 and 3 should not be printed because the event is handled
in a cpu in the core 0 only. With this change, the output becomes like
below.
After:
$ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1
Performance counter stats for 'system wide':
S0-D0-C0 1 2.09 Joules power/energy-pkg/
Fixes: b897613510 ("perf stat: Update event skip condition for system-wide per-thread mode and merged uncore and hybrid events")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230125192431.2929677-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -S/--callstack-filter is to limit display entries having the given
string in the callstack (not only in the caller in the output).
The following example shows lock contention results if the callstack
has 'net' substring somewhere. Note that the caller '__dev_queue_xmit'
does not match to it, but it has 'inet6_csk_xmit' in the callstack.
This applies even if you don't use -v option to show the full callstack.
$ sudo ./perf lock con -abv -S net sleep 1
...
contended total wait max wait avg wait type caller
5 70.20 us 16.13 us 14.04 us spinlock __dev_queue_xmit+0xb6d
0xffffffffa5dd1c60 _raw_spin_lock+0x30
0xffffffffa5b8f6ed __dev_queue_xmit+0xb6d
0xffffffffa5cd8267 ip6_finish_output2+0x2c7
0xffffffffa5cdac14 ip6_finish_output+0x1d4
0xffffffffa5cdb477 ip6_xmit+0x457
0xffffffffa5d1fd17 inet6_csk_xmit+0xd7
0xffffffffa5c5f4aa __tcp_transmit_skb+0x54a
0xffffffffa5c6467d tcp_keepalive_timer+0x2fd
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230126000936.3017683-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's no field for the cgroup, let's add one. To do that, users need to
specify --all-cgroup option for perf record to capture the cgroup info.
$ perf record --all-cgroups -- true
$ perf script -F comm,pid,cgroup
true 337112 /user.slice/user-657345.slice/user@657345.service/...
true 337112 /user.slice/user-657345.slice/user@657345.service/...
true 337112 /user.slice/user-657345.slice/user@657345.service/...
true 337112 /user.slice/user-657345.slice/user@657345.service/...
If it's recorded without the --all-cgroups, it'd complain.
$ perf script -F comm,pid,cgroup
Samples for 'cycles:u' event do not have CGROUP attribute set. Cannot print 'cgroup' field.
Hint: run 'perf record --all-cgroups ...'
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230126213610.3381147-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The canonical location for the tracefs filesystem is at /sys/kernel/tracing.
But, from Documentation/trace/ftrace.rst:
Before 4.1, all ftrace tracing control files were within the debugfs
file system, which is typically located at /sys/kernel/debug/tracing.
For backward compatibility, when mounting the debugfs file system,
the tracefs file system will be automatically mounted at:
/sys/kernel/debug/tracing
A few spots in the perf docs still refer to this older debugfs path, so
let's update them to avoid confusion.
Signed-off-by: Ross Zwisler <zwisler@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: linux-trace-kernel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20230130181915.1113313-5-zwisler@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Unknown address packet indexes are not an error as the Arm architecture
can (and has with SPEv1.2) define new ones and implementation defined
ones are also allowed. The error message for every occurrence of the
packet is needlessly noisy as well. Change the message to print just
once for each unknown index.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230127205546.667740-1-robh@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This problem was encountered on an arm64 system with a lot of memory.
Without kernel debug symbols installed, and with both kcore and kallsyms
available, perf managed to get confused and returned "unknown" for all
of the kernel symbols that it tried to look up.
On this system, stext fell within the vmalloc segment. The kcore symbol
matching code tries to find the first segment that contains stext and
uses that to replace the segment generated from just the kallsyms
information. In this case, however, there were two: a very large
vmalloc segment, and the text segment. This caused perf to get confused
because multiple overlapping segments were inserted into the RB tree
that holds the discovered segments. However, that alone wasn't
sufficient to cause the problem. Even when we could find the segment,
the offsets were adjusted in such a way that the newly generated symbols
didn't line up with the instruction addresses in the trace. The most
obvious solution would be to consult which segment type is text from
kcore, but this information is not exposed to users.
Instead, select the smallest matching segment that contains stext
instead of the first matching segment. This allows us to match the text
segment instead of vmalloc, if one is contained within the other.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Reaver <me@davidreaver.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230125183418.GD1963@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
While parsing the tracepoint events in parse_events_add_tracepoint()
function, code checks for HAVE_LIBTRACEEVENT support. This is needed
since libtraceevent is necessary for tracepoint. But while adding probe
points, check for LIBTRACEEVENT is not done in case of perf probe.
Hence, in environment with missing libtraceevent-devel, it is observed
that adding a probe point shows below message though it can't be used
via perf record.
Example:
Adding probe point:
./perf probe 'vfs_getname=getname_flags:72 pathname=result->name:string'
Added new event:
probe:vfs_getname (on getname_flags:72 with pathname=result->name:string)
You can now use it in all perf tools, such as:
perf record -e probe:vfs_getname -aR sleep 1
But trying perf record:
./perf record -e probe:vfs_getname -aR sleep 1
event syntax error: 'probe:vfs_getname'
\___ unsupported tracepoint
libtraceevent is necessary for tracepoint support
Run 'perf list' for a list of valid events
The builtin tool like perf record needs libtraceevent to
parse tracefs. But still the probe can be used by enabling
via tracefs. Patch fixes the probe usage message to the user
based on presence of libtraceevent. With the fix,
# ./perf probe 'pmu:myprobe=schedule'
Added new event:
pmu:myprobe (on schedule)
perf is not linked with libtraceevent, to use the new probe you can use tracefs:
cd /sys/kernel/tracing/
echo 1 > events/pmu/myprobe/enable
echo 1 > tracing_on
cat trace_pipe
Before removing the probe, echo 0 > events/pmu/myprobe/enable
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230131134748.54567-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For x86_64, determine a symbol for .plt.got entries. That requires
computing the target offset and finding that in .rela.dyn, which in
turn means .rela.dyn needs to be sorted by offset.
Example:
In this example, the GNU C Library is using .plt.got for malloc and
free.
Before:
$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.027 MB perf.data ]
$ perf script --itrace=be --ns -F-event,+addr,-period,-comm,-tid,-cpu > /tmp/cmp1.txt
After:
$ perf script --itrace=be --ns -F-event,+addr,-period,-comm,-tid,-cpu > /tmp/cmp2.txt
$ diff /tmp/cmp1.txt /tmp/cmp2.txt | head -12
15509,15510c15509,15510
< 27046.755390907: 7f0b2943e3ab _nl_normalize_codeset+0x5b (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b29428380 offset_0x28380@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
< 27046.755390907: 7f0b29428384 offset_0x28380@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b294a5120 malloc+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
---
> 27046.755390907: 7f0b2943e3ab _nl_normalize_codeset+0x5b (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b29428380 malloc@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
> 27046.755390907: 7f0b29428384 malloc@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b294a5120 malloc+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
15821,15822c15821,15822
< 27046.755394865: 7f0b2943850c _nl_load_locale_from_archive+0x5bc (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b29428370 offset_0x28370@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
< 27046.755394865: 7f0b29428374 offset_0x28370@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b294a5460 cfree@GLIBC_2.2.5+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
---
> 27046.755394865: 7f0b2943850c _nl_load_locale_from_archive+0x5bc (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b29428370 free@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
> 27046.755394865: 7f0b29428374 free@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b294a5460 cfree@GLIBC_2.2.5+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, timerlat top displays the timerlat tracer latency results, saving
the intuitive timerlat trace for the developer to analyze.
This patch goes a step forward in the automaton of the scheduling latency
analysis by providing a summary of the root cause of a latency higher than
the passed "stop tracing" parameter if the trace stops.
The output is intuitive enough for non-expert users to have a general idea
of the root cause by looking at each factor's contribution percentage while
keeping the technical detail in the output for more expert users to start
an in dept debug or to correlate a root cause with an existing one.
The terminology is in line with recent industry and academic publications
to facilitate the understanding of both audiences.
Here is one example of tool output:
----------------------------------------- %< -----------------------------------------------------
# taskset -c 0 timerlat -a 40 -c 1-23 -q
Timer Latency
0 00:00:12 | IRQ Timer Latency (us) | Thread Timer Latency (us)
CPU COUNT | cur min avg max | cur min avg max
1 #12322 | 0 0 1 15 | 10 3 9 31
2 #12322 | 3 0 1 12 | 10 3 9 23
3 #12322 | 1 0 1 21 | 8 2 8 34
4 #12322 | 1 0 1 17 | 10 2 11 33
5 #12322 | 0 0 1 12 | 8 3 8 25
6 #12322 | 1 0 1 14 | 16 3 11 35
7 #12322 | 0 0 1 14 | 9 2 8 29
8 #12322 | 1 0 1 22 | 9 3 9 34
9 #12322 | 0 0 1 14 | 8 2 8 24
10 #12322 | 1 0 0 12 | 9 3 8 24
11 #12322 | 0 0 0 15 | 6 2 7 29
12 #12321 | 1 0 0 13 | 5 3 8 23
13 #12319 | 0 0 1 14 | 9 3 9 26
14 #12321 | 1 0 0 13 | 6 2 8 24
15 #12321 | 1 0 1 15 | 12 3 11 27
16 #12318 | 0 0 1 13 | 7 3 10 24
17 #12319 | 0 0 1 13 | 11 3 9 25
18 #12318 | 0 0 0 12 | 8 2 8 20
19 #12319 | 0 0 1 18 | 10 2 9 28
20 #12317 | 0 0 0 20 | 9 3 8 34
21 #12318 | 0 0 0 13 | 8 3 8 28
22 #12319 | 0 0 1 11 | 8 3 10 22
23 #12320 | 28 0 1 28 | 41 3 11 41
rtla timerlat hit stop tracing
## CPU 23 hit stop tracing, analyzing it ##
IRQ handler delay: 27.49 us (65.52 %)
IRQ latency: 28.13 us
Timerlat IRQ duration: 9.59 us (22.85 %)
Blocking thread: 3.79 us (9.03 %)
objtool:49256 3.79 us
Blocking thread stacktrace
-> timerlat_irq
-> __hrtimer_run_queues
-> hrtimer_interrupt
-> __sysvec_apic_timer_interrupt
-> sysvec_apic_timer_interrupt
-> asm_sysvec_apic_timer_interrupt
-> _raw_spin_unlock_irqrestore
-> cgroup_rstat_flush_locked
-> cgroup_rstat_flush_irqsafe
-> mem_cgroup_flush_stats
-> mem_cgroup_wb_stats
-> balance_dirty_pages
-> balance_dirty_pages_ratelimited_flags
-> btrfs_buffered_write
-> btrfs_do_write_iter
-> vfs_write
-> __x64_sys_pwrite64
-> do_syscall_64
-> entry_SYSCALL_64_after_hwframe
------------------------------------------------------------------------
Thread latency: 41.96 us (100%)
The system has exit from idle latency!
Max timerlat IRQ latency from idle: 17.48 us in cpu 4
Saving trace to timerlat_trace.txt
----------------------------------------- >% -----------------------------------------------------
In this case, the major factor was the delay suffered by the IRQ handler
that handles timerlat wakeup: 65.52 %. This can be caused by the
current thread masking interrupts, which can be seen in the blocking
thread stacktrace: the current thread (objtool:49256) disabled interrupts
via raw spin lock operations inside mem cgroup, while doing write
syscall in a btrfs file system.
A simple search for the function name on Google shows that this is
a legit case for disabling the interrupts:
cgroup: Use irqsave in cgroup_rstat_flush_locked()
lore.kernel.org/linux-mm/20220301122143.1521823-2-bigeasy@linutronix.de/
The output also prints other reasons for the latency root cause, such as:
- an IRQ that happened before the IRQ handler that caused delays
- The interference from NMI, IRQ, Softirq, and Threads
The details about how these factors affect the scheduling latency
can be found here:
https://bristot.me/demystifying-the-real-time-linux-latency/
Link: https://lkml.kernel.org/r/3d45f40e630317f51ac6d678e2d96d310e495729.1675179318.git.bristot@kernel.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Currently, timerlat displays a summary of the timerlat tracer results
saving the trace if the system hits a stop condition.
While this represented a huge step forward, the root cause was not
that is accessible to non-expert users.
The auto-analysis fulfill this gap by parsing the trace timerlat runs,
printing an intuitive auto-analysis.
Link: https://lkml.kernel.org/r/1ee073822f6a2cbb33da0c817331d0d4045e837f.1675179318.git.bristot@kernel.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The canonical location for the tracefs filesystem is at /sys/kernel/tracing.
But, from Documentation/trace/ftrace.rst:
Before 4.1, all ftrace tracing control files were within the debugfs
file system, which is typically located at /sys/kernel/debug/tracing.
For backward compatibility, when mounting the debugfs file system,
the tracefs file system will be automatically mounted at:
/sys/kernel/debug/tracing
A few scripts in tools/power still refer to this older debugfs path, so
let's update them to avoid confusion.
Signed-off-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The test tool can check that the zerocopy number of completions value is
valid taking into consideration the number of datagram send calls. This can
catch the system into a state where the datagrams are still in the system
(for example in a qdisk, waiting for the network interface to return a
completion notification, etc).
This change adds a retry logic of computing the number of completions up to
a configurable (via CLI) timeout (default: 2 seconds).
Fixes: 79ebc3c260 ("net/udpgso_bench_tx: options to exercise TX CMSG")
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230201001612.515730-4-andrei.gherzan@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
"udpgro_bench.sh" invokes udpgso_bench_rx/udpgso_bench_tx programs
subsequently and while doing so, there is a chance that the rx one is not
ready to accept socket connections. This racing bug could fail the test
with at least one of the following:
./udpgso_bench_tx: connect: Connection refused
./udpgso_bench_tx: sendmsg: Connection refused
./udpgso_bench_tx: write: Connection refused
This change addresses this by making udpgro_bench.sh wait for the rx
program to be ready before firing off the tx one - up to a 10s timeout.
Fixes: 3a687bef14 ("selftests: udp gso benchmark")
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230201001612.515730-3-andrei.gherzan@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Leaving unrecognized arguments buried in the output, can easily hide a
CLI/script typo. Avoid this by exiting when wrong arguments are provided to
the udpgso_bench test programs.
Fixes: 3a687bef14 ("selftests: udp gso benchmark")
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230201001612.515730-2-andrei.gherzan@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This change fixes the following compiler warning:
/usr/include/x86_64-linux-gnu/bits/error.h:40:5: warning: ‘gso_size’ may
be used uninitialized [-Wmaybe-uninitialized]
40 | __error_noreturn (__status, __errnum, __format,
__va_arg_pack ());
|
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
udpgso_bench_rx.c: In function ‘main’:
udpgso_bench_rx.c:253:23: note: ‘gso_size’ was declared here
253 | int ret, len, gso_size, budget = 256;
Fixes: 3327a9c463 ("selftests: add functionals test for UDP GRO")
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230201001612.515730-1-andrei.gherzan@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The linux/net_tstamp.h is included more than once, thus clean it up.
Signed-off-by: Ye Xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/202301311440516312161@zte.com.cn
We only need to consume TX completion instead of refilling 'fill' ring.
It's currently not an issue because we never RX more than 8 packets.
Fixes: e2a46d54d7 ("selftests/bpf: Verify xdp_metadata xdp->af_xdp path")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230201233640.367646-1-sdf@google.com
A statically linked executable can have a .plt due to IFUNCs, in which
case .symtab is used not .dynsym. Check the section header link to see
if that is the case, and then use symtab instead.
Example:
Before:
$ cat tstifunc.c
#include <stdio.h>
void thing1(void)
{
printf("thing1\n");
}
void thing2(void)
{
printf("thing2\n");
}
typedef void (*thing_fn_t)(void);
thing_fn_t thing_ifunc(void)
{
int x;
if (x & 1)
return thing2;
return thing1;
}
void thing(void) __attribute__ ((ifunc ("thing_ifunc")));
int main()
{
thing();
return 0;
}
$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ gcc -static -Wall -Wextra -Wno-uninitialized -o tstifuncstatic tstifunc.c
$ readelf -SW tstifuncstatic | grep 'Name\|plt\|dyn'
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 4] .rela.plt RELA 00000000004002e8 0002e8 000258 18 AI 29 20 8
[ 6] .plt PROGBITS 0000000000401020 001020 000190 00 AX 0 0 16
[20] .got.plt PROGBITS 00000000004c5000 0c4000 0000e0 08 WA 0 0 8
$ perf record -e intel_pt//u --filter 'filter main @ ./tstifuncstatic' ./tstifuncstatic
thing1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.008 MB perf.data ]
$ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
15786.690189535: tr strt 0 [unknown] => 4017cd main+0x0
15786.690189535: tr end call 4017d5 main+0x8 => 401170 [unknown]
15786.690197660: tr strt 0 [unknown] => 4017da main+0xd
15786.690197660: tr end return 4017e0 main+0x13 => 401c1a __libc_start_call_main+0x6a
After:
$ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
15786.690189535: tr strt 0 [unknown] => 4017cd main+0x0
15786.690189535: tr end call 4017d5 main+0x8 => 401170 thing_ifunc@plt+0x0
15786.690197660: tr strt 0 [unknown] => 4017da main+0xd
15786.690197660: tr end return 4017e0 main+0x13 => 401c1a __libc_start_call_main+0x6a
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A static executable can have a .plt due to the presence of IFUNCs. In
that case the .plt does not have a header. Check for whether there is a
header by comparing the number of entries to the number of relocation
entries.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To assist with synthesizing plt symbols for IFUNCs, record whether a
symbol is an alias of an IFUNC symbol.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The section .plt.sec was originally added for MPX and was first called
.plt.bnd. While MPX has been deprecated, .plt.sec is now also used for
IBT. On x86_64, IBT may be enabled by default, but can be switched off
using gcc option -fcf-protection=none, or switched on by -z ibt or -z
ibtplt. On 32-bit, option -z ibt or -z ibtplt will enable IBT.
With .plt.sec, calls are made into .plt.sec instead of .plt, so it makes
more sense to put the symbols there instead of .plt. A notable
difference is that .plt.sec does not have a header entry.
For x86, when synthesizing symbols for plt, use offset and entry size of
.plt.sec instead of .plt when there is a .plt.sec section.
Example on Ubuntu 22.04 gcc 11.3:
Before:
$ cat tstpltlib.c
void fn1(void) {}
void fn2(void) {}
void fn3(void) {}
void fn4(void) {}
$ cat tstplt.c
void fn1(void);
void fn2(void);
void fn3(void);
void fn4(void);
int main()
{
fn4();
fn1();
fn2();
fn3();
return 0;
}
$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ gcc -Wall -Wextra -shared -o libtstpltlib.so tstpltlib.c
$ gcc -Wall -Wextra -z ibt -o tstplt tstplt.c -L . -ltstpltlib -Wl,-rpath=$(pwd)
$ readelf -SW tstplt | grep 'plt\|Name'
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[11] .rela.plt RELA 0000000000000698 000698 000060 18 AI 6 24 8
[13] .plt PROGBITS 0000000000001020 001020 000050 10 AX 0 0 16
[14] .plt.got PROGBITS 0000000000001070 001070 000010 10 AX 0 0 16
[15] .plt.sec PROGBITS 0000000000001080 001080 000040 10 AX 0 0 16
$ perf record -e intel_pt//u --filter 'filter main @ ./tstplt' ./tstplt
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.015 MB perf.data ]
$ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
38970.522546686: tr strt 0 [unknown] => 55fc222a81a9 main+0x0
38970.522546686: tr end call 55fc222a81b1 main+0x8 => 55fc222a80a0 [unknown]
38970.522546687: tr strt 0 [unknown] => 55fc222a81b6 main+0xd
38970.522546687: tr end call 55fc222a81b6 main+0xd => 55fc222a8080 [unknown]
38970.522546688: tr strt 0 [unknown] => 55fc222a81bb main+0x12
38970.522546688: tr end call 55fc222a81bb main+0x12 => 55fc222a80b0 [unknown]
38970.522546688: tr strt 0 [unknown] => 55fc222a81c0 main+0x17
38970.522546688: tr end call 55fc222a81c0 main+0x17 => 55fc222a8090 [unknown]
38970.522546689: tr strt 0 [unknown] => 55fc222a81c5 main+0x1c
38970.522546894: tr end return 55fc222a81cb main+0x22 => 7f3a4dc29d90 __libc_start_call_main+0x80
After:
$ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
38970.522546686: tr strt 0 [unknown] => 55fc222a81a9 main+0x0
38970.522546686: tr end call 55fc222a81b1 main+0x8 => 55fc222a80a0 fn4@plt+0x0
38970.522546687: tr strt 0 [unknown] => 55fc222a81b6 main+0xd
38970.522546687: tr end call 55fc222a81b6 main+0xd => 55fc222a8080 fn1@plt+0x0
38970.522546688: tr strt 0 [unknown] => 55fc222a81bb main+0x12
38970.522546688: tr end call 55fc222a81bb main+0x12 => 55fc222a80b0 fn2@plt+0x0
38970.522546688: tr strt 0 [unknown] => 55fc222a81c0 main+0x17
38970.522546688: tr end call 55fc222a81c0 main+0x17 => 55fc222a8090 fn3@plt+0x0
38970.522546689: tr strt 0 [unknown] => 55fc222a81c5 main+0x1c
38970.522546894: tr end return 55fc222a81cb main+0x22 => 7f3a4dc29d90 __libc_start_call_main+0x80
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Test “Use vfs_getname probe to get syscall args filenames” fails in
environment with missing libtraceevent support as below:
82: Use vfs_getname probe to get syscall args filenames :
--- start ---
test child forked, pid 304726
Recording open file:
event syntax error: 'probe:vfs_getname*'
\___ unsupported tracepoint
libtraceevent is necessary for tracepoint support
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
test child finished with -1
---- end ----
Use vfs_getname probe to get syscall args filenames: FAILED!
The environment has debuginfo but is missing the libtraceevent devel.
Hence perf is compiled without libtraceevent support. The test tries to
add probe “probe:vfs_getname” and then uses it with “perf record”. This
fails at function “parse_events_add_tracepoint" due to missing
libtraceevent.
Similarly "probe libc's inet_pton & backtrace it with ping" test slso
fails with same reason.
Add a function in 'perf test shell' library to check if perf record with
—dry-run reports any error on missing support for libtraceevent. Update
both the tests to use this new function “skip_no_probe_record_support”
before proceeding With using probe point via perf builtin record.
With the change,
82: Use vfs_getname probe to get syscall args filenames :
--- start ---
test child forked, pid 305014
Recording open file:
libtraceevent is necessary for tracepoint support
test child finished with -2
---- end ----
Use vfs_getname probe to get syscall args filenames: Skip
81: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 305036
libtraceevent is necessary for tracepoint support
test child finished with -2
---- end ----
probe libc's inet_pton & backtrace it with ping: Skip
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kjain@linux.ibm.com,
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/r/20230201180421.59640-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The "probe libc's inet_pton & backtrace it with ping" test installs a
uprobe and uses perf record/script to check the backtrace. Currently
even if the "perf record" fails, the test reports success. Logs below:
# ./perf test -v "probe libc's inet_pton & backtrace it with ping"
81: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 304211
failed to open /tmp/perf.data.Btf: No such file or directory
test child finished with 0
---- end ----
probe libc's inet_pton & backtrace it with ping: Ok
Fix this by adding check for presence of perf.data file
before proceeding with "perf script".
With the patch changes, test reports fail correctly.
# ./perf test -v "probe libc's inet_pton & backtrace it with ping"
81: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 304358
FAIL: perf record failed to create "/tmp/perf.data.Uoi"
test child finished with -1
---- end ----
probe libc's inet_pton & backtrace it with ping: FAILED!
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/r/20230201180421.59640-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The test_pipe() function will check perf report and perf inject with
pipe input.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We should not call lseek(2) for pipes as it won't work. And we already
in the proper place to read the data for AUXTRACE. Add the comment like
in the PERF_RECORD_HEADER_TRACING_DATA.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When it processes AUXTRACE_INFO, it calls to auxtrace_queue_data() to
collect AUXTRACE data first. That won't work with pipe since it needs
lseek() to read the scattered aux data.
$ perf record -o- -e intel_pt// true | perf report -i- --itrace=i100
# To display the perf.data header info, please use --header/--header-only options.
#
0x4118 [0xa0]: failed to process type: 70
Error:
failed to process sample
For the pipe mode, it can handle the aux data as it gets. But there's
no guarantee it can get the aux data in time. So the following warning
will be shown at the beginning:
WARNING: Intel PT with pipe mode is not recommended.
The output cannot relied upon. In particular,
time stamps and the order of events may be incorrect.
Fixes: dbd134322e ("perf intel-pt: Add support for decoding AUX area samples")
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a sub-test to verify that KVM stuffs the APIC_ID when userspace forces
a transition from x2APIC to xAPIC without first disabling the APIC. Such
a transition is architecturally disallowed (WRMSR will #GP), but needs to
be handled by KVM to allow userspace to emulate RESET (ignoring that
userspace should also stuff local APIC state on RESET).
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Link: https://lore.kernel.org/r/20230109130605.2013555-3-eesposit@redhat.com
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
The ifname char pointer is taken directly from the command line
as input and the string is copied directly into struct ifreq
via strcpy. This makes it easy to corrupt other members of ifreq
and generally do stack overflows.
Most often the ioctl will fail with:
./xdp_hw_metadata: ioctl(SIOCETHTOOL): Bad address
As people will likely copy-paste code for getting NIC queue
channels (rxq_num) and enabling HW timestamping (hwtstamp_ioctl)
lets make this code a bit more secure by using strncpy.
Fixes: 297a3f1241 ("selftests/bpf: Simple program to dump XDP RX metadata")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/167527272543.937063.16993147790832546209.stgit@firesoul
The glibc error reporting function error():
void error(int status, int errnum, const char *format, ...);
The status argument should be a positive value between 0-255 as it
is passed over to the exit(3) function as the value as the shell exit
status. The least significant byte of status (i.e., status & 0xFF) is
returned to the shell parent.
Fix this by using 1 instead of -1. As 1 corresponds to C standard
constant EXIT_FAILURE.
Fixes: 297a3f1241 ("selftests/bpf: Simple program to dump XDP RX metadata")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/167527272038.937063.9137108142012298120.stgit@firesoul
Using xdp_hw_metadata I experince Segmentation fault after seeing
"detaching bpf program....".
On my system the segfault happened when accessing bpf_obj->skeleton
in xdp_hw_metadata__destroy(bpf_obj) call. That doesn't make any sense
as this memory have not been freed by program at this point in time.
Prior to calling xdp_hw_metadata__destroy(bpf_obj) the function
close_xsk() is called for each RX-queue xsk. The real bug lays
in close_xsk() that unmap via munmap() the wrong memory pointer.
The call xsk_umem__delete(xsk->umem) will free xsk->umem, thus
the call to munmap(xsk->umem, UMEM_SIZE) will have unpredictable
behavior. And man page explain subsequent references to these
pages will generate SIGSEGV.
Unmapping xsk->umem_area instead removes the segfault.
Fixes: 297a3f1241 ("selftests/bpf: Simple program to dump XDP RX metadata")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/167527271533.937063.5717065138099679142.stgit@firesoul
The AF_XDP userspace part of xdp_hw_metadata see non-zero as a signal of
the availability of rx_timestamp and rx_hash in data_meta area. The
kernel-side BPF-prog code doesn't initialize these members when kernel
returns an error e.g. -EOPNOTSUPP. This memory area is not guaranteed to
be zeroed, and can contain garbage/previous values, which will be read
and interpreted by AF_XDP userspace side.
Tested this on different drivers. The experiences are that for most
packets they will have zeroed this data_meta area, but occasionally it
will contain garbage data.
Example of failure tested on ixgbe:
poll: 1 (0)
xsk_ring_cons__peek: 1
0x18ec788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000
rx_hash: 3697961069
rx_timestamp: 9024981991734834796 (sec:9024981991.7348)
0x18ec788: complete idx=8 addr=8000
Converting to date:
date -d @9024981991
2255-12-28T20:26:31 CET
I choose a simple fix in this patch. When kfunc fails or isn't supported
assign zero to the corresponding struct meta value.
It's up to the individual BPF-programmer to do something smarter e.g.
that fits their use-case, like getting a software timestamp and marking
a flag that gives the type of timestamp.
Fixes: 297a3f1241 ("selftests/bpf: Simple program to dump XDP RX metadata")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/167527271027.937063.5177725618616476592.stgit@firesoul
The function close_xsk() unmap via munmap() the wrong memory pointer.
The call xsk_umem__delete(xsk->umem) have already freed xsk->umem.
Thus the call to munmap(xsk->umem, UMEM_SIZE) will have unpredictable
behavior that can lead to Segmentation fault elsewhere, as man page
explain subsequent references to these pages will generate SIGSEGV.
Fixes: e2a46d54d7 ("selftests/bpf: Verify xdp_metadata xdp->af_xdp path")
Reported-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/167527517464.938135.13750760520577765269.stgit@firesoul
kfuncs are allowed to be static, or not use one or more of their
arguments. For example, bpf_xdp_metadata_rx_hash() in net/core/xdp.c is
meant to be implemented by drivers, with the default implementation just
returning -EOPNOTSUPP. As described in [0], such kfuncs can have their
arguments elided, which can cause BTF encoding to be skipped. The new
__bpf_kfunc macro should address this, and this patch adds a selftest
which verifies that a static kfunc with at least one unused argument can
still be encoded and invoked by a BPF program.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230201173016.342758-5-void@manifault.com
Now that we have the __bpf_kfunc tag, we should use add it to all
existing kfuncs to ensure that they'll never be elided in LTO builds.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20230201173016.342758-4-void@manifault.com
Hyper-V extended hypercalls by default exit to userspace. Verify
userspace gets the call, update the result and then verify in guest
correct result is received.
Add KVM_EXIT_HYPERV to list of "known" hypercalls so errors generate
pretty strings.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20221212183720.4062037-14-vipinsh@google.com
[sean: add KVM_EXIT_HYPERV to exit_reasons_known]
Signed-off-by: Sean Christopherson <seanjc@google.com>
In copy_bytes(), it reads the data from the (input) fd and writes it to
the output file. But it does with the read(2) unconditionally which
caused a problem of mixing buffered vs unbuffered I/O together.
You can see the problem when using pipes.
$ perf record -e intel_pt// -o- true | perf inject -b > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
0x45c0 [0x30]: failed to process type: 71
It should use perf_data__read() to honor the 'use_stdio' setting.
Fixes: 601366678c ("perf data: Allow to use stdio functions for pipe mode")
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Test Hyper-V extended hypercall, HV_EXT_CALL_QUERY_CAPABILITIES
(0x8001), access denied and invalid parameter cases.
Access is denied if CPUID.0x40000003.EBX BIT(20) is not set.
Invalid parameter if call has fast bit set.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
Link: https://lore.kernel.org/r/20221212183720.4062037-11-vipinsh@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Just small bugfixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmPTs30PHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpjzIIAMxHPX4KamjTQ0C9A4uPoWbdLAJfft6GpjK0
To2aZSwv48Evjk9eilZIU7QOmU5vAYiyvK27MMUZQ74W/1maZUWGF3Hu3g2TwBNj
qcxNKIZu9eurrEvYSgIAbH2R4al3WhdQjF+h/ZUmYO2RM//X64a2LM6n2p6aFZjs
yBnR3LOiKC7TIkO7YEVyiIQT3ayulu3viLeL0d2HsQJKDQ60YZX1NDBOXgY0UsIq
gG50l3Nr3RA7CgeJyWOB1x9/8DouXbg9uaHu6WR1+qTwFSW++Z07iz3V/ZYzPmNX
1GaeJUNhOIR5F4DvhND3Me1WcuaE28j6Flu8SECYNgnV8WG4lSg=
=EjKi
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
"Just small bugfixes all over the place"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpa: ifcvf: Do proper cleanup if IFCVF init fails
vhost-scsi: unbreak any layout for response
tools/virtio: fix the vringh test for virtio ring changes
vhost/net: Clear the pending messages when the backend is removed
The current signal handling tests for SME do not account for the fact that
unlike SVE all SME vector lengths are optional so we can't guarantee that
we will encounter the minimum possible VL, they will hang enumerating VLs
on such systems. Abort enumeration when we find the lowest VL in the newly
added ssve_za_regs test.
Fixes: bc69da5ff0 ("kselftest/arm64: Verify simultaneous SSVE and ZA context generation")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230131-arm64-kselftest-sig-sme-no-128-v1-2-d47c13dc8e1e@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The current signal handling tests for SME do not account for the fact that
unlike SVE all SME vector lengths are optional so we can't guarantee that
we will encounter the minimum possible VL, they will hang enumerating VLs
on such systems. Abort enumeration when we find the lowest VL.
Fixes: 4963aeb35a ("kselftest/arm64: signal: Add SME signal handling tests")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230131-arm64-kselftest-sig-sme-no-128-v1-1-d47c13dc8e1e@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
During early development a dependedncy was added on having FA64
available so we could use the full FPSIMD register set in the signal
handler. Subsequently the ABI was finialised so the handler is run with
streaming mode disabled meaning this is redundant but the dependency was
never removed, do so now.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230131-arm64-kselfetest-ssve-fa64-v1-1-f418efcc2b60@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The existing users of these helpers have been converted to iproute2 dcb.
Drop the helpers.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Set up default port priority through the iproute2 dcb tool, which is easier
to understand and manage.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Set up DSCP prioritization through the iproute2 dcb tool, which is easier
to understand and manage.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Set up DSCP prioritization through the iproute2 dcb tool, which is easier
to understand and manage.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The scripts require Python 3 and some distros are dropping
Python 2 support.
Reported-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The CLI script tries to validate jsonschema by default.
It's seems better to validate too many times than too few.
However, when copying the scripts to random servers having
to install jsonschema is tedious. Load jsonschema via
importlib, and let the user opt out.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When I wrote the first version of the Python code I was quite
excited that we can generate class methods directly from the
spec. Unfortunately we need to use valid identifiers for method
names (specifically no dashes are allowed). Don't reuse those
names on the CLI, it's much more natural to use the operation
names exactly as listed in the spec.
Instead of:
./cli --do rings_get
use:
./cli --do rings-get
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
One of my favorite features of the Netlink specs is that they
make decoding structured extack a ton easier.
Implement pretty printing bad attribute names in YNL.
For example it will now say:
'bad-attr': '.header.flags'
rather than the useless:
'bad-attr-offs': 32
Proof:
$ ./cli.py --spec ethtool.yaml --do rings_get \
--json '{"header":{"dev-index":1, "flags":4}}'
Netlink error: Invalid argument
nl_len = 68 (52) nl_flags = 0x300 nl_type = 2
error: -22 extack: {'msg': 'reserved bit set',
'bad-attr': '.header.flags'}
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There's a lot of copy and pasting going on between the "cli"
and code gen when it comes to representing the parsed spec.
Create a library which both can use.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move the CLI code out of samples/ and the library part
of it into tools/net/ynl/lib/. This way we can start
sharing some code with the code gen.
Initially I thought that code gen is too C-specific to
share anything but basic stuff like calculating values
for enums can easily be shared.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
An earlier fix tried to address generated code jumping around
one code-gen run to another. Turns out dict()s are already
ordered since Python 3.7, the problem is that we iterate over
operation modes using a set(). Sets are unordered in Python.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Calculate average value in osnoise-hist summary with two-digit
precision to avoid displaying too optimitic results.
Link: https://lkml.kernel.org/r/20230103103400.275566-3-br015@umbiko.net
Signed-off-by: Andreas Ziegler <br015@umbiko.net>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Sampled durations must be weighted by observed quantity, to arrive at a correct
average duration value.
Perform calculation of total duration by summing (duration * count).
Link: https://lkml.kernel.org/r/20230103103400.275566-2-br015@umbiko.net
Fixes: 829a6c0b56 ("rtla/osnoise: Add the hist mode")
Signed-off-by: Andreas Ziegler <br015@umbiko.net>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
It was found that the check to see if a partition could use up all
the cpus from the parent cpuset in update_parent_subparts_cpumask()
was incorrect. As a result, it is possible to leave parent with no
effective cpu left even if there are tasks in the parent cpuset. This
can lead to system panic as reported in [1].
Fix this probem by updating the check to fail the enabling the partition
if parent's effective_cpus is a subset of the child's cpus_allowed.
Also record the error code when an error happens in update_prstate()
and add a test case where parent partition and child have the same cpu
list and parent has task. Enabling partition in the child will fail in
this case.
[1] https://www.spinics.net/lists/cgroups/msg36254.html
Fixes: f0af1bfc27 ("cgroup/cpuset: Relax constraints to partition & cpus changes")
Cc: stable@vger.kernel.org # v6.1
Reported-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
The newly added zt-test program copied the pattern from the other FP
stress test programs of having a redundant _start label which is
rejected by clang, as we did in a parallel series for the other tests
remove the label so we can build with clang.
No functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230130-arm64-fix-sme2-clang-v1-1-3ce81d99ea8f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When SVE was initially merged we chose to export the maximum VQ in the ABI
as being 512, rather more than the architecturally supported maximum of 16.
For the ptrace tests this results in us generating a lot of test cases and
hence log output which are redundant since a system couldn't possibly
support them. Instead only check values up to the current architectural
limit, plus one more so that we're covering the constraining of higher
vector lengths.
This makes no practical difference to our test coverage, speeds things up
on slower consoles and makes the output much more managable.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111-arm64-kselftest-ptrace-max-vl-v1-1-8167f41d1ad8@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ
E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg
nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG
TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp
s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER
ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh
SDc/Y/c=
=zpaD
-----END PGP SIGNATURE-----
Merge tag 'v6.2-rc6' into sched/core, to pick up fixes
Pick up fixes before merging another batch of cpuidle updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
OpenCSD version 1.4 is released with support for FEAT_ITE.
This adds a new packet type, with associated output element ID in the
packet type enum - OCSD_GEN_TRC_ELEM_INSTRUMENTATION.
As we just ignore this packet in perf, add to the switch statement to
avoid the "enum not handled in switch error", but conditionally so as
not to break the perf build for older OpenCSD installations.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120153706.20388-1-mike.leach@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'DWARF unwind' 'perf test' can sometimes fail:
$ perf test -v 74
Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
74: Test dwarf unwind :
--- start ---
test child forked, pid 3785254
Problems creating module maps, continuing anyway...
Problems creating module maps, continuing anyway...
unwind: test__arch_unwind_sample:ip = 0x102d0ad4c (0x36ad4c)
unwind: access_mem addr 0x7fffc33128c8, val 1031c3228, offset 120
unwind: access_mem addr 0x7fffc33128d0, val 12427cc70, offset 128
<snip>
unwind: test_dwarf_unwind__krava_3:ip = 0x102b8768b (0x1e768b)
unwind: access_mem addr 0x7fffc3313048, val 7fffc3313050, offset 2040
unwind: access_mem addr 0x7fffc3313060, val 102b8777c, offset 2064
unwind: test_dwarf_unwind__krava_2:ip = 0x102b8770b (0x1e770b)
unwind: access_mem addr 0x7fffc3313088, val 7fffc3313090, offset 2104
unwind: access_mem addr 0x7fffc33130a0, val 102b87890, offset 2128
unwind: test_dwarf_unwind__krava_1:ip = 0x102b8777b (0x1e777b)
unwind: access_mem addr 0x7fffc3313108, val 10323a274, offset 2232
unwind: access_mem addr 0x7fffc3313110, val ffffffffffffffff, offset 2240
unwind: access_mem addr 0x7fffc3313118, val 102c08ed0, offset 2248
unwind: access_mem addr 0x7fffc3313120, val 1031db000, offset 2256
unwind: access_mem addr 0x7fffc3313128, val 7fffc3313130, offset 2264
unwind: access_mem addr 0x7fffc3313140, val 102b45ee8, offset 2288
unwind: '':ip = 0x102b8788f (0x1e788f)
failed: got unresolved address 0x102b8788f
unwind: failed with 'no error'
got wrong number of stack entries 0 != 8
test child finished with -1
---- end ----
Test dwarf unwind: FAILED!
We expect to resolve test__dwarf_unwind as the last symbol, but that
function can be optimized away:
$ objdump -tT /usr/bin/perf | grep dwarf_unwind
000000000083b018 g DO .data 0000000000000040 Base tests__dwarf_unwind
00000000001e7750 g DF .text 0000000000000068 Base 0x60 test_dwarf_unwind__krava_1
00000000001e76e0 g DF .text 0000000000000068 Base 0x60 test_dwarf_unwind__krava_2
00000000001e7620 g DF .text 00000000000000b4 Base 0x60 test_dwarf_unwind__krava_3
00000000001e74f0 g DF .text 0000000000000128 Base 0x60 test_dwarf_unwind__compare
00000000001e7350 g DF .text 000000000000019c Base 0x60 test_dwarf_unwind__thread
000000000083b000 g DO .data 0000000000000018 Base suite__dwarf_unwind
Fix this similar to commit fdf7c49c20 ("perf tests: Fix dwarf
unwind for stripped binaries") by marking the function as a global and
adding the 'noinline' attribute to it.
With this patch:
$ objdump -tT perf | grep dwarf_unwind
000000000083b018 g DO .data 0000000000000040 Base tests__dwarf_unwind
00000000001e80f0 g DF .text 0000000000000068 Base 0x60 test_dwarf_unwind__krava_1
00000000001e8080 g DF .text 0000000000000068 Base 0x60 test_dwarf_unwind__krava_2
00000000001e7fc0 g DF .text 00000000000000b4 Base 0x60 test_dwarf_unwind__krava_3
00000000001e7e90 g DF .text 0000000000000128 Base 0x60 test_dwarf_unwind__compare
00000000001e7cf0 g DF .text 000000000000019c Base 0x60 test_dwarf_unwind__thread
00000000001e8160 g DF .text 0000000000000248 Base 0x60 test__dwarf_unwind
000000000083b000 g DO .data 0000000000000018 Base suite__dwarf_unwind
$ ./perf test 74
74: Test dwarf unwind : Ok
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Link: http://lore.kernel.org/lkml/20230125123442.107156-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The KVM rseq test is failing to build in -next due to a commit merged
from the tip tree which adds a wrapper for sys_getcpu() to the rseq
kselftests, conflicting with the wrapper already included in the KVM
selftest:
rseq_test.c:48:13: error: conflicting types for 'sys_getcpu'
48 | static void sys_getcpu(unsigned *cpu)
| ^~~~~~~~~~
In file included from rseq_test.c:23:
../rseq/rseq.c:82:12: note: previous definition of 'sys_getcpu' was here
82 | static int sys_getcpu(unsigned *cpu, unsigned *node)
| ^~~~~~~~~~
Fix this by removing the local wrapper and moving the result check up to
the caller.
Fixes: 99babd04b2 ("selftests/rseq: Implement rseq numa node id field selftest")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/r/20230106-fix-kvm-rseq-build-v1-1-b704d9831d02@kernel.org
sk_assign is failing on an s390x machine running Debian "bookworm" for
2 reasons: legacy server_map definition and uninitialized addrlen in
recvfrom() call.
Fix by adding a new-style server_map definition and dropping addrlen
(recvfrom() allows NULL values for src_addr and addrlen).
Since the test should support tc built without libbpf, build the prog
twice: with the old-style definition and with the new-style definition,
then select the right one at runtime. This could be done at compile
time too, but this would not be cross-compilation friendly.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230129190501.1624747-2-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Extend the read-only memslot tests in page_fault_test to test
read-only PT (Page table) memslots. Note that this was not allowed
before commit 406504c7b0 ("KVM: arm64: Fix S1PTW handling on RO
memslots") as all S1PTW faults were treated as writes which resulted
in an (unrecoverable) exception inside the guest.
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230127214353.245671-5-ricarkol@google.com
The dirty log checks are mistakenly testing the first page in the page
table (PT) memory region instead of the page holding the test data
page PTE. This wasn't an issue before commit 406504c7b0 ("KVM:
arm64: Fix S1PTW handling on RO memslots") as all PT pages (including
the first page) were treated as writes.
Fix the page_fault_test dirty logging tests by checking for the right
page: the one for the PTE of the data test page.
Fixes: a4edf25b3e ("KVM: selftests: aarch64: Add dirty logging tests into page_fault_test")
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230127214353.245671-4-ricarkol@google.com
Only Stage1 Page table walks (S1PTW) trying to write into a PTE should
result in the PTE page being dirty in the log. However, the dirty log
tests in page_fault_test default to treat all S1PTW accesses as writes.
Fix the relevant tests by asserting dirty pages only for S1PTW writes,
which in these tests only applies to when Hardware management of the Access
Flag is enabled.
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230127214353.245671-3-ricarkol@google.com
Only Stage1 Page table walks (S1PTW) writing a PTE on an unmapped page
should result in a userfaultfd write. However, the userfaultfd tests in
page_fault_test wrongly assert that any S1PTW is a PTE write.
Fix this by relaxing the read vs. write checks in all userfaultfd
handlers. Note that this is also an attempt to focus less on KVM (and
userfaultfd) behavior, and more on architectural behavior. Also note
that after commit 406504c7b0 ("KVM: arm64: Fix S1PTW handling on RO
memslots"), the userfaultfd fault (S1PTW with AF on an unmaped PTE
page) is actually a read: the translation fault that comes before the
permission fault.
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230127214353.245671-2-ricarkol@google.com
BPF_PROBE_READ_INTO() and BPF_PROBE_READ_STR_INTO() should map to
bpf_probe_read() and bpf_probe_read_str() respectively in order to work
correctly on architectures with !ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-24-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Loading programs that use bpf_usdt_arg() on s390x fails with:
; if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt)
128: (79) r1 = *(u64 *)(r10 -24) ; frame1: R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
129: (25) if r1 > 0xb goto pc+83 ; frame1: R1_w=scalar(umax=11,var_off=(0x0; 0xf))
...
; arg_spec = &spec->args[arg_num];
135: (79) r1 = *(u64 *)(r10 -24) ; frame1: R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
...
; switch (arg_spec->arg_type) {
139: (61) r1 = *(u32 *)(r2 +8)
R2 unbounded memory access, make sure to bounds check any such access
The reason is that, even though the C code enforces that
arg_num < BPF_USDT_MAX_ARG_CNT, the verifier cannot propagate this
constraint to the arg_spec assignment yet. Help it by forcing r1 back
to stack after comparison.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-23-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Use bpf_probe_read_kernel() and bpf_probe_read_kernel_str() instead
of bpf_probe_read() and bpf_probe_read_kernel().
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-21-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Use the correct datatype for the values map values; currently the test
works by accident, since on little-endian machines it is sometimes
acceptable to access u64 as u32.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-20-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Use a syscall macro to access the nanosleep()'s first argument;
currently the code uses gprs[2] instead of orig_gpr2.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-18-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
s390x cache line size is 256 bytes, so skb_shared_info must be aligned
on a much larger boundary than for x86.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-17-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Use syscall macros to access the setdomainname() arguments; currently
the code uses gprs[2] instead of orig_gpr2 for the first argument.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-16-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
s390x ABI requires the caller to zero- or sign-extend the arguments.
eBPF already deals with zero-extension (by definition of its ABI), but
not with sign-extension.
Add a test to cover that potentially problematic area.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-15-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
If stack_mprotect() succeeds, errno is not changed. This can produce
misleading error messages, that show stale errno.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-13-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Sync the definition of socket_cookie between the eBPF program and the
test. Currently the test works by accident, since on little-endian it
is sometimes acceptable to access u64 as u32.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-12-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
s390x cache line size is 256 bytes, so skb_shared_info must be aligned
on a much larger boundary than for x86. This makes the maximum packet
size smaller.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-11-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Use bpf_probe_read_kernel() instead of bpf_probe_read(), which is not
defined on all architectures.
While at it, improve the error handling: do not hide the verifier log,
and check the return values of bpf_probe_read_kernel() and
bpf_copy_from_user().
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-10-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
decap_sanity prints the following on the 1st run:
decap_sanity: sh: 1: Syntax error: Bad fd number
and the following on the 2nd run:
Cannot create namespace file "/run/netns/decap_sanity_ns": File exists
The problem is that the cleanup command has a typo and does nothing.
Fix the typo.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-9-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The result of urand_spawn() is checked with ASSERT_OK_PTR, which treats
NULL as success if errno == 0.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-8-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
h_proto is big-endian; use htons() in order to make comparison work on
both little- and big-endian machines.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-7-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When building with O=, the following error occurs:
ln: failed to create symbolic link 'no_alu32/bpftool': No such file or directory
Adjust the code to account for $(OUTPUT).
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-6-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When building with O=, the following linker error occurs:
clang: error: no such file or directory: 'liburandom_read.so'
Fix by adding $(OUTPUT) to the linker search path.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-5-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
"tcpdump" is used to capture traffic in these tests while using a random,
temporary and not suffixed file for it. This can interfere with apparmor
configuration where the tool is only allowed to read from files with
'known' extensions.
The MINE type application/vnd.tcpdump.pcap was registered with IANA for
pcap files and .pcap is the extension that is both most common but also
aligned with standard apparmor configurations. See TCPDUMP(8) for more
details.
This improves compatibility with standard apparmor configurations by
using ".pcap" as the file extension for the tests' temporary files.
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY9RqJgAKCRDbK58LschI
gw2IAP9G5uhFO5abBzYLupp6SY3T5j97MUvPwLfFqUEt7EXmuwEA2lCUEWeW0KtR
QX+QmzCa6iHxrW7WzP4DUYLue//FJQY=
=yYqA
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
bpf-next 2023-01-28
We've added 124 non-merge commits during the last 22 day(s) which contain
a total of 124 files changed, 6386 insertions(+), 1827 deletions(-).
The main changes are:
1) Implement XDP hints via kfuncs with initial support for RX hash and
timestamp metadata kfuncs, from Stanislav Fomichev and
Toke Høiland-Jørgensen.
Measurements on overhead: https://lore.kernel.org/bpf/875yellcx6.fsf@toke.dk
2) Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case, from Andrii Nakryiko.
3) Significantly reduce the search time for module symbols by livepatch
and BPF, from Jiri Olsa and Zhen Lei.
4) Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs
in different time intervals, from David Vernet.
5) Fix several issues in the dynptr processing such as stack slot liveness
propagation, missing checks for PTR_TO_STACK variable offset, etc,
from Kumar Kartikeya Dwivedi.
6) Various performance improvements, fixes, and introduction of more
than just one XDP program to XSK selftests, from Magnus Karlsson.
7) Big batch to BPF samples to reduce deprecated functionality,
from Daniel T. Lee.
8) Enable struct_ops programs to be sleepable in verifier,
from David Vernet.
9) Reduce pr_warn() noise on BTF mismatches when they are expected under
the CONFIG_MODULE_ALLOW_BTF_MISMATCH config anyway, from Connor O'Brien.
10) Describe modulo and division by zero behavior of the BPF runtime
in BPF's instruction specification document, from Dave Thaler.
11) Several improvements to libbpf API documentation in libbpf.h,
from Grant Seltzer.
12) Improve resolve_btfids header dependencies related to subcmd and add
proper support for HOSTCC, from Ian Rogers.
13) Add ipip6 and ip6ip decapsulation support for bpf_skb_adjust_room()
helper along with BPF selftests, from Ziyang Xuan.
14) Simplify the parsing logic of structure parameters for BPF trampoline
in the x86-64 JIT compiler, from Pu Lehui.
15) Get BTF working for kernels with CONFIG_RUST enabled by excluding
Rust compilation units with pahole, from Martin Rodriguez Reboredo.
16) Get bpf_setsockopt() working for kTLS on top of TCP sockets,
from Kui-Feng Lee.
17) Disable stack protection for BPF objects in bpftool given BPF backends
don't support it, from Holger Hoffstätte.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (124 commits)
selftest/bpf: Make crashes more debuggable in test_progs
libbpf: Add documentation to map pinning API functions
libbpf: Fix malformed documentation formatting
selftests/bpf: Properly enable hwtstamp in xdp_hw_metadata
selftests/bpf: Calls bpf_setsockopt() on a ktls enabled socket.
bpf: Check the protocol of a sock to agree the calls to bpf_setsockopt().
bpf/selftests: Verify struct_ops prog sleepable behavior
bpf: Pass const struct bpf_prog * to .check_member
libbpf: Support sleepable struct_ops.s section
bpf: Allow BPF_PROG_TYPE_STRUCT_OPS programs to be sleepable
selftests/bpf: Fix vmtest static compilation error
tools/resolve_btfids: Alter how HOSTCC is forced
tools/resolve_btfids: Install subcmd headers
bpf/docs: Document the nocast aliasing behavior of ___init
bpf/docs: Document how nested trusted fields may be defined
bpf/docs: Document cpumask kfuncs in a new file
selftests/bpf: Add selftest suite for cpumask kfuncs
selftests/bpf: Add nested trust selftests suite
bpf: Enable cpumasks to be queried and used as kptrs
bpf: Disallow NULLable pointers for trusted kfuncs
...
====================
Link: https://lore.kernel.org/r/20230128004827.21371-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY9RCwAAKCRDbK58LschI
g7drAQDfMPc1Q2CE4LZ9oh2wu1Nt2/85naTDK/WirlCToKs0xwD+NRQOyO3hcoJJ
rOCwfjOlAm+7uqtiwwodBvWgTlDgVAM=
=0fC+
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
bpf 2023-01-27
We've added 10 non-merge commits during the last 9 day(s) which contain
a total of 10 files changed, 170 insertions(+), 59 deletions(-).
The main changes are:
1) Fix preservation of register's parent/live fields when copying
range-info, from Eduard Zingerman.
2) Fix an off-by-one bug in bpf_mem_cache_idx() to select the right
cache, from Hou Tao.
3) Fix stack overflow from infinite recursion in sock_map_close(),
from Jakub Sitnicki.
4) Fix missing btf_put() in register_btf_id_dtor_kfuncs()'s error path,
from Jiri Olsa.
5) Fix a splat from bpf_setsockopt() via lsm_cgroup/socket_sock_rcv_skb,
from Kui-Feng Lee.
6) Fix bpf_send_signal[_thread]() helpers to hold a reference on the task,
from Yonghong Song.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Fix the kernel crash caused by bpf_setsockopt().
selftests/bpf: Cover listener cloning with progs attached to sockmap
selftests/bpf: Pass BPF skeleton to sockmap_listen ops tests
bpf, sockmap: Check for any of tcp_bpf_prots when cloning a listener
bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself
bpf: Add missing btf_put to register_btf_id_dtor_kfuncs
selftests/bpf: Verify copy_register_state() preserves parent/live fields
bpf: Fix to preserve reg parent/live fields when copying range info
bpf: Fix a possible task gone issue with bpf_send_signal[_thread]() helpers
bpf: Fix off-by-one error in bpf_mem_cache_idx()
====================
Link: https://lore.kernel.org/r/20230127215820.4993-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This fixes the doxygen format documentation above the
user_ring_buffer__* APIs. There has to be a newline
before the @brief, otherwise doxygen won't render them
for libbpf.readthedocs.org.
Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230126024749.522278-1-grantseltzer@gmail.com
- fix the -c option in the gpio-event-mode user-space example program
- fix the irq number translation in gpio-ep93xx and make its irqchip immutable
- add a missing spin_unlock in error path in gpio-mxc
- fix a suspend breakage on System76 and Lenovo Gen2a introduced in GPIO ACPI
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmPUKckACgkQEacuoBRx
13KjuQ//fL9MagRQVsE85SVhhMsjmTTbXnvGJOIT0iTTvqgh9DACX9m6EDQQ/8Kd
bKuciI0h2zO5xt/fZPAyJWgGw0w3qvld7f3f6KQE60X4MRyckyZB9xEyHnvuUs+X
v0SiH8rLm+znezFXFiIB3lq004T9DU+wxUCaSq4WH1eE7eqKP8ZXhg2sLUgiskiY
c1Albv2X0xM4duLmPBK4vsXV2x35yj5qS0qkIymezff2sQekzb3TOZsaS7Doznex
cKr2OvJR/VuH4SMv+uIoKjsVKzdvtP/6hBKJtTH5i1krzNsBW/S+/7odwxzdxPVU
EgQc5I7/ubV7FP3Jo3ThO3exTFyd50hl4jKmrkrzzFpeh2xZ2eKU/AD8/MNsPVuA
KAoBUxHnoJfdZuUNL0Cmf1pTmzZbS80qntznzO700QtBaIxS/6redfJ8CKUK4Iy8
f491K3SbzpXzRAjVMBv8ua7JrB8MT82IKxbeXd1bellGrLaNuMzj1Yd8XhWlepnQ
wZxMSo3457FpOghiiMM3YnLSCcpbs4WU8VV/NMCc8jfS14oq7gA26EalQLPwKeCF
5c+UMAZ95gafJwPjuZcDJcRCGpDlhqPawRBo5cDVUlSokh1PSmPhcGNtE8XKUKlE
fHyd6WEYn8AzXnf3PMceiuJYCADs8wtg87GHKuLF1PPXmcqM4C8=
=wh3L
-----END PGP SIGNATURE-----
Merge tag 'gpio-fixes-for-v6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix the -c option in the gpio-event-mode user-space example program
- fix the irq number translation in gpio-ep93xx and make its irqchip
immutable
- add a missing spin_unlock in error path in gpio-mxc
- fix a suspend breakage on System76 and Lenovo Gen2a introduced in
GPIO ACPI
* tag 'gpio-fixes-for-v6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
tools: gpio: fix -c option of gpio-event-mon
gpio: ep93xx: remove unused variable
gpio: ep93xx: Make irqchip immutable
gpio: ep93xx: Fix port F hwirq numbers in handler
gpio: mxc: Unlock on error path in mxc_flip_edge()
gpiolib-acpi: Don't set GPIOs for wakeup in S3 mode
This Kselftest fixes update for Linux 6.2-rc6 consists of a single
fix to a amd-pstate test Makefile bug that deletes source files
during make clean run.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmPS4awACgkQCwJExA0N
QxyxIA/+LSuXceaj895VX/wwCeB9SFarG5oZesX5sg4Z8RID91w505hY+ekuKyUT
6Wt2sjc9XJ3RJfPZ5L9fK1TncIjkkQbKoDIJjKZ9nM10zT4wj0fK+vjnKY96Da9w
s70stRKr0zyVE+uPV5oEuj89LsZITuOsDJxbxcF7RWHOCid+nYaO8RdoANAHMPnF
f70ziaAAb8zNqv55Lnh05N+gd7VptaJ6CuS3d/WxRABhNgqSHheEMWymY51G0aO1
3Om3/HLVFWLr7nqdt7Vnc2EjcllzwxaRnIACctmLIzVlabJSuDm4FtDpOU2D+sG4
pGW/NVc+ydrM1PJuugery7tMMP1vV+frhhDvfnuRdLaPFl5Wxc0W8D7WOgMce0fo
hK+wahfIrcNrXX8mQQi54P0K5imIZcgPxpZz3rgZHrbP+hJmDdStMh93zCLLfL/t
zmVYf9xYJzb32OOkmjZdN0AKNDlDowgZQcbesmH22bDVMuKORb6YG920J6Qa2pmK
5A168cLYS5yDHtbouZlcDK5XAroydcAYWGLXmVMoLKpS2Sa58DsKlVgq7BeHcUgX
dQLhfS18YsW9XKiKYakWb6z67B+MQ0SulnOdOLhOO3ekds04hWFSASR4nFvDKHce
h9MZKBk2osHYfX//RCfmyCVqE4SleMXBWNzFOfrfm0de0d+NILQ=
=Wb/P
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-fixes-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fixes from Shuah Khan:
"A single fix to a amd-pstate test Makefile bug that deletes source
files during make clean run"
* tag 'linux-kselftest-fixes-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: amd-pstate: Don't delete source files via Makefile
build_id__init() only copies the buildid data up to size leaving the
rest of the data array uninitialized. Copying the full array during
synthesis means the written event contains uninitialized memory.
Ensure the size is less that the buffer size and only copy the bytes
that were initialized. This was detected by the Clang/LLVM memory
sanitizer.
v2. Avoids the potential for copying too much as suggested by Arnaldo.
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230120185828.43231-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Setup is non-trivial so also link to the full SPE docs.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-perf-users@vger.kernel.or
Link: https://lore.kernel.org/r/20230124145929.557891-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make the sink error message more similar to the event error message that
reminds about missing kernel support. The available sinks are also
determined by the hardware so mention that too.
Also, usually it's not necessary to specify the sink, so add that as a
hint.
Now the error for a made up sink looks like this:
$ perf record -e cs_etm/@abc/
Couldn't find sink "abc" on event cs_etm/@abc/.
Missing kernel or device support?
Hint: An appropriate sink will be picked automatically if one isn't is specified.
For any error other than ENOENT, the same message as before is
displayed.
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/ec7502e6-b406-3997-c2a5-24f98e5c4854@arm.com
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230124110220.460551-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update ptrace tests according to all potential Yama security policies.
This is required to make such tests pass even if Yama is enabled.
Tests are not skipped but they now check both Landlock and Yama boundary
restrictions at run time to keep a maximum test coverage (i.e. positive
and negative testing).
Signed-off-by: Jeff Xu <jeffxu@google.com>
Link: https://lore.kernel.org/r/20230114020306.1407195-2-jeffxu@google.com
Cc: stable@vger.kernel.org
[mic: Add curly braces around EXPECT_EQ() to make it build, and improve
commit message]
Co-developed-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Following line should listen for a rising edge and exit after the first
one since '-c 1' is provided.
# gpio-event-mon -n gpiochip1 -o 0 -r -c 1
It works with kernel 4.19 but it doesn't work with 5.10. In 5.10 the
above command doesn't exit after the first rising edge it keep listening
for an event forever. The '-c 1' is not taken into an account.
The problem is in commit 62757c32d5 ("tools: gpio: add multi-line
monitoring to gpio-event-mon").
Before this commit the iterator 'i' in monitor_device() is used for
counting of the events (loops). In the case of the above command (-c 1)
we should start from 0 and increment 'i' only ones and hit the 'break'
statement and exit the process. But after the above commit counting
doesn't start from 0, it start from 1 when we listen on one line.
It is because 'i' is used from one more purpose, counting of lines
(num_lines) and it isn't restore to 0 after following code
for (i = 0; i < num_lines; i++)
gpiotools_set_bit(&values.mask, i);
Restore the initial value of the iterator to 0 in order to allow counting
of loops to work for any cases.
Fixes: 62757c32d5 ("tools: gpio: add multi-line monitoring to gpio-event-mon")
Signed-off-by: Ivo Borisov Shopov <ivoshopov@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
[Bartosz: tweak the commit message]
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Fix the build caused by missing kmsan_handle_dma() and is_power_of_2() that
are used in drivers/virtio/virtio_ring.c.
Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Message-Id: <20230110034310.779744-1-mie@igel.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The existing timestamping_enable() is a no-op because it applies
to the socket-related path that we are not verifying here
anymore. (but still leaving the code around hoping we can
have xdp->skb path verified here as well)
poll: 1 (0)
xsk_ring_cons__peek: 1
0xf64788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000
rx_hash: 3697961069
rx_timestamp: 1674657672142214773 (sec:1674657672.1422)
XDP RX-time: 1674657709561774876 (sec:1674657709.5618) delta sec:37.4196
AF_XDP time: 1674657709561871034 (sec:1674657709.5619) delta
sec:0.0001 (96.158 usec)
0xf64788: complete idx=8 addr=8000
Also, maybe something to archive here, see [0] for Jesper's note
about NIC vs host clock delta.
0: https://lore.kernel.org/bpf/f3a116dc-1b14-3432-ad20-a36179ef0608@redhat.com/
v2:
- Restore original value (Martin)
Fixes: 297a3f1241 ("selftests/bpf: Simple program to dump XDP RX metadata")
Reported-by: Jesper Dangaard Brouer <jbrouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <jbrouer@redhat.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230126225030.510629-1-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Log overflow is marked by a separate trace message.
Simulate a log with lots of messages and flag overflow until space is
cleared.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-8-2316a5c8f7d8@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Each type of event has different trace point outputs.
Add mock General Media Event, DRAM event, and Memory Module Event
records to the mock list of events returned.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-7-2316a5c8f7d8@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Facilitate testing basic Get/Clear Event functionality by creating
multiple logs and generic events with made up UUID's.
Data is completely made up with data patterns which should be easy to
spot in trace output.
A single sysfs entry resets the event data and triggers collecting the
events for testing.
Test traces are easy to obtain with a small script such as this:
#!/bin/bash -x
devices=`find /sys/devices/platform -name cxl_mem*`
# Turn on tracing
echo "" > /sys/kernel/tracing/trace
echo 1 > /sys/kernel/tracing/events/cxl/enable
echo 1 > /sys/kernel/tracing/tracing_on
# Generate fake interrupt
for device in $devices; do
echo 1 > $device/event_trigger
done
# Turn off tracing and report events
echo 0 > /sys/kernel/tracing/tracing_on
cat /sys/kernel/tracing/trace
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-6-2316a5c8f7d8@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When rendering code we should walk the ops in the order in which
they are declared in the spec. This is both more intuitive and
prevents code from jumping around when hashing in the dict changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ops_list contains all the operations, but the main iteration use
case is to walk only ops which define attrs. Rename ops_list to
msg_list, because now it looks like the contents are the same,
just the format is different. While at it convert from tuple
to just keys, none of the users care about the name of the op.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lorenzo reports that after switching from enum to flags netdev
family lost ability to render kdoc (and the enum contents got
generally garbled).
Combine the flags and enum handling in uAPI handling.
Reported-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit "tools/testing/cxl: Add XOR Math support to cxl_test" added
a module parameter to cxl_test for the interleave_arithmetic option.
In doing so, it also added this dev_dbg() message describing which
option cxl_test used during load:
"[ 111.743246] (NULL device *): cxl_test loading modulo math option"
That "(NULL device *)" has raised needless user concern.
Remove the dev_dbg() message and make the module_param readable via
sysfs for users that need to know which math option is active.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20230126170555.701240-1-alison.schofield@intel.com
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
As the number of test cases and length of execution grows it's
useful to select only a subset of tests. In TLS for instance we
have a matrix of variants for different crypto protocols and
during development mostly care about testing a handful.
This is quicker and makes reading output easier.
This patch adds argument parsing to kselftest_harness.
It supports a couple of ways to filter things, I could not come
up with one way which will cover all cases.
The first and simplest switch is -r which takes the name of
a test to run (can be specified multiple times). For example:
$ ./my_test -r some.test.name -r some.other.name
will run tests some.test.name and some.other.name (where "some"
is the fixture, "test" and "other" and "name is the test.)
Then there is a handful of group filtering options. f/v/t for
filtering by fixture/variant/test. They have both positive
(match -> run) and negative versions (match -> skip).
If user specifies any positive option we assume the default
is not to run the tests. If only negative options are set
we assume the tests are supposed to be run by default.
Usage: ./tools/testing/selftests/net/tls [-h|-l] [-t|-T|-v|-V|-f|-F|-r name]
-h print help
-l list all tests
-t name include test
-T name exclude test
-v name include variant
-V name exclude variant
-f name include fixture
-F name exclude fixture
-r name run specified test
Test filter options can be specified multiple times. The filtering stops
at the first match. For example to include all tests from variant 'bla'
but not test 'foo' specify '-T foo -v bla'.
Here we can request for example all tests from fixture "foo" to run:
./my_test -f foo
or to skip variants var1 and var2:
./my_test -V var1 -V var2
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
During the cleanup phase, the server pids were killed with a SIGTERM
directly, not using a SIGUSR1 first to quit safely. As a result, this
test was often ending with two error messages:
read: Connection reset by peer
While at it, use a for-loop to terminate all the PIDs the same way.
Also the different files are now removed after having killed the PIDs
using them. It makes more sense to do that in this order.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Before, only '[FAIL]' was printed in case of error during the validation
phase.
Now, in case of failure, the variable name, its value and expected one
are displayed to help understand what was wrong.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Instead of having a long list of conditions to check, it is possible to
give a list of variable names to compare with their 'e_XXX' version.
This will ease the introduction of the following commit which will print
which condition has failed (if any).
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This script is running a few tests after having setup the environment.
Printing titles helps understand what is being tested.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Note that we can't guess the listener family anymore based on the client
target address: always use IPv6.
The fullmesh flag with endpoints from different families is also
validated here.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Exercise IP_LOCAL_PORT_RANGE socket option in various scenarios:
1. pass invalid values to setsockopt
2. pass a range outside of the per-netns port range
3. configure a single-port range
4. exhaust a configured multi-port range
5. check interaction with late-bind (IP_BIND_ADDRESS_NO_PORT)
6. set then get the per-socket port range
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ensures that whenever bpf_setsockopt() is called with the SOL_TCP
option on a ktls enabled socket, the call will be accepted by the
system. The provided test makes sure of this by performing an
examination when the server side socket is in the CLOSE_WAIT state. At
this stage, ktls is still enabled on the server socket and can be used
to test if bpf_setsockopt() works correctly with linux.
Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>
Link: https://lore.kernel.org/r/20230125201608.908230-3-kuifeng@meta.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
size_t is limited to 32-bits and so the gen_pool_alloc() using
the size of SZ_64G would map to 0, triggering a low allocation
which is not expected. Force the dependency on 64-bit for cxl_test
as that is what it was designed for.
This issue was found by build test reports when converting this
driver as a proper upstream driver.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20221219195050.325959-1-mcgrof@kernel.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In a set of prior changes, we added the ability for struct_ops programs
to be sleepable. This patch enhances the dummy_st_ops selftest suite to
validate this behavior by adding a new sleepable struct_ops entry to
dummy_st_ops.
Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230125164735.785732-5-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
In a prior change, the verifier was updated to support sleepable
BPF_PROG_TYPE_STRUCT_OPS programs. A caller could set the program as
sleepable with bpf_program__set_flags(), but it would be more ergonomic
and more in-line with other sleepable program types if we supported
suffixing a struct_ops section name with .s to indicate that it's
sleepable.
Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230125164735.785732-3-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
BPF struct_ops programs currently cannot be marked as sleepable. This
need not be the case -- struct_ops programs can be sleepable, and e.g.
invoke kfuncs that export the KF_SLEEPABLE flag. So as to allow future
struct_ops programs to invoke such kfuncs, this patch updates the
verifier to allow struct_ops programs to be sleepable. A follow-on patch
will add support to libbpf for specifying struct_ops.s as a sleepable
struct_ops program, and then another patch will add testcases to the
dummy_st_ops selftest suite which test sleepable struct_ops behavior.
Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230125164735.785732-2-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
As stated in README.rst, in order to resolve errors with linker errors,
'LDLIBS=-static' should be used. Most problems will be solved by this
option, but in the case of urandom_read, this won't fix the problem. So
the Makefile is currently implemented to strip the 'static' option when
compiling the urandom_read. However, stripping this static option isn't
configured properly on $(LDLIBS) correctly, which is now causing errors
on static compilation.
# LDLIBS=-static ./vmtest.sh
ld.lld: error: attempted static link of dynamic object liburandom_read.so
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [Makefile:190: /linux/tools/testing/selftests/bpf/urandom_read] Error 1
make: *** Waiting for unfinished jobs....
This commit fixes this problem by configuring the strip with $(LDLIBS).
Fixes: 68084a1364 ("selftests/bpf: Fix building bpf selftests statically")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230125100440.21734-1-danieltimlee@gmail.com
HOSTCC is always wanted when building. Setting CC to HOSTCC happens
after tools/scripts/Makefile.include is included, meaning flags are
set assuming say CC is gcc, but then it can be later set to HOSTCC
which may be clang. tools/scripts/Makefile.include is needed for host
set up and common macros in objtool's Makefile. Rather than override
CC to HOSTCC, just pass CC as HOSTCC to Makefile.build, the libsubcmd
builds and the linkage step. This means the Makefiles don't see things
like CC changing and tool flag determination, and similar, work
properly.
Also, clear the passed subdir as otherwise an outer build may break by
inadvertently passing an inappropriate value.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230124064324.672022-2-irogers@google.com
Previously tools/lib/subcmd was added to the include path, switch to
installing the headers and then including from that directory. This
avoids dependencies on headers internal to tools/lib/subcmd. Add the
missing subcmd directory to the affected #include.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230124064324.672022-1-irogers@google.com
A recent patch added a new set of kfuncs for allocating, freeing,
manipulating, and querying cpumasks. This patch adds a new 'cpumask'
selftest suite which verifies their behavior.
Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230125143816.721952-5-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Now that defining trusted fields in a struct is supported, we should add
selftests to verify the behavior. This patch adds a few such testcases.
Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230125143816.721952-4-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
KF_TRUSTED_ARGS kfuncs currently have a subtle and insidious bug in
validating pointers to scalars. Say that you have a kfunc like the
following, which takes an array as the first argument:
bool bpf_cpumask_empty(const struct cpumask *cpumask)
{
return cpumask_empty(cpumask);
}
...
BTF_ID_FLAGS(func, bpf_cpumask_empty, KF_TRUSTED_ARGS)
...
If a BPF program were to invoke the kfunc with a NULL argument, it would
crash the kernel. The reason is that struct cpumask is defined as a
bitmap, which is itself defined as an array, and is accessed as a memory
address by bitmap operations. So when the verifier analyzes the
register, it interprets it as a pointer to a scalar struct, which is an
array of size 8. check_mem_reg() then sees that the register is NULL and
returns 0, and the kfunc crashes when it passes it down to the cpumask
wrappers.
To fix this, this patch adds a check for KF_ARG_PTR_TO_MEM which
verifies that the register doesn't contain a possibly-NULL pointer if
the kfunc is KF_TRUSTED_ARGS.
Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230125143816.721952-2-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Update the selftests to include a test of passing a stacktrace between the
events of a synthetic event.
Link: https://lkml.kernel.org/r/20230117152236.475439286@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Ross Zwisler <zwisler@google.com>
Cc: Ching-lin Yu <chinglinyu@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
With the new filter logic of passing in the name of a function to match an
instruction pointer (or the address of the function), add a test to make
sure that it is functional.
This is also the first test to test plain filtering. The filtering has
been tested via the trigger logic, which uses the same code, but there was
nothing to test just the event filter, so this test is the first to add
such a case.
Link: https://lkml.kernel.org/r/20221219183214.075559302@goodmis.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Zheng Yejian <zhengyejian1@huawei.com>
Cc: linux-kselftest@vger.kernel.org
Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Today we test if a child socket is cloned properly from a listening socket
inside a sockmap only when there are no BPF programs attached to the map.
A bug has been reported [1] for the case when sockmap has a verdict program
attached. So cover this case as well to prevent regressions.
[1]: https://lore.kernel.org/r/00000000000073b14905ef2e7401@google.com
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20230113-sockmap-fix-v2-4-1e0ee7ac2f90@cloudflare.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Following patch extends the sockmap ops tests to cover the scenario when a
sockmap with attached programs holds listening sockets.
Pass the BPF skeleton to sockmap ops test so that the can access and attach
the BPF programs.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20230113-sockmap-fix-v2-3-1e0ee7ac2f90@cloudflare.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When developing tests, it is much faster to use the QEMU Linux
emulator instead of the system emulator, which among other things avoids
kernel-build latencies. Although use of the QEMU Linux emulator does have
its limitations (please see below), it is sufficient to test startup code,
stdlib code, and syscall calling conventions.
However, the current mainline Linux-kernel nolibc setup does not
support this. Therefore, add a "run-user" target that immediately
executes the prebuilt executable.
Again, this approach does have its limitations. For example, the
executable runs with the user's privilege level, which can cause some
false-positive failures due to insufficient permissions. In addition,
if the underlying kernel is old enough to lack some features that
nolibc relies on, the result will be false-positive failures in the
corresponding tests. However, for nolibc changes not affected by these
limittions, the result is a much faster code-compile-test-debug cycle.
With this patch, running a userland test is as simple as issuing:
make ARCH=xxx CROSS_COMPILE=xxx run-user
Signed-off-by: Willy Tarreau <w@1wt.eu>
Tested-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Building the kernel with ARCH=x86_64 works fine, but nolibc-test
only supports "x86", which causes errors when reusing existing build
environment. Let's permit this environment to be used as well by making
nolibc also accept ARCH=x86_64.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Tested-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Remove the assumption from kvm_binary_stats_test that all stats are
laid out contiguously in memory. The current stats in KVM are
contiguously laid out in memory, but that may change in the future and
the ABI specifically allows holes in the stats data (since each stat
exposes its own offset).
While here drop the check that each stats' offset is less than
size_data, as that is now always true by construction.
Link: https://lore.kernel.org/kvm/20221208193857.4090582-9-dmatlack@google.com/
Fixes: 0b45d58738 ("KVM: selftests: Add selftest for KVM statistics data binary interface")
Signed-off-by: Jing Zhang <jingzhangos@google.com>
[dmatlack: Re-worded the commit message.]
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20230117222707.3949974-1-dmatlack@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Use the host CPU's native hypercall instruction, i.e. VMCALL vs. VMMCALL,
in kvm_hypercall(), as relying on KVM to patch in the native hypercall on
a #UD for the "wrong" hypercall requires KVM_X86_QUIRK_FIX_HYPERCALL_INSN
to be enabled and flat out doesn't work if guest memory is encrypted with
a private key, e.g. for SEV VMs.
Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Vishal Annapurve <vannapurve@google.com>
Link: https://lore.kernel.org/r/20230111004445.416840-4-vannapurve@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Cache the host CPU vendor for userspace and share it with guest code.
All the current callers of this_cpu* actually care about host cpu so
they are updated to check host_cpu_is*.
Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Vishal Annapurve <vannapurve@google.com>
Link: https://lore.kernel.org/r/20230111004445.416840-3-vannapurve@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Replace is_intel/amd_cpu helpers with this_cpu_* helpers to better
convey the intent of querying vendor of the current cpu.
Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Vishal Annapurve <vannapurve@google.com>
Link: https://lore.kernel.org/r/20230111004445.416840-2-vannapurve@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
The assert incorrectly identifies the ioctl being called. Switch it
from KVM_GET_MSRS to KVM_SET_MSRS.
Fixes: 6ebfef83f0 ("KVM: selftest: Add proper helpers for x86-specific save/restore ioctls")
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20221209201326.2781950-1-aaronlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
kvm_vm_elf_load() and elfhdr_get() open one file each, but they
never close the opened file descriptor. If a test repeatedly
creates and destroys a VM with __vm_create(), which
(directly or indirectly) calls those two functions, the test
might end up getting a open failure with EMFILE.
Fix those two functions to close the file descriptor.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20221220170921.2499209-2-reijiw@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add testing to show that a pmu event can be filtered with a generalized
match on it's unit mask.
These tests set up test cases to demonstrate various ways of filtering
a pmu event that has multiple unit mask values. It does this by
setting up the filter in KVM with the masked events provided, then
enabling three pmu counters in the guest. The test then verifies that
the pmu counters agree with which counters should be counting and which
counters should be filtered for both a sparse filter list and a dense
filter list.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/r/20221220161236.555143-8-aaronlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Test that masked events are not using invalid bits, and if they are,
ensure the pmu event filter is not accepted by KVM_SET_PMU_EVENT_FILTER.
The only valid bits that can be used for masked events are set when
using KVM_PMU_ENCODE_MASKED_ENTRY() with one exception: If any of the
high bits (35:32) of the event select are set when using Intel, the pmu
event filter will fail.
Also, because validation was not being done prior to the introduction
of masked events, only expect validation to fail when masked events
are used. E.g. in the first test a filter event with all its bits set
is accepted by KVM_SET_PMU_EVENT_FILTER when flags = 0.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/r/20221220161236.555143-7-aaronlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Now that the flags field can be non-zero, pass it in when creating a
pmu event filter.
This is needed in preparation for testing masked events.
No functional change intended.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/r/20221220161236.555143-6-aaronlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
ARM:
* Fix the PMCR_EL0 reset value after the PMU rework
* Correctly handle S2 fault triggered by a S1 page table walk
by not always classifying it as a write, as this breaks on
R/O memslots
* Document why we cannot exit with KVM_EXIT_MMIO when taking
a write fault from a S1 PTW on a R/O memslot
* Put the Apple M2 on the naughty list for not being able to
correctly implement the vgic SEIS feature, just like the M1
before it
* Reviewer updates: Alex is stepping down, replaced by Zenghui
x86:
* Fix various rare locking issues in Xen emulation and teach lockdep
to detect them
* Documentation improvements
* Do not return host topology information from KVM_GET_SUPPORTED_CPUID
Add a CLI sample which can take in arbitrary request
in JSON format, convert it to Netlink and do the inverse
for output.
It's meant as a development tool primarily and perhaps
for selftests which need to tickle netlink in a special way.
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Code generators to turn Netlink specs into C code.
I'm definitely not proud of it.
The main generator is in Python, there's a bash script
to regen all code-gen'ed files in tree after making
spec changes.
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
William reports kernel soft-lockups on some OVS topologies when TC mirred
egress->ingress action is hit by local TCP traffic [1].
The same can also be reproduced with SCTP (thanks Xin for verifying), when
client and server reach themselves through mirred egress to ingress, and
one of the two peers sends a "heartbeat" packet (from within a timer).
Enqueueing to backlog proved to fix this soft lockup; however, as Cong
noticed [2], we should preserve - when possible - the current mirred
behavior that counts as "overlimits" any eventual packet drop subsequent to
the mirred forwarding action [3]. A compromise solution might use the
backlog only when tcf_mirred_act() has a nest level greater than one:
change tcf_mirred_forward() accordingly.
Also, add a kselftest that can reproduce the lockup and verifies TC mirred
ability to account for further packet drops after TC mirred egress->ingress
(when the nest level is 1).
[1] https://lore.kernel.org/netdev/33dc43f587ec1388ba456b4915c75f02a8aae226.1663945716.git.dcaratti@redhat.com/
[2] https://lore.kernel.org/netdev/Y0w%2FWWY60gqrtGLp@pop-os.localdomain/
[3] such behavior is not guaranteed: for example, if RPS or skb RX
timestamping is enabled on the mirred target device, the kernel
can defer receiving the skb and return NET_RX_SUCCESS inside
tcf_mirred_forward().
Reported-by: William Zhao <wizhao@redhat.com>
CC: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
In commit 72653ae530 ("selftests: net: tcp_mmap:
Use huge pages in send path") I made a change to use hugepages
for the buffer used by the client (tx path)
Today, I understood that the cause for poor zerocopy
performance was that after a mmap() for a 512KB memory
zone, kernel uses a single zeropage, mapped 128 times.
This was really the reason for poor tx path performance
in zero copy mode, because this zero page refcount is
under high pressure, especially when TCP ACK packets
are processed on another cpu.
We need either to force a COW on all the memory range,
or use MAP_POPULATE so that a zero page is not abused.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230120181136.3764521-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Each architecture supports at least 6 syscall argument registers, so now
that specs for each architecture is defined in bpf_tracing.h, remove
unnecessary macro overrides, which previously were required to keep
existing BPF_KSYSCALL() uses compiling and working.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230120200914.3008030-26-andrii@kernel.org
Turns out splice() is one of the syscalls that's using current maximum
number of arguments (six). This is perfect for testing, so extend
bpf_syscall_macro selftest to also trace splice() syscall, using
BPF_KSYSCALL() macro. This makes sure all the syscall argument register
definitions are correct.
Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com> # arm64
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x
Link: https://lore.kernel.org/bpf/20230120200914.3008030-25-andrii@kernel.org
Define explicit table of registers used for syscall argument passing.
Note that 7th arg is supported on 32-bit powerpc architecture, by not on
powerpc64.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230120200914.3008030-20-andrii@kernel.org
Define explicit table of registers used for syscall argument passing.
We need PT_REGS_PARM1_[CORE_]SYSCALL macros overrides, similarly to
s390x, due to orig_x0 not being present in UAPI's pt_regs, so we need to
utilize BPF CO-RE and custom pt_regs___arm64 definition.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com> # arm64
Link: https://lore.kernel.org/bpf/20230120200914.3008030-18-andrii@kernel.org
Define explicit table of registers used for syscall argument passing.
Note that we need custom overrides for PT_REGS_PARM1_[CORE_]SYSCALL
macros due to the need to use BPF CO-RE and custom local pt_regs
definitions to fetch orig_gpr2, storing 1st argument.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/20230120200914.3008030-16-andrii@kernel.org
Define explicit table of registers used for syscall argument passing.
Remove now unnecessary overrides of PT_REGS_PARM5_[CORE_]SYSCALL macros.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230120200914.3008030-14-andrii@kernel.org
Set up generic support in bpf_tracing.h for up to 7 syscall arguments
tracing with BPF_KSYSCALL, which seems to be the limit according to
syscall(2) manpage. Also change the way that syscall convention is
specified to be more explicit. Subsequent patches will adjust and define
proper per-architecture syscall conventions.
__PT_PARM1_SYSCALL_REG through __PT_PARM6_SYSCALL_REG is added
temporarily to keep everything working before each architecture has
syscall reg tables defined. They will be removed afterwards.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com> # arm64
Link: https://lore.kernel.org/bpf/20230120200914.3008030-13-andrii@kernel.org
Update uprobe_autoattach selftest to validate architecture-specific
argument passing through registers. Use new BPF_UPROBE and
BPF_URETPROBE, and construct both BPF-side and user-space side in such
a way that for different architectures we are fetching and checking
different number of arguments, matching architecture-specific limit of
how many registers are available for argument passing.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com> # arm64
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x
Link: https://lore.kernel.org/bpf/20230120200914.3008030-12-andrii@kernel.org
Add BPF_UPROBE and BPF_URETPROBE macros, aliased to BPF_KPROBE and
BPF_KRETPROBE, respectively. This makes uprobe-based BPF program code
much less confusing, especially to people new to tracing, at no cost in
terms of maintainability. We'll use this macro in selftests in
subsequent patch.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230120200914.3008030-11-andrii@kernel.org
Add PARM6 through PARM8 definitions. Also fix frame pointer (FP)
register definition. Also leave a link to where to find ABI spec.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230120200914.3008030-9-andrii@kernel.org
Add PARM6 through PARM8 definitions for RISC V (riscv) arch. Leave the
link for ABI doc for future reference.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Pu Lehui <pulehui@huawei.com> # RISC-V
Link: https://lore.kernel.org/bpf/20230120200914.3008030-8-andrii@kernel.org
Add definitions of PARM6 through PARM8 for powerpc architecture. Add
also a link to a functiona call sequence documentation for future reference.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230120200914.3008030-6-andrii@kernel.org
Add registers for PARM6 through PARM8. Add a link to an ABI. We don't
distinguish between O32, N32, and N64, so document that we assume N64
right now.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230120200914.3008030-5-andrii@kernel.org
Remove invalid support for PARM5 on 32-bit arm, as per ABI. Add three
more argument registers for arm64. Also leave links to ABI specs for
future reference.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com> # arm64
Link: https://lore.kernel.org/bpf/20230120200914.3008030-4-andrii@kernel.org
Add r9 as register containing 6th argument on x86-64 architecture, as
per its ABI. Add also a link to a page describing ABI for easier future
reference.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230120200914.3008030-3-andrii@kernel.org
Add BPF_KPROBE() and PT_REGS_PARMx() support for up to 8 arguments, if
target architecture supports this. Currently all architectures are
limited to only 5 register-placed arguments, which is limiting even on
x86-64.
This patch adds generic macro machinery to support up to 8 arguments
both when explicitly fetching it from pt_regs through PT_REGS_PARMx()
macros, as well as more ergonomic access in BPF_KPROBE().
Also, for i386 architecture we now don't have to define fake PARM4 and
PARM5 definitions, they will be generically substituted, just like for
PARM6 through PARM8.
Subsequent patches will fill out architecture-specific definitions,
where appropriate.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com> # arm64
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x
Link: https://lore.kernel.org/bpf/20230120200914.3008030-2-andrii@kernel.org
In commit 537c3f66ea ("selftests/bpf: add generic BPF program tester-loader"),
a new mechanism was added to the BPF selftest framework to allow testsuites to
use macros to define expected failing testcases.
This allows any testsuite which tests verification failure to remove a good
amount of boilerplate code. This patch updates the task_kfunc selftest suite
to use these new macros.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20230120021844.3048244-1-void@manifault.com
The test_cmd_destroy_access() should end with a semicolon, so add one.
There is a test_ioctl_destroy(ioas_id) following already, so drop one.
Fixes: 57f0988706 ("iommufd: Add a selftest")
Link: https://lore.kernel.org/r/20230120074204.1368-1-nicolinc@nvidia.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
To be used for verification of driver implementations. Note that
the skb path is gone from the series, but I'm still keeping the
implementation for any possible future work.
$ xdp_hw_metadata <ifname>
On the other machine:
$ echo -n xdp | nc -u -q1 <target> 9091 # for AF_XDP
$ echo -n skb | nc -u -q1 <target> 9092 # for skb
Sample output:
# xdp
xsk_ring_cons__peek: 1
0x19f9090: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000
rx_timestamp_supported: 1
rx_timestamp: 1667850075063948829
0x19f9090: complete idx=8 addr=8000
# skb
found skb hwtstamp = 1668314052.854274681
Decoding:
# xdp
rx_timestamp=1667850075.063948829
$ date -d @1667850075
Mon Nov 7 11:41:15 AM PST 2022
$ date
Mon Nov 7 11:42:05 AM PST 2022
# skb
$ date -d @1668314052
Sat Nov 12 08:34:12 PM PST 2022
$ date
Sat Nov 12 08:37:06 PM PST 2022
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Maryam Tahhan <mtahhan@redhat.com>
Cc: xdp-hints@xdp-project.net
Cc: netdev@vger.kernel.org
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230119221536.3349901-18-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
- create new netns
- create veth pair (veTX+veRX)
- setup AF_XDP socket for both interfaces
- attach bpf to veRX
- send packet via veTX
- verify the packet has expected metadata at veRX
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Maryam Tahhan <mtahhan@redhat.com>
Cc: xdp-hints@xdp-project.net
Cc: netdev@vger.kernel.org
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230119221536.3349901-12-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Generic check has a different error message, update the selftest.
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Maryam Tahhan <mtahhan@redhat.com>
Cc: xdp-hints@xdp-project.net
Cc: netdev@vger.kernel.org
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230119221536.3349901-7-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
New flag BPF_F_XDP_DEV_BOUND_ONLY plus all the infra to have a way
to associate a netdev with a BPF program at load time.
netdevsim checks are dropped in favor of generic check in dev_xdp_attach.
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Maryam Tahhan <mtahhan@redhat.com>
Cc: xdp-hints@xdp-project.net
Cc: netdev@vger.kernel.org
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230119221536.3349901-6-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
The perf test named “build id cache operations” skips with below error
on some distros:
<<>>
78: build id cache operations :
test child forked, pid 111101
WARNING: wine not found. PE binaries will not be run.
test binaries: /tmp/perf.ex.SHA1.PKz /tmp/perf.ex.MD5.Gt3 ./tests/shell/../pe-file.exe
DEBUGINFOD_URLS=
Adding 4abd406f041feb4f10ecde3fc30fd0639e1a91cb /tmp/perf.ex.SHA1.PKz: Ok
build id: 4abd406f041feb4f10ecde3fc30fd0639e1a91cb
./tests/shell/buildid.sh: 69: ./tests/shell/buildid.sh: Bad substitution
test child finished with -2
build id cache operations: Skip
<<>>
The test script "tests/shell/buildid.sh" uses some of the string
substitution ways which are supported in bash, but not in "sh" or other
shells. Above error on line number 69 that reports "Bad substitution"
is:
<<>>
link=${build_id_dir}/.build-id/${id:0:2}/${id:2}
<<>>
Here the way of getting first two characters from id ie, ${id:0:2} and
similarly expressions like ${id:2} is not recognised in "sh". So the
line errors and instead of hitting failure, the test gets skipped as
shown in logs. So the syntax issue causes test not to be executed in
such cases. Similarly usage : "${@: -1}" [ to pick last argument passed
to a function] in “test_record” doesn’t work in all distros.
Fix this by using alternative way with shell substitution to pick
required characters from the string. Also fix the usage of “${@: -1}” to
work in all cases.
Another usage in “test_record” is:
<<>>
${perf} record --buildid-all -o ${data} $@ &> ${log}
<<>>
This causes the 'perf record' to start in background and Results in the
data file not being created by the time "check" function is invoked.
Below log shows 'perf record' result getting displayed after the call to
"check" function.
<<>>
running: perf record /tmp/perf.ex.SHA1.EAU
build id: 4abd406f041feb4f10ecde3fc30fd0639e1a91cb
link: /tmp/perf.debug.mLT/.build-id/4a/bd406f041feb4f10ecde3fc30fd0639e1a91cb
failed: link /tmp/perf.debug.mLT/.build-id/4a/bd406f041feb4f10ecde3fc30fd0639e1a91cb does not exist
test child finished with -1
build id cache operations: FAILED!
root@machine:~/athira/linux/tools/perf# Couldn't synthesize bpf events.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.010 MB /tmp/perf.data.bFF ]
<<>>
Fix this by redirecting output instead of using “&” which starts the
command in background.
Reviewed-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230119142719.32628-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix various spelling errors as reported by Debian's lintian tool.
"amount of times" -> "number of times"
ocurrence -> occurrence
upto -> up to
Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230122122034.48020-1-didi.debian@cknow.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
BPF filtering tests can sometime fail. Running the test in verbose mode
shows the following:
$ sudo perf test 42
42: BPF filter :
42.1: Basic BPF filtering : FAILED!
42.2: BPF pinning : Skip
42.3: BPF prologue generation : Skip
$ perf --version
perf version 4.18.0-425.3.1.el8.ppc64le
$ sudo perf test -v 42
42: BPF filter :
42.1: Basic BPF filtering :
--- start ---
test child forked, pid 711060
...
bpf: config 'func=do_epoll_wait' is ok
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/lib/modules/4.18.0-425.3.1.el8.ppc64le/vmlinux for symbols
Open Debuginfo file: /usr/lib/debug/.build-id/81/56f5a07f92ccb62c5600ba0e4aacfb5f3a7534.debug
Try to find probe point from debuginfo.
Matched function: do_epoll_wait [4ef8cb0]
found inline addr: 0xc00000000061dbe4
Probe point found: __se_compat_sys_epoll_pwait+196
found inline addr: 0xc00000000061d9f4
Probe point found: __se_sys_epoll_pwait+196
found inline addr: 0xc00000000061d824
Probe point found: __se_sys_epoll_wait+36
Found 3 probe_trace_events.
Opening /sys/kernel/tracing//kprobe_events write=1
...
BPF filter result incorrect, expected 56, got 56 samples
test child finished with -1
---- end ----
BPF filter subtest 1: FAILED!
The statement above about the result being incorrect looks weird, and it
is due to that particular perf build missing commit 3e11300cdf
("perf test: Fix bpf test sample mismatch reporting"). In reality, due
to commit 4b04e0decd ("perf test: Fix basic bpf filtering test"),
perf expects there to be 56*3 samples.
However, the number of samples we receive is going to be dependent on
where the probes are installed, which is dependent on where
do_epoll_wait gets inlined. On s390x, it looks like probes at all the
inlined locations are hit. But, that is not the case on ppc64le.
Fix this by switching the test to instead use the syscall tracepoint.
This ensures that we will only ever install a single event enabling us
to reliably determine the sample count.
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20230123083224.276404-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are some edge cases around estimated timestamps that can result
in them going backwards.
One is that after a discontinuity, the last used timestamp is set to 0.
The duration of the next range is then subtracted which could result in
an earlier timestamp than the last instruction. Fix this by not
resetting the last timestamp used on a discontinuity, and make sure that
new estimated timestamps are clamped to be later than that.
Another case is that estimated timestamps could compound over time to
end up being more than the next real timestamp in the trace. Fix this by
clamping the estimates in cs_etm_decoder__do_soft_timestamp() to be no
later than it.
cs_etm_decoder__do_soft_timestamp() also updated next_cs_timestamp,
which meant that the next real timestamp was lost and not stored
anywhere. Fix that by only updating cs_timestamp for estimates and keep
next_cs_timestamp untouched.
Finally, use next_cs_timestamp to signify if a timestamp has been
received previously. Because cs_timestamp has the first range
subtracted, it could technically go to 0 which would break the logic.
Testing
=======
It can be verified that timestamps don't go backwards when tracing on a
single core with the following commands. Across multiple cores it's
expected that timestamps are interleaved:
$ perf record -e cs_etm/@tmc_etr0/k -C 4 taskset -c 4 sleep 1
$ perf script --itrace=i1ns --ns -Fcomm,tid,pid,time,cpu,event,ip,sym,addr,symoff,flags,callindent > itrace
$ sed 's/://g' itrace | awk -F ' ' ' { print $4 } ' | awk '{ if ($1 < prev) { print "line:" NR " " $0 } {prev=$1}}'
Reported-by: Tanmay Jagdale <tanmay@marvell.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-9-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If virtual timestamps are detected, set sample time field accordingly,
otherwise warn the user that the samples will not include accurate
time data.
| Test notes (FEAT_TRF platform)
|
| $ ./perf record -e cs_etm//u -a -- sleep 4
| $ ./perf script --fields +time
| perf 422 [000] 163.375100: 1 branches:uH: 0 [unknown] ([unknown])
| perf 422 [000] 163.375100: 1 branches:uH: ffffb8009544 ioctl+0x14 (/lib/aarch64-linux-gnu/libc-2.27.so)
| perf 422 [000] 163.375100: 1 branches:uH: aaaaab6bebf4 perf_evsel__run_ioctl+0x90 (/home/german/linux/tools/perf/perf)
| [...]
| perf 422 [000] 167.393100: 1 branches:uH: aaaaab6bda00 __xyarray__entry+0x74 (/home/german/linux/tools/perf/perf)
| perf 422 [000] 167.393099: 1 branches:uH: aaaaab6bda0c __xyarray__entry+0x80 (/home/german/linux/tools/perf/perf)
| perf 422 [000] 167.393099: 1 branches:uH: ffffb8009538 ioctl+0x8 (/lib/aarch64-linux-gnu/libc-2.27.so)
|
| The time from the first sample to the last sample is 4 seconds
Now that times are converted to nanoseconds, also try to estimate the
timestamps more accurately be dividing by some fixed value for
instructions per ns. This prevents long ranges from being estimated
too far in the past than would be realistic.
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-8-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Read the value of ts_source exposed by the driver and store it in the
ETMv4 and ETE header. If the interface doesn't exist (such as in older
Kernels), defaults to a safe value of -1.
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-7-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Previously, adding a new parameter at the end of ETMv4 meant adding it
somewhere in the middle of ETE, which is not supported by the current
header version.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-6-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a utility function perf_pmu__file_exists() to check if a given pmu
file exists in the sysfs filesystem.
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-5-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove some code that duplicates existing methods. Copy strings where
const strings are required.
No functional changes.
Committer notes:
Add a stub for erf_pmu__scan_file() in tools/perf/util/python.c not to
drag tools/perf/util/pmu.c into the python binding.
This fixes 'perf test python' at this point in this patchset.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The pattern for accessing EVENT_SOURCE_DEVICE_PATH is duplicated in a
few places, so add two utility functions to cover it. Also just use
perf_pmu__scan_file() instead of pmu_type() which already does the same
thing.
No functional changes.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the context of LBR stitching documentation.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230119201036.156441-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make the code more readable by checking for SHT_RELA and SHT_REL type
earlier.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
SHT_REL and SHT_RELA are handled the same way. Simplify by combining the
handling.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Create a sensible name for .plt entries with no symbol.
Example:
Before:
$ perf test --dso /usr/lib/x86_64-linux-gnu/libc.so.6 -vv Symbols 2>/tmp/cmp1.txt
After:
$ perf test --dso /usr/lib/x86_64-linux-gnu/libc.so.6 -vv Symbols 2>/tmp/cmp2.txt
$ diff /tmp/cmp1.txt /tmp/cmp2.txt
4c4
< test child forked, pid 53043
---
> test child forked, pid 54372
23,62c23,62
< 280f0-28100 g @plt
< 28100-28110 g @plt
< 28110-28120 g @plt
< 28120-28130 g @plt
< 28130-28140 g @plt
< 28140-28150 g @plt
< 28150-28160 g @plt
< 28160-28170 g @plt
< 28170-28180 g @plt
< 28180-28190 g @plt
< 28190-281a0 g @plt
< 281a0-281b0 g @plt
< 281b0-281c0 g @plt
< 281c0-281d0 g @plt
< 281d0-281e0 g @plt
< 281e0-281f0 g @plt
< 281f0-28200 g @plt
< 28200-28210 g @plt
< 28210-28220 g @plt
< 28220-28230 g @plt
< 28230-28240 g @plt
< 28240-28250 g @plt
< 28250-28260 g @plt
< 28260-28270 g @plt
< 28270-28280 g @plt
< 28280-28290 g @plt
< 28290-282a0 g @plt
< 282a0-282b0 g @plt
< 282b0-282c0 g @plt
< 282c0-282d0 g @plt
< 282d0-282e0 g @plt
< 282e0-282f0 g @plt
< 282f0-28300 g @plt
< 28300-28310 g @plt
< 28310-28320 g @plt
< 28320-28330 g @plt
< 28330-28340 g @plt
< 28340-28350 g @plt
< 28350-28360 g @plt
< 28360-28370 g @plt
---
> 280f0-28100 g offset_0x280f0@plt
> 28100-28110 g offset_0x28100@plt
> 28110-28120 g offset_0x28110@plt
> 28120-28130 g offset_0x28120@plt
> 28130-28140 g offset_0x28130@plt
> 28140-28150 g offset_0x28140@plt
> 28150-28160 g offset_0x28150@plt
> 28160-28170 g offset_0x28160@plt
> 28170-28180 g offset_0x28170@plt
> 28180-28190 g offset_0x28180@plt
> 28190-281a0 g offset_0x28190@plt
> 281a0-281b0 g offset_0x281a0@plt
> 281b0-281c0 g offset_0x281b0@plt
> 281c0-281d0 g offset_0x281c0@plt
> 281d0-281e0 g offset_0x281d0@plt
> 281e0-281f0 g offset_0x281e0@plt
> 281f0-28200 g offset_0x281f0@plt
> 28200-28210 g offset_0x28200@plt
> 28210-28220 g offset_0x28210@plt
> 28220-28230 g offset_0x28220@plt
> 28230-28240 g offset_0x28230@plt
> 28240-28250 g offset_0x28240@plt
> 28250-28260 g offset_0x28250@plt
> 28260-28270 g offset_0x28260@plt
> 28270-28280 g offset_0x28270@plt
> 28280-28290 g offset_0x28280@plt
> 28290-282a0 g offset_0x28290@plt
> 282a0-282b0 g offset_0x282a0@plt
> 282b0-282c0 g offset_0x282b0@plt
> 282c0-282d0 g offset_0x282c0@plt
> 282d0-282e0 g offset_0x282d0@plt
> 282e0-282f0 g offset_0x282e0@plt
> 282f0-28300 g offset_0x282f0@plt
> 28300-28310 g offset_0x28300@plt
> 28310-28320 g offset_0x28310@plt
> 28320-28330 g offset_0x28320@plt
> 28330-28340 g offset_0x28330@plt
> 28340-28350 g offset_0x28340@plt
> 28350-28360 g offset_0x28350@plt
> 28360-28370 g offset_0x28360@plt
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf expands the _init symbol over .plt because there are no PLT symbols
at that point, but then dso__synthesize_plt_symbols() creates them.
Fix by truncating the previous symbol and inserting a symbol for .plt
header.
Example:
Before:
$ perf test --dso `which uname` -v Symbols
74: Symbols :
--- start ---
test child forked, pid 191028
Problems creating module maps, continuing anyway...
Testing /usr/bin/uname
Overlapping symbols:
2000-25f0 g _init
2040-2050 g free@plt
test child finished with -1
---- end ----
Symbols: FAILED!
$ perf test --dso `which uname` -vv Symbols 2>/tmp/cmp1.txt
After:
$ perf test --dso `which uname` -v Symbols
74: Symbols :
--- start ---
test child forked, pid 194291
Testing /usr/bin/uname
test child finished with 0
---- end ----
Symbols: Ok
$ perf test --dso `which uname` -vv Symbols 2>/tmp/cmp2.txt
$ diff /tmp/cmp1.txt /tmp/cmp2.txt
4,5c4
< test child forked, pid 191031
< Problems creating module maps, continuing anyway...
---
> test child forked, pid 194296
9c8,9
< 2000-25f0 g _init
---
> 2000-2030 g _init
> 2030-2040 g .plt
100,103c100
< Overlapping symbols:
< 2000-25f0 g _init
< 2040-2050 g free@plt
< test child finished with -1
---
> test child finished with 0
105c102
< Symbols: FAILED!
---
> Symbols: Ok
$
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
ss->dynsym is checked to be not NULL twice. Remove the first check
because, in fact, there can be a plt with no dynsym, which is something
that will be dealt with later.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Return zero directly instead of needless 'goto out_elf_end' that does
the same thing. That allows 'err' to be initialized to -1 instead of
having to change its value later.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Symbols should not be cached when there are more symbols still to add.
Add dso__find_symbol_nocache() to facilitate that.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The code expects non-zero plt_entry_size. Check it and add a debug
message to print if it is zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out get_plt_sizes() to make the code more readable and further
changes to dso__synthesize_plt_symbols() easier to follow.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a test to check function symbols do not overlap and are not zero
length.
The main motivation for the test is to make it easier to review changes
to PLT symbol synthesis i.e. changes to dso__synthesize_plt_symbols().
By default the test uses the perf executable as a test DSO, but a
specific DSO can be specified via a new perf test option "--dso".
The test is useful in the following ways:
- Any DSO can be tested, even ones that do not run on the current
architecture. For example, using cross-compiled DSOs to see how
well perf handles different architectures.
- With verbose > 1 (e.g. -vv), all the symbols are printed, which
makes it easier to see issues.
- perf removes duplicate symbols and expands zero-length symbols
to reach the next symbol, however that is done before adding
synthesized symbols, so the test is checking those also.
Example:
$ perf test -v Symbols
74: Symbols :
--- start ---
test child forked, pid 154918
Testing /home/user/bin/perf
Overlapping symbols:
7d000-7f3a0 g _init
7d030-7d040 g __printf_chk@plt
test child finished with -1
---- end ----
Symbols: FAILED!
Note the test fails because perf expands the _init symbol over the PLT
because there are no PLT symbols at that point, but then
dso__synthesize_plt_symbols() creates them.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
reclaim_period_ms used to be positive only but the commit 0001725d0f
("KVM: selftests: Add atoi_positive() and atoi_non_negative() for input
validation") incorrectly changed it to non-negative validation.
Change validation to allow only positive input.
Fixes: 0001725d0f ("KVM: selftests: Add atoi_positive() and atoi_non_negative() for input validation")
Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reported-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230111183408.104491-1-vipinsh@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Placing a declaration of evt_reset is pedantically invalid
according to the C standard. While GCC does not really care
and only warns with -Wpedantic, clang ignores the declaration
altogether with an error:
x86_64/xen_shinfo_test.c:965:2: error: expected expression
struct kvm_xen_hvm_attr evt_reset = {
^
x86_64/xen_shinfo_test.c:969:38: error: use of undeclared identifier evt_reset
vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &evt_reset);
^
Reported-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Reported-by: Sean Christopherson <seanjc@google.com>
Fixes: a79b53aaaa ("KVM: x86: fix deadlock for KVM_XEN_EVTCHN_RESET", 2022-12-28)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
1. fopen sysfs without fclose.
2. asprintf filename without free.
3. if asprintf return error,do not need to free the buffer.
Signed-off-by: Yulong Zhang <yulong.zhang@metoak.net>
Link: https://lore.kernel.org/r/20230117025147.69890-1-yulong.zhang@metoak.net
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
First test that we allow overwriting dynptr slots and reinitializing
them in unreferenced case, and disallow overwriting for referenced case.
Include tests to ensure slices obtained from destroyed dynptrs are being
invalidated on their destruction. The destruction needs to be scoped, as
in slices of dynptr A should not be invalidated when dynptr B is
destroyed. Next, test that MEM_UNINIT doesn't allow writing dynptr stack
slots.
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-13-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Try creating a dynptr, then overwriting second slot with first slot of
another dynptr. Then, the first slot of first dynptr should also be
invalidated, but without our fix that does not happen. As a consequence,
the unfixed case allows passing first dynptr (as the kernel check only
checks for slot_type and then first_slot == true).
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-12-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Ensure that variable offset is handled correctly, and verifier takes
both fixed and variable part into account. Also ensures that only
constant var_off is allowed.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-11-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add verifier tests that verify the new pruning behavior for STACK_DYNPTR
slots, and ensure that state equivalence takes into account changes to
the old and current verifier state correctly. Also ensure that the
stacksafe changes are actually enabling pruning in case states are
equivalent from pruning PoV.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-10-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The previous commit implemented destroy_if_dynptr_stack_slot. It
destroys the dynptr which given spi belongs to, but still doesn't
invalidate the slices that belong to such a dynptr. While for the case
of referenced dynptr, we don't allow their overwrite and return an error
early, we still allow it and destroy the dynptr for unreferenced dynptr.
To be able to enable precise and scoped invalidation of dynptr slices in
this case, we must be able to associate the source dynptr of slices that
have been obtained using bpf_dynptr_data. When doing destruction, only
slices belonging to the dynptr being destructed should be invalidated,
and nothing else. Currently, dynptr slices belonging to different
dynptrs are indistinguishible.
Hence, allocate a unique id to each dynptr (CONST_PTR_TO_DYNPTR and
those on stack). This will be stored as part of reg->id. Whenever using
bpf_dynptr_data, transfer this unique dynptr id to the returned
PTR_TO_MEM_OR_NULL slice pointer, and store it in a new per-PTR_TO_MEM
dynptr_id register state member.
Finally, after establishing such a relationship between dynptrs and
their slices, implement precise invalidation logic that only invalidates
slices belong to the destroyed dynptr in destroy_if_dynptr_stack_slot.
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-5-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Currently, while reads are disallowed for dynptr stack slots, writes are
not. Reads don't work from both direct access and helpers, while writes
do work in both cases, but have the effect of overwriting the slot_type.
While this is fine, handling for a few edge cases is missing. Firstly,
a user can overwrite the stack slots of dynptr partially.
Consider the following layout:
spi: [d][d][?]
2 1 0
First slot is at spi 2, second at spi 1.
Now, do a write of 1 to 8 bytes for spi 1.
This will essentially either write STACK_MISC for all slot_types or
STACK_MISC and STACK_ZERO (in case of size < BPF_REG_SIZE partial write
of zeroes). The end result is that slot is scrubbed.
Now, the layout is:
spi: [d][m][?]
2 1 0
Suppose if user initializes spi = 1 as dynptr.
We get:
spi: [d][d][d]
2 1 0
But this time, both spi 2 and spi 1 have first_slot = true.
Now, when passing spi 2 to dynptr helper, it will consider it as
initialized as it does not check whether second slot has first_slot ==
false. And spi 1 should already work as normal.
This effectively replaced size + offset of first dynptr, hence allowing
invalid OOB reads and writes.
Make a few changes to protect against this:
When writing to PTR_TO_STACK using BPF insns, when we touch spi of a
STACK_DYNPTR type, mark both first and second slot (regardless of which
slot we touch) as STACK_INVALID. Reads are already prevented.
Second, prevent writing to stack memory from helpers if the range may
contain any STACK_DYNPTR slots. Reads are already prevented.
For helpers, we cannot allow it to destroy dynptrs from the writes as
depending on arguments, helper may take uninit_mem and dynptr both at
the same time. This would mean that helper may write to uninit_mem
before it reads the dynptr, which would be bad.
PTR_TO_MEM: [?????dd]
Depending on the code inside the helper, it may end up overwriting the
dynptr contents first and then read those as the dynptr argument.
Verifier would only simulate destruction when it does byte by byte
access simulation in check_helper_call for meta.access_size, and
fail to catch this case, as it happens after argument checks.
The same would need to be done for any other non-trivial objects created
on the stack in the future, such as bpf_list_head on stack, or
bpf_rb_root on stack.
A common misunderstanding in the current code is that MEM_UNINIT means
writes, but note that writes may also be performed even without
MEM_UNINIT in case of helpers, in that case the code after handling meta
&& meta->raw_mode will complain when it sees STACK_DYNPTR. So that
invalid read case also covers writes to potential STACK_DYNPTR slots.
The only loophole was in case of meta->raw_mode which simulated writes
through instructions which could overwrite them.
A future series sequenced after this will focus on the clean up of
helper access checks and bugs around that.
Fixes: 97e03f5210 ("bpf: Add verifier support for dynptrs")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-4-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Currently, the dynptr function is not checking the variable offset part
of PTR_TO_STACK that it needs to check. The fixed offset is considered
when computing the stack pointer index, but if the variable offset was
not a constant (such that it could not be accumulated in reg->off), we
will end up a discrepency where runtime pointer does not point to the
actual stack slot we mark as STACK_DYNPTR.
It is impossible to precisely track dynptr state when variable offset is
not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc.
simply reject the case where reg->var_off is not constant. Then,
consider both reg->off and reg->var_off.value when computing the stack
pointer index.
A new helper dynptr_get_spi is introduced to hide over these details
since the dynptr needs to be located in multiple places outside the
process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we
need to enforce these checks in all places.
Note that it is disallowed for unprivileged users to have a non-constant
var_off, so this problem should only be possible to trigger from
programs having CAP_PERFMON. However, its effects can vary.
Without the fix, it is possible to replace the contents of the dynptr
arbitrarily by making verifier mark different stack slots than actual
location and then doing writes to the actual stack address of dynptr at
runtime.
Fixes: 97e03f5210 ("bpf: Add verifier support for dynptrs")
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This Kselftest fixes update for Linux 6.2-rc5 consists of a single
fix address error seen during unconfigured LLVM builds.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmPKv0MACgkQCwJExA0N
QxyVzg//SDB26SRVRIClssnq66PUlT6/MDBSRIO1fyzsurgvpZuB0As2sMBN8JB9
lfmMcG/bR61FYpF9aUAnObz9HhV9Kil/oEik63sTZkE+3LjWbOyePQqykjtqk6/z
Rk1Y4ZDrwVztR8XiwzlOEuIyeR/tdnkwJlx4WE0VI+NcsAKzI5012guYbr59tnN0
6eKm1ZjCy5wLSyiEVEYgCMgX7mJEvWYiqSQl/kGdffuIu+4QSJYnEXY5HMMnYIiX
HasoIBTnAC1yNdeLEc9rQF5WETzOO5PZdHypkgKwkwAzxPcM2VsCLYwbItgJqTj6
qhoJSrXlwQ8lVrnS+GXR5rAInObhWmN4WSTq4sX0WJWE8gSbYH03ieMAkJRDPmWE
jb/WolrPuSIUSkuGRN6cTBM0I08BeNKKPMh26F1RXf0/eDa51FjaOUqFOJohSi9k
JUprlni/tgLygj+A6GUVP4gjd1K6jzLEFXrbBPFztn1lxiuRkdW+XqpoExDB9xtd
D88K+wvthc5sYYfOxwkDLftDVgjjRSpQ4XiZ2DHxyAQWy3dmRDA1xVjhDW+Qj2aA
8lWRxDAk5Jaqz8E1DFh/pcwHKnsY12Lt0s0iJ4Vq1H7FFcoXiPa4TfzTMOF4xuXa
OZRPzfrbrURd/Lvwrr4uCGbCI+J3Ky/Exz27VDocX+OTLJDMtTY=
=O4Lm
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-fixes-6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fix from Shuah Khan:
"Fix an error seen during unconfigured LLVM builds"
* tag 'linux-kselftest-fixes-6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kselftest: Fix error message for unconfigured LLVM builds
Current release - regressions:
- Revert "net: team: use IFF_NO_ADDRCONF flag to prevent ipv6
addrconf", fix nsna_ping mode of team
- wifi: mt76: fix bugs in Rx queue handling and DMA mapping
- eth: mlx5:
- add missing mutex_unlock in error reporter
- protect global IPsec ASO with a lock
Current release - new code bugs:
- rxrpc: fix wrong error return in rxrpc_connect_call()
Previous releases - regressions:
- bluetooth: hci_sync: fix use of HCI_OP_LE_READ_BUFFER_SIZE_V2
- wifi:
- mac80211: fix crashes on Rx due to incorrect initialization of
rx->link and rx->link_sta
- mac80211: fix bugs in iTXQ conversion - Tx stalls, incorrect
aggregation handling, crashes
- brcmfmac: fix regression for Broadcom PCIe wifi devices
- rndis_wlan: prevent buffer overflow in rndis_query_oid
- netfilter: conntrack: handle tcp challenge acks during connection
reuse
- sched: avoid grafting on htb_destroy_class_offload when destroying
- virtio-net: correctly enable callback during start_xmit, fix stalls
- tcp: avoid the lookup process failing to get sk in ehash table
- ipa: disable ipa interrupt during suspend
- eth: stmmac: enable all safety features by default
Previous releases - always broken:
- bpf:
- fix pointer-leak due to insufficient speculative store bypass
mitigation (Spectre v4)
- skip task with pid=1 in send_signal_common() to avoid a splat
- fix BPF program ID information in BPF_AUDIT_UNLOAD as well as
PERF_BPF_EVENT_PROG_UNLOAD events
- fix potential deadlock in htab_lock_bucket from same bucket index
but different map_locked index
- bluetooth:
- fix a buffer overflow in mgmt_mesh_add()
- hci_qca: fix driver shutdown on closed serdev
- ISO: fix possible circular locking dependency
- CIS: hci_event: fix invalid wait context
- wifi: brcmfmac: fixes for survey dump handling
- mptcp: explicitly specify sock family at subflow creation time
- netfilter: nft_payload: incorrect arithmetics when fetching VLAN
header bits
- tcp: fix rate_app_limited to default to 1
- l2tp: close all race conditions in l2tp_tunnel_register()
- eth: mlx5: fixes for QoS config and eswitch configuration
- eth: enetc: avoid deadlock in enetc_tx_onestep_tstamp()
- eth: stmmac: fix invalid call to mdiobus_get_phy()
Misc:
- ethtool: add netlink attr in rss get reply only if the value is
not empty
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmPKxFAACgkQMUZtbf5S
Irtdlg/+OUggv3sKhcpUv39SnmMyiIhnNj9KhG+25Iiy7MJxaoNntCGsW3KXXTGo
JylOGMesociz+hCv4xHl9J61uwmz+qrUPKqqi7hSbEoHlAa3OrubQb+LW4/x0Jp0
bNJqqAYt04C+txhvsuF9odfZbKtvQ7RIU0XEzqEoES4UXTYQoCHEAwfn5twKDNNS
/x9OLSZnctAv1pinKZ8QTLjz0IZwHaBAbWNXkLe/HEu9nGrUndFtA5rJjyzrjw10
ZltTDfV2lr3SWVHsJShnTJ64u+aPBGmJmVzeNw64qRrmnYdFMCpUVoH222IurexO
aVPY9WUOwgUovetB8fmhPF0+n5Aa6pbTb4toQB1oVZ8X0h7WNrdfXZug1QDQOMbC
eGzsNdk6hvOeqBhbIKPLQXzaIxbPyXM+KUUbOxi+V4dahG79vG2BaQsrpFymueVs
cna7pL8dE1S9dR3SEB0KW4nyoWIugukZrzuX0efv1hxovuWn4yNJBt2lp8gQwY6v
yTk93Ou2LYDrm4yXLrHHWYNXU1u68Pq0o14xbx7tOYGan/evqfaaa1lmAvj7b1bq
g19FB4IrwA/1ZBoaOIMV8Ml7u5ww9LAFzJRAClEptOopADN4Gro2jgUYWjmxm+uV
RdlpQ2mI8iEeEH0FOITmdlFy7cbh7TWIkoiXHcCWifgfUE7sxnY=
=F3be
-----END PGP SIGNATURE-----
Merge tag 'net-6.2-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless, bluetooth, bpf and netfilter.
Current release - regressions:
- Revert "net: team: use IFF_NO_ADDRCONF flag to prevent ipv6
addrconf", fix nsna_ping mode of team
- wifi: mt76: fix bugs in Rx queue handling and DMA mapping
- eth: mlx5:
- add missing mutex_unlock in error reporter
- protect global IPsec ASO with a lock
Current release - new code bugs:
- rxrpc: fix wrong error return in rxrpc_connect_call()
Previous releases - regressions:
- bluetooth: hci_sync: fix use of HCI_OP_LE_READ_BUFFER_SIZE_V2
- wifi:
- mac80211: fix crashes on Rx due to incorrect initialization of
rx->link and rx->link_sta
- mac80211: fix bugs in iTXQ conversion - Tx stalls, incorrect
aggregation handling, crashes
- brcmfmac: fix regression for Broadcom PCIe wifi devices
- rndis_wlan: prevent buffer overflow in rndis_query_oid
- netfilter: conntrack: handle tcp challenge acks during connection
reuse
- sched: avoid grafting on htb_destroy_class_offload when destroying
- virtio-net: correctly enable callback during start_xmit, fix stalls
- tcp: avoid the lookup process failing to get sk in ehash table
- ipa: disable ipa interrupt during suspend
- eth: stmmac: enable all safety features by default
Previous releases - always broken:
- bpf:
- fix pointer-leak due to insufficient speculative store bypass
mitigation (Spectre v4)
- skip task with pid=1 in send_signal_common() to avoid a splat
- fix BPF program ID information in BPF_AUDIT_UNLOAD as well as
PERF_BPF_EVENT_PROG_UNLOAD events
- fix potential deadlock in htab_lock_bucket from same bucket
index but different map_locked index
- bluetooth:
- fix a buffer overflow in mgmt_mesh_add()
- hci_qca: fix driver shutdown on closed serdev
- ISO: fix possible circular locking dependency
- CIS: hci_event: fix invalid wait context
- wifi: brcmfmac: fixes for survey dump handling
- mptcp: explicitly specify sock family at subflow creation time
- netfilter: nft_payload: incorrect arithmetics when fetching VLAN
header bits
- tcp: fix rate_app_limited to default to 1
- l2tp: close all race conditions in l2tp_tunnel_register()
- eth: mlx5: fixes for QoS config and eswitch configuration
- eth: enetc: avoid deadlock in enetc_tx_onestep_tstamp()
- eth: stmmac: fix invalid call to mdiobus_get_phy()
Misc:
- ethtool: add netlink attr in rss get reply only if the value is not
empty"
* tag 'net-6.2-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
Revert "Merge branch 'octeontx2-af-CPT'"
tcp: fix rate_app_limited to default to 1
bnxt: Do not read past the end of test names
net: stmmac: enable all safety features by default
octeontx2-af: add mbox to return CPT_AF_FLT_INT info
octeontx2-af: update cpt lf alloc mailbox
octeontx2-af: restore rxc conf after teardown sequence
octeontx2-af: optimize cpt pf identification
octeontx2-af: modify FLR sequence for CPT
octeontx2-af: add mbox for CPT LF reset
octeontx2-af: recover CPT engine when it gets fault
net: dsa: microchip: ksz9477: port map correction in ALU table entry register
selftests/net: toeplitz: fix race on tpacket_v3 block close
net/ulp: use consistent error code when blocking ULP
octeontx2-pf: Fix the use of GFP_KERNEL in atomic context on rt
tcp: avoid the lookup process failing to get sk in ehash table
Revert "net: team: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf"
MAINTAINERS: add networking entries for Willem
net: sched: gred: prevent races when adding offloads to stats
l2tp: prevent lockdep issue in l2tp_tunnel_register()
...
It looks like a copy-paste error to describe the ZA buffer size using (the
number of P registers * the maximum size of a Z register). This doesn't
have practical impact though as we're always allocating enough space even
for the architectural maximum ZA storage, with SVL equals to 2048 bits.
Switch to use ZA_SIG_REGS_SIZE(SVE_VQ_MAX). setup_za() will need to
initialize two 64MB arraies with this change and can be optimized later (if
someone complain).
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221218092942.1940-2-yuzenghui@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
It was introduced in commit b77e995e3b ("kselftest/arm64: Add a test
program to exercise the syscall ABI") but never actually used. Remove it.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221218092942.1940-1-yuzenghui@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>