'perf inject' with Coresight data generates files that cannot be opened
when only the last branch option is specified:
perf inject -i perf.data --itrace=l -o inject.data
perf script -i inject.data
0x33faa8 [0x8]: failed to process type: 9 [Bad address]
This is because cs_etm__synth_instruction_sample() is called even when
the sample type for instructions hasn't been setup. Last branch records
are attached to instruction samples so it doesn't make sense to generate
them when --itrace=i isn't specified anyway.
This change disables all calls of cs_etm__synth_instruction_sample()
unless --itrace=i is specified, resulting in a file with no samples if
only --itrace=l is provided, rather than a bad file.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220210200620.1227232-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
sample_branches and sample_instructions are already saved in the
synth_opts struct. Other usages like synth_opts.last_branch don't save a
value, so make this more consistent by always going through synth_opts
and not saving duplicate values.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220210200620.1227232-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The function trace__symbols_init() runs "perf-read-vdso32" and that ends up
with a SIGCHLD delivered to 'perf'. And this SIGCHLD make perf exit early.
'perf trace' should exit only if the SIGCHLD is from our workload process.
So let's use sigaction() instead of signal() to match such condition.
Committer notes:
Use memset to zero the 'struct sigaction' variable as the '= { 0 }'
method isn't accepted in many compiler versions, e.g.:
4 34.02 alpine:3.6 : FAIL clang version 4.0.0 (tags/RELEASE_400/final)
builtin-trace.c:4897:35: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces]
struct sigaction sigchld_act = { 0 };
^
{}
builtin-trace.c:4897:37: error: missing field 'sa_mask' initializer [-Werror,-Wmissing-field-initializers]
struct sigaction sigchld_act = { 0 };
^
2 errors generated.
6 32.60 alpine:3.8 : FAIL gcc version 6.4.0 (Alpine 6.4.0)
builtin-trace.c:4897:35: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces]
struct sigaction sigchld_act = { 0 };
^
{}
builtin-trace.c:4897:37: error: missing field 'sa_mask' initializer [-Werror,-Wmissing-field-initializers]
struct sigaction sigchld_act = { 0 };
^
2 errors generated.
7 34.82 alpine:3.9 : FAIL gcc version 8.3.0 (Alpine 8.3.0)
builtin-trace.c:4897:35: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces]
struct sigaction sigchld_act = { 0 };
^
{}
builtin-trace.c:4897:37: error: missing field 'sa_mask' initializer [-Werror,-Wmissing-field-initializers]
struct sigaction sigchld_act = { 0 };
^
2 errors generated.
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220208140725.3947-1-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is a spelling mistake in a debug message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220214093547.44590-1-colin.i.king@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Return the result from hist_entry_iter__add() directly instead of taking
this in another redundant variable.
Signed-off-by: tangmeng <tangmeng@uniontech.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220216030425.27779-1-tangmeng@uniontech.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The variable 'err' in the perf_event__process_sample() is only used in
the only one judgment statement, it is not used in other places.
So, use the return value from hist_entry_iter__add() directly instead of
taking this in another redundant variable.
Signed-off-by: tangmeng <tangmeng@uniontech.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220216030425.27779-2-tangmeng@uniontech.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When testing metric expressions we fake counter values from 1 going
upward. For some metrics this can yield negative values that are clipped
to zero, and then cause divide by zero failures.
Such clipping is questionable but may be a result of tools automatically
generating metrics. A workaround for this case is to try a second time
with counter values going in the opposite direction.
This case was seen in a metric like:
event1 / max(event2 - event3, 0)
But it may also happen in more sensible metrics like:
event1 / (event2 + event3 - 1 - event4)
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211223185622.3435128-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that a config flag for branch broadcast has been added, take it into
account when trying to deduce what the driver would have programmed the
TRCCONFIGR register to.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/20220113091056.1297982-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some code in builtin-c2c.c calls bitmap_weight() to check if any bit of
a given bitmap is set.
It's better to use bitmap_empty() in that case because bitmap_empty()
stops traversing the bitmap as soon as it finds first set bit, while
bitmap_weight() counts all bits unconditionally.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: David Laight <david.laight@aculab.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: http://lore.kernel.org/lkml/20220123183925.1052919-13-yury.norov@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make the --tui command line flags dependent HAVE_SLANG_SUPPORT. This was
reported as confusing in:
https://lore.kernel.org/linux-perf-users/YevaTkzdXmFKdGpc@zx-spectrum.none/
Reported-by: xaizek <xaizek@posteo.net>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: xaizek <xaizek@posteo.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220123191849.3655855-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add documentation for Event Trace and TNT disable to the perf Intel PT man
page.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-26-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add sample flags to the PostgreSQL database definition and export.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-25-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add sample flags to the SQLite database definition and export.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-24-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, the transaction flag (x) is kept separate from branch flags.
Instead of doing the same for the interrupt disabled flags (D and t), add
all flags so that new flags will not need to be handled separately in the
future.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-23-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add Event Trace to the intel-pt-events.py script. This shows how to unpack
the raw data from the new sample events in a Python script.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-22-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Amend the display to include D and t flags in the same way as the x flag.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-21-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Similar to other Intel PT synth events, display changes to the interrupt
flag represented by the MODE.Exec packet.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-20-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Similar to other Intel PT synth events, display Event Trace events recorded
by CFE / EVD packets.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-19-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is not possible to walk the executable code without TNT packets, so
force 'quick' mode when TNT is disabled, because 'quick' mode does not walk
the code.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-18-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update sample flags to represent the state and changes to the interrupt
flag.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-17-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Synthesize an attribute event and sample events for changes to the
interrupt flag represented by the MODE.Exec packet.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Synthesize an attribute event and sample events for Intel PT Event Trace
events represented by CFE and EVD packets.
Committer notes:
Make 'struct perf_synth_intel_evd evd[]' evd[0] at the end of 'struct
perf_synth_intel_evt' as it is breaking the build with in many compilers
with (e.g. clang version 13.0.0 (Fedora 13.0.0-3.fc35)):
util/intel-pt.c:2213:31: error: field 'cfe' with variable sized type 'struct perf_synth_intel_evt' not at the end of a struct or class is a GNU extension [-Werror,-Wgnu-variable-sized-type-not-at-end]
struct perf_synth_intel_evt cfe;
^
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-15-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The change to the MODE.Exec packet means processing must distinguish
between the old and new cases. Record the Event Trace capability flag to
make that possible.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add itrace option "I" to synthesize interrupt or similar (asynchronous)
events. This will be used for Intel PT Event Trace events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Define 2 new flags to represent:
- when interrupts are disabled (D)
- when interrupt disabling toggles (t)
This gives 4 combinations:
no flag, interrupts enabled
t interrupts were enabled but become disabled
D interrupts are disabled
Dt interrupts were disabled but become enabled
Committer notes:
Those are control flow flags, as per 'tools/perf/Documentation/perf-intel-pt.txt:
<quote>
An interesting field that is not printed by default is 'flags' which can be
displayed as follows:
perf script --itrace=ibxwpe -F+flags
The flags are "bcrosyiABExgh" which stand for branch, call, return, conditional,
system, asynchronous, interrupt, transaction abort, trace begin, trace end,
in transaction, VM-entry, and VM-exit respectively.
</quote>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Similar to other Intel PT synth events, define a structure to hold
information about a change to the interrupt flag.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Similar to other Intel PT synth events, define structures to hold CFE
and EVD data.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As of Intel SDM (https://www.intel.com/sdm) version 076, there is a new
Intel PT feature called Event Trace which adds a bit to the existing
MODE.Exec packet to record the interrupt flag.
Previously, the MODE.Exec packet did not generate any events, so the
new processing required is practically the same as a new packet.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As of Intel SDM (https://www.intel.com/sdm) version 076, there is a new
Intel PT feature called Event Trace which requires 2 new packets CFE
(Control Flow Event) and EVD (Event Data).
Each Event Trace event is represented by a CFE packet that is preceded
by zero or more EVD packets. It may be bound to a following FUP (Flow
Update) packet that provides the IP.
Event Trace exposes details about asynchronous events. The CFE packet
contains a type field to identify one of the following:
1 INTR interrupt, fault, exception, NMI
2 IRET interrupt return
3 SMI system management interrupt
4 RSM resume from system management mode
5 SIPI startup interprocessor interrupt
6 INIT INIT signal
7 VMENTRY VM-Entry
8 VMEXIT VM-Entry
9 VMEXIT_INTR VM-Exit due to interrupt
10 SHUTDOWN Shutdown
For more details, refer to the Intel SDM, Intel Processor Trace chapter.
Add processing to the decoder for the new packets.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out clearing of FUP (Flow Update) event variables, to avoid code duplication.
Committer Notes:
From the Intel documentation:
<quote>
Flow Update Packets (FUP): FUPs provide the source IP addresses for
asynchronous events (interrupt and exceptions), as well as other cases
where the source address cannot be determined from the binary.
</quote>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tidy up config bit constants to use #define.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As of Intel SDM (https://www.intel.com/sdm) version 076, there is a new
Intel PT feature called Event Trace which adds a bit to the existing
MODE.Exec packet to record the interrupt flag. Amend the packet decoder and
packet decoder test accordingly.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As of Intel SDM (https://www.intel.com/sdm) version 076, there is a new
Intel PT feature called Event Trace which requires 2 new packets CFE and
EVD. Add them to the packet decoder and packet decoder test.
Committer notes:
I got the "Intel® 64 and IA-32 architectures software developer’s manual
combined volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" PDF at:
https://cdrdv2.intel.com/v1/dl/getContent/671200
And these new packets are described in page 3951:
<quote>
32.2.4
Event Trace is a capability that exposes details about the asynchronous
events, when they are generated, and when their corresponding software
event handler completes execution. These include:
o Interrupts, including NMI and SMI, including the interrupt vector when
defined.
o Faults, exceptions including the fault vector.
— Page faults additionally include the page fault address, when in context.
o Event handler returns, including IRET and RSM.
o VM exits and VM entries.¹
— VM exits include the values written to the “exit reason” and “exit qualification” VMCS fields.
INIT and SIPI events.
o TSX aborts, including the abort status returned for the RTM instructions.
o Shutdown.
Additionally, it provides indication of the status of the Interrupt Flag
(IF), to indicate when interrupts are masked.
</quote>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make test_data 'static' otherwise it will conflict with any global
variable of the same name.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
$(or ...) is available since GNU Make 3.81, and useful to shorten the
code in some places.
Covert as follows:
$(if A,A,B) --> $(or A,B)
This patch also converts:
$(if A, A, B) --> $(or A, B)
Strictly speaking, the latter is not an equivalent conversion because
GNU Make keeps spaces after commas; if A is not empty, $(if A, A, B)
expands to " A", while $(or A, B) expands to "A".
Anyway, preceding spaces are not significant in the code hunks I touched.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Note 01.org has no TremontX directory but in mapfile.csv the family and
model:
...
GenuineIntel-6-86,V1.17,/SNR/snowridgex_core_v1.17.json,core,,,
GenuineIntel-6-86,V1.17,/SNR/snowridgex_uncore_v1.17.json,uncore,,,
...
match TremontX in the perf mapfile.csv:
...
GenuineIntel-6-86,v1,tremontx,core
...
Events are at version 1.17:
https://download.01.org/perfmon/SNR
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
floating-point.json is added.
Tested:
Not tested on a SnowridgeX/TremontX, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-27-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events are still at version 2:
https://download.01.org/perfmon/WSM-EX
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Westmere EX, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-26-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events are still at version 2:
https://download.01.org/perfmon/WSM-EP-SP
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Westmere EP-SP, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-25-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events are updated to version 1.06:
https://download.01.org/perfmon/TGL
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Tigerlake, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-24-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events are still at version 14:
https://download.01.org/perfmon/SLM
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
The addition of a floating-point.json is due to events having
their topic better identified by the converter script.
Tested:
Not tested on a Silvermont, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-23-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events are still at version 16:
https://download.01.org/perfmon/SNB
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Sandybridge, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-22-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are still version 52:
https://download.01.org/perfmon/SKL
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Skylake, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-21-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events are still version 2:
https://download.01.org/perfmon/NHM-EP
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Nehalem EP, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-20-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events are still at version 9:
https://download.01.org/perfmon/KNL
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
The addition of a floating-point.json is due to events having
their topic better identified by the converter script.
Tested:
Not tested on a Knights Landing, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-19-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are still at version 20:
https://download.01.org/perfmon/JKT
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Jaketown, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-18-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are still at version 20:
https://download.01.org/perfmon/IVT
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on an Ivytown, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 1.12:
https://download.01.org/perfmon/ICL
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on an Icelake, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 30:
https://download.01.org/perfmon/HSW
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Haswell, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events are still at version 1.01:
https://download.01.org/perfmon/GLP
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
The addition of a floating-point.json is due to events having
their topic better identified by the converter script.
Tested:
Not tested on a GoldmontPlus, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events are still at version 13:
https://download.01.org/perfmon/GLM
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
The addition of a floating-point.json is due to events having
their topic better identified by the converter script.
Tested:
Not tested on a Goldmont, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 17:
https://download.01.org/perfmon/BDX
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a BroadwellX, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 26:
https://download.01.org/perfmon/BDW
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Broadwell, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events are still at version 4:
https://download.01.org/perfmon/BNL
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Bonnell, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 1.11:
https://download.01.org/perfmon/ICX
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
Tested:
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : FAILED!
91: perf all PMU test : Ok
...
Test 90 failed due to MEM_PMM_Read_Latency as the test machine
lacks optane memory, and the divide by 0 causes the metric not to
print - which is intended behavior.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events are still at version 2:
https://download.01.org/perfmon/WSM-EP-DP
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : Ok
91: perf all PMU test : Ok
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are still at version 21:
https://download.01.org/perfmon/IVB
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
Tested:
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : FAILED!
91: perf all PMU test : Ok
...
Test 90 failed for Load_Miss_Real_Latency with <not counted> events:
Performance counter stats for 'perf bench internals synthesize':
<not counted> mem_load_uops_retired.hit_lfb (0.00%)
<not counted> MEM_LOAD_UOPS_RETIRED.L1_MISS (0.00%)
<not counted> L1D_PEND_MISS.PENDING (0.00%)
558185217 ns duration_time
This is exposing a somewhat known issue with weak groups that can
be worked around with:
$ perf stat --metric-no-group -M Load_Miss_Real_Latency -a sleep 1
Performance counter stats for 'system wide':
14935022 mem_load_uops_retired.hit_lfb # 23.55 Load_Miss_Real_Latency (83.23%)
4716714 MEM_LOAD_UOPS_RETIRED.L1_MISS (66.68%)
462705675 L1D_PEND_MISS.PENDING (83.22%)
1001548340 ns duration_time
1.001548340 seconds time elapsed
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 22:
https://download.01.org/perfmon/HSX
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
Tested:
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : FAILED!
91: perf all PMU test : Ok
...
Test 90 failed for Load_Miss_Real_Latency with <not counted> events:
Performance counter stats for 'system wide':
<not counted> mem_load_uops_retired.hit_lfb (0.00%)
<not counted> MEM_LOAD_UOPS_RETIRED.L1_MISS (0.00%)
<not counted> L1D_PEND_MISS.PENDING (0.00%)
1002638743 ns duration_time
This is exposing a somewhat known issue with weak groups that can
be worked around with:
$ perf stat --metric-no-group -M Load_Miss_Real_Latency -a sleep 1
Performance counter stats for 'system wide':
9539883 mem_load_uops_retired.hit_lfb # 25.87 Load_Miss_Real_Latency (83.24%)
10876212 MEM_LOAD_UOPS_RETIRED.L1_MISS (66.68%)
528172960 L1D_PEND_MISS.PENDING (83.26%)
1001964165 ns duration_time
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 1.14:
https://download.01.org/perfmon/CLX
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
Tested:
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : FAILED!
91: perf all PMU test : Ok
...
Test 90 failed due to MEM_PMM_Read_Latency as the test machine lacks
optane memory, and the divide by 0 causes the metric not to print -
which is intended behavior.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are still at version 7:
https://download.01.org/perfmon/BDW-DE
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
This adds TopdownL1_SMT metrics to bdwde-metrics.json. A discussed in:
https://lore.kernel.org/all/20220129080929.837293-4-irogers@google.com/
The TMA_Metrics-full.csv was modified so that BDW-DE is in the server
column with BDX, the Page_Walks_Utilization and
Page_Walks_Utilization_SMT metrics are then copied from BDW.
Tested:
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : Skip
91: perf all PMU test : Ok
...
90 skips due to a lack of floating point samples, which is
understandable.
Suggested-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 1.26:
https://download.01.org/perfmon/SKX
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
Fixes were made that allow the skx-metrics.json to successfully
generate, bringing back TopdownL1 metrics.
Tested:
$ perf test
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : Skip
91: perf all PMU test : Ok
...
90 skips due to a lack of floating point samples, which is
understandable.
Fixes: c4ad8fabd0 ("perf vendor events: Update metrics for SkyLake Server")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some Intel TMA metrics compute a ratio that may divide by 0, which
causes the metric not to print. This happens for metrics with FP_ARITH
events. If we see these events in the result and would otherwise fail,
then switch to a skip.
Also, don't early exit when processing metrics.
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220201015858.1226914-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now unmap_ip is const, make contains symbol const.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220211103415.2737789-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The maps code has its own header, move the corresponding C function
definitions to their own C file. In the process tidy and minimize
includes.
Committer notes:
Add back the 'static' for maps__init() and maps__exit().
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220211103415.2737789-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now purely accessed through new and delete, so reduce to file scope.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220211103415.2737789-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
struct maps is reference counted, using a pointer is more idiomatic.
Committer notes:
Check maps__new() return.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220211103415.2737789-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
struct maps is reference counted, using a pointer is more idiomatic.
Committer notes:
Delay:
maps = machine__kernel_maps(&vmlinux);
To after:
machine__init(&vmlinux, "", HOST_KERNEL_ID);
To avoid this on f34:
In file included from /var/home/acme/git/perf/tools/perf/util/build-id.h:10,
from /var/home/acme/git/perf/tools/perf/util/dso.h:13,
from tests/vmlinux-kallsyms.c:8:
In function ‘machine__kernel_maps’,
inlined from ‘test__vmlinux_matches_kallsyms’ at tests/vmlinux-kallsyms.c:122:22:
/var/home/acme/git/perf/tools/perf/util/machine.h:86:23: error: ‘vmlinux.kmaps’ is used uninitialized [-Werror=uninitialized]
86 | return machine->kmaps;
| ~~~~~~~^~~~~~~
tests/vmlinux-kallsyms.c: In function ‘test__vmlinux_matches_kallsyms’:
tests/vmlinux-kallsyms.c:121:34: note: ‘vmlinux’ declared here
121 | struct machine kallsyms, vmlinux;
| ^~~~~~~
cc1: all warnings being treated as errors
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220211103415.2737789-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Libbpf has deprecated the ability to keep track of object list inside
libbpf, it now requires applications to track usage multiple bpf objects
directly. Remove usage of bpf_object__next() API and hoist the tracking
logic to perf.
Signed-off-by: Christy Lee <christylee@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/bpf/20220212073054.1052880-3-andrii@kernel.org
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
bpf_load_program() API is deprecated, remove perf's usage of the
deprecated function. Add a __weak function declaration for libbpf
version compatibility.
Signed-off-by: Christy Lee <christylee@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: bpf@vger.kernel.org
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/bpf/20220212073054.1052880-2-andrii@kernel.org
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Having functions to access nsinfo reduces the places where reference
counting checking needs to be added.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220211103415.2737789-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Functions purely determine a value from the map and don't need to modify
it. Move functions to C file as they are most commonly used via a
function pointer.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220211103415.2737789-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Improve readability in perf_pmu__cpus_match() by using
perf_cpu_map__for_each_cpu().
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220211103415.2737789-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Switch from directly accessing the perf_cpu_map to using the appropriate
libperf API when possible. Using the API simplifies the job of
refactoring use of perf_cpu_map.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220211103415.2737789-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Load data directory files and provide basic raw dump and aggregated
analysis support of data directories in report mode, still with no
memory consumption optimizations.
READER_MAX_SIZE is chosen based on the results of measurements on
different machines on perf.data directory sizes >1GB. On machines
with big core count (192 cores) the difference between 1MB and 2MB
is about 4%. Other sizes (>2MB) are quite equal to 2MB.
On machines with small core count (4-24) there is no differences
between 1-16 MB sizes. So this constant is 2MB.
Suggested-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/3f10c13a226c0ceb53e88a082f847b91c1ae2c25.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implement compatibility checks for other modes and related command line
options: asynchronous (--aio) trace streaming and affinity (--affinity)
modes, pipe mode, AUX area tracing --snapshot and --aux-sample options,
--switch-output, --switch-output-event, --switch-max-files and
--timestamp-filename options. Parallel data streaming is compatible with
Zstd compression (--compression-level) and external control commands
(--control). CPU mask provided via -C option filters --threads
specification masks.
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/fadc1cf74057af4d5766248fcfe5cdde40732aa9.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Extend --threads option in perf record command line interface.
The option can have a value in the form of masks that specify
CPUs to be monitored with data streaming threads and its layout
in system topology. The masks can be filtered using CPU mask
provided via -C option.
The specification value can be user defined list of masks. Masks
separated by colon define CPUs to be monitored by one thread and
affinity mask of that thread is separated by slash. For example:
<cpus mask 1>/<affinity mask 1>:<cpu mask 2>/<affinity mask 2>
specifies parallel threads layout that consists of two threads
with corresponding assigned CPUs to be monitored.
The specification value can be a string e.g. "cpu", "core" or
"package" meaning creation of data streaming thread for every
CPU or core or package to monitor distinct CPUs or CPUs grouped
by core or package.
The option provided with no or empty value defaults to per-cpu
parallel threads layout creating data streaming thread for every
CPU being monitored.
Document --threads option syntax and parallel data streaming modes
in Documentation/perf-record.txt.
Suggested-by: Jiri Olsa <jolsa@kernel.org>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/079e2619be70c465317cf7c9fdaf5fa069728c32.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Provide --threads option in perf record command line interface.
The option creates a data streaming thread for each CPU in the system.
Document --threads option in Documentation/perf-record.txt.
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/01aeae43b047f428596c4ef9f9342ab94865cedd.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce bytes_transferred and bytes_compressed stats so they
would capture statistics for the related data buffer transfers.
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/b5d598034c507dfb7544d2125500280b7d434764.1642440724.git.alexey.v.bayduraev@linux.intel.com
[ Use PRiu64 to print u64 values, fixing the build on 32-bit architectures ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce compressor object into mmap object so it could be used to
pack the data stream from the corresponding kernel data buffer.
Initialize and make use of the introduced per mmap compressor.
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Link: https://lore.kernel.org/r/80edc286cf6543139a7d5a91217605123aa0b50d.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce a function to calculate the total amount of data written
and use it to support the --max-size option.
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/3e2c69186641446f8ab003ec209bccc762b3394d.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce data file objects into mmap object so it could be used to
process and store data stream from the corresponding kernel data buffer.
Initialize data files located at mmap buffer objects so trace data
can be written into several data file located at data directory.
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Link: https://lore.kernel.org/r/177077f7734b63e5c999ccd75ac6dc3c694f0d0d.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Start thread in detached state because its management is implemented
via messaging to avoid any scaling issues. Block signals prior thread
start so only main tool thread would be notified on external async
signals during data collection. Thread affinity mask is used to assign
eligible CPUs for the thread to run. Wait and sync on thread start using
thread ack pipe.
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/95784dd9f7c81ee408eab27b50b4c09ad4cf7be6.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signal thread to terminate by closing write fd of msg pipe.
Receive THREAD_MSG__READY message as the confirmation of the
thread's termination. Stop threads created for parallel trace
streaming prior their stats processing.
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/55ef8cc5ec3a96360660d9dc1763573225325f8c.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce thread local variable and use it for threaded trace streaming.
Use thread affinity mask instead of record affinity mask in affinity
modes. Use evlist__ctlfd_update() to propagate control commands from
thread object to global evlist object to enable evlist__ctlfd_*
functionality. Move waking and sample statistic to struct record_thread
and introduce record__waking function to calculate the total number of
wakes.
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/0d127555219991c1dcd6c6bb76b24fa6b78d2932.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce thread specific data object and array of such objects
to store and manage thread local data. Implement functions to
allocate, initialize, finalize and release thread specific data.
Thread local maps and overwrite_maps arrays keep pointers to
mmap buffer objects to serve according to maps thread mask.
Thread local pollfd array keeps event fds connected to mmaps
buffers according to maps thread mask.
Thread control commands are delivered via thread local comm pipes
and ctlfd_pos fd. External control commands (--control option)
are delivered via evlist ctlfd_pos fd and handled by the main
tool thread.
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/fc9f74af6f822d9c0fa0e145c3564a760dbe3d4b.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce affinity and mmap thread masks. Thread affinity mask
defines CPUs that a thread is allowed to run on. Thread maps
mask defines mmap data buffers the thread serves to stream
profiling data from.
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/9042bf7daf988e17e17e6acbf5d29590bde869cd.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stats from discarded entries should be omitted.
But a lock class may have both good and bad entries.
If the first entry was bad, we can zero-fill the stats and only add good
stats if any.
The entry can remove the discard state if it finds a good entry later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220127000050.3011493-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of the random order, sort it by lock class name.
Before:
# perf lock info -m
Address of instance: name of class
0xffffa0d940ac5310: &dentry->d_lockref.lock
0xffffa0c20b0e1cb0: &dentry->d_lockref.lock
0xffffa0d8e051cc48: &base->lock
0xffffa0d94f992110: &anon_vma->rwsem
0xffffa0d947a4f278: (null)
0xffffa0c208f6e108: &map->lock
0xffffa0c213ad32c8: &cfs_rq->removed.lock
0xffffa0c20d695888: &parent->list_lock
0xffffa0c278775278: (null)
0xffffa0c212ad4690: &dentry->d_lockref.lock
After:
# perf lock info -m
Address of instance: name of class
0xffffa0c20d538800: &(&sig->stats_lock)->lock
0xffffa0c216d4ec40: &(&sig->stats_lock)->lock
0xffffa1fe4cb04610: &(__futex_data.queues)[i].lock
0xffffa1fe4cb07750: &(__futex_data.queues)[i].lock
0xffffa1fe4cb07b50: &(__futex_data.queues)[i].lock
0xffffa1fe4cb0b850: &(__futex_data.queues)[i].lock
0xffffa1fe4cb0bcd0: &(__futex_data.queues)[i].lock
0xffffa1fe4cb0e5d0: &(__futex_data.queues)[i].lock
0xffffa1fe4cb11ad0: &(__futex_data.queues)[i].lock
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220127000050.3011493-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As evsel__intval() returns u64, we can just use it as is.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220127000050.3011493-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hlist_head has a single entry so we can save some memory.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220127000050.3011493-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Likewise, it should use a proper name in case the task runs under
chroot.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220202070828.143303-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When reading build-id from a DSO, it should consider if it's from a
chroot task. In that case, the path is different so it needs to prepend
the root directory to access the file correctly.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220202070828.143303-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently it doesn't handle tasks in chroot properly. As filenames in
MMAP records base on their root directory, it's different than what perf
tool can see from outside.
Add filename_with_chroot() helper to deal with those cases. The
function returns a new filename only if it's in a different root
directory. Since it needs to access /proc for the process, it only
works until the task exits.
With this change, I can see symbols in my program like below.
# perf record -o- chroot myroot myprog 3 | perf report -i-
...
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. .............................
#
99.83% myprog myprog [.] loop
0.04% chroot [kernel.kallsyms] [k] fxregs_fixup
0.04% chroot [kernel.kallsyms] [k] rsm_load_seg_32
...
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220202070828.143303-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-02-09
We've added 126 non-merge commits during the last 16 day(s) which contain
a total of 201 files changed, 4049 insertions(+), 2215 deletions(-).
The main changes are:
1) Add custom BPF allocator for JITs that pack multiple programs into a huge
page to reduce iTLB pressure, from Song Liu.
2) Add __user tagging support in vmlinux BTF and utilize it from BPF
verifier when generating loads, from Yonghong Song.
3) Add per-socket fast path check guarding from cgroup/BPF overhead when
used by only some sockets, from Pavel Begunkov.
4) Continued libbpf deprecation work of APIs/features and removal of their
usage from samples, selftests, libbpf & bpftool, from Andrii Nakryiko
and various others.
5) Improve BPF instruction set documentation by adding byte swap
instructions and cleaning up load/store section, from Christoph Hellwig.
6) Switch BPF preload infra to light skeleton and remove libbpf dependency
from it, from Alexei Starovoitov.
7) Fix architecture-agnostic macros in libbpf for accessing syscall
arguments from BPF progs for non-x86 architectures,
from Ilya Leoshkevich.
8) Rework port members in struct bpf_sk_lookup and struct bpf_sock to be
of 16-bit field with anonymous zero padding, from Jakub Sitnicki.
9) Add new bpf_copy_from_user_task() helper to read memory from a different
task than current. Add ability to create sleepable BPF iterator progs,
from Kenny Yu.
10) Implement XSK batching for ice's zero-copy driver used by AF_XDP and
utilize TX batching API from XSK buffer pool, from Maciej Fijalkowski.
11) Generate temporary netns names for BPF selftests to avoid naming
collisions, from Hangbin Liu.
12) Implement bpf_core_types_are_compat() with limited recursion for
in-kernel usage, from Matteo Croce.
13) Simplify pahole version detection and finally enable CONFIG_DEBUG_INFO_DWARF5
to be selected with CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor.
14) Misc minor fixes to libbpf and selftests from various folks.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (126 commits)
selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup
bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide
libbpf: Fix compilation warning due to mismatched printf format
selftests/bpf: Test BPF_KPROBE_SYSCALL macro
libbpf: Add BPF_KPROBE_SYSCALL macro
libbpf: Fix accessing the first syscall argument on s390
libbpf: Fix accessing the first syscall argument on arm64
libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL
selftests/bpf: Skip test_bpf_syscall_macro's syscall_arg1 on arm64 and s390
libbpf: Fix accessing syscall arguments on riscv
libbpf: Fix riscv register names
libbpf: Fix accessing syscall arguments on powerpc
selftests/bpf: Use PT_REGS_SYSCALL_REGS in bpf_syscall_macro
libbpf: Add PT_REGS_SYSCALL_REGS macro
selftests/bpf: Fix an endianness issue in bpf_syscall_macro test
bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZE
bpf: Fix leftover header->pages in sparc and powerpc code.
libbpf: Fix signedness bug in btf_dump_array_data()
selftests/bpf: Do not export subtest as standalone test
bpf, x86_64: Fail gracefully on bpf_jit_binary_pack_finalize failures
...
====================
Link: https://lore.kernel.org/r/20220209210050.8425-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Remove my old invalid email address which can be found in a couple of
files. Instead of updating it, just remove my contact data completely
from source files.
We have git and other tools which allow to figure out who is responsible
for what with recent contact data.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The ftrace.target.system_wide must be set before invoking
evlist__create_maps(), otherwise it has no effect.
Fixes: 53be502822 ("perf ftrace: Add 'latency' subcommand")
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220127132010.4836-1-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch fixes a spelling typo in error message.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211225005558.503935-1-standby24x7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For perf recording, it retrieves process info by iterating nodes in proc
fs. If we run perf in a non-root PID namespace with command:
# unshare --fork --pid perf record -e cycles -a -- test_program
... in this case, unshare command creates a child PID namespace and
launches perf tool in it, but the issue is the proc fs is not mounted
for the non-root PID namespace, this leads to the perf tool gathering
process info from its parent PID namespace.
We can use below command to observe the process nodes under proc fs:
# unshare --pid --fork ls /proc
1 137 1968 2128 3 342 48 62 78 crypto kcore net uptime
10 138 2 2142 30 35 49 63 8 devices keys pagetypeinfo version
11 139 20 2143 304 36 50 64 82 device-tree key-users partitions vmallocinfo
12 14 2011 22 305 37 51 65 83 diskstats kmsg self vmstat
128 140 2038 23 307 39 52 656 84 driver kpagecgroup slabinfo zoneinfo
129 15 2074 24 309 4 53 67 9 execdomains kpagecount softirqs
13 16 2094 241 31 40 54 68 asound fb kpageflags stat
130 164 2096 242 310 41 55 69 buddyinfo filesystems loadavg swaps
131 17 2098 25 317 42 56 70 bus fs locks sys
132 175 21 26 32 43 57 71 cgroups interrupts meminfo sysrq-trigger
133 179 2102 263 329 44 58 75 cmdline iomem misc sysvipc
134 1875 2103 27 330 45 59 76 config.gz ioports modules thread-self
135 19 2117 29 333 46 6 77 consoles irq mounts timer_list
136 1941 2121 298 34 47 60 773 cpuinfo kallsyms mtd tty
So it shows many existed tasks, since unshared command has not mounted
the proc fs for the new created PID namespace, it still accesses the
proc fs of the root PID namespace. This leads to two prominent issues:
- Firstly, PID values are mismatched between thread info and samples.
The gathered thread info are coming from the proc fs of the root PID
namespace, but samples record its PID from the child PID namespace.
- The second issue is profiled program 'test_program' returns its forked
PID number from the child PID namespace, perf tool wrongly uses this
PID number to retrieve the process info via the proc fs of the root
PID namespace.
To avoid issues, we need to mount proc fs for the child PID namespace
with the option '--mount-proc' when use unshare command:
# unshare --fork --pid --mount-proc perf record -e cycles -a -- test_program
Conversely, when the proc fs of the root PID namespace is used by child
namespace, perf tool can detect the multiple PID levels and
nsinfo__is_in_root_namespace() returns false, this patch reports error
for this case:
# unshare --fork --pid perf record -e cycles -a -- test_program
Couldn't synthesize bpf events.
Perf runs in non-root PID namespace but it tries to gather process info from its parent PID namespace.
Please mount the proc file system properly, e.g. add the option '--mount-proc' for unshare command.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20211224124014.2492751-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move NULL pointer check before dereferencing the variable.
Addresses-Coverity: 1497622 ("Derereference before null check")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ameer Hamza <amhamza.mgc@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Link: https://lore.kernel.org/r/20220125121141.18347-1-amhamza.mgc@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The stderr should be set to a pipe when using TUI. Otherwise it'd
print to stdout and break TUI windows with an error message.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220202070828.143303-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This updates branch sample type with missing PERF_SAMPLE_BRANCH_TYPE_SAVE.
Suggested-by: James Clark <james.clark@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/1643799443-15109-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
bpf_object__open_buffer() API is deprecated, use the unified opts
bpf_object__open_mem() API in perf instead. This requires at least
libbpf 0.0.6.
Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220125005923.418339-3-christylee@fb.com
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-01-24
We've added 80 non-merge commits during the last 14 day(s) which contain
a total of 128 files changed, 4990 insertions(+), 895 deletions(-).
The main changes are:
1) Add XDP multi-buffer support and implement it for the mvneta driver,
from Lorenzo Bianconi, Eelco Chaudron and Toke Høiland-Jørgensen.
2) Add unstable conntrack lookup helpers for BPF by using the BPF kfunc
infra, from Kumar Kartikeya Dwivedi.
3) Extend BPF cgroup programs to export custom ret value to userspace via
two helpers bpf_get_retval() and bpf_set_retval(), from YiFei Zhu.
4) Add support for AF_UNIX iterator batching, from Kuniyuki Iwashima.
5) Complete missing UAPI BPF helper description and change bpf_doc.py script
to enforce consistent & complete helper documentation, from Usama Arif.
6) Deprecate libbpf's legacy BPF map definitions and streamline XDP APIs to
follow tc-based APIs, from Andrii Nakryiko.
7) Support BPF_PROG_QUERY for BPF programs attached to sockmap, from Di Zhu.
8) Deprecate libbpf's bpf_map__def() API and replace users with proper getters
and setters, from Christy Lee.
9) Extend libbpf's btf__add_btf() with an additional hashmap for strings to
reduce overhead, from Kui-Feng Lee.
10) Fix bpftool and libbpf error handling related to libbpf's hashmap__new()
utility function, from Mauricio Vásquez.
11) Add support to BTF program names in bpftool's program dump, from Raman Shukhau.
12) Fix resolve_btfids build to pick up host flags, from Connor O'Brien.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (80 commits)
selftests, bpf: Do not yet switch to new libbpf XDP APIs
selftests, xsk: Fix rx_full stats test
bpf: Fix flexible_array.cocci warnings
xdp: disable XDP_REDIRECT for xdp frags
bpf: selftests: add CPUMAP/DEVMAP selftests for xdp frags
bpf: selftests: introduce bpf_xdp_{load,store}_bytes selftest
net: xdp: introduce bpf_xdp_pointer utility routine
bpf: generalise tail call map compatibility check
libbpf: Add SEC name for xdp frags programs
bpf: selftests: update xdp_adjust_tail selftest to include xdp frags
bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature
bpf: introduce frags support to bpf_prog_test_run_xdp()
bpf: move user_size out of bpf_test_init
bpf: add frags support to xdp copy helpers
bpf: add frags support to the bpf_xdp_adjust_tail() API
bpf: introduce bpf_xdp_get_buff_len helper
net: mvneta: enable jumbo frames if the loaded XDP program support frags
bpf: introduce BPF_F_XDP_HAS_FRAGS flag in prog_flags loading the ebpf program
net: mvneta: add frags support to XDP_TX
xdp: add frags support to xdp_return_{buff/frame}
...
====================
Link: https://lore.kernel.org/r/20220124221235.18993-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The x86 instruction decoder is used for both kernel instructions and
user space instructions (e.g. uprobes, perf tools Intel PT), so it is
good to update it with new instructions.
Add AVX512-FP16 instructions to x86 instruction decoder test.
A subsequent patch adds the instructions to the instruction decoder.
Reference:
Intel AVX512-FP16 Architecture Specification
June 2021
Revision 1.0
Document Number: 347407-001US
Example:
$ perf test -v "x86 instruction decoder" |& grep vfcmaddcph | head -2
Failed to decode: 62 f6 6f 48 56 cb vfcmaddcph %zmm3,%zmm2,%zmm1
Failed to decode: 62 f6 6f 48 56 8c c8 78 56 34 12 vfcmaddcph 0x12345678(%eax,%ecx,8),%zmm2,%zmm1
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20211202095029.2165714-6-adrian.hunter@intel.com
The x86 instruction decoder is used for both kernel instructions and
user space instructions (e.g. uprobes, perf tools Intel PT), so it is
good to update it with new instructions.
Add the following instructions to the x86 instruction decoder test:
User Interrupt
clui
senduipi
stui
testui
uiret
Prediction history reset
hreset
Serialize instruction execution
serialize
TSX suspend load address tracking
xresldtrk
xsusldtrk
A subsequent patch adds the instructions to the instruction decoder.
Reference:
Intel Architecture Instruction Set Extensions and Future Features
Programming Reference
May 2021
Document Number: 319433-044
Example:
$ perf test -v "x86 instruction decoder" |& grep -i hreset
Failed to decode length (4 vs expected 6): f3 0f 3a f0 c0 00 hreset $0x0
Failed to decode length (4 vs expected 6): f3 0f 3a f0 c0 00 hreset $0x0
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20211202095029.2165714-4-adrian.hunter@intel.com
Return value from perf_event__process_tracing_data() directly instead
of taking this in another redundant variable.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220112080109.666800-1-chi.minghao@zte.com.cn
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a test which allows us to test parsing an event alias with hyphens.
Since these events typically do not exist on most host systems, add the
alias to the fake pmu.
Function perf_pmu__test_parse_init() has terms added to match known test
aliases.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/1642432215-234089-4-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a test for aliases with hyphens in the name to ensure that the
pmu-events tables are as expects. There should be no reason why these sort
of aliases would be treated differently, but no harm in checking.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/1642432215-234089-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Event aliasing for events whose name in the form foo-bar-baz is not
supported, while foo-bar, foo_bar_baz, and other combinations are, i.e.
two hyphens are not supported.
The HiSilicon D06 platform has events in such form:
$ ./perf list sdir-home-migrate
List of pre-defined events (to be used in -e):
uncore hha:
sdir-home-migrate
[Unit: hisi_sccl,hha]
$ sudo ./perf stat -e sdir-home-migrate
event syntax error: 'sdir-home-migrate'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event>event selector. use 'perf list' to list available events
To support, add an extra PMU event symbol type for "baz", and add a new
rule in the bison file.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/1642432215-234089-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A previous patch preventing "attr->sample_period" values from being
overridden in pfm events changed a related behaviour in arm-spe.
Before said patch:
perf record -c 10000 -e arm_spe_0// -- sleep 1
Would yield an SPE event with period=10000. After the patch, the period
in "-c 10000" was being ignored because the arm-spe code initializes
sample_period to a non-zero value.
This patch restores the previous behaviour for non-libpfm4 events.
Fixes: ae5dcc8abe (“perf record: Prevent override of attr->sample_period for libpfm4 events”)
Reported-by: Chase Conklin <chase.conklin@arm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220118144054.2541-1-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove all but the first include of stdbool.h from cpumap.h.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220117083730.863200-1-lv.ruyi@zte.com.cn
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Switch from directly accessing the perf_cpu_map to using the appropriate
libperf API when possible. Using the API simplifies the job of
refactoring use of perf_cpu_map.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220122045811.3402706-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Value should be built as an integer.
Switch some uses of perf_cpu_map to use the library API.
Fixes: 6d18804b96 ("perf cpumap: Give CPUs their own type")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220122045811.3402706-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf script was failed to print the phys_addr for SPE profiling.
One 'dummy' event is added by SPE profiling but it doesn't have PHYS_ADDR
attribute set, perf script then exits with error.
Now referring to 'addr', use evsel__do_check_stype() to check the type.
Before:
# perf record -e arm_spe_0/branch_filter=0,ts_enable=1,pa_enable=1,load_filter=1,jitter=0,\
store_filter=0,min_latency=0,event_filter=2/ -p 4064384 -- sleep 3
# perf script -F pid,tid,addr,phys_addr
Samples for 'dummy:u' event do not have PHYS_ADDR attribute set. Cannot print 'phys_addr' field.
After:
# perf record -e arm_spe_0/branch_filter=0,ts_enable=1,pa_enable=1,load_filter=1,jitter=0,\
store_filter=0,min_latency=0,event_filter=2/ -p 4064384 -- sleep 3
# perf script -F pid,tid,addr,phys_addr
4064384/4064384 ffff802f921be0d0 2f921be0d0
4064384/4064384 ffff802f921be0d0 2f921be0d0
Reviewed-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Yao Jin <jinyao5@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220121065954.2121900-1-liwei391@huawei.com
Signed-off-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Its more intention revealing, and if we're interested in the odd cases
where this may end up truncating we can do debug checks at one
centralized place.
Motivation, of all the container builds, fedora rawhide started
complaining of:
util/machine.c: In function ‘machine__create_modules’:
util/machine.c:1419:50: error: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size between 0 and 4095 [-Werror=format-truncation=]
1419 | snprintf(path, sizeof(path), "%s/%s", dir_name, dent->d_name);
| ^~
In file included from /usr/include/stdio.h:894,
from util/branch.h:9,
from util/callchain.h:8,
from util/machine.c:7:
In function ‘snprintf’,
inlined from ‘maps__set_modules_path_dir’ at util/machine.c:1419:3,
inlined from ‘machine__set_modules_path’ at util/machine.c:1473:9,
inlined from ‘machine__create_modules’ at util/machine.c:1519:7:
/usr/include/bits/stdio2.h:71:10: note: ‘__builtin___snprintf_chk’ output between 2 and 4352 bytes into a destination of size 4096
There are other places where we should use path__join(), but lets get rid of
this one first.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: Link: https://lore.kernel.org/r/YebZKjwgfdOz0lAs@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the target is a pid, not started by 'perf stat' we need to disable
the events, and in that case there is no need to setup affinities as we
use a dummy CPU map, with just one entry set to -1.
So stop doing it to avoid this needless call to sched_getaffinity():
# strace -ke sched_getaffinity perf stat -e cycles -p 241957 sleep 1
<SNIP>
sched_getaffinity(0, 512, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]) = 8
> /usr/lib64/libc-2.33.so(sched_getaffinity@@GLIBC_2.3.4+0x1a) [0xe6eea]
> /var/home/acme/bin/perf(affinity__setup+0x6a) [0x532a2a]
> /var/home/acme/bin/perf(__evlist__disable.constprop.0+0x27) [0x4b9827]
> /var/home/acme/bin/perf(cmd_stat+0x29b5) [0x431725]
> /var/home/acme/bin/perf(run_builtin+0x6a) [0x4a2cfa]
> /var/home/acme/bin/perf(main+0x612) [0x40f8c2]
> /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
> /var/home/acme/bin/perf(_start+0x2d) [0x40fadd]
<SNIP>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220117160931.1191712-5-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the target is a pid, not started by 'perf stat' we need to enable
the events, and in that case there is no need to setup affinities as we
use a dummy CPU map, with just one entry set to -1.
So stop doing it to avoid this needless call to sched_getaffinity():
# strace -ke sched_getaffinity perf stat -e cycles -p 241957 sleep 1
<SNIP>
sched_getaffinity(0, 512, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]) = 8
> /usr/lib64/libc-2.33.so(sched_getaffinity@@GLIBC_2.3.4+0x1a) [0xe6eea]
> /var/home/acme/bin/perf(affinity__setup+0x6a) [0x5329ca]
> /var/home/acme/bin/perf(__evlist__enable.constprop.0+0x23) [0x4b9693]
> /var/home/acme/bin/perf(enable_counters+0x14d) [0x42de5d]
> /var/home/acme/bin/perf(cmd_stat+0x2358) [0x4310c8]
> /var/home/acme/bin/perf(run_builtin+0x6a) [0x4a2cfa]
> /var/home/acme/bin/perf(main+0x612) [0x40f8c2]
> /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
> /var/home/acme/bin/perf(_start+0x2d) [0x40fadd]
<SNIP>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220117160931.1191712-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I.e. the simple:
$ perf stat sleep 1
Uses a dummy CPU map and thus there is no need to setup/cleanup
affinities to avoid IPIs, etc.
With this we're down to a sched_getaffinity() call, in the libnuma
initialization, that probably can be removed in a followup patch.
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220117160931.1191712-3-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just like with free(), NULL is checked to avoid having all callers do
it.
Its convenient for when not using affinity setup/cleanup for dummy CPU
maps, i.e. CPU maps for pid targets.
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220117160931.1191712-2-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Because of commit bf794bf52a ("powerpc/kprobes: Fix kallsyms
lookup across powerpc ABIv1 and ABIv2"), in ppc64 ABIv1, our perf
command eliminates the need to use the prefix "." at the symbol name.
But when the command "perf probe -a schedule" is executed on ppc64
ABIv1, it obtains two symbol address information through /proc/kallsyms,
for example:
cat /proc/kallsyms | grep -w schedule
c000000000657020 T .schedule
c000000000d4fdb8 D schedule
The symbol "D schedule" is not a function symbol, and perf will print:
"p:probe/schedule _text+13958584"Failed to write event: Invalid argument
Therefore, when searching symbols from map and adding probe point for
them, a symbol type check is added. If the type of symbol is not a
function, skip it.
Fixes: bf794bf52a ("powerpc/kprobes: Fix kallsyms lookup across powerpc ABIv1 and ABIv2")
Signed-off-by: Zechuan Chen <chenzechuan1@huawei.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jianlin Lv <Jianlin.Lv@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20211228111338.218602-1-chenzechuan1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
New features:
- Add 'trace' subcommand for 'perf ftrace', setting the stage for more
'perf ftrace' subcommands. Not using a subcommand yields the previous
behaviour of 'perf ftrace'.
- Add 'latency' subcommand to 'perf ftrace', that can use the function
graph tracer or a BPF optimized one, via the -b/--use-bpf option.
E.g.:
$ sudo perf ftrace latency -a -T mutex_lock sleep 1
# DURATION | COUNT | GRAPH |
0 - 1 us | 4596 | ######################## |
1 - 2 us | 1680 | ######### |
2 - 4 us | 1106 | ##### |
4 - 8 us | 546 | ## |
8 - 16 us | 562 | ### |
16 - 32 us | 1 | |
32 - 64 us | 0 | |
64 - 128 us | 0 | |
128 - 256 us | 0 | |
256 - 512 us | 0 | |
512 - 1024 us | 0 | |
1 - 2 ms | 0 | |
2 - 4 ms | 0 | |
4 - 8 ms | 0 | |
8 - 16 ms | 0 | |
16 - 32 ms | 0 | |
32 - 64 ms | 0 | |
64 - 128 ms | 0 | |
128 - 256 ms | 0 | |
256 - 512 ms | 0 | |
512 - 1024 ms | 0 | |
1 - ... s | 0 | |
The original implementation of this command was in the bcc tool.
- Support --cputype option for hybrid events in 'perf stat'.
Improvements:
- Call chain improvements for ARM64.
- No need to do any affinity setup when profiling pids.
- Reduce multiplexing with duration_time in 'perf stat' metrics.
- Improve error message for uncore events, stating that some event groups are
can only be used in system wide (-a) mode.
- perf stat metric group leader fixes/improvements, including arch specific
changes to better support Intel topdown events.
- Probe non-deprecated sysfs path 1st, i.e. try /sys/devices/system/cpu/cpuN/topology/thread_siblings
first, then the old /sys/devices/system/cpu/cpuN/topology/core_cpus.
- Disable debuginfod by default in 'perf record', to avoid stalls on distros
such as Fedora 35.
- Use unbuffered output in 'perf bench' when pipe/tee'ing to a file.
- Enable ignore_missing_thread in 'perf trace'
Fixes:
- Avoid TUI crash when navigating in the annotation of recursive functions.
- Fix hex dump character output in 'perf script'.
- Fix JSON indentation to 4 spaces standard in the ARM vendor event files.
- Fix use after free in metric__new().
- Fix IS_ERR_OR_NULL() usage in the perf BPF loader.
- Fix up cross-arch register support, i.e. when printing register names take
into account the architecture where the perf.data file was collected.
- Fix SMT fallback with large core counts.
- Don't lower case MetricExpr when parsing JSON files so as not to lose info
such as the ":G" event modifier in metrics.
perf test:
- Add basic stress test for sigtrap handling to 'perf test'.
- Fix 'perf test' failures on s/390
- Enable system wide for metricgroups test in 'perf test´.
- Use 3 digits for test numbering now we can have more tests.
Arch specific:
- Add events for Arm Neoverse N2 in the ARM JSON vendor event files
- Support PERF_MEM_LVLNUM encodings in powerpc, that came from a single
patch series, where I incorrectly merged the kernel bits, that were then
reverted after coordination with Michael Ellerman and Stephen Rothwell.
- Add ARM SPE total latency as PERF_SAMPLE_WEIGHT.
- Update AMD documentation, with info on raw event encoding.
- Add support for global and local variants of the "p_stage_cyc" sort key,
applicable to perf.data files collected on powerpc.
- Remove duplicate and incorrect aux size checks in the ARM CoreSight ETM code.
Refactorings:
- Add a perf_cpu abstraction to disambiguate CPUs and CPU map indexes, fixing
problems along the way.
- Document CPU map methods.
UAPI sync:
- Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench mem memcpy'
- Sync UAPI files with the kernel sources: drm, msr-index, cpufeatures.
Build system
- Enable warnings through HOSTCFLAGS.
- Drop requirement for libstdc++.so for libopencsd check
libperf:
- Make libperf adopt perf_counts_values__scale() from tools/perf/util/.
- Add a stat multiplexing test to libperf.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYeQj6AAKCRCyPKLppCJ+
JwyWAQCBmU8OJxhSJQnNCwTB9zNkPPBbihvIztepOJ7zsw7JcQD+KfAidHGQvI/Y
EmXIYkmdNkWPYJafONllnKK5cckjxgI=
=aj9V
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v5.17-2022-01-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo:
"New features:
- Add 'trace' subcommand for 'perf ftrace', setting the stage for
more 'perf ftrace' subcommands. Not using a subcommand yields the
previous behaviour of 'perf ftrace'.
- Add 'latency' subcommand to 'perf ftrace', that can use the
function graph tracer or a BPF optimized one, via the -b/--use-bpf
option.
E.g.:
$ sudo perf ftrace latency -a -T mutex_lock sleep 1
# DURATION | COUNT | GRAPH |
0 - 1 us | 4596 | ######################## |
1 - 2 us | 1680 | ######### |
2 - 4 us | 1106 | ##### |
4 - 8 us | 546 | ## |
8 - 16 us | 562 | ### |
16 - 32 us | 1 | |
32 - 64 us | 0 | |
64 - 128 us | 0 | |
128 - 256 us | 0 | |
256 - 512 us | 0 | |
512 - 1024 us | 0 | |
1 - 2 ms | 0 | |
2 - 4 ms | 0 | |
4 - 8 ms | 0 | |
8 - 16 ms | 0 | |
16 - 32 ms | 0 | |
32 - 64 ms | 0 | |
64 - 128 ms | 0 | |
128 - 256 ms | 0 | |
256 - 512 ms | 0 | |
512 - 1024 ms | 0 | |
1 - ... s | 0 | |
The original implementation of this command was in the bcc tool.
- Support --cputype option for hybrid events in 'perf stat'.
Improvements:
- Call chain improvements for ARM64.
- No need to do any affinity setup when profiling pids.
- Reduce multiplexing with duration_time in 'perf stat' metrics.
- Improve error message for uncore events, stating that some event
groups are can only be used in system wide (-a) mode.
- perf stat metric group leader fixes/improvements, including arch
specific changes to better support Intel topdown events.
- Probe non-deprecated sysfs path first, i.e. try the path
/sys/devices/system/cpu/cpuN/topology/thread_siblings first, then
the old /sys/devices/system/cpu/cpuN/topology/core_cpus.
- Disable debuginfod by default in 'perf record', to avoid stalls on
distros such as Fedora 35.
- Use unbuffered output in 'perf bench' when pipe/tee'ing to a file.
- Enable ignore_missing_thread in 'perf trace'
Fixes:
- Avoid TUI crash when navigating in the annotation of recursive
functions.
- Fix hex dump character output in 'perf script'.
- Fix JSON indentation to 4 spaces standard in the ARM vendor event
files.
- Fix use after free in metric__new().
- Fix IS_ERR_OR_NULL() usage in the perf BPF loader.
- Fix up cross-arch register support, i.e. when printing register
names take into account the architecture where the perf.data file
was collected.
- Fix SMT fallback with large core counts.
- Don't lower case MetricExpr when parsing JSON files so as not to
lose info such as the ":G" event modifier in metrics.
perf test:
- Add basic stress test for sigtrap handling to 'perf test'.
- Fix 'perf test' failures on s/390
- Enable system wide for metricgroups test in 'perf test´.
- Use 3 digits for test numbering now we can have more tests.
Arch specific:
- Add events for Arm Neoverse N2 in the ARM JSON vendor event files
- Support PERF_MEM_LVLNUM encodings in powerpc, that came from a
single patch series, where I incorrectly merged the kernel bits,
that were then reverted after coordination with Michael Ellerman
and Stephen Rothwell.
- Add ARM SPE total latency as PERF_SAMPLE_WEIGHT.
- Update AMD documentation, with info on raw event encoding.
- Add support for global and local variants of the "p_stage_cyc" sort
key, applicable to perf.data files collected on powerpc.
- Remove duplicate and incorrect aux size checks in the ARM CoreSight
ETM code.
Refactorings:
- Add a perf_cpu abstraction to disambiguate CPUs and CPU map
indexes, fixing problems along the way.
- Document CPU map methods.
UAPI sync:
- Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench
mem memcpy'
- Sync UAPI files with the kernel sources: drm, msr-index,
cpufeatures.
Build system
- Enable warnings through HOSTCFLAGS.
- Drop requirement for libstdc++.so for libopencsd check
libperf:
- Make libperf adopt perf_counts_values__scale() from tools/perf/util/.
- Add a stat multiplexing test to libperf"
* tag 'perf-tools-for-v5.17-2022-01-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (115 commits)
perf record: Disable debuginfod by default
perf evlist: No need to do any affinity setup when profiling pids
perf cpumap: Add is_dummy() method
perf metric: Fix metric_leader
perf cputopo: Fix CPU topology reading on s/390
perf metricgroup: Fix use after free in metric__new()
libperf tests: Update a use of the new cpumap API
perf arm: Fix off-by-one directory path
tools arch x86: Sync the msr-index.h copy with the kernel sources
tools headers cpufeatures: Sync with the kernel sources
tools headers UAPI: Update tools's copy of drm.h header
tools arch: Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench mem memcpy'
perf pmu-events: Don't lower case MetricExpr
perf expr: Add debug logging for literals
perf tools: Probe non-deprecated sysfs path 1st
perf tools: Fix SMT fallback with large core counts
perf cpumap: Give CPUs their own type
perf stat: Correct first_shadow_cpu to return index
perf script: Fix flipped index and cpu
perf c2c: Use more intention revealing iterator
...
New:
- The Real Time Linux Analysis (RTLA) tool is added to the tools directory.
- Can safely filter on user space pointers with: field.ustring ~ "match-string"
- eprobes can now be filtered like any other event.
- trace_marker(_raw) now uses stream_open() to allow multiple threads to safely
write to it. Note, this could possibly break existing user space, but we will
not know until we hear about it, and then can revert the change if need be.
- New field in events to display when bottom halfs are disabled.
- Sorting of the ftrace functions are now done at compile time instead of
at bootup.
Infrastructure changes to support future efforts:
- Added __rel_loc type for trace events. Similar to __data_loc but the offset
to the dynamic data is based off of the location of the descriptor and not
the beginning of the event. Needed for user defined events.
- Some simplification of event trigger code.
- Make synthetic events process its callback better to not hinder other
event callbacks that are registered. Needed for user defined events.
And other small fixes and clean ups.
-
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYeGvcxQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qrZtAP9ICjJxX54MTErElhhUL/NFLV7wqhJi
OIAgmp6jGVRqPAD+JxQtBnGH+3XMd71ioQkTfQ1rp+jBz2ERBj2DmELUAg0=
=zmda
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"New:
- The Real Time Linux Analysis (RTLA) tool is added to the tools
directory.
- Can safely filter on user space pointers with: field.ustring ~
"match-string"
- eprobes can now be filtered like any other event.
- trace_marker(_raw) now uses stream_open() to allow multiple threads
to safely write to it. Note, this could possibly break existing
user space, but we will not know until we hear about it, and then
can revert the change if need be.
- New field in events to display when bottom halfs are disabled.
- Sorting of the ftrace functions are now done at compile time
instead of at bootup.
Infrastructure changes to support future efforts:
- Added __rel_loc type for trace events. Similar to __data_loc but
the offset to the dynamic data is based off of the location of the
descriptor and not the beginning of the event. Needed for user
defined events.
- Some simplification of event trigger code.
- Make synthetic events process its callback better to not hinder
other event callbacks that are registered. Needed for user defined
events.
And other small fixes and cleanups"
* tag 'trace-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (50 commits)
tracing: Add ustring operation to filtering string pointers
rtla: Add rtla timerlat hist documentation
rtla: Add rtla timerlat top documentation
rtla: Add rtla timerlat documentation
rtla: Add rtla osnoise hist documentation
rtla: Add rtla osnoise top documentation
rtla: Add rtla osnoise man page
rtla: Add Documentation
rtla/timerlat: Add timerlat hist mode
rtla: Add timerlat tool and timelart top mode
rtla/osnoise: Add the hist mode
rtla/osnoise: Add osnoise top mode
rtla: Add osnoise tool
rtla: Helper functions for rtla
rtla: Real-Time Linux Analysis tool
tracing/osnoise: Properly unhook events if start_per_cpu_kthreads() fails
tracing: Remove duplicate warnings when calling trace_create_file()
tracing/kprobes: 'nmissed' not showed correctly for kretprobe
tracing: Add test for user space strings when filtering on string pointers
tracing: Have syscall trace events use trace_event_buffer_lock_reserve()
...
Fedora 35 sets DEBUGINFOD_URLS by default, which might lead to
unexpected stalls in perf record exit path, when we try to cache
profiled binaries.
# DEBUGINFOD_PROGRESS=1 ./perf record -a
^C[ perf record: Woken up 1 times to write data ]
Downloading from https://debuginfod.fedoraproject.org/ 447069
Downloading from https://debuginfod.fedoraproject.org/ 1502175
Downloading \^Z
Disabling DEBUGINFOD_URLS by default in perf record and adding
debuginfod option and .perfconfig variable support to enable id.
Default without debuginfo processing:
# perf record -a
Using system debuginfod setup:
# perf record -a --debuginfod
Using custom debuginfd url:
# perf record -a --debuginfod='https://evenbetterdebuginfodserver.krava'
Adding single perf_debuginfod_setup function and using
it also in perf buildid-cache command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211209200425.303561-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The cpumap is dummy, so no need to go on figuring out affinity.o
This way we reduce the setup time for simple scenarios like:
$ perf stat sleep 1
Acked-by: Andi Kleen <andi@firstfloor.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Needed to check if a cpu_map is dummy, i.e. not a cpu map at all, for
pid monitoring scenarios.
This probably needs to move to libperf, but since perf itself is the
first and so far only user, leave it at tools/perf/util/.
Acked-by: Andi Kleen <andi@firstfloor.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Multiple events may have a metric_leader to aggregate into.
This happens for uncore events where, for example, uncore_imc is
expanded into uncore_imc_0, uncore_imc_1, etc.
Such events all have the same metric_id and should aggregate into the
first event.
The change introducing metric_ids had a bug where the metric_id was
compared to itself, creating an always true condition.
Correct this by comparing the event in the metric_evlist and the
metric_leader.
Fixes: ec5c5b3d2c ("perf metric: Encode and use metric-id as qualifier")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20220115062852.1959424-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit fdf1e29b61 ("perf expr: Add metric literals for topology.")
fails on s390:
# ./perf test -Fv 7
...
# FAILED tests/expr.c:173 #num_dies >= #num_packages
---- end ----
Simple expression parser: FAILED!
#
Investigating this issue leads to these functions:
build_cpu_topology()
+--> has_die_topology(void)
{
struct utsname uts;
if (uname(&uts) < 0)
return false;
if (strncmp(uts.machine, "x86_64", 6))
return false;
....
}
which always returns false on s390. The caller build_cpu_topology()
checks has_die_topology() return value. On false the
the struct cpu_topology::die_cpu_list is not contructed and has zero
entries. This leads to the failing comparison: #num_dies >= #num_packages.
s390 of course has a positive number of packages.
Fix this by adding s390 architecture to support CPU die list.
Output after:
# ./perf test -Fv 7
7: Simple expression parser :
--- start ---
division by zero
syntax error
---- end ----
Simple expression parser: Ok
#
Fixes: fdf1e29b61 ("perf expr: Add metric literals for topology.")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20211124090343.9436-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We shouldn't free() something that will be used in the next line, fix
it.
Fixes: b85a4d61d3 ("perf metric: Allow modifiers on metrics")
Addresses-Coverity-ID: 1494000
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211208171113.22089-1-jose.exposito89@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Relative path include works in the regular build due to -I paths but may
fail in other situations.
Fixes: 83869019c7 ("perf arch: Support register names from all archs")
Reviewed-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220114064822.1806019-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up fixes and get in line with other trees, powerpc kernel
mostly this time, but BPF as well.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
libbpf bpf_map__def() API is being deprecated, replace perf's
usage with the appropriate getters and setters.
Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220108004218.355761-4-christylee@fb.com
This patch changes MetricExpr to be written out in the same case. This
enables events in metrics to use modifiers like 'G' which currently
yield parse errors when made lower case. To keep tests passing the
literal #smt_on is compared in a non-case sensitive way - #SMT_on is
present in at least SkylakeX metrics.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211126071305.3733878-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Useful for diagnosing problems with metrics.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211124001231.3277836-1-irogers@google.com
[ Fixed up perf_cpu conflict, i.e. we need to append ".cpu" to cpu__max_present_cpu() result ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Following Documentation/ABI/stable/sysfs-devices-system-cpu the
/sys/devices/system/cpu/cpuX/topology/core_cpus is deprecated in favor
of thread_siblings, so probe thread_siblings before falling back on
core_cpus.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211124001231.3277836-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
strtoull can only read a 64-bit bitmap. On an AMD EPYC core_cpus may look
like:
00000000,00000000,00000000,00000001,00000000,00000000,00000000,00000001
and so the sibling wasn't spotted. Fix by writing a simple hweight string
parser.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211124001231.3277836-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A common problem is confusing CPU map indices with the CPU, by wrapping
the CPU with a struct then this is avoided. This approach is similar to
atomic_t.
Committer notes:
To make it build with BUILD_BPF_SKEL=1 these files needed the
conversions to 'struct perf_cpu' usage:
tools/perf/util/bpf_counter.c
tools/perf/util/bpf_counter_cgroup.c
tools/perf/util/bpf_ftrace.c
Also perf_env__get_cpu() was removed back in "perf cpumap: Switch
cpu_map__build_map to cpu function".
Additionally these needed to be fixed for the ARM builds to complete:
tools/perf/arch/arm/util/cs-etm.c
tools/perf/arch/arm64/util/pmu.c
Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220105061351.120843-49-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_stat__update_shadow_stats() and perf_stat__print_shadow_stats() use
a cpu map index rather than a CPU, but first_shadow_cpu is returning the
wrong value for this. Change first_shadow_cpu to
first_shadow_cpu_map_idx to make things agree.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220105061351.120843-48-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>