Update the haswell metrics and events using the new tooling from:
https://github.com/intel/perfmon
The metrics are unchanged but the formulas differ due to parentheses,
use of exponents and removal of redundant operations like "* 1". The
events are unchanged but unused json values are removed. The
formatting changes increase consistency across the json files.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065510.1621979-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the goldmontplus events using the new tooling from:
https://github.com/intel/perfmon
The events are unchanged but unused json values are removed. This
increases consistency across the json files.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065510.1621979-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the goldmont events using the new tooling from:
https://github.com/intel/perfmon
The events are unchanged but unused json values are removed. This
increases consistency across the json files.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065510.1621979-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the elkhartlake events using the new tooling from:
https://github.com/intel/perfmon
The events are unchanged but unused json values are removed. This
increases consistency across the json files.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065510.1621979-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the cascadelakex metrics and events using the new tooling from:
https://github.com/intel/perfmon
The metrics are unchanged but the formulas differ due to parentheses,
use of exponents and removal of redundant operations like "* 1". The
order of metrics varies as TMA metrics are first converted and then
removed if perfmon versions are found. The events are updated with
fixes to uncore events and improved descriptions. The formatting
changes increase consistency across the json files.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065017.1621020-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the broadwellx metrics and events using the new tooling from:
https://github.com/intel/perfmon
The metrics are unchanged but the formulas differ due to parentheses,
use of exponents and removal of redundant operations like "* 1". The
order of metrics varies as TMA metrics are first converted and then
removed if perfmon versions are found. The events are updated with
fixes to uncore events and improved descriptions. The formatting
changes increase consistency across the json files.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065017.1621020-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the broadwellde metrics and events using the new tooling from:
https://github.com/intel/perfmon
The metrics vary as tma_false_sharing, MEM_Parallel_Requests and
MEM_Request_Latency are explicitly dropped from having missing events:
https://github.com/captain5050/perfmon/blob/main/scripts/create_perf_json.py#L934
The formulas also differ due to parentheses, use of exponents and
removal of redundant operations like "* 1". The events are unchanged
but unused json values are removed and implicit umasks of 0 are
dropped. This increases consistency across the json files.
mapfile.csv's version number is set to match that in the perfmon
repository.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065017.1621020-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the broadwell metrics and events using the new tooling from:
https://github.com/intel/perfmon
The metrics are unchanged but the formulas differ due to parentheses,
use of exponents and removal of redundant operations like "* 1". The
events are unchanged but unused json values are removed, implicit
umasks of 0 are dropped and duplicate short and long descriptions have
the long one dropped. This increases consistency across the json
files.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215064755.1620246-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the bonnell events using the new tooling from:
https://github.com/intel/perfmon
The events are unchanged but unused json values are removed and
implicit umasks of 0 are dropped. This increases consistency across
the json files.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215064755.1620246-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the alderlake-n metrics using the new tooling from:
https://github.com/intel/perfmon
The metrics are unchanged but the formulas differ due to parentheses,
use of exponents and removal of redundant operations like "* 1".
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215064755.1620246-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the alderlake metrics using the new tooling from:
https://github.com/intel/perfmon
The metrics are unchanged but the formulas differ due to parentheses,
use of exponents and removal of redundant operations like "* 1".
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215064755.1620246-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We test metrics with fake events with fake values. The fake values may
yield division by zero and so we count both up and down to try to
avoid this. Unfortunately this isn't sufficient for some metrics and
so don't fail the test for them.
Add the metric name to debug output.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20221215064755.1620246-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Likewise, modify ->cmp() callback to compare sample address and map
address. And add ->collapse() and ->sort() to check the actual
srcfile string. Also add ->init() to make sure it has the srcfile.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Likewise, modify ->cmp() callback to compare sample address and map
address. And add ->collapse() and ->sort() to check the actual
srcfile string. Also add ->init() to make sure it has the srcfile.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The sort_entry->cmp() will be called for eventy sample data to find a
matching entry. When it has 'srcline' sort key, that means it needs to
call addr2line or libbfd everytime.
This is not optimal because many samples will have same address and it
just can call addr2line once. So postpone the actual srcline check to
the sort_entry->collpase() and compare addresses in ->cmp().
Also it needs to add ->init() callback to make sure it has srcline info.
If a sample has a unique data, chances are the entry can be sorted out
by other (previous) keys and callbacks in sort_srcline never called.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In __hists__insert_output_entry(), it calls fmt->sort() for dynamic
entries with NULL to update column width for tracepoint fields.
But it's a hacky abuse of the sort callback, better to have a proper
callback for that. I'll add more use cases later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It has symbol_conf.disable_add2line_warn to suppress some warnings. Let's
make it consistent with others.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The srcline info is from the .debug_line section. No need to setup
addr2line subprocess if the section is missing.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20221215192817.2734573-5-namhyung@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The filename__has_section() is to check if the given section name is in
the binary. It'd be used for checking debug info for srcline.
Committer notes:
Added missing __maybe_unused to the unused filename__has_section()
arguments in tools/perf/util/symbol-minimal.c.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Parametrized events are not only a powerpc domain. They occur on other
platforms too (e.g. aarch64). They should be ignored in this testcase,
since proper setup of the parameters is out of scope of this script.
Let's not filter them out by PMU name, but rather based on the fact that
they expect a parameter.
Fixes: 451ed8058c ("perf test: Fix "all PMU test" to skip hv_24x7/hv_gpci tests on powerpc")
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20221219163008.9691-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As kbuild, this adds .DELETE_ON_ERROR special target to clean up
partially updated files on error. A known issue is the empty vmlinux.h
generted by bpftool if it failed to dump btf info.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221217225151.90387-1-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add more tests for the new filters.
$ sudo perf test contention -v
87: kernel lock contention analysis test :
--- start ---
test child forked, pid 412379
Testing perf lock record and perf lock contention
Testing perf lock contention --use-bpf
Testing perf lock record and perf lock contention at the same time
Testing perf lock contention --threads
Testing perf lock contention --lock-addr
Testing perf lock contention --type-filter
Testing perf lock contention --lock-filter
test child finished with 0
---- end ----
kernel lock contention analysis test: Ok
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Likewise, add addr_filter BPF hash map and check it with the lock
address.
$ sudo ./perf lock con -ab -L tasklist_lock -- ./perf bench sched messaging
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 0.169 [sec]
contended total wait max wait avg wait type caller
18 174.09 us 25.31 us 9.67 us rwlock:W do_exit+0x36d
5 32.34 us 10.87 us 6.47 us rwlock:R do_wait+0x8b
4 15.41 us 4.73 us 3.85 us rwlock:W release_task+0x6e
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -L/--lock-filter option is to filter only given locks. The locks
can be specified by address or name (if exists).
$ sudo ./perf lock record -a sleep 1
$ sudo ./perf lock con -l
contended total wait max wait avg wait address symbol
57 1.11 ms 42.83 us 19.54 us ffff9f4140059000
15 280.88 us 23.51 us 18.73 us ffffffff9d007a40 jiffies_lock
1 20.49 us 20.49 us 20.49 us ffffffff9d0d50c0 rcu_state
1 9.02 us 9.02 us 9.02 us ffff9f41759e9ba0
$ sudo ./perf lock con -L jiffies_lock,rcu_state
contended total wait max wait avg wait type caller
15 280.88 us 23.51 us 18.73 us spinlock tick_sched_do_timer+0x93
1 20.49 us 20.49 us 20.49 us spinlock __softirqentry_text_start+0xeb
$ sudo ./perf lock con -L ffff9f4140059000
contended total wait max wait avg wait type caller
38 779.40 us 42.83 us 20.51 us spinlock worker_thread+0x50
11 216.30 us 39.87 us 19.66 us spinlock queue_work_on+0x39
8 118.13 us 20.51 us 14.77 us spinlock kthread+0xe5
Committer testing:
# uname -a
Linux quaco 6.0.12-200.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Dec 8 17:15:53 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
# perf lock record
^C[ perf record: Woken up 1 times to write data ]
# perf lock con -L jiffies_lock,rcu_state
contended total wait max wait avg wait type caller
# perf lock con
contended total wait max wait avg wait type caller
1 9.06 us 9.06 us 9.06 us spinlock call_timer_fn+0x24
# perf lock con -L call
ignore unknown symbol: call
contended total wait max wait avg wait type caller
1 9.06 us 9.06 us 9.06 us spinlock call_timer_fn+0x24
#
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Likewise, add type_filter BPF hash map and check it when user gave a
lock type filter.
$ sudo ./perf lock con -ab -Y rwlock -- ./perf bench sched messaging
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 0.203 [sec]
contended total wait max wait avg wait type caller
15 156.19 us 19.45 us 10.41 us rwlock:W do_exit+0x36d
1 11.12 us 11.12 us 11.12 us rwlock:R do_wait+0x8b
1 5.09 us 5.09 us 5.09 us rwlock:W release_task+0x6e
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -Y/--type-filter option is to filter the result for specific lock
types only. It can accept comma-separated values. Note that it would
accept type names like one in the output. spinlock, mutex, rwsem:R and
so on.
For RW-variant lock types, it converts the name to the both variants.
In other words, "rwsem" is same as "rwsem:R,rwsem:W". Also note that
"mutex" has two different encoding - one for sleeping wait, another for
optimistic spinning. Add "mutex-spin" entry for the lock_type_table so
that we can add it for "mutex" under the table.
$ sudo ./perf lock record -a -- ./perf bench sched messaging
$ sudo ./perf lock con -E 5 -Y spinlock
contended total wait max wait avg wait type caller
802 1.26 ms 11.73 us 1.58 us spinlock __wake_up_common_lock+0x62
13 787.16 us 105.44 us 60.55 us spinlock remove_wait_queue+0x14
12 612.96 us 78.70 us 51.08 us spinlock prepare_to_wait+0x27
114 340.68 us 12.61 us 2.99 us spinlock try_to_wake_up+0x1f5
83 226.38 us 9.15 us 2.73 us spinlock folio_lruvec_lock_irqsave+0x5e
Committer notes:
Make get_type_flag() return UINT_MAX for error instad of -1UL, as that
function returns 'unsigned int' and we store the value on a 'unsigned
int' 'flags' variable which makes clang unhappy:
35 98.23 fedora:37 : FAIL clang version 15.0.6 (Fedora 15.0.6-1.fc37)
builtin-lock.c:2012:14: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always true [-Werror,-Wtautological-constant-out-of-range-compare]
if (flags != -1UL) {
~~~~~ ^ ~~~~
builtin-lock.c:2021:14: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always true [-Werror,-Wtautological-constant-out-of-range-compare]
if (flags != -1UL) {
~~~~~ ^ ~~~~
builtin-lock.c:2037:14: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always true [-Werror,-Wtautological-constant-out-of-range-compare]
if (flags != -1UL) {
~~~~~ ^ ~~~~
3 errors generated.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move it out of get_type_str() so that we can reuse the table for others
later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Check the -q and -v options first to return earlier on error.
Before:
# perf probe -q -v test
probe-definition(0): test
symbol:test file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Error: -v and -q are exclusive.
After:
# perf probe -q -v test
Error: -v and -q are exclusive.
Fixes: 5e17b28f1e ("perf probe: Add --quiet option to suppress output result message")
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20221220035702.188413-4-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The data type of the verbose variable is integer and can be negative,
replace improperly used cases in a unified manner:
1. if (verbose) => if (verbose > 0)
2. if (!verbose) => if (verbose <= 0)
3. if (XX && verbose) => if (XX && verbose > 0)
4. if (XX && !verbose) => if (XX && verbose <= 0)
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20221220035702.188413-3-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When perf uses quiet mode, perf_quiet_option() sets the 'debug_peo_args'
variable to -1, and display_attr() incorrectly determines the value of
'debug_peo_args'. As a result, unexpected information is displayed.
Before:
# perf record --quiet -- ls > /dev/null
------------------------------------------------------------
perf_event_attr:
size 128
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|PERIOD
read_format ID|LOST
disabled 1
inherit 1
mmap 1
comm 1
freq 1
enable_on_exec 1
task 1
precise_ip 3
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
ksymbol 1
bpf_event 1
------------------------------------------------------------
...
After:
# perf record --quiet -- ls > /dev/null
#
redirect_to_stderr is a similar problem.
Fixes: f78eaef0e0 ("perf tools: Allow to force redirect pr_debug to stderr.")
Fixes: ccd26741f5 ("perf tool: Provide an option to print perf_event_open args and return value")
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: martin.lau@kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: https://lore.kernel.org/r/20221220035702.188413-2-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in:
86bdf3ebcf ("KVM: Support dirty ring in conjunction with bitmap")
That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.
This is also by now used by tools/testing/selftests/kvm/, a simple test
build didn't succeed, but for another reason:
lib/kvm_util.c: In function ‘vm_enable_dirty_ring’:
lib/kvm_util.c:125:30: error: ‘KVM_CAP_DIRTY_LOG_RING_ACQ_REL’ undeclared (first use in this function); did you mean ‘KVM_CAP_DIRTY_LOG_RING’?
125 | if (vm_check_cap(vm, KVM_CAP_DIRTY_LOG_RING_ACQ_REL))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| KVM_CAP_DIRTY_LOG_RING
I'll send a separate patch for that.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lore.kernel.org/lkml/Y6H3b1Q4Msjy5Yz3@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in these csets:
ce883a2ba3 ("powerpc/32: fix syscall wrappers with 64-bit arguments")
That doesn't cause any changes in the perf tools.
This table is used in tools perf to allow features as described in the
last update to this file.
This addresses this perf build warning:
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Y6H0C5plZ4V4aiPm@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes in:
97fa21f65c ("x86/resctrl: Move MSR defines into msr-index.h")
7420ae3bb9 ("x86/intel_epb: Set Alder Lake N and Raptor Lake P normal EPB")
Addressing these tools/perf build warnings:
diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
That makes the beautification scripts to pick some new entries:
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
$ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
$ diff -u before after
--- before 2022-12-20 14:28:40.893794072 -0300
+++ after 2022-12-20 14:28:54.831993914 -0300
@@ -266,6 +266,7 @@
[0xc0000104 - x86_64_specific_MSRs_offset] = "AMD64_TSC_RATIO",
[0xc000010e - x86_64_specific_MSRs_offset] = "AMD64_LBR_SELECT",
[0xc000010f - x86_64_specific_MSRs_offset] = "AMD_DBG_EXTN_CFG",
+ [0xc0000200 - x86_64_specific_MSRs_offset] = "IA32_MBA_BW_BASE",
[0xc0000300 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS",
[0xc0000301 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_CTL",
[0xc0000302 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS_CLR",
$
Now one can trace systemwide asking to see backtraces to where that MSR
is being read/written, see this example with a previous update:
# perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
^C#
If we use -v (verbose mode) we can see what it does behind the scenes:
# perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
Using CPUID AuthenticAMD-25-21-0
0x6a0
0x6a8
New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
0x6a0
0x6a8
New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
mmap size 528384B
^C#
Example with a frequent msr:
# perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
Using CPUID AuthenticAMD-25-21-0
0x48
New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
0x48
New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
mmap size 528384B
Looking at the vmlinux_path (8 entries long)
symsrc__init: build id mismatch for vmlinux.
Using /proc/kcore for kernel data
Using /proc/kallsyms for symbols
0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
__switch_to_xtra ([kernel.kallsyms])
__switch_to ([kernel.kallsyms])
__schedule ([kernel.kallsyms])
schedule ([kernel.kallsyms])
futex_wait_queue_me ([kernel.kallsyms])
futex_wait ([kernel.kallsyms])
do_futex ([kernel.kallsyms])
__x64_sys_futex ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
__futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
__switch_to_xtra ([kernel.kallsyms])
__switch_to ([kernel.kallsyms])
__schedule ([kernel.kallsyms])
schedule_idle ([kernel.kallsyms])
do_idle ([kernel.kallsyms])
cpu_startup_entry ([kernel.kallsyms])
secondary_startup_64_no_verify ([kernel.kallsyms])
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/lkml/Y6HyTOGRNvKfCVe4@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes in:
bc7ed4d308 ("drm/i915/perf: Apply Wa_18013179988")
81d5f7d914 ("drm/i915/perf: Add 32-bit OAG and OAR formats for DG2")
8133a6daad ("drm/i915: enable PS64 support for DG2")
b76c14c8fb ("drm/i915/huc: better define HuC status getparam possible return values.")
94dfc73e7c ("treewide: uapi: Replace zero-length arrays with flexible-array members")
That doesn't add any ioctl, so no changes in tooling.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://lore.kernel.org/lkml/Y6HukoRaZh2R4j5U@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It makes sense to move the important monitor structure into rodata to
prevent accidental structure modification.
Link: https://lkml.kernel.org/r/20221122173648.4732-1-acarmina@redhat.com
Signed-off-by: Alessandro Carminati <acarmina@redhat.com>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Here are 2 small updates for LICENSES and some kernel files that add the
Copyleft-next license and use it in a SPDX tag as a dual-license for
some kernel files.
These have been discussed thoroughly in public on the linux-spdx mailing
list, and have the needed acks on them, as well as having been in
linux-next with no reported issues for quite some time.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY6F1Qg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynGWwCfVJ+Z1CVWSFC8KaaGNiFu/gXmgNUAoKy11gWJ
8igpSNEkOiGiaGA+AvN+
=j8iu
-----END PGP SIGNATURE-----
Merge tag 'spdx-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX/License additions from Greg KH:
"Here are two small updates for LICENSES and some kernel files that add
the Copyleft-next license and use it in a SPDX tag as a dual-license
for some kernel files.
These have been discussed thoroughly in public on the linux-spdx
mailing list, and have the needed acks on them, as well as having been
in linux-next with no reported issues for quite some time"
* tag 'spdx-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
testing: use the copyleft-next-0.3.1 SPDX tag
LICENSES: Add the copyleft-next-0.3.1 license
Fixes:
- Fix potential null-ptr-deref in start_task()
- Fix kgdb console on serial port
- Add missing FORCE prerequisites in Makefile
- Drop PMD_SHIFT from calculation in pgtable.h
Enhancements:
- Implement a wrapper to align madvise() MADV_* constants with other
architectures
- If machine supports running MPE/XL, show the MPE model string
Cleanups:
- Drop duplicate kgdb console code
- Indenting fixes in setup_cmdline()
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCY6B/cgAKCRD3ErUQojoP
X85pAQCC6YpSYON3KZRfABeiDTRCKcGm72p7JQRnyj88XCq6ZAEA40T2qpRpjoYi
NaXr28mxHFYh4Z0c5Y7K5EuFTT7gAA4=
=e2Jd
-----END PGP SIGNATURE-----
Merge tag 'parisc-for-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
"There is one noteable patch, which allows the parisc kernel to use the
same MADV_xxx constants as the other architectures going forward. With
that change only alpha has one entry left (MADV_DONTNEED is 6 vs 4 on
others) which is different. To prevent an ABI breakage, a wrapper is
included which translates old MADV values to the new ones, so existing
userspace isn't affected. Reason for that patch is, that some
applications wrongly used the standard MADV_xxx values even on some
non-x86 platforms and as such those programs failed to run correctly
on parisc (examples are qemu-user, tor browser and boringssl).
Then the kgdb console and the LED code received some fixes, and some
0-day warnings are now gone. Finally, the very last compile warning
which was visible during a kernel build is now fixed too (in the vDSO
code).
The majority of the patches are tagged for stable series and in
summary this patchset is quite small and drops more code than it adds:
Fixes:
- Fix potential null-ptr-deref in start_task()
- Fix kgdb console on serial port
- Add missing FORCE prerequisites in Makefile
- Drop PMD_SHIFT from calculation in pgtable.h
Enhancements:
- Implement a wrapper to align madvise() MADV_* constants with other
architectures
- If machine supports running MPE/XL, show the MPE model string
Cleanups:
- Drop duplicate kgdb console code
- Indenting fixes in setup_cmdline()"
* tag 'parisc-for-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Show MPE/iX model string at bootup
parisc: Add missing FORCE prerequisites in Makefile
parisc: Move pdc_result struct to firmware.c
parisc: Drop locking in pdc console code
parisc: Drop duplicate kgdb_pdc console
parisc: Fix locking in pdc_iodc_print() firmware call
parisc: Drop PMD_SHIFT from calculation in pgtable.h
parisc: Align parisc MADV_XXX constants with all other architectures
parisc: led: Fix potential null-ptr-deref in start_task()
parisc: Fix inconsistent indenting in setup_cmdline()
To pick the changes from:
f8b435f93b ("fscrypt: remove unused Speck definitions")
e0cefada13 ("fscrypt: Add SM4 XTS/CTS symmetric algorithm support")
That don't result in any changes in tooling, just causes this to be
rebuilt:
CC /tmp/build/perf-urgent/trace/beauty/sync_file_range.o
LD /tmp/build/perf-urgent/trace/beauty/perf-in.o
addressing this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fscrypt.h' differs from latest version at 'include/uapi/linux/fscrypt.h'
diff -u tools/include/uapi/linux/fscrypt.h include/uapi/linux/fscrypt.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Link: https://lore.kernel.org/lkml/Y6CHSS6Rn9YOqpAd@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
5e85c4ebf2 ("x86: KVM: Advertise AVX-IFMA CPUID to user space")
af2872f622 ("x86: KVM: Advertise AMX-FP16 CPUID to user space")
6a19d7aa58 ("x86: KVM: Advertise CMPccXADD CPUID to user space")
aaa65d17ee ("x86/tsx: Add a feature bit for TSX control MSR support")
b1599915f0 ("x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit")
16a7fe3728 ("KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest")
80e4c1cd42 ("x86/retbleed: Add X86_FEATURE_CALL_DEPTH")
7df548840c ("x86/bugs: Add "unknown" reporting for MMIO Stale Data")
This only causes these perf files to be rebuilt:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
And addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiaxi Chen <jiaxi.chen@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/Y6CD%2FIcEbDW5X%2FpN@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
15e15d64bd ("x86/cpufeatures: Add X86_FEATURE_XENPV to disabled-features.h")
This only causes these perf files to be rebuilt:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
And addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h
Cc: Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Y6B2w3WqifB%2FV70T@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- Add powerpc qspinlock implementation optimised for large system scalability and
paravirt. See the merge message for more details.
- Enable objtool to be built on powerpc to generate mcount locations.
- Use a temporary mm for code patching with the Radix MMU, so the writable mapping is
restricted to the patching CPU.
- Add an option to build the 64-bit big-endian kernel with the ELFv2 ABI.
- Sanitise user registers on interrupt entry on 64-bit Book3S.
- Many other small features and fixes.
Thanks to: Aboorva Devarajan, Angel Iglesias, Benjamin Gray, Bjorn Helgaas, Bo Liu, Chen
Lifu, Christoph Hellwig, Christophe JAILLET, Christophe Leroy, Christopher M. Riedl, Colin
Ian King, Deming Wang, Disha Goel, Dmitry Torokhov, Finn Thain, Geert Uytterhoeven,
Gustavo A. R. Silva, Haowen Bai, Joel Stanley, Jordan Niethe, Julia Lawall, Kajol Jain,
Laurent Dufour, Li zeming, Miaoqian Lin, Michael Jeanson, Nathan Lynch, Naveen N. Rao,
Nayna Jain, Nicholas Miehlbradt, Nicholas Piggin, Pali Rohár, Randy Dunlap, Rohan McLure,
Russell Currey, Sathvika Vasireddy, Shaomin Deng, Stephen Kitt, Stephen Rothwell, Thomas
Weißschuh, Tiezhu Yang, Uwe Kleine-König, Xie Shaowen, Xiu Jianfeng, XueBing Chen, Yang
Yingliang, Zhang Jiaming, ruanjinjie, Jessica Yu, Wolfram Sang.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmOfrj8THG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgIWtD/9mGF/ze2k+qFTo+30fb7bO8WJIDgsR
dIASnZjXV7q/45elvymhUdkQv4R7xL3pzC40P1+ZKtWzGTNe+zWUQLoALNwRK85j
8CsxZbqefGNKE5Z6ZHo9s37wsu3+jJu9yEQpGFo1LINyzeclCn5St5oqfRam+Hd/
cPF+VfvREwZ0+YOKGBhJ2EgC+Gc9xsFY7DLQsoYlu71iZZr6Z6rgZW/EY5h3RMGS
YKBoVwDsWaU0FpFWrr/rYTI6DqSr3AHr1+ftDg7ncCZMD6vQva6aMCCt94aLB1aE
vC+DNdhZlA558bXGa5yA7Wr//7aUBUIwyC60DogOeZ6vw3kD9tdEd1fbH5hmqNKY
K5bfqm28XU2959CTE8RDgsYYZvwDcfrjBIML14WZGdCQOTcGKpgOGp22o6yNb1Pq
JKpHHnVpvu2PZ/p2XdKSm9+etr2yI6lXZAEVTS7ehdtMukButjSHEVbSCEZ8tlWz
KokQt2J23BMHuSrXK6+67wWQBtdsLEk+LBOQmweiwarMocqvL/Zjz/5J7DR2DtH8
wlY3wOtB1+E5j7xZ+RgK3c3jNg5dH39ZwvFsSATWTI3P+iq6OK/bbk4q4LmZt2l9
ZIfH/CXPf9BvGCHzHa3AAd3UBbJLFwj17btMEv1wFVPS0T4LPUzkgTNTNUYeP6zL
h1e5QfgUxvKPuQ==
=7k3p
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Add powerpc qspinlock implementation optimised for large system
scalability and paravirt. See the merge message for more details
- Enable objtool to be built on powerpc to generate mcount locations
- Use a temporary mm for code patching with the Radix MMU, so the
writable mapping is restricted to the patching CPU
- Add an option to build the 64-bit big-endian kernel with the ELFv2
ABI
- Sanitise user registers on interrupt entry on 64-bit Book3S
- Many other small features and fixes
Thanks to Aboorva Devarajan, Angel Iglesias, Benjamin Gray, Bjorn
Helgaas, Bo Liu, Chen Lifu, Christoph Hellwig, Christophe JAILLET,
Christophe Leroy, Christopher M. Riedl, Colin Ian King, Deming Wang,
Disha Goel, Dmitry Torokhov, Finn Thain, Geert Uytterhoeven, Gustavo A.
R. Silva, Haowen Bai, Joel Stanley, Jordan Niethe, Julia Lawall, Kajol
Jain, Laurent Dufour, Li zeming, Miaoqian Lin, Michael Jeanson, Nathan
Lynch, Naveen N. Rao, Nayna Jain, Nicholas Miehlbradt, Nicholas Piggin,
Pali Rohár, Randy Dunlap, Rohan McLure, Russell Currey, Sathvika
Vasireddy, Shaomin Deng, Stephen Kitt, Stephen Rothwell, Thomas
Weißschuh, Tiezhu Yang, Uwe Kleine-König, Xie Shaowen, Xiu Jianfeng,
XueBing Chen, Yang Yingliang, Zhang Jiaming, ruanjinjie, Jessica Yu,
and Wolfram Sang.
* tag 'powerpc-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (181 commits)
powerpc/code-patching: Fix oops with DEBUG_VM enabled
powerpc/qspinlock: Fix 32-bit build
powerpc/prom: Fix 32-bit build
powerpc/rtas: mandate RTAS syscall filtering
powerpc/rtas: define pr_fmt and convert printk call sites
powerpc/rtas: clean up includes
powerpc/rtas: clean up rtas_error_log_max initialization
powerpc/pseries/eeh: use correct API for error log size
powerpc/rtas: avoid scheduling in rtas_os_term()
powerpc/rtas: avoid device tree lookups in rtas_os_term()
powerpc/rtasd: use correct OF API for event scan rate
powerpc/rtas: document rtas_call()
powerpc/pseries: unregister VPA when hot unplugging a CPU
powerpc/pseries: reset the RCU watchdogs after a LPM
powerpc: Take in account addition CPU node when building kexec FDT
powerpc: export the CPU node count
powerpc/cpuidle: Set CPUIDLE_FLAG_POLLING for snooze state
powerpc/dts/fsl: Fix pca954x i2c-mux node names
cxl: Remove unnecessary cxl_pci_window_alignment()
selftests/powerpc: Fix resource leaks
...
- Two minor feature patches which were awkwardly dependent on mm-nonmm.
I need to set up a new branch to handle such things.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY56V1wAKCRDdBJ7gKXxA
juVQAP9pr5XBx880RJEil6skMCxYJmae8LvYShhvxJi9keot7QEA3wZRlGcllw/3
fiHcsaBlXqtXBWUbtnMezcdP6gb3TQo=
=8T1p
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-12-17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more mm updates from Andrew Morton:
- A few late-breaking minor fixups
- Two minor feature patches which were awkwardly dependent on mm-nonmm.
I need to set up a new branch to handle such things.
* tag 'mm-stable-2022-12-17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
MAINTAINERS: zram: zsmalloc: Add an additional co-maintainer
mm/kmemleak: use %pK to display kernel pointers in backtrace
mm: use stack_depot for recording kmemleak's backtrace
maple_tree: update copyright dates for test code
maple_tree: fix mas_find_rev() comment
mm/gup_test: free memory allocated via kvcalloc() using kvfree()
Adjust some MADV_XXX constants to be in sync what their values are on
all other platforms. There is currently no reason to have an own
numbering on parisc, but it requires workarounds in many userspace
sources (e.g. glibc, qemu, ...) - which are often forgotten and thus
introduce bugs and different behaviour on parisc.
A wrapper avoids an ABI breakage for existing userspace applications by
translating any old values to the new ones, so this change allows us to
move over all programs to the new ABI over time.
Signed-off-by: Helge Deller <deller@gmx.de>
Libraries:
- Drop the old copy of libtraceevent in tools/lib/traceevent/ now that all major distros
ship it from its external repository.
This is now just another feature detection, emitting a warning when the
libtraceevent-dev[el] package isn't installed, disabling the build of perf features
and tools that strictly require parsing things from tracefs while keeping
the core functionality present and working with a subset of the events, the
most used ones like CPU cycles, hardware cache and also vendor events, etc.
This was tested with lots of containers for Fedora, Debian, OpenSUSE, Alpine Linux,
Ubuntu, with cross builds, etc.
Build:
- Update to C standard to gnu11, like was done for the kernel.
- Install the tools/lib/ libraries locally instead of having headers searched
directly from the source code directories, to help the cases where we can
build either from in-kernel source libraries or from the same library shipped
as a distro package, as is the case with libbpf and was the case with
libtraceevent.
perf stat:
- Do not delay the workload with --delay, the delay is just for starting to count
the events, to skip noise at workload startup.
- When we have events for each cgroup, the metric should be printed for each
cgroup separately.
$ perf stat -a --for-each-cgroup system.slice,user.slice --metric-only sleep 1
Performance counter stats for 'system wide':
GHz insn per cycle branch-misses of all branches
system.slice 3.792 0.61 3.24%
user.slice 3.661 2.32 0.37%
- Fix printing field separator in CSV metrics output.
- Fix --metric-only --json output.
- Fix summary output in CSV with --metric-only.
- Update event group check for support of uncore event.
perf test:
- Stop requiring a C toolchain in shell tests, instead add a workload option that has
all the previously C snippets built as part of 'perf test -w' that then get used in
the 'perf test' shell scripts.
- Add event group test for events in multiple PMUs
- The "kernel lock contention analysis" test should not print warnings in quiet mode.
- Add attr tests for ARM64's new VG register.
- Fix record test on KVM guests, as using precise flag with the
br_inst_retired.near_call event causes the test fail on KVM guests, even when
the guests have PMU forwarding enabled and the event itself is supported, so just
remove the precise flag from the event.
- Add mechanism for skipping attr tests on specific kernel versions where it is known that
these checks will fail.
- Skip watchpoint tests if no watchpoints available.
- Add more Intel PT 'perf test' entries: hybrid CPUs, split the packet decoder
into a suite of subtests.
perf script:
- Introduce task analyzer python script, where one first records some events:
Recording can be done in two ways:
$ perf script record tasks-analyzer -- sleep 10
$ perf record -e sched:sched_switch -a -- sleep 10
The script can parse any perf.data files, as long as it has sched:sched_switch events,
other events will be ignored.
The most simple report use case is to just call the script without arguments.
Runtime is the time the task was running on the CPU, Time Out-In is the time
between the process being scheduled *out* and scheduled back *in*. So the last
time span between two executions:
$ perf script report tasks-analyzer
Switched-In Switched-Out CPU PID TID Comm Runtime Time Out-In
15576.658891407 15576.659156086 4 2412 2428 gdbus 265 1949
15576.659111320 15576.659455410 0 2412 2412 gnome-shell 344 2267
15576.659491326 15576.659506173 2 74 74 kworker/2:1 15 13145
15576.659506173 15576.659825748 2 2858 2858 gnome-terminal- 320 63263
15576.659871270 15576.659902872 6 20932 20932 kworker/u16:0 32 2314582
15576.659909951 15576.659945501 3 27264 27264 sh 36 -1
15576.659853285 15576.659971052 7 27265 27265 perf 118 5050741
[...]
perf lock:
- Allow concurrent record and report to support live monitoring of kernel lock
contention without BPF:
# perf lock record -a -o- sleep 1 | perf lock contention -i-
contended total wait max wait avg wait type caller
2 10.27 us 6.17 us 5.13 us spinlock load_balance+0xc03
1 5.29 us 5.29 us 5.29 us rwlock:W ep_scan_ready_list+0x54
1 4.12 us 4.12 us 4.12 us spinlock smpboot_thread_fn+0x116
1 3.28 us 3.28 us 3.28 us mutex pipe_read+0x50
- Implement -t/--threads option when using BPF:
$ sudo ./perf lock contention -abt -E 5 sleep 1
contended total wait max wait avg wait pid comm
1 740.66 ms 740.66 ms 740.66 ms 1950 nv_queue
3 305.50 ms 298.19 ms 101.83 ms 1884 nvidia-modeset/
1 25.14 us 25.14 us 25.14 us 2725038 EventManager_De
12 23.09 us 9.30 us 1.92 us 0 swapper
1 20.18 us 20.18 us 20.18 us 2725033 EventManager_De
- Add -l/--lock-addr to aggregate per-lock-instance contention:
$ sudo ./perf lock contention -abl sleep 1
contended total wait max wait avg wait address symbol
1 36.28 us 36.28 us 36.28 us ffff92615d6448b8
9 10.91 us 1.84 us 1.21 us ffffffffbaed50c0 rcu_state
1 10.49 us 10.49 us 10.49 us ffff9262ac4f0c80
8 4.68 us 1.67 us 585 ns ffffffffbae07a40 jiffies_lock
3 3.03 us 1.45 us 1.01 us ffff9262277861e0
1 924 ns 924 ns 924 ns ffff926095ba9d20
1 436 ns 436 ns 436 ns ffff9260bfda4f60
perf record:
- Add remaining branch filters: "no_cycles", "no_flags" & "hw_index", to be
used with hardware such as Intel's LBR that allows things like stitching
stacks of two samples to overcome the limits of the number of LBR registers.
Symbol resolution:
- Handle .debug files created with 'objcopy --only-keep-debug', where program
headers are zeroed and thus can't be used for adjustments, use the info in
the runtime_ss (runtime ELF) instead.
perf trace:
- Add BPF based augmenter for the 'perf_event_open's 'struct perf_event_attr' argument.
- Add BPF based augmenter for the 'clock_gettime's 'struct timespec' argument.
- In both cases the syscall tracepoint has just the pointer value, we
need to hook a BPF program to collect the pointer contents, and then,
in userspace, pretty print it in 'perf trace'.
perf list:
- Introduce JSON output of events.
- Streamline how the expression specifying what events should be shown is handled,
fixing several corner cases, such as the metric filter that is specified as a glob
but was using strstr().
perf probe:
- Fix to avoid crashing if DW_AT_decl_file is NULL, coping with clang generating
DWARF5 like that.
- Use dwarf_attr_integrate() as generic DWARF attr accessor as it supersedes dwarf_attr(),
supporting abstact origin DIEs.
perf inject:
- Set PERF_RECORD_MISC_BUILD_ID_SIZE in the PERF_RECORD_HEADER_BUILD_ID so that
perf.data readers can get the real build-id size and avoid trailing zeros.
perf data:
- Add tracepoint fields when converting a perf.data file to JSON.
arm64:
- Fix mksyscalltbl, don't lose syscalls due to sort -nu.
- Add Arm Neoverse V2 PMU events.
riscv:
- Add riscv sbi firmware std event files.
- Add Sifive U74 vendor events (JSON) file.
- Add some more events and metrics for Alderlake/Alderlake-N.
Documentation:
- Add data documentation for the PMU structs in the C source code.
Miscellaneous:
- Periodic sanitization of headers, adding missing includes, removing needless ones,
creating new ones, etc.
- Use sig_atomic_t for signal handlers to avoid undefined behaviour in all perf
tools.
- Fixes for libbpf 1.0+ compatibility (maps, etc) on 'perf trace' BPF examples.
- Remove some old perf bpf examples, leave the best ones that demonstrate how
to associate BPF functions to points in the kernel.
- Make quiet mode consistent between tools.
- Use dedicated non-atomic clear/set bit helpers.
- Use "grep -E" instead of "egrep" as recommended by warning emitted by GNU
grep since at least version 3.8.
- Complete list of supported subcommands in the 'perf daemon' help message.
- Update John Garry's email address for arm64 perf tooling on the MAINTAINERS file,
he moved from Huawei to Oracle.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCY5yARAAKCRCyPKLppCJ+
J7bdAQCO4Y4gXKWv+AQc77aptQaCRmWy6T9ynsdv5gOV43NpCwD/TWZz8zcBqLSS
fxYSgf2kOQ3Z9soE4/udsL5sDhFbsgA=
=hLlg
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v6.2-1-2022-12-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
"Libraries:
- Drop the old copy of libtraceevent in tools/lib/traceevent/ now
that all major distros ship it from its external repository.
This is now just another feature detection, emitting a warning when
the libtraceevent-dev[el] package isn't installed, disabling the
build of perf features and tools that strictly require parsing
things from tracefs while keeping the core functionality present
and working with a subset of the events, the most used ones like
CPU cycles, hardware cache and also vendor events, etc.
This was tested with lots of containers for Fedora, Debian,
OpenSUSE, Alpine Linux, Ubuntu, with cross builds, etc.
Build:
- Update to C standard to gnu11, like was done for the kernel.
- Install the tools/lib/ libraries locally instead of having headers
searched directly from the source code directories, to help the
cases where we can build either from in-kernel source libraries or
from the same library shipped as a distro package, as is the case
with libbpf and was the case with libtraceevent.
perf stat:
- Do not delay the workload with --delay, the delay is just for
starting to count the events, to skip noise at workload startup.
- When we have events for each cgroup, the metric should be printed
for each cgroup separately.
$ perf stat -a --for-each-cgroup system.slice,user.slice --metric-only sleep 1
Performance counter stats for 'system wide':
GHz insn per cycle branch-misses of all branches
system.slice 3.792 0.61 3.24%
user.slice 3.661 2.32 0.37%
- Fix printing field separator in CSV metrics output.
- Fix --metric-only --json output.
- Fix summary output in CSV with --metric-only.
- Update event group check for support of uncore event.
perf test:
- Stop requiring a C toolchain in shell tests, instead add a workload
option that has all the previously C snippets built as part of
'perf test -w' that then get used in the 'perf test' shell scripts.
- Add event group test for events in multiple PMUs
- The "kernel lock contention analysis" test should not print
warnings in quiet mode.
- Add attr tests for ARM64's new VG register.
- Fix record test on KVM guests, as using precise flag with the
br_inst_retired.near_call event causes the test fail on KVM guests,
even when the guests have PMU forwarding enabled and the event
itself is supported, so just remove the precise flag from the
event.
- Add mechanism for skipping attr tests on specific kernel versions
where it is known that these checks will fail.
- Skip watchpoint tests if no watchpoints available.
- Add more Intel PT 'perf test' entries: hybrid CPUs, split the
packet decoder into a suite of subtests.
perf script:
- Introduce task analyzer python script, where one first records some events:
Recording can be done in two ways:
$ perf script record tasks-analyzer -- sleep 10
$ perf record -e sched:sched_switch -a -- sleep 10
The script can parse any perf.data files, as long as it has
sched:sched_switch events, other events will be ignored.
The most simple report use case is to just call the script without
arguments.
Runtime is the time the task was running on the CPU, Time Out-In is
the time between the process being scheduled *out* and scheduled
back *in*. So the last time span between two executions:
$ perf script report tasks-analyzer
Switched-In Switched-Out CPU PID TID Comm Runtime Time Out-In
15576.658891407 15576.659156086 4 2412 2428 gdbus 265 1949
15576.659111320 15576.659455410 0 2412 2412 gnome-shell 344 2267
15576.659491326 15576.659506173 2 74 74 kworker/2:1 15 13145
15576.659506173 15576.659825748 2 2858 2858 gnome-terminal- 320 63263
15576.659871270 15576.659902872 6 20932 20932 kworker/u16:0 32 2314582
15576.659909951 15576.659945501 3 27264 27264 sh 36 -1
15576.659853285 15576.659971052 7 27265 27265 perf 118 5050741
[...]
perf lock:
- Allow concurrent record and report to support live monitoring of
kernel lock contention without BPF:
# perf lock record -a -o- sleep 1 | perf lock contention -i-
contended total wait max wait avg wait type caller
2 10.27 us 6.17 us 5.13 us spinlock load_balance+0xc03
1 5.29 us 5.29 us 5.29 us rwlock:W ep_scan_ready_list+0x54
1 4.12 us 4.12 us 4.12 us spinlock smpboot_thread_fn+0x116
1 3.28 us 3.28 us 3.28 us mutex pipe_read+0x50
- Implement -t/--threads option when using BPF:
$ sudo ./perf lock contention -abt -E 5 sleep 1
contended total wait max wait avg wait pid comm
1 740.66 ms 740.66 ms 740.66 ms 1950 nv_queue
3 305.50 ms 298.19 ms 101.83 ms 1884 nvidia-modeset/
1 25.14 us 25.14 us 25.14 us 2725038 EventManager_De
12 23.09 us 9.30 us 1.92 us 0 swapper
1 20.18 us 20.18 us 20.18 us 2725033 EventManager_De
- Add -l/--lock-addr to aggregate per-lock-instance contention:
$ sudo ./perf lock contention -abl sleep 1
contended total wait max wait avg wait address symbol
1 36.28 us 36.28 us 36.28 us ffff92615d6448b8
9 10.91 us 1.84 us 1.21 us ffffffffbaed50c0 rcu_state
1 10.49 us 10.49 us 10.49 us ffff9262ac4f0c80
8 4.68 us 1.67 us 585 ns ffffffffbae07a40 jiffies_lock
3 3.03 us 1.45 us 1.01 us ffff9262277861e0
1 924 ns 924 ns 924 ns ffff926095ba9d20
1 436 ns 436 ns 436 ns ffff9260bfda4f60
perf record:
- Add remaining branch filters: "no_cycles", "no_flags" & "hw_index",
to be used with hardware such as Intel's LBR that allows things
like stitching stacks of two samples to overcome the limits of the
number of LBR registers.
Symbol resolution:
- Handle .debug files created with 'objcopy --only-keep-debug', where
program headers are zeroed and thus can't be used for adjustments,
use the info in the runtime_ss (runtime ELF) instead.
perf trace:
- Add BPF based augmenter for the 'perf_event_open's 'struct
perf_event_attr' argument.
- Add BPF based augmenter for the 'clock_gettime's 'struct timespec'
argument.
- In both cases the syscall tracepoint has just the pointer value, we
need to hook a BPF program to collect the pointer contents, and
then, in userspace, pretty print it in 'perf trace'.
perf list:
- Introduce JSON output of events.
- Streamline how the expression specifying what events should be
shown is handled, fixing several corner cases, such as the metric
filter that is specified as a glob but was using strstr().
perf probe:
- Fix to avoid crashing if DW_AT_decl_file is NULL, coping with clang
generating DWARF5 like that.
- Use dwarf_attr_integrate() as generic DWARF attr accessor as it
supersedes dwarf_attr(), supporting abstact origin DIEs.
perf inject:
- Set PERF_RECORD_MISC_BUILD_ID_SIZE in the PERF_RECORD_HEADER_BUILD_ID
so that perf.data readers can get the real build-id size and avoid
trailing zeroes.
perf data:
- Add tracepoint fields when converting a perf.data file to JSON.
arm64:
- Fix mksyscalltbl, don't lose syscalls due to sort -nu.
- Add Arm Neoverse V2 PMU events.
riscv:
- Add riscv sbi firmware std event files.
- Add Sifive U74 vendor events (JSON) file.
- Add some more events and metrics for Alderlake/Alderlake-N.
Documentation:
- Add data documentation for the PMU structs in the C source code.
Miscellaneous:
- Periodic sanitization of headers, adding missing includes, removing
needless ones, creating new ones, etc.
- Use sig_atomic_t for signal handlers to avoid undefined behaviour
in all perf tools.
- Fixes for libbpf 1.0+ compatibility (maps, etc) on 'perf trace' BPF
examples.
- Remove some old perf bpf examples, leave the best ones that
demonstrate how to associate BPF functions to points in the kernel.
- Make quiet mode consistent between tools.
- Use dedicated non-atomic clear/set bit helpers.
- Use "grep -E" instead of "egrep" as recommended by warning emitted
by GNU grep since at least version 3.8.
- Complete list of supported subcommands in the 'perf daemon' help
message.
- Update John Garry's email address for arm64 perf tooling on the
MAINTAINERS file, he moved from Huawei to Oracle"
* tag 'perf-tools-for-v6.2-1-2022-12-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (239 commits)
libperf: Fix install_pkgconfig target
perf tools: Use "grep -E" instead of "egrep"
perf stat: Do not delay the workload with --delay
perf evlist: Remove group option.
perf build: Fix python/perf.so library's name
perf test arm64: Add attr tests for new VG register
perf test: Add mechanism for skipping attr tests on kernel versions
perf test: Add mechanism for skipping attr tests on auxiliary vector values
perf test: Add ability to test exit code for attr tests
perf test: add new task-analyzer tests
perf script: task-analyzer add csv support
perf script: Introduce task analyzer python script
perf cs-etm: Print auxtrace info even if OpenCSD isn't linked
perf cs-etm: Cleanup cs_etm__process_auxtrace_info()
perf cs-etm: Tidy up auxtrace info header printing
perf cs-etm: Remove unused stub methods
perf cs-etm: Print unknown header version as an error
perf test: Update perf lock contention test
perf lock contention: Add -l/--lock-addr option
perf lock contention: Implement -t/--threads option for BPF
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY5yc9AAKCRDbK58LschI
g3n4AP4heagqBHPH+WcC0N/Vc4K2dmPmil4ZTuZ/Xt+EMrX0MQEAygWsa272V2C9
vx51DOu5/D+DPC20/1+mRpEnC4JIqAA=
=utn+
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2022-12-16
We've added 7 non-merge commits during the last 2 day(s) which contain
a total of 9 files changed, 119 insertions(+), 36 deletions(-).
1) Fix for recent syzkaller XDP dispatcher update splat, from Jiri Olsa.
2) Fix BPF program refcount leak in LSM attachment failure path,
from Milan Landaverde.
3) Fix BPF program type in map compatibility check for fext,
from Toke Høiland-Jørgensen.
4) Fix a BPF selftest compilation error under !CONFIG_SMP config,
from Yonghong Song.
5) Fix CI to enable CONFIG_FUNCTION_ERROR_INJECTION after it got changed
to a prompt, from Song Liu.
6) Various BPF documentation fixes for socket local storage,
from Donald Hunter.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Add a test for using a cpumap from an freplace-to-XDP program
bpf: Resolve fext program type when checking map compatibility
bpf: Synchronize dispatcher update with bpf_dispatcher_xdp_func
bpf: prevent leak of lsm program after failed attach
selftests/bpf: Select CONFIG_FUNCTION_ERROR_INJECTION
selftests/bpf: Fix a selftest compilation error with CONFIG_SMP=n
docs/bpf: Reword docs for BPF_MAP_TYPE_SK_STORAGE
====================
Link: https://lore.kernel.org/r/20221216174540.16598-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To resolve a trivial merge conflict with c302378bc1 ("libbpf:
Hashmap interface update to allow both long and void* keys/values"),
where a function present upstream was removed in the perf tools
development tree.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Here is the large set of USB and Thunderbolt driver changes for 6.2-rc1.
Overall, thanks to the removal of a driver, more lines were removed than
added, a nice change. Highlights include:
- removal of the sisusbvga driver that was not used by anyone anymore
- minor thunderbolt driver changes and tweaks
- chipidea driver updates
- usual set of typec driver features and hardware support added
- musb minor driver fixes
- fotg210 driver fixes, bringing that hardware back from the "dead"
- minor dwc3 driver updates
- addition, and then removal, of a list.h helper function for many USB
and other subsystem drivers, that ended up breaking the build. That
will come back for 6.3-rc1, it missed this merge window.
- usual xhci updates and enhancements
- usb-serial driver updates and support for new devices
- other minor USB driver updates
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY5wvYg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yl5DACgssl/ag4zDePHpfoiG5zEGEzH8XsAoMFrzvzu
d43hsH3qsfDGSZRkJJMu
=ORDd
-----END PGP SIGNATURE-----
Merge tag 'usb-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB and Thunderbolt driver updates from Greg KH:
"Here is the large set of USB and Thunderbolt driver changes for
6.2-rc1. Overall, thanks to the removal of a driver, more lines were
removed than added, a nice change. Highlights include:
- removal of the sisusbvga driver that was not used by anyone anymore
- minor thunderbolt driver changes and tweaks
- chipidea driver updates
- usual set of typec driver features and hardware support added
- musb minor driver fixes
- fotg210 driver fixes, bringing that hardware back from the "dead"
- minor dwc3 driver updates
- addition, and then removal, of a list.h helper function for many
USB and other subsystem drivers, that ended up breaking the build.
That will come back for 6.3-rc1, it missed this merge window.
- usual xhci updates and enhancements
- usb-serial driver updates and support for new devices
- other minor USB driver updates
All of these have been in linux-next for a while with no reported
problems"
* tag 'usb-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (153 commits)
usb: gadget: uvc: Rename bmInterfaceFlags -> bmInterlaceFlags
usb: dwc2: power on/off phy for peripheral mode in dual-role mode
usb: dwc2: disable lpm feature on Rockchip SoCs
dt-bindings: usb: mtk-xhci: add support for mt7986
usb: dwc3: core: defer probe on ulpi_read_id timeout
usb: ulpi: defer ulpi_register on ulpi_read_id timeout
usb: misc: onboard_usb_hub: add Genesys Logic GL850G hub support
dt-bindings: usb: Add binding for Genesys Logic GL850G hub controller
dt-bindings: vendor-prefixes: add Genesys Logic
usb: fotg210-udc: fix potential memory leak in fotg210_udc_probe()
usb: typec: tipd: Set mode of operation for USB Type-C connector
usb: gadget: udc: drop obsolete dependencies on COMPILE_TEST
usb: musb: remove extra check in musb_gadget_vbus_draw
usb: gadget: uvc: Prevent buffer overflow in setup handler
usb: dwc3: qcom: Fix memory leak in dwc3_qcom_interconnect_init
usb: typec: wusb3801: fix fwnode refcount leak in wusb3801_probe()
usb: storage: Add check for kcalloc
USB: sisusbvga: use module_usb_driver()
USB: sisusbvga: rename sisusb.c to sisusbvga.c
USB: sisusbvga: remove console support
...
NetworkManager (and other daemons) may bring the interface up
and cause failures in quiescence checks. Print a helpful warning,
and take the interface down again.
I seem to forget about this every time I run these tests on a new VM.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
$number + > bash means redirect FD $number, e.g. commonly
used 2> redirects stderr (fd 2). The test uses 8192> to
write the number 8192 to a file, this results in:
./devlink.sh: line 499: 8192: Bad file descriptor
Oddly the test also papers over this issue by checking
for failure (expecting an error rather than success)
so it passes, anyway.
Fixes: ff18176ad8 ("selftests: Add a test of large binary to devlink health test")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the span to the year of the development.
Link: https://lkml.kernel.org/r/20221025173709.2718725-1-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* Enable the per-vcpu dirty-ring tracking mechanism, together with an
option to keep the good old dirty log around for pages that are
dirtied by something other than a vcpu.
* Switch to the relaxed parallel fault handling, using RCU to delay
page table reclaim and giving better performance under load.
* Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option,
which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a97d:
"Fix a number of issues with MTE, such as races on the tags being
initialised vs the PG_mte_tagged flag as well as the lack of support
for VM_SHARED when KVM is involved. Patches from Catalin Marinas and
Peter Collingbourne").
* Merge the pKVM shadow vcpu state tracking that allows the hypervisor
to have its own view of a vcpu, keeping that state private.
* Add support for the PMUv3p5 architecture revision, bringing support
for 64bit counters on systems that support it, and fix the
no-quite-compliant CHAIN-ed counter support for the machines that
actually exist out there.
* Fix a handful of minor issues around 52bit VA/PA support (64kB pages
only) as a prefix of the oncoming support for 4kB and 16kB pages.
* Pick a small set of documentation and spelling fixes, because no
good merge window would be complete without those.
s390:
* Second batch of the lazy destroy patches
* First batch of KVM changes for kernel virtual != physical address support
* Removal of a unused function
x86:
* Allow compiling out SMM support
* Cleanup and documentation of SMM state save area format
* Preserve interrupt shadow in SMM state save area
* Respond to generic signals during slow page faults
* Fixes and optimizations for the non-executable huge page errata fix.
* Reprogram all performance counters on PMU filter change
* Cleanups to Hyper-V emulation and tests
* Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest
running on top of a L1 Hyper-V hypervisor)
* Advertise several new Intel features
* x86 Xen-for-KVM:
** Allow the Xen runstate information to cross a page boundary
** Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
** Add support for 32-bit guests in SCHEDOP_poll
* Notable x86 fixes and cleanups:
** One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
** Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
years back when eliminating unnecessary barriers when switching between
vmcs01 and vmcs02.
** Clean up vmread_error_trampoline() to make it more obvious that params
must be passed on the stack, even for x86-64.
** Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
of the current guest CPUID.
** Fudge around a race with TSC refinement that results in KVM incorrectly
thinking a guest needs TSC scaling when running on a CPU with a
constant TSC, but no hardware-enumerated TSC frequency.
** Advertise (on AMD) that the SMM_CTL MSR is not supported
** Remove unnecessary exports
Generic:
* Support for responding to signals during page faults; introduces
new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
Selftests:
* Fix an inverted check in the access tracking perf test, and restore
support for asserting that there aren't too many idle pages when
running on bare metal.
* Fix build errors that occur in certain setups (unsure exactly what is
unique about the problematic setup) due to glibc overriding
static_assert() to a variant that requires a custom message.
* Introduce actual atomics for clear/set_bit() in selftests
* Add support for pinning vCPUs in dirty_log_perf_test.
* Rename the so called "perf_util" framework to "memstress".
* Add a lightweight psuedo RNG for guest use, and use it to randomize
the access pattern and write vs. read percentage in the memstress tests.
* Add a common ucall implementation; code dedup and pre-work for running
SEV (and beyond) guests in selftests.
* Provide a common constructor and arch hook, which will eventually be
used by x86 to automatically select the right hypercall (AMD vs. Intel).
* A bunch of added/enabled/fixed selftests for ARM64, covering memslots,
breakpoints, stage-2 faults and access tracking.
* x86-specific selftest changes:
** Clean up x86's page table management.
** Clean up and enhance the "smaller maxphyaddr" test, and add a related
test to cover generic emulation failure.
** Clean up the nEPT support checks.
** Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
** Fix an ordering issue in the AMX test introduced by recent conversions
to use kvm_cpu_has(), and harden the code to guard against similar bugs
in the future. Anything that tiggers caching of KVM's supported CPUID,
kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if
the caching occurs before the test opts in via prctl().
Documentation:
* Remove deleted ioctls from documentation
* Clean up the docs for the x86 MSR filter.
* Various fixes
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOaFrcUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPemQgAq49excg2Cc+EsHnZw3vu/QWdA0Rt
KhL3OgKxuHNjCbD2O9n2t5di7eJOTQ7F7T0eDm3xPTr4FS8LQ2327/mQePU/H2CF
mWOpq9RBWLzFsSTeVA2Mz9TUTkYSnDHYuRsBvHyw/n9cL76BWVzjImldFtjYjjex
yAwl8c5itKH6bc7KO+5ydswbvBzODkeYKUSBNdbn6m0JGQST7XppNwIAJvpiHsii
Qgpk0e4Xx9q4PXG/r5DedI6BlufBsLhv0aE9SHPzyKH3JbbUFhJYI8ZD5OhBQuYW
MwxK2KlM5Jm5ud2NZDDlsMmmvd1lnYCFDyqNozaKEWC1Y5rq1AbMa51fXA==
=QAYX
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM64:
- Enable the per-vcpu dirty-ring tracking mechanism, together with an
option to keep the good old dirty log around for pages that are
dirtied by something other than a vcpu.
- Switch to the relaxed parallel fault handling, using RCU to delay
page table reclaim and giving better performance under load.
- Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
option, which multi-process VMMs such as crosvm rely on (see merge
commit 382b5b87a97d: "Fix a number of issues with MTE, such as
races on the tags being initialised vs the PG_mte_tagged flag as
well as the lack of support for VM_SHARED when KVM is involved.
Patches from Catalin Marinas and Peter Collingbourne").
- Merge the pKVM shadow vcpu state tracking that allows the
hypervisor to have its own view of a vcpu, keeping that state
private.
- Add support for the PMUv3p5 architecture revision, bringing support
for 64bit counters on systems that support it, and fix the
no-quite-compliant CHAIN-ed counter support for the machines that
actually exist out there.
- Fix a handful of minor issues around 52bit VA/PA support (64kB
pages only) as a prefix of the oncoming support for 4kB and 16kB
pages.
- Pick a small set of documentation and spelling fixes, because no
good merge window would be complete without those.
s390:
- Second batch of the lazy destroy patches
- First batch of KVM changes for kernel virtual != physical address
support
- Removal of a unused function
x86:
- Allow compiling out SMM support
- Cleanup and documentation of SMM state save area format
- Preserve interrupt shadow in SMM state save area
- Respond to generic signals during slow page faults
- Fixes and optimizations for the non-executable huge page errata
fix.
- Reprogram all performance counters on PMU filter change
- Cleanups to Hyper-V emulation and tests
- Process Hyper-V TLB flushes from a nested guest (i.e. from a L2
guest running on top of a L1 Hyper-V hypervisor)
- Advertise several new Intel features
- x86 Xen-for-KVM:
- Allow the Xen runstate information to cross a page boundary
- Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
- Add support for 32-bit guests in SCHEDOP_poll
- Notable x86 fixes and cleanups:
- One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
- Reinstate IBPB on emulated VM-Exit that was incorrectly dropped
a few years back when eliminating unnecessary barriers when
switching between vmcs01 and vmcs02.
- Clean up vmread_error_trampoline() to make it more obvious that
params must be passed on the stack, even for x86-64.
- Let userspace set all supported bits in MSR_IA32_FEAT_CTL
irrespective of the current guest CPUID.
- Fudge around a race with TSC refinement that results in KVM
incorrectly thinking a guest needs TSC scaling when running on a
CPU with a constant TSC, but no hardware-enumerated TSC
frequency.
- Advertise (on AMD) that the SMM_CTL MSR is not supported
- Remove unnecessary exports
Generic:
- Support for responding to signals during page faults; introduces
new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
Selftests:
- Fix an inverted check in the access tracking perf test, and restore
support for asserting that there aren't too many idle pages when
running on bare metal.
- Fix build errors that occur in certain setups (unsure exactly what
is unique about the problematic setup) due to glibc overriding
static_assert() to a variant that requires a custom message.
- Introduce actual atomics for clear/set_bit() in selftests
- Add support for pinning vCPUs in dirty_log_perf_test.
- Rename the so called "perf_util" framework to "memstress".
- Add a lightweight psuedo RNG for guest use, and use it to randomize
the access pattern and write vs. read percentage in the memstress
tests.
- Add a common ucall implementation; code dedup and pre-work for
running SEV (and beyond) guests in selftests.
- Provide a common constructor and arch hook, which will eventually
be used by x86 to automatically select the right hypercall (AMD vs.
Intel).
- A bunch of added/enabled/fixed selftests for ARM64, covering
memslots, breakpoints, stage-2 faults and access tracking.
- x86-specific selftest changes:
- Clean up x86's page table management.
- Clean up and enhance the "smaller maxphyaddr" test, and add a
related test to cover generic emulation failure.
- Clean up the nEPT support checks.
- Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
- Fix an ordering issue in the AMX test introduced by recent
conversions to use kvm_cpu_has(), and harden the code to guard
against similar bugs in the future. Anything that tiggers
caching of KVM's supported CPUID, kvm_cpu_has() in this case,
effectively hides opt-in XSAVE features if the caching occurs
before the test opts in via prctl().
Documentation:
- Remove deleted ioctls from documentation
- Clean up the docs for the x86 MSR filter.
- Various fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits)
KVM: x86: Add proper ReST tables for userspace MSR exits/flags
KVM: selftests: Allocate ucall pool from MEM_REGION_DATA
KVM: arm64: selftests: Align VA space allocator with TTBR0
KVM: arm64: Fix benign bug with incorrect use of VA_BITS
KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow
KVM: x86: Advertise that the SMM_CTL MSR is not supported
KVM: x86: remove unnecessary exports
KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic"
tools: KVM: selftests: Convert clear/set_bit() to actual atomics
tools: Drop "atomic_" prefix from atomic test_and_set_bit()
tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers
KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests
perf tools: Use dedicated non-atomic clear/set bit helpers
tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers
KVM: arm64: selftests: Enable single-step without a "full" ucall()
KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself
KVM: Remove stale comment about KVM_REQ_UNHALT
KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR
KVM: Reference to kvm_userspace_memory_region in doc and comments
KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
...
This adds a simple test for inserting an XDP program into a cpumap that is
"owned" by an XDP program that was loaded as PROG_TYPE_EXT (as libxdp
does). Prior to the kernel fix this would fail because the map type
ownership would be set to PROG_TYPE_EXT instead of being resolved to
PROG_TYPE_XDP.
v5:
- Fix a few nits from Andrii, add his ACK
v4:
- Use skeletons for selftest
v3:
- Update comment to better explain the cause
- Add Yonghong's ACK
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20221214230254.790066-2-toke@redhat.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Since uprobe's argument must contain the user-space data, that
should not be converted to kernel symbols. Reject if user
specifies these types on uprobe events. e.g.
/sys/kernel/debug/tracing # echo 'p /bin/sh:10 %ax:symbol' >> uprobe_events
sh: write error: Invalid argument
/sys/kernel/debug/tracing # echo 'p /bin/sh:10 %ax:symstr' >> uprobe_events
sh: write error: Invalid argument
/sys/kernel/debug/tracing # cat error_log
[ 1783.134883] trace_uprobe: error: Unknown type is specified
Command: p /bin/sh:10 %ax:symbol
^
[ 1792.201120] trace_uprobe: error: Unknown type is specified
Command: p /bin/sh:10 %ax:symstr
^
Link: https://lore.kernel.org/all/166679931679.1528100.15540755370726009882.stgit@devnote3/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
been long in the making. It is a lighterweight software-only fix for
Skylake-based cores where enabling IBRS is a big hammer and causes a
significant performance impact.
What it basically does is, it aligns all kernel functions to 16 bytes
boundary and adds a 16-byte padding before the function, objtool
collects all functions' locations and when the mitigation gets applied,
it patches a call accounting thunk which is used to track the call depth
of the stack at any time.
When that call depth reaches a magical, microarchitecture-specific value
for the Return Stack Buffer, the code stuffs that RSB and avoids its
underflow which could otherwise lead to the Intel variant of Retbleed.
This software-only solution brings a lot of the lost performance back,
as benchmarks suggest:
https://lore.kernel.org/all/20220915111039.092790446@infradead.org/
That page above also contains a lot more detailed explanation of the
whole mechanism
- Implement a new control flow integrity scheme called FineIBT which is
based on the software kCFI implementation and uses hardware IBT support
where present to annotate and track indirect branches using a hash to
validate them
- Other misc fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmOZp5EACgkQEsHwGGHe
VUrZFxAAvi/+8L0IYSK4mKJvixGbTFjxN/Swo2JVOfs34LqGUT6JaBc+VUMwZxdb
VMTFIZ3ttkKEodjhxGI7oGev6V8UfhI37SmO2lYKXpQVjXXnMlv/M+Vw3teE38CN
gopi+xtGnT1IeWQ3tc/Tv18pleJ0mh5HKWiW+9KoqgXj0wgF9x4eRYDz1TDCDA/A
iaBzs56j8m/FSykZHnrWZ/MvjKNPdGlfJASUCPeTM2dcrXQGJ93+X2hJctzDte0y
Nuiw6Y0htfFBE7xoJn+sqm5Okr+McoUM18/CCprbgSKYk18iMYm3ZtAi6FUQZS1A
ua4wQCf49loGp15PO61AS5d3OBf5D3q/WihQRbCaJvTVgPp9sWYnWwtcVUuhMllh
ZQtBU9REcVJ/22bH09Q9CjBW0VpKpXHveqQdqRDViLJ6v/iI6EFGmD24SW/VxyRd
73k9MBGrL/dOf1SbEzdsnvcSB3LGzp0Om8o/KzJWOomrVKjBCJy16bwTEsCZEJmP
i406m92GPXeaN1GhTko7vmF0GnkEdJs1GVCZPluCAxxbhHukyxHnrjlQjI4vC80n
Ylc0B3Kvitw7LGJsPqu+/jfNHADC/zhx1qz/30wb5cFmFbN1aRdp3pm8JYUkn+l/
zri2Y6+O89gvE/9/xUhMohzHsWUO7xITiBavewKeTP9GSWybWUs=
=cRy1
-----END PGP SIGNATURE-----
Merge tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Borislav Petkov:
- Add the call depth tracking mitigation for Retbleed which has been
long in the making. It is a lighterweight software-only fix for
Skylake-based cores where enabling IBRS is a big hammer and causes a
significant performance impact.
What it basically does is, it aligns all kernel functions to 16 bytes
boundary and adds a 16-byte padding before the function, objtool
collects all functions' locations and when the mitigation gets
applied, it patches a call accounting thunk which is used to track
the call depth of the stack at any time.
When that call depth reaches a magical, microarchitecture-specific
value for the Return Stack Buffer, the code stuffs that RSB and
avoids its underflow which could otherwise lead to the Intel variant
of Retbleed.
This software-only solution brings a lot of the lost performance
back, as benchmarks suggest:
https://lore.kernel.org/all/20220915111039.092790446@infradead.org/
That page above also contains a lot more detailed explanation of the
whole mechanism
- Implement a new control flow integrity scheme called FineIBT which is
based on the software kCFI implementation and uses hardware IBT
support where present to annotate and track indirect branches using a
hash to validate them
- Other misc fixes and cleanups
* tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits)
x86/paravirt: Use common macro for creating simple asm paravirt functions
x86/paravirt: Remove clobber bitmask from .parainstructions
x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al
x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit
x86/Kconfig: Enable kernel IBT by default
x86,pm: Force out-of-line memcpy()
objtool: Fix weak hole vs prefix symbol
objtool: Optimize elf_dirty_reloc_sym()
x86/cfi: Add boot time hash randomization
x86/cfi: Boot time selection of CFI scheme
x86/ibt: Implement FineIBT
objtool: Add --cfi to generate the .cfi_sites section
x86: Add prefix symbols for function padding
objtool: Add option to generate prefix symbols
objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
objtool: Slice up elf_create_section_symbol()
kallsyms: Revert "Take callthunks into account"
x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces
x86/retpoline: Fix crash printing warning
x86/paravirt: Fix a !PARAVIRT build warning
...
* add tests that trigger reallocation of memblock structures from
memblock itself via memblock_double_array()
* add tests for memblock_alloc_exact_nid_raw() that verify that requested
node and memory range constraints are respected.
-----BEGIN PGP SIGNATURE-----
iQFMBAABCAA2FiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmOYL14YHG1pa2UucmFw
b3BvcnRAZ21haWwuY29tAAoJEDkDhibLDv2RZdcH/2AE447oXzVO2lzOgkqQH1EX
xJdaa7hu00h2Euzv2lgcOHroHGXDP8wYjUV2cEyNZMP0WOMiO8i6rwIKmrzWufcm
R+ZoKPQV/Nc+7rIycpW455yLxcgsVIpUILK2BQEkDCGYugSHKb7IYdcA9KDJwtmR
xIG9j8nsuwWJtmtAuQqNOBmsc5FzKNYFa/RtDiJoMFmQNK3UqB8G8VCASdP0DYvH
7MXPcyRmlwpmOsKoNKi2/wQBsiag8/PLgcZv5vYg+E6no1tMG6u7pgDS12Sn6ZvA
I8gThJ8HNAo0d1O2SnbkicMx2CqrPFSub3QXaEFjCZF5mdBcirxHc/VBKj50TXU=
=iXEA
-----END PGP SIGNATURE-----
Merge tag 'memblock-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock updates from Mike Rapoport:
"Extend test coverage:
- add tests that trigger reallocation of memblock structures from
memblock itself via memblock_double_array()
- add tests for memblock_alloc_exact_nid_raw() that verify that
requested node and memory range constraints are respected"
* tag 'memblock-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock tests: remove completed TODO item
memblock tests: add generic NUMA tests for memblock_alloc_exact_nid_raw
memblock tests: add bottom-up NUMA tests for memblock_alloc_exact_nid_raw
memblock tests: add top-down NUMA tests for memblock_alloc_exact_nid_raw
memblock tests: introduce range tests for memblock_alloc_exact_nid_raw
memblock test: Update TODO list
memblock test: Add test to memblock_reserve() 129th region
memblock test: Add test to memblock_add() 129th region
The latest version of grep claims the egrep is now obsolete so the build
now contains warnings that look like:
egrep: warning: egrep is obsolescent; using grep -E
fix this up by moving the related file to use "grep -E" instead.
sed -i "s/egrep/grep -E/g" `grep egrep -rwl tools/perf`
Here are the steps to install the latest grep:
wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz
tar xf grep-3.8.tar.gz
cd grep-3.8 && ./configure && make
sudo make install
export PATH=/usr/local/bin:$PATH
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1668762999-9297-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -D/--delay option is to delay the measure after the program starts.
But the current code goes to sleep before starting the program so the
program is delayed too. This is not the intention, let's fix it.
Before:
$ time sudo ./perf stat -a -e cycles -D 3000 sleep 4
Events disabled
Events enabled
Performance counter stats for 'system wide':
4,326,949,337 cycles
4.007494118 seconds time elapsed
real 0m7.474s
user 0m0.356s
sys 0m0.120s
It ran the workload for 4 seconds and gave the 3 second delay. So it
should skip the first 3 second and measure the last 1 second only. But
as you can see, it delays 3 seconds and ran the workload after that for
4 seconds. So the total time (real) was 7 seconds.
After:
$ time sudo ./perf stat -a -e cycles -D 3000 sleep 4
Events disabled
Events enabled
Performance counter stats for 'system wide':
1,063,551,013 cycles
1.002769510 seconds time elapsed
real 0m4.484s
user 0m0.385s
sys 0m0.086s
The bug was introduced when it changed enablement of system-wide events
with a command line workload. But it should've considered the initial
delay case. The code was reworked since then (in bb8bc52e75) so I'm
afraid it won't be applied cleanly.
Fixes: d0a0a51149 ("perf stat: Fix forked applications enablement of counters")
Reported-by: Kevin Nomura <nomurak@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Link: https://lore.kernel.org/r/20221212230820.901382-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The group option predates grouping events using curly braces added in
commit 89efb02950 ("perf tools: Add support to parse event group
syntax").
The --group option was retained for legacy support (in August
2012) but keeping it adds complexity.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Eelco Chaudron <echaudro@redhat.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Shaomin Deng <dengshaomin@cdjrlc.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213232651.1269909-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
BPF selftests require CONFIG_FUNCTION_ERROR_INJECTION to work. However,
CONFIG_FUNCTION_ERROR_INJECTION is no longer 'y' by default after recent
changes. As a result, we are seeing errors like the following from BPF CI:
bpf_testmod_test_read() is not modifiable
__x64_sys_setdomainname is not sleepable
__x64_sys_getpgid is not sleepable
Fix this by explicitly selecting CONFIG_FUNCTION_ERROR_INJECTION in the
selftest config.
Fixes: a4412fdd49 ("error-injection: Add prompt for function error injection")
Reported-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/bpf/20221213220500.3427947-1-song@kernel.org
Kernel test robot reported bpf selftest build failure when CONFIG_SMP
is not set. The error message looks below:
>> progs/rcu_read_lock.c:256:34: error: no member named 'last_wakee' in 'struct task_struct'
last_wakee = task->real_parent->last_wakee;
~~~~~~~~~~~~~~~~~ ^
1 error generated.
When CONFIG_SMP is not set, the field 'last_wakee' is not available in struct
'task_struct'. Hence the above compilation failure. To fix the issue, let us
choose another field 'group_leader' which is available regardless of
CONFIG_SMP set or not.
Fixes: fe147956fc ("bpf/selftests: Add selftests for new task kfuncs")
Fixes: 48671232fc ("selftests/bpf: Add tests for bpf_rcu_read_lock()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20221213012224.379581-1-yhs@fb.com
iommufd is the user API to control the IOMMU subsystem as it relates to
managing IO page tables that point at user space memory.
It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO
container) which is the VFIO specific interface for a similar idea.
We see a broad need for extended features, some being highly IOMMU device
specific:
- Binding iommu_domain's to PASID/SSID
- Userspace IO page tables, for ARM, x86 and S390
- Kernel bypassed invalidation of user page tables
- Re-use of the KVM page table in the IOMMU
- Dirty page tracking in the IOMMU
- Runtime Increase/Decrease of IOPTE size
- PRI support with faults resolved in userspace
Many of these HW features exist to support VM use cases - for instance the
combination of PASID, PRI and Userspace IO Page Tables allows an
implementation of DMA Shared Virtual Addressing (vSVA) within a
guest. Dirty tracking enables VM live migration with SRIOV devices and
PASID support allow creating "scalable IOV" devices, among other things.
As these features are fundamental to a VM platform they need to be
uniformly exposed to all the driver families that do DMA into VMs, which
is currently VFIO and VDPA.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCY5ct7wAKCRCFwuHvBreF
YZZ5AQDciXfcgXLt0UBEmWupNb0f/asT6tk717pdsKm8kAZMNAEAsIyLiKT5HqGl
s7fAu+CQ1pr9+9NKGevD+frw8Solsw4=
=jJkd
-----END PGP SIGNATURE-----
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd implementation from Jason Gunthorpe:
"iommufd is the user API to control the IOMMU subsystem as it relates
to managing IO page tables that point at user space memory.
It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO
container) which is the VFIO specific interface for a similar idea.
We see a broad need for extended features, some being highly IOMMU
device specific:
- Binding iommu_domain's to PASID/SSID
- Userspace IO page tables, for ARM, x86 and S390
- Kernel bypassed invalidation of user page tables
- Re-use of the KVM page table in the IOMMU
- Dirty page tracking in the IOMMU
- Runtime Increase/Decrease of IOPTE size
- PRI support with faults resolved in userspace
Many of these HW features exist to support VM use cases - for instance
the combination of PASID, PRI and Userspace IO Page Tables allows an
implementation of DMA Shared Virtual Addressing (vSVA) within a guest.
Dirty tracking enables VM live migration with SRIOV devices and PASID
support allow creating "scalable IOV" devices, among other things.
As these features are fundamental to a VM platform they need to be
uniformly exposed to all the driver families that do DMA into VMs,
which is currently VFIO and VDPA"
For more background, see the extended explanations in Jason's pull request:
https://lore.kernel.org/lkml/Y5dzTU8dlmXTbzoJ@nvidia.com/
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (62 commits)
iommufd: Change the order of MSI setup
iommufd: Improve a few unclear bits of code
iommufd: Fix comment typos
vfio: Move vfio group specific code into group.c
vfio: Refactor dma APIs for emulated devices
vfio: Wrap vfio group module init/clean code into helpers
vfio: Refactor vfio_device open and close
vfio: Make vfio_device_open() truly device specific
vfio: Swap order of vfio_device_container_register() and open_device()
vfio: Set device->group in helper function
vfio: Create wrappers for group register/unregister
vfio: Move the sanity check of the group to vfio_create_group()
vfio: Simplify vfio_create_group()
iommufd: Allow iommufd to supply /dev/vfio/vfio
vfio: Make vfio_container optionally compiled
vfio: Move container related MODULE_ALIAS statements into container.c
vfio-iommufd: Support iommufd for emulated VFIO devices
vfio-iommufd: Support iommufd for physical VFIO devices
vfio-iommufd: Allow iommufd to be used in place of a container fd
vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent()
...
Since Python 3.3 extensions have a suffix encoding platform and
version information. For example, the perf extension was previously
perf.so but now maybe perf.cpython-310-x86_64-linux-gnu.so. Compute
the extension using Python and then use this in the target name. Doing
this avoids the "perf.so" target always being rebuilt.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Eelco Chaudron <echaudro@redhat.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Shaomin Deng <dengshaomin@cdjrlc.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213232651.1269909-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ensure that the availability of the VG register behaves as expected
depending on the kernel version and SVE support.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213114739.2312862-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The first two version numbers are used since that is where the ABI
changes happen, so seems to be the most useful for now.
'Until' is exclusive and 'since' is inclusive so that the same version
number can be used to mark a point where the change comes into effect.
This allows keeping the tests in a state where new tests will also pass
on older kernels if the existence of a new feature isn't explicitly
broadcast by the kernel. For example extended user regs are currently
discovered by trial and error calls to perf_event_open.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213114739.2312862-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This can be used to skip tests or provide different test values on
different platforms. For example to run a test only where Arm SVE is
present add this to the config section:
auxv = auxv["AT_HWCAP"] & 0x200000 == 0x200000
The value is a freeform Python expression that is evaled in the context
of a map called "auxv" that contains the decoded auxiliary vector.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213114739.2312862-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the return value is used to skip the test, but sometimes it
can be useful to test if a certain command should return a certain exit
code.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213114739.2312862-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Provide task-analyzer test cases for all possible arguments and a subset of possible
combinations.
12 Tests in total.
test_basic:
- cmd:"perf script report task-analyzer"
- Fundamental test of script without arguments.
- Check for standard output.
test_ns_rename:
- cmd:"perf script report task-analyzer --ns --rename-comms-by-tids 0:random"
- Standard task with timestamps in nanoseconds and comm renamed.
- Check for standard output.
test_ms_filtertasks_highlight:
- cmd:"perf script report task-analyzer --ms --filter-tasks perf --highlight-tasks perf"
- Standard task with timestamps in milliseconds, task filtered out and highlighted.
- Check for standard output.
test_extended_times_timelimit_limittasks:
- cmd "perf script report task-analyzer --extended-times --time-limit :99999"
- Standard task with additional schedule out/in info and timlimit active at 99999.
- Check for extended table output.
test_summary:
- cmd:"perf script report task-analyzer --summary"
- Standard task with additional summary output.
- Check for summary print.
test_summary_extended:
- cmd:"perf script report task-analyzer --summary-extended"
- Standard task with summary and additional schedule in/out info.
- Chceck for extended table print.
test_summaryonly:
- cmd:"perf script report task-analyzer --summary-only"
- Only summary should be printed.
- Check for summary print.
test_extended_times_summary_ns:
- cmd:"perf script report task-analyzer --extended-times --summary --ns"
- Standard task with extended schedule in/out information and summary in ns.
- Check for extended table and summary.
test_csv:
- cmd:"perf script report task-analyzer --csv csv"
- Print standard task to csv file in csv format.
- Check for csv format.
test_csv_extended_times:
- cmd:"perf script report task-analyzer --csv csv --extended-times"
- Print standard task to csv file in csv format with additional schedule in/out
information.
- Check for additional information and csv format.
test_csvsummary:
- cmd:"perf script report task-analyzer --csv-summary csvsummary"
- Print summary to csvsummary file in csv format.
- Check for csv format.
test_csvsummary_extended:
- cmd:"perf script report task-analyzer --csv-summary csvsummary --summary-extended"
- Print summary to csvsummary file in csv format with additional schedule in/out
information.
- Check for additional information and csv format.
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Petar Gligoric <petar.gligoric@rohde-schwarz.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221206154406.41941-4-petar.gligor@gmail.com
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds the possibility to write the trace and the summary as csv files
to a user specified file. A format as such simplifies further data processing.
This is achieved by having ";" as separators instead of spaces and solely one
header per file.
Additional parameters are being considered, like in the normal usage of the
script. Colors are turned off in the case of a csv output, thus the highlight
option is also being ignored.
Usage:
Write standard task to csv file:
$ perf script report tasks-analyzer --csv <file>
write limited output to csv file in nanoseconds:
$ perf script report tasks-analyzer --csv <file> --ns --limit-to-tasks 1337
Write summary to a csv file:
$ perf script report tasks-analyzer --csv-summary <file>
Write summary to csv file with additional schedule information:
$ perf script report tasks-analyzer --csv-summary <file> --summary-extended
Write both summary and standard task to a csv file:
$ perf script report tasks-analyzer --csv --csv-summary
The following examples illustrate what is possible with the CSV output. The
first command sequence will record all scheduler switch events for 10 seconds,
the task-analyzer calculates task information like runtimes as CSV. A small
python snippet using pandas and matplotlib will visualize the most frequent
task (e.g. kworker/1:1) runtimes - each runtime as a bar in a bar chart:
$ perf record -e sched:sched_switch -a -- sleep 10
$ perf script report tasks-analyzer --ns --csv tasks.csv
$ cat << EOF > /tmp/freq-comm-runtimes-bar.py
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("tasks.csv", sep=';')
most_freq_comm = df["COMM"].value_counts().idxmax()
most_freq_runtimes = df[df["COMM"]==most_freq_comm]["Runtime"]
plt.title(f"Runtimes for Task {most_freq_comm} in Nanoseconds")
plt.bar(range(len(most_freq_runtimes)), most_freq_runtimes)
plt.show()
$ python3 /tmp/freq-comm-runtimes-bar.py
As a seconds example, the subsequent script generates a pie chart of all
accumulated tasks runtimes for 10 seconds of system recordings:
$ perf record -e sched:sched_switch -a -- sleep 10
$ perf script report tasks-analyzer --csv-summary task-summary.csv
$ cat << EOF > /tmp/accumulated-task-pie.py
import pandas as pd
from matplotlib.pyplot import pie, axis, show
df = pd.read_csv("task-summary.csv", sep=';')
sums = df.groupby(df["Comm"])["Accumulated"].sum()
axis("equal")
pie(sums, labels=sums.index);
show()
EOF
$ python3 /tmp/accumulated-task-pie.py
A variety of other visualizations are possible in matplotlib and other
environments. Of course, pandas, numpy and co. also allow easy
statistical analysis of the data!
Signed-off-by: Petar Gligoric <petar.gligoric@rohde-schwarz.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221206154406.41941-3-petar.gligor@gmail.com
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce a new 'perf script' to analyze task scheduling behavior.
During the task analysis, some data is always needed - which goes beyond
the simple time of switching on and off a task (process/thread). This
concerns for example the runtime of a process or the frequency with
which the process was called. This script serves to simplify this
recurring analyze process. It immediately provides the user with helpful
task characteristic information about the tasks runtimes.
Usage:
Recorded can be in two ways:
$ perf script record tasks-analyzer -- sleep 10
$ perf record -e sched:sched_switch -a -- sleep 10
The script can parse all perf.data files, most important: sched:sched_switch
events are mandatory, other events will be ignored.
Most simple report use case is to just call the script without arguments:
$ perf script report tasks-analyzer
Switched-In Switched-Out CPU PID TID Comm Runtime Time Out-In
15576.658891407 15576.659156086 4 2412 2428 gdbus 265 1949
15576.659111320 15576.659455410 0 2412 2412 gnome-shell 344 2267
15576.659491326 15576.659506173 2 74 74 kworker/2:1 15 13145
15576.659506173 15576.659825748 2 2858 2858 gnome-terminal- 320 63263
15576.659871270 15576.659902872 6 20932 20932 kworker/u16:0 32 2314582
15576.659909951 15576.659945501 3 27264 27264 sh 36 -1
15576.659853285 15576.659971052 7 27265 27265 perf 118 5050741
[...]
What is not shown here are the ASCII color sequences. For example, if
the task consists of only one thread, the TID is grayed out.
Runtime is the time the task was running on the CPU, Time Out-In is the
time between the process being scheduled *out* and scheduled back *in*.
So the last time span between two executions. If -1 is printed, then the
task simply ran the first time in the measurements - a Out-In delta
could not be calculated.
In addition to the chronological representation, there is a summary on
task level. This output can be additionally switched on via the
--summary option and provides information such as max, min & average
runtime per process. The maximum runtime is often important for
debugging. The call looks like this:
$ perf script report tasks-analyzer --summary
Summary
Task Information Runtime Information
PID TID Comm Runs Accumulated Mean Median Min Max Max At
14 14 ksoftirqd/0 13 334 26 15 9 127 15571.621211956
15 15 rcu_preempt 133 1778 13 13 2 33 15572.581176024
16 16 migration/0 3 49 16 13 12 24 15571.608915425
20 20 migration/1 3 34 11 13 8 13 15571.639101555
25 25 migration/2 3 32 11 12 9 12 15575.639239896
[...]
Besides these two options, there are a number of other options that change the
output and behavior. This can be queried via --help. Options worth mentioning include:
- filter-tasks - filter out unneeded tasks, --filter-task 1337,/sbin/init
- highlight-tasks - more pleasant focusing, --highlight-tasks 1:red,mutt:yellow
- extended-times - show combinations of elapsed times between schedule in/schedule out
- summary-extended - summary with additional information, like maximum delta time statistics
- rename-comms-by-tids - handy for inexpressive processnames like python, --rename 1337:my-python-app
- ms - show timestamps in milliseconds, nanoseconds is also possible (--ns)
- time-limit - limit the analyzer to a time range, --time-limit 15576.0:15576.1
Script is tested and prime time ready for python2 & python3:
- make PYTHON=python3 prefix=/usr/local install
- make PYTHON=python2 prefix=/usr/local install
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221206154406.41941-2-petar.gligor@gmail.com
Signed-off-by: Petar Gligoric <petar.gligoric@rohde-schwarz.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Printing the info doesn't have any dependency on OpenCSD, and neither
does recording Coresight data. Because it's sometimes useful to look at
the info for debugging, it makes sense to be able to see it on the same
platform that the recording was made on.
So pull the auxtrace info printing parts into a new file that is always
compiled into Perf.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20221212155513.2259623-6-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
hdr is a copy of 3 values of ptr and doesn't need to be long lived. So
just use ptr instead which means the malloc and the extra error path can
be removed to simplify things.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20221212155513.2259623-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
cs_etm__print_auxtrace_info() is called twice in case there is an error
somewhere in cs_etm__process_auxtrace_info(), but all the info is
already available at the beginning so just print it there instead.
Also use u64 and the already cast ptr variable to make it more
consistent with the rest of the etm code.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20221212155513.2259623-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These aren't used outside of cs-etm so don't need stubs. Leave
cs_etm__process_auxtrace_info() which is used externally, and add an
error message so that it's obvious to users why it causes errors.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20221212155513.2259623-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is an error rather than just for the raw trace dump so always print
it as an error. Also remove the duplicate header version check.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20221212155513.2259623-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add test cases for the task and addr aggregation modes.
$ sudo ./perf test -v contention
86: kernel lock contention analysis test :
--- start ---
test child forked, pid 680006
Testing perf lock record and perf lock contention
Testing perf lock contention --use-bpf
Testing perf lock record and perf lock contention at the same time
Testing perf lock contention --threads
Testing perf lock contention --lock-addr
test child finished with 0
---- end ----
kernel lock contention analysis test: Ok
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -l/--lock-addr option is to implement per-lock-instance contention
stat using LOCK_AGGR_ADDR. It displays lock address and optionally
symbol name if exists.
$ sudo ./perf lock con -abl sleep 1
contended total wait max wait avg wait address symbol
1 36.28 us 36.28 us 36.28 us ffff92615d6448b8
9 10.91 us 1.84 us 1.21 us ffffffffbaed50c0 rcu_state
1 10.49 us 10.49 us 10.49 us ffff9262ac4f0c80
8 4.68 us 1.67 us 585 ns ffffffffbae07a40 jiffies_lock
3 3.03 us 1.45 us 1.01 us ffff9262277861e0
1 924 ns 924 ns 924 ns ffff926095ba9d20
1 436 ns 436 ns 436 ns ffff9260bfda4f60
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The BPF didn't show the per-thread stat properly. Use task's thread id (PID)
as a key instead of stack_id and add a task_data map to save task comm names.
$ sudo ./perf lock con -abt -E 5 sleep 1
contended total wait max wait avg wait pid comm
1 740.66 ms 740.66 ms 740.66 ms 1950 nv_queue
3 305.50 ms 298.19 ms 101.83 ms 1884 nvidia-modeset/
1 25.14 us 25.14 us 25.14 us 2725038 EventManager_De
12 23.09 us 9.30 us 1.92 us 0 swapper
1 20.18 us 20.18 us 20.18 us 2725033 EventManager_De
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Accessing BPF maps should use the same data types. Add bpf_skel/lock_data.h
to define the common data structures. No functional changes.
Committer notes:
Fixed contention_key.stack_id missing rename to contention_key.stack_or_task_id.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sometimes build systems may append options e.g. --sysroot etc. to CC
variable especially in cross-compile environments like yocto project
where CC varable is composed of cross-compiler name and some needed
options for it to work in a relocatable environment.
Therefore separate out the compiler name from rest of the options in CC,
then add the options via second argument to Popen() API
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Link: https://lore.kernel.org/r/20221205025534.150006-1-raj.khem@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In BTF, tracepoint definitions have the "btf_trace_" prefix. The
off-cpu profiler needs to check the signature of the sched_switch event
using that definition. But there's a typo (s/bpf/btf/) so it failed
always.
Fixes: b36888f71c ("perf record: Handle argument change in sched_switch")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: bpf@vger.kernel.org
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20221208182636.524139-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The event group test checks group creation for combinations of hw, sw
and uncore PMU events. Some of the uncore pmus may require additional
permission to access the counters.
For example, in case of hv_24x7, partition need to have permissions to
access hv_24x7 pmu counters. If not, event_open will fail. Hence add a
sanity check to see if event_open succeeds before proceeding with the
test.
Fixes: 9d9b22beda ("perf test: Add event group test for events in multiple PMUs")
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20221207165815.774-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some distros have older versions of libtraceevent where
TEP_FIELD_IS_RELATIVE and its associated semantics are not present, so
we need to check if the version has it, it was introduced in
libtraceevent 1.5.0.
Reported-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
libtraceevent is now out-of-date and it is better to depend on the
system version. Remove this code that is no longer depended upon by
any builds.
Committer notes:
Removed the removed tools/lib/traceevent/ from tools/perf/MANIFEST, so
that 'make perf-tar-src-pkg' works.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20221130062935.2219247-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command
line variables.
If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the
build, don't compile in libtraceevent and libtracefs support.
This also disables CONFIG_TRACE that controls "perf trace".
CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles,
HAVE_LIBTRACEEVENT is used in C code.
Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the
commands kmem, kwork, lock, sched and timechart are removed. The
majority of commands continue to work including "perf test".
Committer notes:
Fixed up a tools/perf/util/Build reject and added:
#include <traceevent/event-parse.h>
to tools/perf/util/scripting-engines/trace-event-perl.c.
Committer testing:
$ rpm -qi libtraceevent-devel
Name : libtraceevent-devel
Version : 1.5.3
Release : 2.fc36
Architecture: x86_64
Install Date: Mon 25 Jul 2022 03:20:19 PM -03
Group : Unspecified
Size : 27728
License : LGPLv2+ and GPLv2+
Signature : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4
Source RPM : libtraceevent-1.5.3-2.fc36.src.rpm
Build Date : Fri 15 Apr 2022 10:57:01 AM -03
Build Host : buildvm-x86-05.iad2.fedoraproject.org
Packager : Fedora Project
Vendor : Fedora Project
URL : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/
Bug URL : https://bugz.fedoraproject.org/libtraceevent
Summary : Development headers of libtraceevent
Description :
Development headers of libtraceevent-libs
$
Default build:
$ ldd ~/bin/perf | grep tracee
libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000)
$
# perf trace -e sched:* --max-events 10
0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1)
0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1)
0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120)
1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120)
1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120)
0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2)
0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2)
0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120)
1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1)
1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120)
#
Had to tweak tools/perf/util/setup.py to make sure the python binding
shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is
present in CFLAGS.
Building with NO_LIBTRACEEVENT=1 uncovered some more build failures:
- Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y
- perf-$(CONFIG_LIBTRACEEVENT) += scripts/
- bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y
- The python binding needed some fixups and util/trace-event.c can't be
built and linked with the python binding shared object, so remove it
in tools/perf/util/setup.py and exclude it from the list of
dependencies in the python/perf.so Makefile.perf target.
Building without libtraceevent-devel installed uncovered more build
failures:
- The python binding tools/perf/util/python.c was assuming that
traceevent/parse-events.h was always available, which was the case
when we defaulted to using the in-kernel tools/lib/traceevent/ files,
now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like
the other parts of it that deal with tracepoints.
- We have to ifdef the rules in the Build files with
CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and
tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when
setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't
detect libtraceevent-devel installed in the system. Simplification here
to avoid these two ways of disabling builtin-trace.c and not having
CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean
way.
From Athira:
<quote>
tools/perf/arch/powerpc/util/Build
-perf-y += kvm-stat.o
+perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
</quote>
Then, ditto for arm64 and s390, detected by container cross build tests.
- s/390 uses test__checkevent_tracepoint() that is now only available if
HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT.
Also from Athira:
<quote>
With this change, I could successfully compile in these environment:
- Without libtraceevent-devel installed
- With libtraceevent-devel installed
- With “make NO_LIBTRACEEVENT=1”
</quote>
Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for
consistency with other libraries detected in tools/perf/.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the 'MetricExpr' json value is passed from the json
file to the pmu-events.c. This change introduces an expression
tree that is parsed into. The parsing is done largely by using
operator overloading and python's 'eval' function. Two advantages
in doing this are:
1) Broken metrics fail at compile time rather than relying on
`perf test` to detect. `perf test` remains relevant for checking
event encoding and actual metric use.
2) The conversion to a string from the tree can minimize the metric's
string size, for example, preferring 1e6 over 1000000, avoiding
multiplication by 1 and removing unnecessary whitespace. On x86
this reduces the string size by 2,930bytes (0.07%).
In future changes it would be possible to programmatically
generate the json expressions (a single line of text and so a
pain to write manually) for an architecture using the expression
tree. This could avoid copy-pasting metrics for all architecture
variants.
v4. Doesn't simplify "0*SLOTS" to 0, as the pattern is used to fix
Intel metrics with topdown events.
v3. Avoids generic types on standard types like set that aren't
supported until Python 3.9, fixing an issue with Python 3.6
reported-by John Garry. v3 also fixes minor pylint issues and adds
a call to Simplify on the read expression tree.
v2. Improvements to type information.
Committer notes:
Added one-line fixer from Ian, see first Link: tag below.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/CAP-5=fWa=zNK_ecpWGoGggHCQx7z-oW0eGMQf19Maywg0QK=4g@mail.gmail.com
Link: https://lore.kernel.org/r/20221207055908.1385448-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In print_counter_aggrdata(), it skips some events that has no aggregate
count. It's actually for system-wide per-thread mode and merged uncore
and hybrid events.
Let's update the condition to check them explicitly.
Fixes: 91f85f98da ("perf stat: Display event stats using aggr counts")
Reported-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221206175804.391387-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If LIBTRACEEVENT_DYNAMIC is enabled then avoid the install step for
the plugins. If disabled correct DESTDIR so that the plugins are
installed under <lib>/traceevent/plugins.
Fixes: ef019df01e ("perf build: Install libtraceevent locally when building")
Reported-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20221205225940.3079667-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is used in bpf_lock_contention.c and builtin-lock.c will be made
CONFIG_LIBTRACEEVENT=y conditional, so move it to machine.c, that is
always available.
This makes those 4 global variables for sched and lock text start and
end to move to 'struct machine' too, as conceivably we can have that
info for several machine instances, say some 'perf diff' like tool.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Multiple events in a group can belong to one or more PMUs, however
there are some limitations.
One of the limitations is that perf doesn't allow creating a group of
events from different hw PMUs.
Write a simple test to create various combinations of hw, sw and uncore
PMU events and verify group creation succeeds or fails as expected.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20221206043237.12159-3-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'pmus' list variable is defined as static variable under pmu.c file.
Introduce a new pmus.c file and migrate this variable to it. Also make
it non static so that it can be accessed from outside.
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: carsten.haitzler@arm.com
Link: https://lore.kernel.org/r/20221206043237.12159-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Avoid libtraceevent dependency for tep_is_bigendian or trace-event.h
dependency for bigendian. Add a new host_is_bigendian to util.h, using
the compiler defined __BYTE_ORDER__ when available.
Committer notes:
Added:
#else /* !__BYTE_ORDER__ */
On that nested #ifdef block, as per Namhyung's suggestion.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20221130062935.2219247-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove git reference by changing GIT_COMPAT_UTIL_H to __PERF_UTIL_H.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20221130062935.2219247-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In this context, 'os' is already a pointer so the extra dereference
isn't required. This fixes the following test failure on aarch64:
$ ./perf test "json output" -vvv
92: perf stat JSON output linter :
--- start ---
Checking json output: no args Test failed for input:
...
Fatal error: glibc detected an invalid stdio handle
---- end ----
perf stat JSON output linter: FAILED!
Fixes: e7f4da3122 ("perf stat: Pass struct outstate to printout()")
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221130111521.334152-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a metric produces more than one values, it missed to print the opening
bracket.
Fixes: ab6baaae27 ("perf stat: Fix JSON output in metric-only mode")
Reported-by: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221202190447.1588680-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Compute the headers to be installed from their source headers and make
each have its own build target to install it. Using dependencies
avoids headers being reinstalled and getting a new timestamp which
then causes files that depend on the header to be rebuilt.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20221202045743.2639466-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>