Commit Graph

11836 Commits

Author SHA1 Message Date
Jin Yao 28904f4dce perf streams: Calculate the sum of total streams hits
We have used callchain_node->hit to measure the hot level of one stream.
This patch calculates the sum of hits of total streams.

Thus in next patch, we can use following formula to report hot percent
for one stream.

hot percent = callchain_node->hit / sum of total hits

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20201009022845.13141-6-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 13:34:06 -03:00
Jin Yao fa79aa6485 perf streams: Link stream pair
In previous patch, we have created an evsel_streams for one event, and
top N hottest streams will be saved in a stream array in evsel_streams.

This patch compares total streams among two evsel_streams.

Once two streams are fully matched, they will be linked as a pair. From
the pair, we can know which streams are matched.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20201009022845.13141-5-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 13:32:36 -03:00
Jin Yao 47ef8398c3 perf streams: Compare two streams
Stream is the branch history which is aggregated by the branch records
from perf samples. Now we support the callchain as stream.

If the callchain entries of one stream are fully matched with the
callchain entries of another stream, we think two streams are matched.

For example,

   cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
   -----------------------                 -----------------------
             main div.c:39                           main div.c:39
             main div.c:44                           main div.c:44

Above two streams are matched (we don't consider the case that source
code is changed).

The matching logic is, compare the chain string first. If it's not
matched, fallback to dso address comparison.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20201009022845.13141-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 13:31:56 -03:00
Jin Yao dd1d841810 perf streams: Get the evsel_streams by evsel_idx
In previous patch, we have created evsel_streams array.

This patch returns the specified evsel_streams according to the
evsel_idx.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20201009022845.13141-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 13:30:13 -03:00
Jin Yao 480accbb17 perf streams: Introduce branch history "streams"
We define a stream as the branch history which is aggregated by the
branch records from perf samples. For example, the callchains aggregated
from the branch records are considered as streams.  By browsing the hot
stream, we can understand the hot code path.

Now we only support the callchain for stream. For measuring the hot
level for a stream, we use the callchain_node->hit, higher is hotter.

There may be many callchains sampled so we only focus on the top N
hottest callchains. N is a user defined parameter or predefined default
value (nr_streams_max).

This patch creates an evsel_streams array per event, and saves the top N
hottest streams in a stream array.

So now we can get the per-event top N hottest streams.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20201009022845.13141-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 13:27:28 -03:00
Andi Kleen 6556a75bec perf intel-pt: Improve PT documentation slightly
Document the higher level --insn-trace etc. perf script options.

Include the howto how to build xed into the manpage

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20201014035346.4772-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 13:14:40 -03:00
Andi Kleen 0997a2662f perf tools: Add support for exclusive groups/events
Peter suggested that using the exclusive mode in perf could avoid some
problems with bad scheduling of groups. Exclusive is implemented in the
kernel, but wasn't exposed by the perf tool, so hard to use without
custom low level API users.

Add support for marking groups or events with :e for exclusive in the
perf tool.  The implementation is basically the same as the existing
pinned attribute.

Committer testing:

  # perf test "parse event"
   6: Parse event definition strings                                  : Ok
  # perf test -v "parse event" |& grep :u*e
  running test 56 'instructions:uep'
  running test 57 '{cycles,cache-misses,branch-misses}:e'
  #
  #
  # grep "model name" -m1 /proc/cpuinfo
  model name	: AMD Ryzen 9 3900X 12-Core Processor
  #
  # perf stat -a -e '{cycles,cache-misses,branch-misses}:e' sleep 1

   Performance counter stats for 'system wide':

       <not counted>      cycles                                                        (0.00%)
       <not counted>      cache-misses                                                  (0.00%)
       <not counted>      branch-misses                                                 (0.00%)

         1.001269893 seconds time elapsed

  Some events weren't counted. Try disabling the NMI watchdog:
  	echo 0 > /proc/sys/kernel/nmi_watchdog
  	perf stat ...
  	echo 1 > /proc/sys/kernel/nmi_watchdog
  # echo 0 > /proc/sys/kernel/nmi_watchdog
  # perf stat -a -e '{cycles,cache-misses,branch-misses}:e' sleep 1

   Performance counter stats for 'system wide':

       1,298,663,141      cycles
          30,962,215      cache-misses
           5,325,150      branch-misses

         1.001474934 seconds time elapsed

  #
  # The output for asking for precise events on AMD needs to improve, it
  # supposedly works only for system wide or per CPU
  #
  # perf stat -a -e '{cycles,cache-misses,branch-misses}:uep' sleep 1
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles).
  /bin/dmesg | grep -i perf may provide additional information.

  # perf stat -a -e '{cycles,cache-misses,branch-misses}:ue' sleep 1

   Performance counter stats for 'system wide':

         746,363,126      cycles
          16,881,611      cache-misses
           2,871,259      branch-misses

         1.001636066 seconds time elapsed

  #

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20201014144255.22699-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 12:24:28 -03:00
Jiri Olsa 78b2c50c5d perf test: Add build id shell test
Add a test for the build id cache that adds a binary with sha1 and md5
build ids and verifies it's added properly.

The test updates build id cache with 'perf record' and 'perf buildid-cache -a'.

Committer testing:

  # perf test "build id"
  82: build id cache operations                                       : Ok
  #
  # perf test -v "build id"
  82: build id cache operations                                       :
  --- start ---
  test child forked, pid 447218
  test binaries: /tmp/perf.ex.SHA1.B8I /tmp/perf.ex.MD5.7Nv
  Adding d1abc1eb7568358cf23c959566f23462461834d1 /tmp/perf.ex.SHA1.B8I: Ok
  build id: d1abc1eb7568358cf23c959566f23462461834d1
  link: /tmp/perf.debug.sS2/.build-id/d1/abc1eb7568358cf23c959566f23462461834d1
  file: /tmp/perf.debug.sS2/.build-id/d1/../../tmp/perf.ex.SHA1.B8I/d1abc1eb7568358cf23c959566f23462461834d1/elf
  OK for /tmp/perf.ex.SHA1.B8I
  Adding a50e350e97c43b4708d09bcd85ebfff7 /tmp/perf.ex.MD5.7Nv: Ok
  build id: a50e350e97c43b4708d09bcd85ebfff7
  link: /tmp/perf.debug.IuW/.build-id/a5/0e350e97c43b4708d09bcd85ebfff7
  file: /tmp/perf.debug.IuW/.build-id/a5/../../tmp/perf.ex.MD5.7Nv/a50e350e97c43b4708d09bcd85ebfff7/elf
  OK for /tmp/perf.ex.MD5.7Nv
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.034 MB /tmp/perf.data.xrH ]
  build id: d1abc1eb7568358cf23c959566f23462461834d1
  link: /tmp/perf.debug.eGR/.build-id/d1/abc1eb7568358cf23c959566f23462461834d1
  file: /tmp/perf.debug.eGR/.build-id/d1/../../tmp/perf.ex.SHA1.B8I/d1abc1eb7568358cf23c959566f23462461834d1/elf
  OK for /tmp/perf.ex.SHA1.B8I
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.034 MB /tmp/perf.data.cbE ]
  build id: a50e350e97c43b4708d09bcd85ebfff7
  link: /tmp/perf.debug.82t/.build-id/a5/0e350e97c43b4708d09bcd85ebfff7
  file: /tmp/perf.debug.82t/.build-id/a5/../../tmp/perf.ex.MD5.7Nv/a50e350e97c43b4708d09bcd85ebfff7/elf
  OK for /tmp/perf.ex.MD5.7Nv
  test child finished with 0
  ---- end ----
  build id cache operations: Ok
  #

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20201013192441.1299447-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 11:28:52 -03:00
Jiri Olsa e9ad94381c perf tools: Align buildid list output for short build ids
With shorter md5 build ids we need to align their paths properly with
other build ids:

  $ perf buildid-list
  17f4e448cc746582ea1881528deb549f7fdb3fd5 [kernel.kallsyms]
  a50e350e97c43b4708d09bcd85ebfff7         .../tools/perf/buildid-ex-md5
  1805c738c8f3ec0f47b7ea09080c28f34d18a82b /usr/lib64/ld-2.31.so
  $

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20201013192441.1299447-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 11:28:52 -03:00
Jiri Olsa b0a323c7f0 perf tools: Add size to 'struct perf_record_header_build_id'
We do not store size with build ids in perf data, but there's enough
space to do it. Adding misc bit PERF_RECORD_MISC_BUILD_ID_SIZE to mark
build id event with size.

With this fix the dso with md5 build id will have correct build id data
and will be usable for debuginfod processing if needed (coming in
following patches).

Committer notes:

Use %zu with size_t to fix this error on 32-bit arches:

  util/header.c: In function '__event_process_build_id':
  util/header.c:2105:3: error: format '%lu' expects argument of type 'long unsigned int', but argument 6 has type 'size_t' [-Werror=format=]
     pr_debug("build id event received for %s: %s [%lu]\n",
     ^

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20201013192441.1299447-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 11:28:12 -03:00
Jiri Olsa 39be8d0115 perf tools: Pass build_id object to dso__build_id_equal()
Passing build_id object to dso__build_id_equal(), so we can properly
check build id with different size than sha1.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20201013192441.1299447-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 09:25:36 -03:00
Jiri Olsa 8dfdf440d3 perf tools: Pass build_id object to dso__set_build_id()
Passing build_id object to dso__set_build_id(), so it's easier
to initialize dos's build id object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20201013192441.1299447-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 08:46:42 -03:00
Jiri Olsa bf5411695a perf tools: Pass build_id object to build_id__sprintf()
Passing build_id object to build_id__sprintf function, so it can operate
with the proper size of build id.

This will create proper md5 build id readable names,
like following:

  a50e350e97c43b4708d09bcd85ebfff7

instead of:

  a50e350e97c43b4708d09bcd85ebfff700000000

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20201013192441.1299447-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 08:46:22 -03:00
Jiri Olsa 3ff1b8c8cc perf tools: Pass build id object to sysfs__read_build_id()
Passing build id object to sysfs__read_build_id function, so it can
populate the size of the build_id object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20201013192441.1299447-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 08:46:02 -03:00
Jiri Olsa f766819cd5 perf tools: Pass build_id object to filename__read_build_id()
Pass a build_id object to filename__read_build_id function, so it can
populate the size of the build_id object.

Changing filename__read_build_id() code for both ELF/non-ELF code.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20201013192441.1299447-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 08:45:16 -03:00
Jiri Olsa 0aba7f036a perf tools: Use build_id object in dso
Replace build_id byte array with struct build_id object and all the code
that references it.

The objective is to carry size together with build id array, so it's
better to keep both together.

This is preparatory change for following patches, and there's no
functional change.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20201013192441.1299447-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 08:44:47 -03:00
Arnaldo Carvalho de Melo 79bbbabd22 perf config: Export the perf_config_from_file() function
We'll use it to ask for extra config files to be loaded, profile like
stuff that will be used first to make 'perf trace' mimic 'strace' output
via a 'perf strace' command that just sets up 'perf trace' output.

At some point it'll be used for regression tests, where we'll run some
simple commands like:

  perf strace ls > perf-strace.output
  strace ls > strace.output

And then do some mutable syscall arg aware diff like tool to deal with
arguments for things like mmap, that change at each execution, to be
first ignored and then properly tracked when used accoss multiple
syscalls.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 17:03:19 -03:00
James Clark 79373082fa perf python: Autodetect python3 binary
Some distros don't come with python2 and only have python3 available.
This causes the "'import perf' in python" self test to fail.

This change adds python3 to the list of possible python versions
that are autodetected but maintains the priorities for
'python2' and 'python' detection. Python3 has the lowest priority.

Committer notes:

On a fedora system without python2 packages the 'perf test python'
continues to work:

  # python2
  bash: python2: command not found...
  Similar command is: 'python'
  # rpm -qa | grep python2
  #

That "Similar command" gives the clue:

  # rpm -qf /usr/bin/python
  python-unversioned-command-3.8.5-5.fc32.noarch
  # rpm -ql python-unversioned-command
  /usr/bin/python
  /usr/share/man/man1/python.1.gz
  #

With it in place the 'python' binary is found and perf builds the python
binding using python3:

  # perf test -v python
  19: 'import perf' in python                                         :
  --- start ---
  test child forked, pid 379988
  python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python' "
  test child finished with 0
  ---- end ----
  'import perf' in python: Ok
  #

Looking at that path:

  # ls -la /tmp/build/perf/python
  total 1864
  drwxrwxr-x.  2 acme acme      60 Oct 13 16:20 .
  drwxrwxr-x. 18 acme acme    4420 Oct 13 16:28 ..
  -rwxrwxr-x.  1 acme acme 1907216 Oct 13 16:28 perf.cpython-38-x86_64-linux-gnu.so
  #

And:

  # ldd ~/bin/perf | grep python
  	libpython3.8.so.1.0 => /lib64/libpython3.8.so.1.0 (0x00007f5471187000)
  #

As soon as we remove it:

  # rpm -e python-unversioned-command-3.8.5-5.fc32.noarch
  # hash -r
  # python
  bash: python: command not found...
  Install package 'python-unversioned-command' to provide command 'python'? [N/y] n
  #

And rebuilding perf now doesn't find python in the system:

  make: Entering directory '/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j24' parallel build
  <SNIP>
  Makefile.config:786: No python interpreter was found: disables Python support - please install python-devel/python-dev
  <SNIP>

After this patch:

  $ rpm -qi python-unversioned-command
  package python-unversioned-command is not installed
  $
  $ python
  bash: python: command not found...
  Install package 'python-unversioned-command' to provide command 'python'? [N/y] ^C
  $
  $ m
  make: Entering directory '/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j24' parallel build
  <SNIP>
    CC       /tmp/build/perf/tests/attr.o
    CC       /tmp/build/perf/tests/python-use.o
    DESCEND  plugins
    GEN      /tmp/build/perf/python/perf.so
    INSTALL  trace_plugins
    LD       /tmp/build/perf/tests/perf-in.o
    LD       /tmp/build/perf/perf-in.o
    LINK     /tmp/build/perf/perf
  <SNIP>
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  19: 'import perf' in python                                         : Ok
  $ ldd ~/bin/perf | grep python
  	libpython3.8.so.1.0 => /lib64/libpython3.8.so.1.0 (0x00007f2c8c708000)
  $ ls -la /tmp/build/perf/python
  total 1864
  drwxrwxr-x.  2 acme acme      60 Oct 13 16:20 .
  drwxrwxr-x. 18 acme acme    4420 Oct 13 16:31 ..
  -rwxrwxr-x.  1 acme acme 1907216 Oct 13 16:31 perf.cpython-38-x86_64-linux-gnu.so
  $

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LPU-Reference: 20201005080645.6588-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 16:25:57 -03:00
Arnaldo Carvalho de Melo 0fd0f00fdb perf tests: Show python test script in verbose mode
To help figure out where it is getting the binding.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 16:22:03 -03:00
Vasily Gorbik 6cf4ecf5c5 perf build: Allow nested externs to enable BUILD_BUG() usage
Currently BUILD_BUG() macro is expanded to smth like the following:

   do {
           extern void __compiletime_assert_0(void)
                   __attribute__((error("BUILD_BUG failed")));
           if (!(!(1)))
                   __compiletime_assert_0();
   } while (0);

If used in a function body this obviously would produce build errors
with -Wnested-externs and -Werror.

To enable BUILD_BUG() usage in tools/arch/x86/lib/insn.c which perf
includes in intel-pt-decoder, build perf without -Wnested-externs.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> # build tested
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/patch-1.thread-251403.git-2514037e9477.your-ad-here.call-01602244460-ext-7088@work.hours
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 16:07:24 -03:00
Jiri Slaby f3013f7ed4 perf trace: Fix off by ones in memset() after realloc() in arches using libaudit
'perf trace ls' started crashing after commit d21cb73a90 on
!HAVE_SYSCALL_TABLE_SUPPORT configs (armv7l here) like this:

  0  strlen () at ../sysdeps/arm/armv6t2/strlen.S:126
  1  0xb6800780 in __vfprintf_internal (s=0xbeff9908, s@entry=0xbeff9900, format=0xa27160 "]: %s()", ap=..., mode_flags=<optimized out>) at vfprintf-internal.c:1688
  ...
  5  0x0056ecdc in fprintf (__fmt=0xa27160 "]: %s()", __stream=<optimized out>) at /usr/include/bits/stdio2.h:100
  6  trace__sys_exit (trace=trace@entry=0xbeffc710, evsel=evsel@entry=0xd968d0, event=<optimized out>, sample=sample@entry=0xbeffc3e8) at builtin-trace.c:2475
  7  0x00566d40 in trace__handle_event (sample=0xbeffc3e8, event=<optimized out>, trace=0xbeffc710) at builtin-trace.c:3122
  ...
  15 main (argc=2, argv=0xbefff6e8) at perf.c:538

It is because memset in trace__read_syscall_info zeroes wrong memory:

1) when initializing for the first time, it does not reset the last id.

2) in other cases, it resets the last id of previous buffer.

ad 1) it causes the crash above as sc->name used in the fprintf above
      contains garbage.

ad 2) it sets nonexistent from true back to false for id 11 here. Not
      sure, what the consequences are.

So fix it by introducing a special case for the initial initialization
and do the right +1 in both cases.

Fixes: d21cb73a90 ("perf trace: Grow the syscall table as needed when using libaudit")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20201001093419.15761-1-jslaby@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 13:57:41 -03:00
Leo Yan edac75a2f8 perf c2c: Update usage for showing memory events
Since commit b027cc6fdf ("perf c2c: Fix 'perf c2c record -e list' to
show the default events used"), "perf c2c" tool can show the memory
events properly, it's no reason to still suggest user to use the
command "perf mem record -e list" for showing events.

This patch updates the usage for showing memory events with command
"perf c2c record -e list".

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20201011121022.22409-1-leo.yan@linaro.org
2020-10-13 13:15:38 -03:00
Arnaldo Carvalho de Melo dbaa1b3d9a Merge branch 'perf/urgent' into perf/core
To pick fixes that missed v5.9.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 13:02:20 -03:00
Joel Fernandes (Google) dc000c4593 perf sched: Show start of latency as well
The 'perf sched latency' tool is really useful at showing worst-case
latencies that task encountered since wakeup. However it shows only the
end of the latency. Often times the start of a latency is interesting as
it can show what else was going on at the time to cause the latency. I
certainly myself spending a lot of time backtracking to the start of the
latency in "perf sched script" which wastes a lot of time.

This patch therefore adds a new column "Max delay start". Considering
this, also rename "Maximum delay at" to "Max delay end" as its easier to
understand.

Example of the new output:

  ----------------------------------------------------------------------------------------------------------------------------------
   Task                  | Runtime ms  | Switches | Avg delay ms  | Max delay ms   | Max delay start         | Max delay end       |
  ----------------------------------------------------------------------------------------------------------------------------------
   MediaScannerSer:11936 |  651.296 ms |    67978 | avg: 0.113 ms | max: 77.250 ms | max start: 477.691360 s | max end: 477.768610 s
   audio@2.0-servi:(3)   |    0.000 ms |     3440 | avg: 0.034 ms | max: 72.267 ms | max start: 477.697051 s | max end: 477.769318 s
   AudioOut_1D:8112      |    0.000 ms |     2588 | avg: 0.083 ms | max: 64.020 ms | max start: 477.710740 s | max end: 477.774760 s
   Time-limited te:14973 | 7966.090 ms |    24807 | avg: 0.073 ms | max: 15.563 ms | max start: 477.162746 s | max end: 477.178309 s
   surfaceflinger:8049   |    9.680 ms |      603 | avg: 0.063 ms | max: 13.275 ms | max start: 476.931791 s | max end: 476.945067 s
   HeapTaskDaemon:(3)    | 1588.830 ms |     7040 | avg: 0.065 ms | max:  6.880 ms | max start: 473.666043 s | max end: 473.672922 s
   mount-passthrou:(3)   | 1370.809 ms |    68904 | avg: 0.011 ms | max:  6.524 ms | max start: 478.090630 s | max end: 478.097154 s
   ReferenceQueueD:(3)   |   11.794 ms |     1725 | avg: 0.014 ms | max:  6.521 ms | max start: 476.119782 s | max end: 476.126303 s
   writer:14077          |   18.410 ms |     1427 | avg: 0.036 ms | max:  6.131 ms | max start: 474.169675 s | max end: 474.175805 s

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20200925235634.4089867-1-joel@joelfernandes.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 11:01:42 -03:00
Sandipan Das 70830f974e perf vendor events: Fix typos in power8 PMU events
This replaces the incorrectly spelled word "localtion" with "location"
in some power8 PMU event descriptions.

Fixes: 2a81fa3bb5 ("perf vendor events: Add power8 PMU events")
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lore.kernel.org/lkml/20201012050205.328523-1-sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 11:01:42 -03:00
Namhyung Kim bf7ef5ddb0 perf bench: Run inject-build-id with --buildid-all option too
For comparison, it now runs the benchmark twice - one if regular -b and
another for --buildid-all.

  $ perf bench internals inject-build-id
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 21.002 msec (+- 0.172 msec)
    Average time per event: 2.059 usec (+- 0.017 usec)
    Average memory usage: 8169 KB (+- 0 KB)
    Average build-id-all injection took: 19.543 msec (+- 0.124 msec)
    Average time per event: 1.916 usec (+- 0.012 usec)
    Average memory usage: 7348 KB (+- 0 KB)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20201012070214.2074921-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 11:01:42 -03:00
Namhyung Kim 27c9c3424f perf inject: Add --buildid-all option
Like 'perf record', we can even more speedup build-id processing by just
using all DSOs.  Then we don't need to look at all the sample events
anymore.  The following patch will update 'perf bench' to show the result
of the --buildid-all option too.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Original-patch-by: Stephane Eranian <eranian@google.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20201012070214.2074921-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 11:01:42 -03:00
Namhyung Kim e7b60c5a0c perf inject: Do not load map/dso when injecting build-id
No need to load symbols in a DSO when injecting build-id.  I guess the
reason was to check the DSO is a special file like anon files.  Use some
helper functions in map.c to check them before reading build-id.  Also
pass sample event's cpumode to a new build-id event.

It brought a speedup in the benchmark of 25 -> 21 msec on my laptop.
Also the memory usage (Max RSS) went down by ~200 KB.

  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 21.389 msec (+- 0.138 msec)
    Average time per event: 2.097 usec (+- 0.014 usec)
    Average memory usage: 8225 KB (+- 0 KB)

Committer notes:

Before:

  $ perf stat -r5 perf bench internals inject-build-id > /dev/null

   Performance counter stats for 'perf bench internals inject-build-id' (5 runs):

            4,020.56 msec task-clock:u              #    1.271 CPUs utilized            ( +-  0.74% )
                   0      context-switches:u        #    0.000 K/sec
                   0      cpu-migrations:u          #    0.000 K/sec
             123,354      page-faults:u             #    0.031 M/sec                    ( +-  0.81% )
       7,119,951,568      cycles:u                  #    1.771 GHz                      ( +-  1.74% )  (83.27%)
         230,086,969      stalled-cycles-frontend:u #    3.23% frontend cycles idle     ( +-  1.97% )  (83.41%)
       1,168,298,765      stalled-cycles-backend:u  #   16.41% backend cycles idle      ( +-  1.13% )  (83.44%)
      11,173,083,669      instructions:u            #    1.57  insn per cycle
                                                    #    0.10  stalled cycles per insn  ( +-  1.58% )  (83.31%)
       2,413,908,936      branches:u                #  600.392 M/sec                    ( +-  1.69% )  (83.26%)
          46,576,289      branch-misses:u           #    1.93% of all branches          ( +-  2.20% )  (83.31%)

              3.1638 +- 0.0309 seconds time elapsed  ( +-  0.98% )

  $

After:

  $ perf stat -r5 perf bench internals inject-build-id > /dev/null

   Performance counter stats for 'perf bench internals inject-build-id' (5 runs):

            2,379.94 msec task-clock:u              #    1.473 CPUs utilized            ( +-  0.18% )
                   0      context-switches:u        #    0.000 K/sec
                   0      cpu-migrations:u          #    0.000 K/sec
              62,584      page-faults:u             #    0.026 M/sec                    ( +-  0.07% )
       2,372,389,668      cycles:u                  #    0.997 GHz                      ( +-  0.29% )  (83.14%)
         106,937,862      stalled-cycles-frontend:u #    4.51% frontend cycles idle     ( +-  4.89% )  (83.20%)
         581,697,915      stalled-cycles-backend:u  #   24.52% backend cycles idle      ( +-  0.71% )  (83.47%)
       3,659,692,199      instructions:u            #    1.54  insn per cycle
                                                    #    0.16  stalled cycles per insn  ( +-  0.10% )  (83.63%)
         791,372,961      branches:u                #  332.518 M/sec                    ( +-  0.27% )  (83.39%)
          10,648,083      branch-misses:u           #    1.35% of all branches          ( +-  0.22% )  (83.16%)

             1.61570 +- 0.00172 seconds time elapsed  ( +-  0.11% )

  $

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Original-patch-by: Stephane Eranian <eranian@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20201012070214.2074921-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 11:01:37 -03:00
Namhyung Kim 336c95b297 perf inject: Enter namespace when reading build-id
It should be in a proper mnt namespace when accessing the file.

I think this had no problem since the build-id was actually read from
map__load() -> dso__load() already.  But I'd like to change it in the
following commit.

Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20201012070214.2074921-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 10:59:42 -03:00
Namhyung Kim 2946ecedd0 perf inject: Add missing callbacks in perf_tool
I found some events (like PERF_RECORD_CGROUP) are not copied by perf
inject due to the missing callbacks.  Let's add them.

While at it, I've changed the order of the callbacks to match with
struct perf_tool so that we can compare them easily.

Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20201012070214.2074921-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 10:59:42 -03:00
Namhyung Kim 0bf02a0d80 perf bench: Add build-id injection benchmark
Sometimes I can see that 'perf record' piped with 'perf inject' take a
long time processing build-ids.

So introduce a inject-build-id benchmark to the internals benchmark
suite to measure its overhead regularly.

It runs the 'perf inject' command internally and feeds the given number
of synthesized events (MMAP2 + SAMPLE basically).

  Usage: perf bench internals inject-build-id <options>

    -i, --iterations <n>  Number of iterations used to compute average (default: 100)
    -m, --nr-mmaps <n>    Number of mmap events for each iteration (default: 100)
    -n, --nr-samples <n>  Number of sample events per mmap event (default: 100)
    -v, --verbose         be more verbose (show iteration count, DSO name, etc)

By default, it measures average processing time of 100 MMAP2 events
and 10000 SAMPLE events.  Below is a result on my laptop.

  $ perf bench internals inject-build-id
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 25.789 msec (+- 0.202 msec)
    Average time per event: 2.528 usec (+- 0.020 usec)
    Average memory usage: 8411 KB (+- 7 KB)

Committer testing:

  $ perf bench
  Usage:
  	perf bench [<common options>] <collection> <benchmark> [<options>]

          # List of all available benchmark collections:

           sched: Scheduler and IPC benchmarks
         syscall: System call benchmarks
             mem: Memory access benchmarks
            numa: NUMA scheduling and MM benchmarks
           futex: Futex stressing benchmarks
           epoll: Epoll stressing benchmarks
       internals: Perf-internals benchmarks
             all: All benchmarks

  $ perf bench internals

          # List of available benchmarks for collection 'internals':

      synthesize: Benchmark perf event synthesis
  kallsyms-parse: Benchmark kallsyms parsing
  inject-build-id: Benchmark build-id injection

  $ perf bench internals inject-build-id
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 14.202 msec (+- 0.059 msec)
    Average time per event: 1.392 usec (+- 0.006 usec)
    Average memory usage: 12650 KB (+- 10 KB)
    Average build-id-all injection took: 12.831 msec (+- 0.071 msec)
    Average time per event: 1.258 usec (+- 0.007 usec)
    Average memory usage: 11895 KB (+- 10 KB)
  $

  $ perf stat -r5 perf bench internals inject-build-id
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 14.380 msec (+- 0.056 msec)
    Average time per event: 1.410 usec (+- 0.006 usec)
    Average memory usage: 12608 KB (+- 11 KB)
    Average build-id-all injection took: 11.889 msec (+- 0.064 msec)
    Average time per event: 1.166 usec (+- 0.006 usec)
    Average memory usage: 11838 KB (+- 10 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 14.246 msec (+- 0.065 msec)
    Average time per event: 1.397 usec (+- 0.006 usec)
    Average memory usage: 12744 KB (+- 10 KB)
    Average build-id-all injection took: 12.019 msec (+- 0.066 msec)
    Average time per event: 1.178 usec (+- 0.006 usec)
    Average memory usage: 11963 KB (+- 10 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 14.321 msec (+- 0.067 msec)
    Average time per event: 1.404 usec (+- 0.007 usec)
    Average memory usage: 12690 KB (+- 10 KB)
    Average build-id-all injection took: 11.909 msec (+- 0.041 msec)
    Average time per event: 1.168 usec (+- 0.004 usec)
    Average memory usage: 11938 KB (+- 10 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 14.287 msec (+- 0.059 msec)
    Average time per event: 1.401 usec (+- 0.006 usec)
    Average memory usage: 12864 KB (+- 10 KB)
    Average build-id-all injection took: 11.862 msec (+- 0.058 msec)
    Average time per event: 1.163 usec (+- 0.006 usec)
    Average memory usage: 12103 KB (+- 10 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 14.402 msec (+- 0.053 msec)
    Average time per event: 1.412 usec (+- 0.005 usec)
    Average memory usage: 12876 KB (+- 10 KB)
    Average build-id-all injection took: 11.826 msec (+- 0.061 msec)
    Average time per event: 1.159 usec (+- 0.006 usec)
    Average memory usage: 12111 KB (+- 10 KB)

   Performance counter stats for 'perf bench internals inject-build-id' (5 runs):

            4,267.48 msec task-clock:u              #    1.502 CPUs utilized            ( +-  0.14% )
                   0      context-switches:u        #    0.000 K/sec
                   0      cpu-migrations:u          #    0.000 K/sec
             102,092      page-faults:u             #    0.024 M/sec                    ( +-  0.08% )
       3,894,589,578      cycles:u                  #    0.913 GHz                      ( +-  0.19% )  (83.49%)
         140,078,421      stalled-cycles-frontend:u #    3.60% frontend cycles idle     ( +-  0.77% )  (83.34%)
         948,581,189      stalled-cycles-backend:u  #   24.36% backend cycles idle      ( +-  0.46% )  (83.25%)
       5,835,587,719      instructions:u            #    1.50  insn per cycle
                                                    #    0.16  stalled cycles per insn  ( +-  0.21% )  (83.24%)
       1,267,423,636      branches:u                #  296.996 M/sec                    ( +-  0.22% )  (83.12%)
          17,484,290      branch-misses:u           #    1.38% of all branches          ( +-  0.12% )  (83.55%)

             2.84176 +- 0.00222 seconds time elapsed  ( +-  0.08% )

  $

Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20201012070214.2074921-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 10:59:42 -03:00
Vasily Gorbik ab0a40ea88 perf build: Allow nested externs to enable BUILD_BUG() usage
Currently the BUILD_BUG() macro is expanded to the following:

   do {
           extern void __compiletime_assert_0(void)
                   __attribute__((error("BUILD_BUG failed")));
           if (!(!(1)))
                   __compiletime_assert_0();
   } while (0);

If used in a function body this would obviously produce build errors
with -Wnested-externs and -Werror.

To enable BUILD_BUG() usage in tools/arch/x86/lib/insn.c which perf
includes in intel-pt-decoder, build perf without -Wnested-externs.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> # build tested
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/patch-1.thread-251403.git-2514037e9477.your-ad-here.call-01602244460-ext-7088@work.hours
2020-10-13 12:08:32 +02:00
Linus Torvalds 22230cd2c5 Merge branch 'compat.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat mount cleanups from Al Viro:
 "The last remnants of mount(2) compat buried by Christoph.

  Buried into NFS, that is.

  Generally I'm less enthusiastic about "let's use in_compat_syscall()
  deep in call chain" kind of approach than Christoph seems to be, but
  in this case it's warranted - that had been an NFS-specific wart,
  hopefully not to be repeated in any other filesystems (read: any new
  filesystem introducing non-text mount options will get NAKed even if
  it doesn't mess the layout up).

  IOW, not worth trying to grow an infrastructure that would avoid that
  use of in_compat_syscall()..."

* 'compat.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: remove compat_sys_mount
  fs,nfs: lift compat nfs4 mount data handling into the nfs code
  nfs: simplify nfs4_parse_monolithic
2020-10-12 16:44:57 -07:00
Linus Torvalds 85ed13e78d Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat iovec cleanups from Al Viro:
 "Christoph's series around import_iovec() and compat variant thereof"

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  security/keys: remove compat_keyctl_instantiate_key_iov
  mm: remove compat_process_vm_{readv,writev}
  fs: remove compat_sys_vmsplice
  fs: remove the compat readv/writev syscalls
  fs: remove various compat readv/writev helpers
  iov_iter: transparently handle compat iovecs in import_iovec
  iov_iter: refactor rw_copy_check_uvector and import_iovec
  iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c
  compat.h: fix a spelling error in <linux/compat.h>
2020-10-12 16:35:51 -07:00
Linus Torvalds ca1b66922a * Extend the recovery from MCE in kernel space also to processes which
encounter an MCE in kernel space but while copying from user memory by
 sending them a SIGBUS on return to user space and umapping the faulty
 memory, by Tony Luck and Youquan Song.
 
 * memcpy_mcsafe() rework by splitting the functionality into
 copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables
 support for new hardware which can recover from a machine check
 encountered during a fast string copy and makes that the default and
 lets the older hardware which does not support that advance recovery,
 opt in to use the old, fragile, slow variant, by Dan Williams.
 
 * New AMD hw enablement, by Yazen Ghannam and Akshay Gupta.
 
 * Do not use MSR-tracing accessors in #MC context and flag any fault
 while accessing MCA architectural MSRs as an architectural violation
 with the hope that such hw/fw misdesigns are caught early during the hw
 eval phase and they don't make it into production.
 
 * Misc fixes, improvements and cleanups, as always.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl+EIpUACgkQEsHwGGHe
 VUouoBAAgwb+NkWZtIqGImV4f+LOyFjhTR/r/7ZyiijXdbhOIuAdc/jQM31mQxug
 sX2jxaRYnf1n6SLA0ggX99gwr2deRQ/hsNf5Abw55GC+Z1dOxpGL0k59A3ELl1IR
 H9KYmCAFQIHvzfk38qcdND73XHcgthQoXFBOG9wAPAdgDWnaiWt6lcLAq8OiJTmp
 D8pInAYhcnL8YXwMGyQQ1KkFn9HwydoWDsK5Ff2shaw2/+dMQqd1zetenbVtjhLb
 iNYGvV7Bi/RQ8PyMbzmtTWa4kwQJAHC2gptkGxty//2ADGVBbqUQdqF9TjIWCNy5
 V6Ldv5zo0/1s7DOzji3htzqkSs/K1Ea6d2LtZjejkJipHKV5x068UC6Fu+PlfS2D
 VZfcICeapU4G2F3Zvks2DlZ7dVTbHCvoI78Qi7bBgczPUVmk6iqah4xuQaiHyBJc
 kTFDA4Nnf/026GpoWRiFry9vqdnHBZyLet5A6Y+SoWF0FbhYnCVPpq4MnussYoav
 lUIi9ZZav6X2RZp9DDM1f9d5xubtKq0DKt93wvzqAhjK0T2DikckJ+riOYkI6N8t
 fHCBNUkdfgyMzJUTBPAzYQ7RmjbjKWJi7xWP0oz6+GqOJkQfSTVC5/2yEffbb3ya
 whYRS6iklbl7yshzaOeecXsZcAeK2oGPfoHg34WkHFgXdF5mNgA=
 =u1Wg
 -----END PGP SIGNATURE-----

Merge tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RAS updates from Borislav Petkov:

 - Extend the recovery from MCE in kernel space also to processes which
   encounter an MCE in kernel space but while copying from user memory
   by sending them a SIGBUS on return to user space and umapping the
   faulty memory, by Tony Luck and Youquan Song.

 - memcpy_mcsafe() rework by splitting the functionality into
   copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables
   support for new hardware which can recover from a machine check
   encountered during a fast string copy and makes that the default and
   lets the older hardware which does not support that advance recovery,
   opt in to use the old, fragile, slow variant, by Dan Williams.

 - New AMD hw enablement, by Yazen Ghannam and Akshay Gupta.

 - Do not use MSR-tracing accessors in #MC context and flag any fault
   while accessing MCA architectural MSRs as an architectural violation
   with the hope that such hw/fw misdesigns are caught early during the
   hw eval phase and they don't make it into production.

 - Misc fixes, improvements and cleanups, as always.

* tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Allow for copy_mc_fragile symbol checksum to be generated
  x86/mce: Decode a kernel instruction to determine if it is copying from user
  x86/mce: Recover from poison found while copying from user space
  x86/mce: Avoid tail copy when machine check terminated a copy from user
  x86/mce: Add _ASM_EXTABLE_CPY for copy user access
  x86/mce: Provide method to find out the type of an exception handler
  x86/mce: Pass pointer to saved pt_regs to severity calculation routines
  x86/copy_mc: Introduce copy_mc_enhanced_fast_string()
  x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()
  x86/mce: Drop AMD-specific "DEFERRED" case from Intel severity rule list
  x86/mce: Add Skylake quirk for patrol scrub reported errors
  RAS/CEC: Convert to DEFINE_SHOW_ATTRIBUTE()
  x86/mce: Annotate mce_rd/wrmsrl() with noinstr
  x86/mce/dev-mcelog: Do not update kflags on AMD systems
  x86/mce: Stop mce_reign() from re-computing severity for every CPU
  x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR
  x86/mce: Increase maximum number of banks to 64
  x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()
  x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
  RAS/CEC: Fix cec_init() prototype
2020-10-12 10:14:38 -07:00
Dan Williams ec6347bb43 x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()
In reaction to a proposal to introduce a memcpy_mcsafe_fast()
implementation Linus points out that memcpy_mcsafe() is poorly named
relative to communicating the scope of the interface. Specifically what
addresses are valid to pass as source, destination, and what faults /
exceptions are handled.

Of particular concern is that even though x86 might be able to handle
the semantics of copy_mc_to_user() with its common copy_user_generic()
implementation other archs likely need / want an explicit path for this
case:

  On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote:
  >
  > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote:
  > >
  > > However now I see that copy_user_generic() works for the wrong reason.
  > > It works because the exception on the source address due to poison
  > > looks no different than a write fault on the user address to the
  > > caller, it's still just a short copy. So it makes copy_to_user() work
  > > for the wrong reason relative to the name.
  >
  > Right.
  >
  > And it won't work that way on other architectures. On x86, we have a
  > generic function that can take faults on either side, and we use it
  > for both cases (and for the "in_user" case too), but that's an
  > artifact of the architecture oddity.
  >
  > In fact, it's probably wrong even on x86 - because it can hide bugs -
  > but writing those things is painful enough that everybody prefers
  > having just one function.

Replace a single top-level memcpy_mcsafe() with either
copy_mc_to_user(), or copy_mc_to_kernel().

Introduce an x86 copy_mc_fragile() name as the rename for the
low-level x86 implementation formerly named memcpy_mcsafe(). It is used
as the slow / careful backend that is supplanted by a fast
copy_mc_generic() in a follow-on patch.

One side-effect of this reorganization is that separating copy_mc_64.S
to its own file means that perf no longer needs to track dependencies
for its memcpy_64.S benchmarks.

 [ bp: Massage a bit. ]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: <stable@vger.kernel.org>
Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com
Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
2020-10-06 11:18:04 +02:00
Christoph Hellwig c3973b401e mm: remove compat_process_vm_{readv,writev}
Now that import_iovec handles compat iovecs, the native syscalls
can be used for the compat case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:15 -04:00
Christoph Hellwig 598b3cec83 fs: remove compat_sys_vmsplice
Now that import_iovec handles compat iovecs, the native vmsplice syscall
can be used for the compat case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:15 -04:00
Christoph Hellwig 5f764d624a fs: remove the compat readv/writev syscalls
Now that import_iovec handles compat iovecs, the native readv and writev
syscalls can be used for the compat case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:14 -04:00
Jiri Olsa 6fcd5ddc3b perf python scripting: Fix printable strings in python3 scripts
Hagen reported broken strings in python3 tracepoint scripts:

  make PYTHON=python3
  perf record -e sched:sched_switch -a -- sleep 5
  perf script --gen-script py
  perf script -s ./perf-script.py

  [..]
  sched__sched_switch      7 563231.759525792        0 swapper   prev_comm=bytearray(b'swapper/7\x00\x00\x00\x00\x00\x00\x00'), prev_pid=0, prev_prio=120, prev_state=, next_comm=bytearray(b'mutex-thread-co\x00'),

The problem is in the is_printable_array function that does not take the
zero byte into account and claim such string as not printable, so the
code will create byte array instead of string.

Committer testing:

After this fix:

sched__sched_switch 3 484522.497072626  1158680 kworker/3:0-eve  prev_comm=kworker/3:0, prev_pid=1158680, prev_prio=120, prev_state=I, next_comm=swapper/3, next_pid=0, next_prio=120
Sample: {addr=0, cpu=3, datasrc=84410401, datasrc_decode=N/A|SNP N/A|TLB N/A|LCK N/A, ip=18446744071841817196, period=1, phys_addr=0, pid=1158680, tid=1158680, time=484522497072626, transaction=0, values=[(0, 0)], weight=0}

sched__sched_switch 4 484522.497085610  1225814 perf             prev_comm=perf, prev_pid=1225814, prev_prio=120, prev_state=, next_comm=migration/4, next_pid=30, next_prio=0
Sample: {addr=0, cpu=4, datasrc=84410401, datasrc_decode=N/A|SNP N/A|TLB N/A|LCK N/A, ip=18446744071841817196, period=1, phys_addr=0, pid=1225814, tid=1225814, time=484522497085610, transaction=0, values=[(0, 0)], weight=0}

Fixes: 249de6e074 ("perf script python: Fix string vs byte array resolving")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200928201135.3633850-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-01 12:10:56 -03:00
Arnaldo Carvalho de Melo 388968d864 perf trace: Use the autogenerated mmap 'prot' string/id table
No change in behaviour:

  # perf trace -e mmap sleep 1
       0.000 ( 0.009 ms): sleep/751870 mmap(len: 143317, prot: READ, flags: PRIVATE, fd: 3)                  = 0x7fa96d0f7000
       0.028 ( 0.004 ms): sleep/751870 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS)           = 0x7fa96d0f5000
       0.037 ( 0.005 ms): sleep/751870 mmap(len: 1872744, prot: READ, flags: PRIVATE|DENYWRITE, fd: 3)       = 0x7fa96cf2b000
       0.044 ( 0.011 ms): sleep/751870 mmap(addr: 0x7fa96cf50000, len: 1376256, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x25000) = 0x7fa96cf50000
       0.056 ( 0.007 ms): sleep/751870 mmap(addr: 0x7fa96d0a0000, len: 307200, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x175000) = 0x7fa96d0a0000
       0.064 ( 0.007 ms): sleep/751870 mmap(addr: 0x7fa96d0eb000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bf000) = 0x7fa96d0eb000
       0.075 ( 0.005 ms): sleep/751870 mmap(addr: 0x7fa96d0f1000, len: 13160, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) = 0x7fa96d0f1000
       0.253 ( 0.005 ms): sleep/751870 mmap(len: 218049136, prot: READ, flags: PRIVATE, fd: 3)               = 0x7fa95ff38000
  #
  #
  # set -o vi
  # strace -e mmap sleep 1
  mmap(NULL, 143317, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f333bd83000
  mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f333bd81000
  mmap(NULL, 1872744, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f333bbb7000
  mmap(0x7f333bbdc000, 1376256, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7f333bbdc000
  mmap(0x7f333bd2c000, 307200, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x175000) = 0x7f333bd2c000
  mmap(0x7f333bd77000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bf000) = 0x7f333bd77000
  mmap(0x7f333bd7d000, 13160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f333bd7d000
  mmap(NULL, 218049136, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f332ebc4000
  +++ exited with 0 +++
  #

And you can as well tweak 'perf trace's output to more closely match
strace's:

  # perf config trace.show_arg_names=no
  # perf config trace.show_duration=no
  # perf config trace.show_prefix=yes
  # perf config trace.show_timestamp=no
  # perf config trace.show_zeros=yes
  # perf config trace.no_inherit=yes
  # perf trace -e mmap sleep 1
  mmap(NULL, 143317, PROT_READ, MAP_PRIVATE, 3, 0)                      = 0x7f0d287ca000
  mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS)     = 0x7f0d287c8000
  mmap(NULL, 1872744, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0)       = 0x7f0d285fe000
  mmap(0x7f0d28623000, 1376256, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7f0d28623000
  mmap(0x7f0d28773000, 307200, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x175000) = 0x7f0d28773000
  mmap(0x7f0d287be000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bf000) = 0x7f0d287be000
  mmap(0x7f0d287c4000, 13160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS) = 0x7f0d287c4000
  mmap(NULL, 218049136, PROT_READ, MAP_PRIVATE, 3, 0)                   = 0x7f0d1b60b000
  #

  # perf config | grep ^trace
  trace.show_arg_names=no
  trace.show_duration=no
  trace.show_prefix=yes
  trace.show_timestamp=no
  trace.show_zeros=yes
  trace.no_inherit=yes
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-01 11:35:01 -03:00
Arnaldo Carvalho de Melo 08fc476214 tools beauty: Add script to generate table of mmap's 'prot' argument
Will be wired up in the following csets:

  $ tools/perf/trace/beauty/mmap_prot.sh
  static const char *mmap_prot[] = {
  	[ilog2(0x1) + 1] = "READ",
  #ifndef PROT_READ
  #define PROT_READ 0x1
  #endif
  	[ilog2(0x2) + 1] = "WRITE",
  #ifndef PROT_WRITE
  #define PROT_WRITE 0x2
  #endif
  	[ilog2(0x4) + 1] = "EXEC",
  #ifndef PROT_EXEC
  #define PROT_EXEC 0x4
  #endif
  	[ilog2(0x8) + 1] = "SEM",
  #ifndef PROT_SEM
  #define PROT_SEM 0x8
  #endif
  	[ilog2(0x01000000) + 1] = "GROWSDOWN",
  #ifndef PROT_GROWSDOWN
  #define PROT_GROWSDOWN 0x01000000
  #endif
  	[ilog2(0x02000000) + 1] = "GROWSUP",
  #ifndef PROT_GROWSUP
  #define PROT_GROWSUP 0x02000000
  #endif
  };
  $
  $
  $
  $ tools/perf/trace/beauty/mmap_prot.sh alpha
  static const char *mmap_prot[] = {
  	[ilog2(0x4) + 1] = "EXEC",
  #ifndef PROT_EXEC
  #define PROT_EXEC 0x4
  #endif
  	[ilog2(0x01000000) + 1] = "GROWSDOWN",
  #ifndef PROT_GROWSDOWN
  #define PROT_GROWSDOWN 0x01000000
  #endif
  	[ilog2(0x02000000) + 1] = "GROWSUP",
  #ifndef PROT_GROWSUP
  #define PROT_GROWSUP 0x02000000
  #endif
  	[ilog2(0x1) + 1] = "READ",
  #ifndef PROT_READ
  #define PROT_READ 0x1
  #endif
  	[ilog2(0x8) + 1] = "SEM",
  #ifndef PROT_SEM
  #define PROT_SEM 0x8
  #endif
  	[ilog2(0x2) + 1] = "WRITE",
  #ifndef PROT_WRITE
  #define PROT_WRITE 0x2
  #endif
  };
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-01 11:14:22 -03:00
Arnaldo Carvalho de Melo 61693228b6 perf beauty mmap_flags: Conditionaly define the mmap flags
So that in older systems we get it in the mmap flags scnprintf routines:

  $ tools/perf/trace/beauty/mmap_flags.sh  | head -9 2> /dev/null
  static const char *mmap_flags[] = {
  	[ilog2(0x40) + 1] = "32BIT",
  #ifndef MAP_32BIT
  #define MAP_32BIT 0x40
  #endif
  	[ilog2(0x01) + 1] = "SHARED",
  #ifndef MAP_SHARED
  #define MAP_SHARED 0x01
  #endif
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-30 09:34:20 -03:00
Arnaldo Carvalho de Melo 9012e3dda2 perf trace beauty: Add script to autogenerate mremap's flags args string/id table
It'll also conditionally generate the defines, so that if we don't have
those when building a new tool tarball in an older systems, we get
those, and we need them sometimes in the actual scnprintf routine, such
as when checking if a flags means we have an extra arg, like with
MREMAP_FIXED.

  $ tools/perf/trace/beauty/mremap_flags.sh
  static const char *mremap_flags[] = {
  	[ilog2(1) + 1] = "MAYMOVE",
  #ifndef MREMAP_MAYMOVE
  #define MREMAP_MAYMOVE 1
  #endif
  	[ilog2(2) + 1] = "FIXED",
  #ifndef MREMAP_FIXED
  #define MREMAP_FIXED 2
  #endif
  	[ilog2(4) + 1] = "DONTUNMAP",
  #ifndef MREMAP_DONTUNMAP
  #define MREMAP_DONTUNMAP 4
  #endif
  };
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-29 18:07:27 -03:00
Arnaldo Carvalho de Melo d758d5d474 perf tools: Separate the checking of headers only used to build beautification tables
Some headers are not used in building the tools directly, but instead to
generate tables that then gets source code included to do id->string and
string->id lookups for things like syscall flags and commands.

We were adding it directly to tools/include/ and this sometimes gets in
the way of building using system headers, lets untangle this a bit.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-29 08:56:38 -03:00
Ian Rogers a55b7bb1c1 perf test: Fix msan uninitialized use.
Ensure 'st' is initialized before an error branch is taken.
Fixes test "67: Parse and process metrics" with LLVM msan:

  ==6757==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x5570edae947d in rblist__exit tools/perf/util/rblist.c:114:2
    #1 0x5570edb1c6e8 in runtime_stat__exit tools/perf/util/stat-shadow.c:141:2
    #2 0x5570ed92cfae in __compute_metric tools/perf/tests/parse-metric.c:187:2
    #3 0x5570ed92cb74 in compute_metric tools/perf/tests/parse-metric.c:196:9
    #4 0x5570ed92c6d8 in test_recursion_fail tools/perf/tests/parse-metric.c:318:2
    #5 0x5570ed92b8c8 in test__parse_metric tools/perf/tests/parse-metric.c:356:2
    #6 0x5570ed8de8c1 in run_test tools/perf/tests/builtin-test.c:410:9
    #7 0x5570ed8ddadf in test_and_print tools/perf/tests/builtin-test.c:440:9
    #8 0x5570ed8dca04 in __cmd_test tools/perf/tests/builtin-test.c:661:4
    #9 0x5570ed8dbc07 in cmd_test tools/perf/tests/builtin-test.c:807:9
    #10 0x5570ed7326cc in run_builtin tools/perf/perf.c:313:11
    #11 0x5570ed731639 in handle_internal_command tools/perf/perf.c:365:8
    #12 0x5570ed7323cd in run_argv tools/perf/perf.c:409:2
    #13 0x5570ed731076 in main tools/perf/perf.c:539:3

Fixes: commit f5a56570a3 ("perf test: Fix memory leaks in parse-metric test")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20200923210655.4143682-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28 09:24:01 -03:00
Ian Rogers aa98d8482c perf parse-events: Reduce casts around bp_addr
perf_event_attr bp_addr is a u64. parse-events.y parses it as a u64, but
casts it to a void* and then parse-events.c casts it back to a u64.
Rather than all the casts, change the type of the address to be a u64.

This removes an issue noted in:

  https://lore.kernel.org/lkml/20200903184359.GC3495158@kernel.org/

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200925003903.561568-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28 09:22:39 -03:00
Namhyung Kim 40b74c30ff perf test: Add expand cgroup event test
It'll expand given events for cgroups A, B and C.

  $ perf test -v expansion
  69: Event expansion for cgroups                      :
  --- start ---
  test child forked, pid 983140
  metric expr 1 / IPC for CPI
  metric expr instructions / cycles for IPC
  found event instructions
  found event cycles
  adding {instructions,cycles}:W
  copying metric event for cgroup 'A': instructions (idx=0)
  copying metric event for cgroup 'B': instructions (idx=0)
  copying metric event for cgroup 'C': instructions (idx=0)
  test child finished with 0
  ---- end ----
  Event expansion for cgroups: Ok

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200924124455.336326-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28 09:21:05 -03:00
Namhyung Kim 89fb1ca2ab perf tools: Allow creation of cgroup without open
This is a preparation for a test case of expanding events for multiple
cgroups.  Instead of using real system cgroup, the test will use fake
cgroups so it needs a way to have them without a open file descriptor.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200924124455.336326-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28 09:18:06 -03:00
Namhyung Kim b214ba8c42 perf tools: Copy metric events properly when expand cgroups
The metricgroup__copy_metric_events() is to handle metrics events when
expanding event for cgroups.  As the metric events keep pointers to
evsel, it should be refreshed when events are cloned during the
operation.

The perf_stat__collect_metric_expr() is also called in case an event has
a metric directly.

During the copy, it references evsel by index as the evlist now has
cloned evsels for the given cgroup.

Also kernel test robot found an issue in the python module import so add
empty implementations of those two functions to fix it.

Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200924124455.336326-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28 09:16:21 -03:00
Namhyung Kim d1c5a0e86a perf stat: Add --for-each-cgroup option
The --for-each-cgroup option is a syntax sugar to monitor large number
of cgroups easily.  Current command line requires to list all the events
and cgroups even if users want to monitor same events for each cgroup.
This patch addresses that usage by copying given events for each cgroup
on user's behalf.

For instance, if they want to monitor 6 events for 200 cgroups each they
should write 1200 event names (with -e) AND 1200 cgroup names (with -G)
on the command line.  But with this change, they can just specify 6
events and 200 cgroups with a new option.

A simpler example below: It wants to measure 3 events for 2 cgroups ('A'
and 'B').  The result is that total 6 events are counted like below.

  $ perf stat -a -e cpu-clock,cycles,instructions --for-each-cgroup A,B sleep 1

   Performance counter stats for 'system wide':

              988.18 msec cpu-clock                 A #    0.987 CPUs utilized
       3,153,761,702      cycles                    A #    3.200 GHz                      (100.00%)
       8,067,769,847      instructions              A #    2.57  insn per cycle           (100.00%)
              982.71 msec cpu-clock                 B #    0.982 CPUs utilized
       3,136,093,298      cycles                    B #    3.182 GHz                      (99.99%)
       8,109,619,327      instructions              B #    2.58  insn per cycle           (99.99%)

         1.001228054 seconds time elapsed

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200924124455.336326-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28 09:07:08 -03:00
Namhyung Kim 7fedd9b84b perf evsel: Add evsel__clone() function
The evsel__clone() is to create an exactly same evsel from same
attributes.  The function assumes the given evsel is not configured
yet so it cares fields set during event parsing.  Those fields are now
moved together as Jiri suggested.  Note that metric events will be
handled by later patch.

It will be used by perf stat to generate separate events for each
cgroup.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200924124455.336326-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28 08:55:48 -03:00
Jin Yao b5ff7f2799 perf vendor events: Update SkylakeX events to v1.21
- Update SkylakeX events to v1.21.
- Update SkylakeX JSON metrics from TMAM 4.0.

Other fixes:

- Add NO_NMI_WATCHDOG metric constraint to Backend_Bound
- Fix misspelled error

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/20200922031918.3723-1-yao.jin@linux.intel.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28 08:46:47 -03:00
Jin Yao 038d3b53c2 perf vendor events intel: Update CascadelakeX events to v1.08
- Update CascadelakeX events to v1.08.
- Update CascadelakeX JSON metrics from TMAM 4.0.

Other fixes:

- Add NO_NMI_WATCHDOG metric constraint to Backend_Bound
- Change 'MB/sec' to 'MB' in UNC_M_PMM_BANDWIDTH.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: https://lore.kernel.org/lkml/20200922031918.3723-1-yao.jin@linux.intel.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28 08:46:37 -03:00
David S. Miller 6d772f328d Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-09-23

The following pull-request contains BPF updates for your *net-next* tree.

We've added 95 non-merge commits during the last 22 day(s) which contain
a total of 124 files changed, 4211 insertions(+), 2040 deletions(-).

The main changes are:

1) Full multi function support in libbpf, from Andrii.

2) Refactoring of function argument checks, from Lorenz.

3) Make bpf_tail_call compatible with functions (subprograms), from Maciej.

4) Program metadata support, from YiFei.

5) bpf iterator optimizations, from Yonghong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23 13:11:11 -07:00
Hagen Paul Pfeifer 69f48c7040 perf script: Add min, max to futex-contention output, in addition to avg
Average is quite informative, but the outliners - especially max - are
also of interest.

Before:

  mutex-locker[793299] lock 5637ec61e080 contended 3400 times, 446 avg ns
  mutex-locker[793301] lock 5637ec61e080 contended 3563 times, 385 avg ns
  mutex-locker[793300] lock 5637ec61e080 contended 3110 times, 1855 avg ns

After:

  mutex-locker[795251] lock 55b14e6dd080 contended 3853 times, 1279 avg ns [max: 12270 ns, min 340 ns]
  mutex-locker[795253] lock 55b14e6dd080 contended 2911 times, 518 avg ns [max: 51660261 ns, min 347 ns]
  mutex-locker[795252] lock 55b14e6dd080 contended 3843 times, 385 avg ns [max: 24323998 ns, min 338 ns]

Committer testing:

  [root@five ~]# perf script record futex-contention -a
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.877 MB perf.data (923 samples) ]

  [root@five ~]# perf evlist
  syscalls:sys_enter_futex
  syscalls:sys_exit_futex
  dummy:HG
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  #

Before:

  [root@five ~]# perf script report futex-contention
  JS Helper[2457] lock 55fe0cf82610 contended 4 times, 6657 avg ns
  ibus-daemon[2975] lock 56227f6d0210 contended 4 times, 1020 avg ns
  chromium-browse[1905801] lock 7ffe573f5088 contended 8 times, 108463 avg ns
  gnome-shell[2240] lock 55fe0cf82678 contended 1 times, 8616 avg ns
  gnome-shel:cs0[2292] lock 55fe0d0ab768 contended 3 times, 606016034 avg ns
  JS Helper[2458] lock 55fe0cf82690 contended 1 times, 1167840 avg ns
  chromium-browse[1905470] lock 7ffe573f5358 contended 1 times, 551504 avg ns
  chromium-browse[1905948] lock 7ffe573f5358 contended 1 times, 577422 avg ns
  gnome-shell[2240] lock 55fe0cf82660 contended 6 times, 202696 avg ns
  pool[2602] lock 7fd600008ef0 contended 1 times, 500046007 avg ns
  chromium-browse[1905801] lock 7ffe573f5128 contended 4 times, 285083 avg ns
  JS Helper[2460] lock 55fe0cf82690 contended 1 times, 680877 avg ns
  JS Helper[2459] lock 55fe0cf82610 contended 7 times, 4224 avg ns
  chromium-browse[1905434] lock 7ffe573f5358 contended 1 times, 697038 avg ns
  chromium-browse[212592] lock 7ffe573f53c8 contended 4 times, 460601 avg ns
  gnome-shel:cs0[2292] lock 55fe0d0ab76c contended 2 times, 601237648 avg ns
  JS Helper[2460] lock 55fe0cf82610 contended 4 times, 3340 avg ns
  JS Helper[2462] lock 55fe0cf82694 contended 1 times, 237275 avg ns
  chromium-browse[1905605] lock 7ffe573f5358 contended 2 times, 634555 avg ns
  chromium-browse[1905992] lock 7ffe573f5358 contended 1 times, 583965 avg ns
  chromium-browse[1905647] lock 7ffe573f5368 contended 8 times, 549800 avg ns
  JS Helper[2462] lock 55fe0cf82610 contended 2 times, 4694 avg ns
  JS Helper[2461] lock 55fe0cf82694 contended 1 times, 257793 avg ns
  JS Helper[2456] lock 55fe0cf82690 contended 1 times, 677771 avg ns
  JS Helper[2463] lock 55fe0cf82610 contended 3 times, 5139 avg ns
  gdbus[2980] lock 56227f6d0210 contended 2 times, 2465 avg ns
  gnome-shell[2240] lock 55fe0cf82664 contended 5 times, 8036 avg ns
  chromium-browse[1906308] lock 7ffe573f5358 contended 1 times, 210735 avg ns
  JS Helper[2463] lock 55fe0cf82694 contended 1 times, 251531 avg ns
  chromium-browse[1905801] lock 7ffe573f4f58 contended 4 times, 399927 avg ns
  [root@five ~]#

After:

  [root@five ~]# perf script report futex-contention
  JS Helper[2457] lock 55fe0cf82610 contended 4 times, 6657 avg ns [max: 11502 ns, min 792 ns]
  ibus-daemon[2975] lock 56227f6d0210 contended 4 times, 1020 avg ns [max: 1813 ns, min 581 ns]
  chromium-browse[1905801] lock 7ffe573f5088 contended 8 times, 108463 avg ns [max: 380103 ns, min 57989 ns]
  gnome-shell[2240] lock 55fe0cf82678 contended 1 times, 8616 avg ns [max: 8616 ns, min 8616 ns]
  gnome-shel:cs0[2292] lock 55fe0d0ab768 contended 3 times, 606016034 avg ns [max: 611295960 ns, min 600191357 ns]
  JS Helper[2458] lock 55fe0cf82690 contended 1 times, 1167840 avg ns [max: 1167840 ns, min 1167840 ns]
  chromium-browse[1905470] lock 7ffe573f5358 contended 1 times, 551504 avg ns [max: 551504 ns, min 551504 ns]
  chromium-browse[1905948] lock 7ffe573f5358 contended 1 times, 577422 avg ns [max: 577422 ns, min 577422 ns]
  gnome-shell[2240] lock 55fe0cf82660 contended 6 times, 202696 avg ns [max: 398998 ns, min 5050 ns]
  pool[2602] lock 7fd600008ef0 contended 1 times, 500046007 avg ns [max: 500046007 ns, min 500046007 ns]
  chromium-browse[1905801] lock 7ffe573f5128 contended 4 times, 285083 avg ns [max: 389531 ns, min 76183 ns]
  JS Helper[2460] lock 55fe0cf82690 contended 1 times, 680877 avg ns [max: 680877 ns, min 680877 ns]
  JS Helper[2459] lock 55fe0cf82610 contended 7 times, 4224 avg ns [max: 12724 ns, min 1012 ns]
  chromium-browse[1905434] lock 7ffe573f5358 contended 1 times, 697038 avg ns [max: 697038 ns, min 697038 ns]
  chromium-browse[212592] lock 7ffe573f53c8 contended 4 times, 460601 avg ns [max: 594956 ns, min 232996 ns]
  gnome-shel:cs0[2292] lock 55fe0d0ab76c contended 2 times, 601237648 avg ns [max: 601255863 ns, min 601219434 ns]
  JS Helper[2460] lock 55fe0cf82610 contended 4 times, 3340 avg ns [max: 9168 ns, min 962 ns]
  JS Helper[2462] lock 55fe0cf82694 contended 1 times, 237275 avg ns [max: 237275 ns, min 237275 ns]
  chromium-browse[1905605] lock 7ffe573f5358 contended 2 times, 634555 avg ns [max: 1024060 ns, min 245050 ns]
  chromium-browse[1905992] lock 7ffe573f5358 contended 1 times, 583965 avg ns [max: 583965 ns, min 583965 ns]
  chromium-browse[1905647] lock 7ffe573f5368 contended 8 times, 549800 avg ns [max: 775293 ns, min 258375 ns]
  JS Helper[2462] lock 55fe0cf82610 contended 2 times, 4694 avg ns [max: 8556 ns, min 832 ns]
  JS Helper[2461] lock 55fe0cf82694 contended 1 times, 257793 avg ns [max: 257793 ns, min 257793 ns]
  JS Helper[2456] lock 55fe0cf82690 contended 1 times, 677771 avg ns [max: 677771 ns, min 677771 ns]
  JS Helper[2463] lock 55fe0cf82610 contended 3 times, 5139 avg ns [max: 6873 ns, min 931 ns]
  gdbus[2980] lock 56227f6d0210 contended 2 times, 2465 avg ns [max: 4188 ns, min 742 ns]
  gnome-shell[2240] lock 55fe0cf82664 contended 5 times, 8036 avg ns [max: 13105 ns, min 401 ns]
  chromium-browse[1906308] lock 7ffe573f5358 contended 1 times, 210735 avg ns [max: 210735 ns, min 210735 ns]
  JS Helper[2463] lock 55fe0cf82694 contended 1 times, 251531 avg ns [max: 251531 ns, min 251531 ns]
  chromium-browse[1905801] lock 7ffe573f4f58 contended 4 times, 399927 avg ns [max: 476904 ns, min 178495 ns]
  [root@five ~]#

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lore.kernel.org/lkml/20200922200922.1306034-1-hagen@jauu.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-23 12:58:53 -03:00
Hagen Paul Pfeifer 2a684fcb60 perf script: Autopep8 futex-contention
10 years leaves its mark! Python has evolved and so has its style guide.
Even with vim it is getting hard to follow the no longer valid
guidelines (spaces vs. tabs).

Autopep8 this code to modernize it!

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Link: http://lore.kernel.org/lkml/20200921201928.799498-1-hagen@jauu.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-23 12:58:53 -03:00
Jin Yao 002a3d690f perf stat: Skip duration_time in setup_system_wide
Some metrics (such as DRAM_BW_Use) consists of uncore events and
duration_time. For uncore events, counter->core.system_wide is true. But
for duration_time, counter->core.system_wide is false so
target.system_wide is set to false.

Then 'enable_on_exec' is set in perf_event_attr of uncore event.  Kernel
will return error when trying to open the uncore event.

This patch skips the duration_time in setup_system_wide then
target.system_wide will be set to true for the evlist of uncore events +
duration_time.

Before (tested on skylake desktop):

  # perf stat -M DRAM_BW_Use -- sleep 1
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (arb/event=0x84,umask=0x1/).
  /bin/dmesg | grep -i perf may provide additional information.

After:

  # perf stat -M DRAM_BW_Use -- sleep 1

   Performance counter stats for 'system wide':

                169      arb/event=0x84,umask=0x1/ #     0.00 DRAM_BW_Use
             40,427      arb/event=0x81,umask=0x1/
      1,000,902,197 ns   duration_time

        1.000902197 seconds time elapsed

Fixes: e3ba76deef ("perf tools: Force uncore events to system wide monitoring")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200922015004.30114-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-23 12:58:53 -03:00
Christoph Hellwig 028abd9222 fs: remove compat_sys_mount
compat_sys_mount is identical to the regular sys_mount now, so remove it
and use the native version everywhere.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-22 23:45:57 -04:00
David S. Miller 3ab0a7a0c3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Two minor conflicts:

1) net/ipv4/route.c, adding a new local variable while
   moving another local variable and removing it's
   initial assignment.

2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes.
   One pretty prints the port mode differently, whilst another
   changes the driver to try and obtain the port mode from
   the port node rather than the switch node.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-22 16:45:34 -07:00
Leo Yan d110162caf perf tsc: Support cap_user_time_short for event TIME_CONV
The synthesized event TIME_CONV doesn't contain the complete parameters
for counters, this will lead to wrong conversion between counter cycles
and timestamp.

This patch extends event TIME_CONV to record flags 'cap_user_time_zero'
which is used to indicate the counter parameters are valid or not, if
not will directly return 0 for timestamp calculation.  And record the
flag 'cap_user_time_short' and its relevant fields 'time_cycles' and
'time_mask' for cycle calibration.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Gasson <nick.gasson@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steve Maclean <steve.maclean@microsoft.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zou Wei <zou_wei@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200914115311.2201-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-22 13:46:40 -03:00
Leo Yan 78a93d4cec perf tsc: Calculate timestamp with cap_user_time_short
The perf mmap'ed buffer contains the flag 'cap_user_time_short' and two
extra fields 'time_cycles' and 'time_mask', perf tool needs to know them
for handling the counter wrapping case.

This patch is to reads out the relevant parameters from the head of the
first mmap'ed page and stores into the structure 'perf_tsc_conversion',
if the flag 'cap_user_time_short' has been set, it will firstly
calibrate cycle value for timestamp calculation.

Committer testing:

Before/after:

  # perf test tsc
  70: Convert perf time to TSC                                        : Ok
  #
  # perf test -v tsc
  70: Convert perf time to TSC                                        :
  --- start ---
  test child forked, pid 11059
  mmap size 528384B
  1st event perf time 996384576521 tsc 3850532906613
  rdtsc          time 996384578455 tsc 3850532913950
  2nd event perf time 996384578845 tsc 3850532915428
  test child finished with 0
  ---- end ----
  Convert perf time to TSC: Ok
  #

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Gasson <nick.gasson@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steve Maclean <steve.maclean@microsoft.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zou Wei <zou_wei@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200914115311.2201-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-22 13:45:21 -03:00
Leo Yan 4979e86141 perf tsc: Add rdtsc() for Arm64
The system register CNTVCT_EL0 can be used to retrieve the counter from
user space.  Add rdtsc() for Arm64.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Gasson <nick.gasson@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steve Maclean <steve.maclean@microsoft.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zou Wei <zou_wei@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200914115311.2201-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-22 13:44:16 -03:00
Leo Yan 03fca3af51 perf tsc: Move out common functions from x86
Functions perf_read_tsc_conversion() and perf_event__synth_time_conv()
should work as common functions rather than x86 specific, so move these
two functions out from arch/x86 folder and place them into util/tsc.c.

Since the function perf_event__synth_time_conv() will be linked in
util/tsc.c, remove its weak version.

Committer testing:

Before/after:

  # perf test tsc
  70: Convert perf time to TSC                                        : Ok
  #
  # perf test -v tsc
  70: Convert perf time to TSC                                        :
  --- start ---
  test child forked, pid 8520
  mmap size 528384B
  1st event perf time 592110439891 tsc 2317172044331
  rdtsc          time 592110441915 tsc 2317172052010
  2nd event perf time 592110442336 tsc 2317172053605
  test child finished with 0
  ---- end ----
  Convert perf time to TSC: Ok
  #

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Gasson <nick.gasson@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steve Maclean <steve.maclean@microsoft.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zou Wei <zou_wei@huawei.com>
Link: http://lore.kernel.org/lkml/20200914115311.2201-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-22 13:38:33 -03:00
Masami Hiramatsu 7cd5738d0d perf probe: Fall back to debuginfod query if debuginfo and source not found locally
Since 'perf probe' heavily depends on debuginfo, debuginfod gives us
many benefits on the 'perf probe' command on remote machine.

Especially, this will be helpful for the embedded devices which will not
have enough storage, or boot with a cross-build kernel whose source code
is in the host machine.

This will work as similar to commit c7a14fdcb3 ("perf build-ids:
Fall back to debuginfod query if debuginfo not found")

Tested with:

  (host) $ cd PATH/TO/KBUILD/DIR/
  (host) $ debuginfod -F .
  ...

  (remote) # perf probe -L vfs_read
  Failed to find the path for the kernel: No such file or directory
    Error: Failed to show lines.

  (remote) # export DEBUGINFOD_URLS="http://$HOST_IP:8002/"
  (remote) # perf probe -L vfs_read
  <vfs_read@...>
        0  ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
           {
        2         ssize_t ret;

                  if (!(file->f_mode & FMODE_READ))
                          return -EBADF;
        6         if (!(file->f_mode & FMODE_CAN_READ))
                          return -EINVAL;
        8         if (unlikely(!access_ok(buf, count)))
                          return -EFAULT;

       11         ret = rw_verify_area(READ, file, pos, count);
       12         if (ret)
                          return ret;
                  if (count > MAX_RW_COUNT)
  ...

  (remote) # perf probe -a "vfs_read count"
  Added new event:
    probe:vfs_read       (on vfs_read with count)

  (remote) # perf probe -l
    probe:vfs_read       (on vfs_read@ksrc/linux/fs/read_write.c with count)

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Frank Ch. Eigler <fche@redhat.com>
Cc: Aaron Merey <amerey@redhat.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Link: http://lore.kernel.org/lkml/160041610083.912668.13659563860278615846.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-18 09:20:47 -03:00
Masami Hiramatsu ac7a75d1fb perf probe: Fix to adjust symbol address with correct reloc_sym address
'perf probe' uses ref_reloc_sym to adjust symbol offset address from
debuginfo address or ref_reloc_sym based address, but that is misusing
reloc_sym->addr and reloc_sym->unrelocated_addr.  If map is not
relocated (map->reloc == 0), we can use reloc_sym->addr as unrelocated
address instead of reloc_sym->unrelocated_addr.

This usually does not happen. If we have a non-stripped ELF binary, we
will use it for map and debuginfo, if not, we use only kallsyms without
debuginfo. Thus, the map is always relocated (ELF and DWARF binary) or
not relocated (kallsyms).

However, if we allow the combination of debuginfo and kallsyms based map
(like using debuginfod), we have to check the map->reloc and choose the
collect address of reloc_sym.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Frank Ch. Eigler <fche@redhat.com>
Cc: Aaron Merey <amerey@redhat.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Link: http://lore.kernel.org/lkml/160041609047.912668.14314639291419159274.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-18 09:19:03 -03:00
Ian Rogers dcc81be0fc perf metricgroup: Fix uncore metric expressions
A metric like DRAM_BW_Use has on SkylakeX events uncore_imc/cas_count_read/
and uncore_imc/case_count_write/.

These events open 6 events per socket with pmu names of
uncore_imc_[0-5].

The current metric setup code in find_evsel_group assumes one ID will
map to 1 event to be recorded in metric_events.

For events with multiple matches, the first event is recorded in
metric_events (avoiding matching >1 event with the same name) and the
evlist_used updated so that duplicate events aren't removed when the
evlist has unused events removed.

Before this change:

  $ /tmp/perf/perf stat -M DRAM_BW_Use -a -- sleep 1

   Performance counter stats for 'system wide':

               41.14 MiB  uncore_imc/cas_count_read/
       1,002,614,251 ns   duration_time

         1.002614251 seconds time elapsed

After this change:

  $ /tmp/perf/perf stat -M DRAM_BW_Use -a -- sleep 1

   Performance counter stats for 'system wide':

              157.47 MiB  uncore_imc/cas_count_read/ #     0.00 DRAM_BW_Use
              126.97 MiB  uncore_imc/cas_count_write/
       1,003,019,728 ns   duration_time

Erroneous duplication introduced in:
commit 2440689d62 ("perf metricgroup: Remove duped metric group events").

Fixes: ded80bda8b ("perf expr: Migrate expr ids table to a hashmap").
Reported-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200917201807.4090224-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 17:37:11 -03:00
Adrian Hunter 7d537a8d2e perf intel-pt: Fix "context_switch event has no tid" error
A context_switch event can have no tid because pids can be detached from
a task while the task is still running (in do_exit()). Note this won't
happen with per-task contexts because then tracing stops at
perf_event_exit_task()

If a task with no tid gets preempted, or a dying task gets preempted and
its parent releases it, when it subsequently gets switched back in,
Intel PT will not be able to determine what task is running and prints
an error "context_switch event has no tid". However, it is not really an
error because the task is in kernel space and the decoder can continue
to decode successfully. Fix by changing the error to be only a logged
message, and make allowance for tid == -1.

Example:

  Using 5.9-rc4 with Preemptible Kernel (Low-Latency Desktop) e.g.
  $ uname -r
  5.9.0-rc4
  $ grep PREEMPT .config
  # CONFIG_PREEMPT_NONE is not set
  # CONFIG_PREEMPT_VOLUNTARY is not set
  CONFIG_PREEMPT=y
  CONFIG_PREEMPT_COUNT=y
  CONFIG_PREEMPTION=y
  CONFIG_PREEMPT_RCU=y
  CONFIG_PREEMPT_NOTIFIERS=y
  CONFIG_DRM_I915_PREEMPT_TIMEOUT=640
  CONFIG_DEBUG_PREEMPT=y
  # CONFIG_PREEMPT_TRACER is not set
  # CONFIG_PREEMPTIRQ_DELAY_TEST is not set

Before:

  $ cat forkit.c

  #include <sys/types.h>
  #include <unistd.h>
  #include <sys/wait.h>

  int main()
  {
          pid_t child;
          int status = 0;

          child = fork();
          if (child == 0)
                  return 123;
          wait(&status);
          return 0;
  }

  $ gcc -o forkit forkit.c
  $ sudo ~/bin/perf record --kcore -a -m,64M -e intel_pt/cyc/k &
  [1] 11016
  $ taskset 2 ./forkit
  $ sudo pkill perf
  $ [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 17.262 MB perf.data ]

  [1]+  Terminated              sudo ~/bin/perf record --kcore -a -m,64M -e intel_pt/cyc/k
  $ sudo ~/bin/perf script --show-task-events --show-switch-events --itrace=iqqe-o -C 1 --ns | grep -C 2 forkit
  context_switch event has no tid
           taskset 11019 [001] 66663.270045029:          1 instructions:k:  ffffffffb1d9f844 strnlen_user+0xb4 ([kernel.kallsyms])
           taskset 11019 [001] 66663.270201816:          1 instructions:k:  ffffffffb1a83121 unmap_page_range+0x561 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270327553: PERF_RECORD_COMM exec: forkit:11019/11019
            forkit 11019 [001] 66663.270420028:          1 instructions:k:  ffffffffb1db9537 __clear_user+0x27 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270648704:          1 instructions:k:  ffffffffb18829e6 do_user_addr_fault+0xf6 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270833163:          1 instructions:k:  ffffffffb230a825 irqentry_exit_to_user_mode+0x15 ([kernel.kallsyms])
            forkit 11019 [001] 66663.271092359:          1 instructions:k:  ffffffffb1aea3d9 lock_page_memcg+0x9 ([kernel.kallsyms])
            forkit 11019 [001] 66663.271207092: PERF_RECORD_FORK(11020:11020):(11019:11019)
            forkit 11019 [001] 66663.271234775: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid: 11020/11020
            forkit 11020 [001] 66663.271238407: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11019/11019
            forkit 11020 [001] 66663.271312066:          1 instructions:k:  ffffffffb1a88140 handle_mm_fault+0x10 ([kernel.kallsyms])
            forkit 11020 [001] 66663.271476225: PERF_RECORD_EXIT(11020:11020):(11019:11019)
            forkit 11020 [001] 66663.271497488: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 11019/11019
            forkit 11019 [001] 66663.271500523: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11020/11020
            forkit 11019 [001] 66663.271517241:          1 instructions:k:  ffffffffb24012cd error_entry+0x6d ([kernel.kallsyms])
            forkit 11019 [001] 66663.271664080: PERF_RECORD_EXIT(11019:11019):(1386:1386)

After:

  $ sudo ~/bin/perf script --show-task-events --show-switch-events --itrace=iqqe-o -C 1 --ns | grep -C 2 forkit
           taskset 11019 [001] 66663.270045029:          1 instructions:k:  ffffffffb1d9f844 strnlen_user+0xb4 ([kernel.kallsyms])
           taskset 11019 [001] 66663.270201816:          1 instructions:k:  ffffffffb1a83121 unmap_page_range+0x561 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270327553: PERF_RECORD_COMM exec: forkit:11019/11019
            forkit 11019 [001] 66663.270420028:          1 instructions:k:  ffffffffb1db9537 __clear_user+0x27 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270648704:          1 instructions:k:  ffffffffb18829e6 do_user_addr_fault+0xf6 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270833163:          1 instructions:k:  ffffffffb230a825 irqentry_exit_to_user_mode+0x15 ([kernel.kallsyms])
            forkit 11019 [001] 66663.271092359:          1 instructions:k:  ffffffffb1aea3d9 lock_page_memcg+0x9 ([kernel.kallsyms])
            forkit 11019 [001] 66663.271207092: PERF_RECORD_FORK(11020:11020):(11019:11019)
            forkit 11019 [001] 66663.271234775: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid: 11020/11020
            forkit 11020 [001] 66663.271238407: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11019/11019
            forkit 11020 [001] 66663.271312066:          1 instructions:k:  ffffffffb1a88140 handle_mm_fault+0x10 ([kernel.kallsyms])
            forkit 11020 [001] 66663.271476225: PERF_RECORD_EXIT(11020:11020):(11019:11019)
            forkit 11020 [001] 66663.271497488: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 11019/11019
            forkit 11019 [001] 66663.271500523: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11020/11020
            forkit 11019 [001] 66663.271517241:          1 instructions:k:  ffffffffb24012cd error_entry+0x6d ([kernel.kallsyms])
            forkit 11019 [001] 66663.271664080: PERF_RECORD_EXIT(11019:11019):(1386:1386)
            forkit 11019 [001] 66663.271688752: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:    -1/-1
               :-1    -1 [001] 66663.271692086: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11019/11019
                :-1    -1 [001] 66663.271707466:          1 instructions:k:  ffffffffb18eb096 update_load_avg+0x306 ([kernel.kallsyms])

Fixes: 86c2786994 ("perf intel-pt: Add support for PERF_RECORD_SWITCH")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Link: http://lore.kernel.org/lkml/20200909084923.9096-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 16:08:43 -03:00
Adrian Hunter fc18380fb9 perf script: Display negative tid in non-sample events
The kernel can release tasks while they are still running. This can
result in a task having no tid, in which case perf records a tid of -1.
Improve the perf script output in that case.

Example:

Before:

  # cat ./autoreap.c

  #include <sys/types.h>
  #include <unistd.h>
  #include <sys/wait.h>
  #include <signal.h>

  struct sigaction act = {
          .sa_handler = SIG_IGN,
  };

  int main()
  {
          pid_t child;
          int status = 0;

          sigaction(SIGCHLD, &act, NULL);
          child = fork();
          if (child == 0)
                  return 123;
          wait(&status);
          return 0;
  }

  # gcc -o autoreap autoreap.c
  # ./perf record -a -e dummy --switch-events ./autoreap
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.948 MB perf.data ]
  # ./perf script --show-task-events --show-switch-events | grep -C2 'autoreap\|4294967295\|-1'
           swapper     0 [004] 18462.673613: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25189/25189
              perf 25189 [004] 18462.673614: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
          autoreap 25189 [004] 18462.673800: PERF_RECORD_COMM exec: autoreap:25189/25189
          autoreap 25189 [004] 18462.674042: PERF_RECORD_FORK(25191:25191):(25189:25189)
          autoreap 25189 [004] 18462.674050: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [004] 18462.674051: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 25189/25189
           swapper     0 [005] 18462.674083: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25191/25191
          autoreap 25191 [005] 18462.674084: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
           swapper     0 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid:    11/11
       rcu_preempt    11 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
       rcu_preempt    11 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    11/11
          autoreap 25191 [005] 18462.674138: PERF_RECORD_EXIT(25191:25191):(25189:25189)
  PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 4294967295/4294967295
           swapper     0 [004] 18462.674182: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25189/25189
          autoreap 25189 [004] 18462.674183: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
          autoreap 25189 [004] 18462.674218: PERF_RECORD_EXIT(25189:25189):(25188:25188)
          autoreap 25189 [004] 18462.674225: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [004] 18462.674226: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 25189/25189
           swapper     0 [007] 18462.674257: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25188/25188

After:

  # ./perf script --show-task-events --show-switch-events | grep -C2 'autoreap\|4294967295\|-1'
           swapper     0 [004] 18462.673613: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25189/25189
              perf 25189 [004] 18462.673614: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
          autoreap 25189 [004] 18462.673800: PERF_RECORD_COMM exec: autoreap:25189/25189
          autoreap 25189 [004] 18462.674042: PERF_RECORD_FORK(25191:25191):(25189:25189)
          autoreap 25189 [004] 18462.674050: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [004] 18462.674051: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 25189/25189
           swapper     0 [005] 18462.674083: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25191/25191
          autoreap 25191 [005] 18462.674084: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
           swapper     0 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid:    11/11
       rcu_preempt    11 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
       rcu_preempt    11 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    11/11
          autoreap 25191 [005] 18462.674138: PERF_RECORD_EXIT(25191:25191):(25189:25189)
               :-1    -1 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    -1/-1
           swapper     0 [004] 18462.674182: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25189/25189
          autoreap 25189 [004] 18462.674183: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
          autoreap 25189 [004] 18462.674218: PERF_RECORD_EXIT(25189:25189):(25188:25188)
          autoreap 25189 [004] 18462.674225: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [004] 18462.674226: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 25189/25189
           swapper     0 [007] 18462.674257: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25188/25188

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Link: http://lore.kernel.org/lkml/20200909084923.9096-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 16:06:22 -03:00
Zejiang Tang 99f638173e perf docs: Improve help information in perf.txt
perf has many undocumented options, such as:-vv, --exec-path,
--html-path, -p, --paginate,--no-pager, --debugfs-dir, --buildid-dir,
--list-cmds, --list-opts.

Add entris for these options in perf.txt.

Signed-off-by: Zejiang Tang <tangzejiang@loongson.cn>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/1599645194-8438-1-git-send-email-tangzejiang@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 16:03:31 -03:00
YueHaibing a803fbe61d perf metric: Remove duplicate include
Remove duplicate header which is included twice.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200915081541.41004-1-yuehaibing@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 15:48:49 -03:00
Andi Kleen 328781df86 perf tools: Add documentation for topdown metrics
Add some documentation how to use the topdown metrics in ring 3.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200911144808.27603-5-kan.liang@linux.intel.com
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 15:48:31 -03:00
Andi Kleen 55c36a9fc2 perf stat: Support new per thread TopDown metrics
Icelake has support for reporting per thread TopDown metrics.

These are reported differently than the previous TopDown support,
each metric is standalone, but scaled to pipeline "slots".

We don't need to do anything special for HyperThreading anymore.
Teach perf stat --topdown to handle these new metrics and
print them in the same way as the previous TopDown metrics.

The restrictions of only being able to report information per core is
gone.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200911144808.27603-4-kan.liang@linux.intel.com
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 15:48:08 -03:00
Kan Liang acb65150a4 perf record: Support sample-read topdown metric group
With the hardware TopDown metrics feature, sample-read feature should be
supported for a topdown group, e.g., sample a non-topdown event and read
a topdown metric group. But the current perf record code errors out.

For a topdown metric group, the slots event must be the leader of the
group, but the leader slots event doesn't support sampling.

To support sample-read the topdown metric group, use the 2nd event of
the group as the "leader" for the purposes of sampling.

Only the platform with Topdown metic feature supports sample-read the
topdown group. Add arch_topdown_sample_read() to indicate whether the
topdown group supports sample-read.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200911144808.27603-3-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 15:47:58 -03:00
Kan Liang 687986bbeb perf tools: Rename group to topdown
The group.h/c only include TopDown group related functions. The name
"group" is too generic and inaccurate. Use the name "topdown" to replace
it.

Move topdown related functions to a dedicated file, topdown.c.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200911144808.27603-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 15:47:55 -03:00
Jiri Olsa c57f5eaa09 perf machine: Add machine__for_each_dso() function
Add the machine__for_each_dso() to iterate over all dso objects defined
for the within a machine object. It will be used in the MMAP3 patch
series.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200913210313.1985612-22-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 15:47:12 -03:00
Arnaldo Carvalho de Melo 056c172201 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 15:45:05 -03:00
Namhyung Kim 0f1b550e29 perf parse-event: Release cpu_map refcount if evsel alloc failed
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200917060219.1287863-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 13:28:15 -03:00
Namhyung Kim 5d680be3b0 perf parse-event: Fix cpu map refcounting
Like evlist cpu map, evsel's cpu map should have a proper refcount.

As it's created with a refcount, we don't need to get an extra count.
Thanks to Arnaldo for the simpler suggestion.

This, together with the following patch, fixes the following ASAN
report:

  Direct leak of 840 byte(s) in 70 object(s) allocated from:
    #0 0x7fe36703f628 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x107628)
    #1 0x559fbbf611ca in cpu_map__trim_new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:79
    #2 0x559fbbf6229c in perf_cpu_map__new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:237
    #3 0x559fbbcc6c6d in __add_event util/parse-events.c:357
    #4 0x559fbbcc6c6d in add_event_tool util/parse-events.c:408
    #5 0x559fbbcc6c6d in parse_events_add_tool util/parse-events.c:1414
    #6 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
    #7 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
    #8 0x559fbbcc95da in __parse_events util/parse-events.c:2141
    #9 0x559fbbc2788b in check_parse_id tests/pmu-events.c:406
    #10 0x559fbbc2788b in check_parse_id tests/pmu-events.c:393
    #11 0x559fbbc2788b in check_parse_fake tests/pmu-events.c:436
    #12 0x559fbbc2788b in metric_parse_fake tests/pmu-events.c:553
    #13 0x559fbbc27e2d in test_parsing_fake tests/pmu-events.c:599
    #14 0x559fbbc27e2d in test_parsing_fake tests/pmu-events.c:574
    #15 0x559fbbc0109b in run_test tests/builtin-test.c:410
    #16 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
    #17 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
    #18 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
    #19 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #20 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #21 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #22 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #23 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308

And I've failed which commit introduced this bug as the code was
heavily changed since then. ;-/

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200917060219.1287863-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 13:25:35 -03:00
Qi Liu ce9c13f31b perf stat: Fix the ratio comments of miss-events
'perf stat' displays miss ratio of L1-dcache, L1-icache, dTLB cache,
iTLB cache and LL-cache. Take L1-dcache for example, miss ratio is
caculated as "L1-dcache-load-misses/L1-dcache-loads". So "of all
L1-dcache hits" is unsuitable to describe it, and "of all L1-dcache
accesses" seems better.

The comments of L1-icache, dTLB cache, iTLB cache and LL-cache are
fixed in the same way.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1600253331-10535-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-16 10:54:02 -03:00
Namhyung Kim d26383dcb2 perf test: Free formats for perf pmu parse test
The following leaks were detected by ASAN:

  Indirect leak of 360 byte(s) in 9 object(s) allocated from:
    #0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
    #1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333
    #2 0x560578f752fc in perf_pmu_parse util/pmu.y:59
    #3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73
    #4 0x560578e07045 in test__pmu tests/pmu.c:155
    #5 0x560578de109b in run_test tests/builtin-test.c:410
    #6 0x560578de109b in test_and_print tests/builtin-test.c:440
    #7 0x560578de401a in __cmd_test tests/builtin-test.c:661
    #8 0x560578de401a in cmd_test tests/builtin-test.c:807
    #9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: cff7f956ec ("perf tests: Move pmu tests into separate object")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-12-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15 09:22:42 -03:00
Namhyung Kim 6f47ed6cd1 perf metric: Do not free metric when failed to resolve
It's dangerous to free the original metric when it's called from
resolve_metric() as it's already in the metric_list and might have other
resources too.  Instead, it'd better let them bail out and be released
properly at the later stage.

So add a check when it's called from metricgroup__add_metric() and
release it.  Also make sure that mp is set properly.

Fixes: 83de0b7d53 ("perf metric: Collect referenced metrics in struct metric_ref_node")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15 09:22:21 -03:00
Namhyung Kim 27adafcda3 perf metric: Free metric when it failed to resolve
The metricgroup__add_metric() can find multiple match for a metric group
and it's possible to fail.  Also it can fail in the middle like in
resolve_metric() even for single metric.

In those cases, the intermediate list and ids will be leaked like:

  Direct leak of 3 byte(s) in 1 object(s) allocated from:
    #0 0x7f4c938f40b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
    #1 0x55f7e71c1bef in __add_metric util/metricgroup.c:683
    #2 0x55f7e71c31d0 in add_metric util/metricgroup.c:906
    #3 0x55f7e71c3844 in metricgroup__add_metric util/metricgroup.c:940
    #4 0x55f7e71c488d in metricgroup__add_metric_list util/metricgroup.c:993
    #5 0x55f7e71c488d in parse_groups util/metricgroup.c:1045
    #6 0x55f7e71c60a4 in metricgroup__parse_groups_test util/metricgroup.c:1087
    #7 0x55f7e71235ae in __compute_metric tests/parse-metric.c:164
    #8 0x55f7e7124650 in compute_metric tests/parse-metric.c:196
    #9 0x55f7e7124650 in test_recursion_fail tests/parse-metric.c:318
    #10 0x55f7e7124650 in test__parse_metric tests/parse-metric.c:356
    #11 0x55f7e70be09b in run_test tests/builtin-test.c:410
    #12 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
    #13 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
    #14 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
    #15 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #16 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #17 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #18 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #19 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: 83de0b7d53 ("perf metric: Collect referenced metrics in struct metric_ref_node")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15 09:21:49 -03:00
Namhyung Kim 437822bf38 perf metric: Release expr_parse_ctx after testing
The test_generic_metric() missed to release entries in the pctx.  Asan
reported following leak (and more):

  Direct leak of 128 byte(s) in 1 object(s) allocated from:
    #0 0x7f4c9396980e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
    #1 0x55f7e748cc14 in hashmap_grow (/home/namhyung/project/linux/tools/perf/perf+0x90cc14)
    #2 0x55f7e748d497 in hashmap__insert (/home/namhyung/project/linux/tools/perf/perf+0x90d497)
    #3 0x55f7e7341667 in hashmap__set /home/namhyung/project/linux/tools/perf/util/hashmap.h:111
    #4 0x55f7e7341667 in expr__add_ref util/expr.c:120
    #5 0x55f7e7292436 in prepare_metric util/stat-shadow.c:783
    #6 0x55f7e729556d in test_generic_metric util/stat-shadow.c:858
    #7 0x55f7e712390b in compute_single tests/parse-metric.c:128
    #8 0x55f7e712390b in __compute_metric tests/parse-metric.c:180
    #9 0x55f7e712446d in compute_metric tests/parse-metric.c:196
    #10 0x55f7e712446d in test_dcache_l2 tests/parse-metric.c:295
    #11 0x55f7e712446d in test__parse_metric tests/parse-metric.c:355
    #12 0x55f7e70be09b in run_test tests/builtin-test.c:410
    #13 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
    #14 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
    #15 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
    #16 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #17 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #18 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #19 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #20 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: 6d432c4c8a ("perf tools: Add test_generic_metric function")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15 09:21:22 -03:00
Namhyung Kim f5a56570a3 perf test: Fix memory leaks in parse-metric test
It didn't release resources when there's an error so the
test_recursion_fail() will leak some memory.

Fixes: 0a507af9c6 ("perf tests: Add parse metric test for ipc metric")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15 09:20:11 -03:00
Namhyung Kim b12eea5ad8 perf parse-event: Fix memory leak in evsel->unit
The evsel->unit borrows a pointer of pmu event or alias instead of
owns a string.  But tool event (duration_time) passes a result of
strdup() caused a leak.

It was found by ASAN during metric test:

  Direct leak of 210 byte(s) in 70 object(s) allocated from:
    #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
    #1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414
    #2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414
    #3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
    #4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
    #5 0x559fbbcc95da in __parse_events util/parse-events.c:2141
    #6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406
    #7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393
    #8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415
    #9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498
    #10 0x559fbbc0109b in run_test tests/builtin-test.c:410
    #11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
    #12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
    #13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
    #14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: f0fbb114e3 ("perf stat: Implement duration_time as a proper event")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15 09:18:56 -03:00
Namhyung Kim bfd1b83d75 perf evlist: Fix cpu/thread map leak
Asan reported leak of cpu and thread maps as they have one more refcount
than released.  I found that after setting evlist maps it should release
it's refcount.

It seems to be broken from the beginning so I chose the original commit
as the culprit.  But not sure how it's applied to stable trees since
there are many changes in the code after that.

Fixes: 7e2ed09753 ("perf evlist: Store pointer to the cpu and thread maps")
Fixes: 4112eb1899 ("perf evlist: Default to syswide target when no thread/cpu maps set")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15 08:59:26 -03:00
Namhyung Kim b033ab11ad perf metric: Fix some memory leaks - part 2
The metric_event_delete() missed to free expr->metric_events and it
should free an expr when metric_refs allocation failed.

Fixes: 4ea2896715 ("perf metric: Collect referenced metrics in struct metric_expr")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15 08:58:44 -03:00
Namhyung Kim 4f57a1ed74 perf metric: Fix some memory leaks
I found some memory leaks while reading the metric code.  Some are real
and others only occur in the error path.  When it failed during metric
or event parsing, it should release all resources properly.

Fixes: b18f3e3650 ("perf stat: Support JSON metrics in perf stat")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15 08:58:03 -03:00
Namhyung Kim 22fe5a25b5 perf test: Free aliases for PMU event map aliases test
The aliases were never released causing the following leaks:

  Indirect leak of 1224 byte(s) in 9 object(s) allocated from:
    #0 0x7feefb830628 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x107628)
    #1 0x56332c8f1b62 in __perf_pmu__new_alias util/pmu.c:322
    #2 0x56332c8f401f in pmu_add_cpu_aliases_map util/pmu.c:778
    #3 0x56332c792ce9 in __test__pmu_event_aliases tests/pmu-events.c:295
    #4 0x56332c792ce9 in test_aliases tests/pmu-events.c:367
    #5 0x56332c76a09b in run_test tests/builtin-test.c:410
    #6 0x56332c76a09b in test_and_print tests/builtin-test.c:440
    #7 0x56332c76ce69 in __cmd_test tests/builtin-test.c:695
    #8 0x56332c76ce69 in cmd_test tests/builtin-test.c:807
    #9 0x56332c7d2214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #10 0x56332c6701a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #11 0x56332c6701a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #12 0x56332c6701a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #13 0x7feefb359cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: 956a78356c ("perf test: Test pmu-events aliases")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-11-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15 08:56:50 -03:00
Henry Burns 56f3a1cdaf perf vendor events amd: Remove trailing commas
The amdzen2/core.json and amdzen/core.json vendor events files have the
occasional trailing comma. Since that goes against the JSON standard,
lets remove it.

Signed-off-by: Henry Burns <henrywolfeburns@gmail.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Link: http://lore.kernel.org/lkml/20200915004125.971-1-henrywolfeburns@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15 08:53:25 -03:00
Ian Rogers 880a784344 perf test: Leader sampling shouldn't clear sample period
Add test that a sibling with leader sampling doesn't have its period
cleared.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200912025655.1337192-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-14 19:35:22 -03:00
Ian Rogers 3b0a18c1aa perf record: Don't clear event's period if set by a term
If events in a group explicitly set a frequency or period with leader
sampling, don't disable the samples on those events.

Prior to 5.8:

  perf record -e '{cycles/period=12345000/,instructions/period=6789000/}:S'

would clear the attributes then apply the config terms. In commit
5f34278867 leader sampling configuration was moved to after applying the
config terms, in the example, making the instructions' event have its period
cleared.

This change makes it so that sampling is only disabled if configuration
terms aren't present.

Committer testing:

Before:

  # perf record -e '{cycles/period=1/,instructions/period=2/}:S' sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.051 MB perf.data (6 samples) ]
  #
  # perf evlist -v
  cycles/period=1/: size: 120, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|READ|ID, read_format: ID|GROUP, disabled: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  instructions/period=2/: size: 120, config: 0x1, sample_type: IP|TID|TIME|READ|ID, read_format: ID|GROUP, sample_id_all: 1, exclude_guest: 1
  #

After:

  # perf record -e '{cycles/period=1/,instructions/period=2/}:S' sleep 0.0001
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.052 MB perf.data (4 samples) ]
  # perf evlist -v
  cycles/period=1/: size: 120, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|READ|ID, read_format: ID|GROUP, disabled: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  instructions/period=2/: size: 120, config: 0x1, { sample_period, sample_freq }: 2, sample_type: IP|TID|TIME|READ|ID, read_format: ID|GROUP, sample_id_all: 1, exclude_guest: 1
  #

Fixes: 5f34278867 ("perf evlist: Move leader-sampling configuration")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20200912025655.1337192-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-14 19:35:12 -03:00
Stephane Eranian ae5dcc8abe perf record: Prevent override of attr->sample_period for libpfm4 events
Before:

  $ perf record -c 10000 --pfm-events=cycles:period=77777

Would yield a cycles event with period=10000, instead of 77777.

the event string and perf record initializing the event.
This was due to an ordering issue between libpfm4 parsing

events with attr->sample_period != 0 by the time
intent of the author.
perf_evsel__config() is invoked. This seems to have been the
This patch fixes the problem by preventing override for

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20200912025655.1337192-3-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-14 18:44:35 -03:00
David Sharp ce4326d275 perf record: Set PERF_RECORD_PERIOD if attr->freq is set.
evsel__config() would only set PERF_RECORD_PERIOD if it set attr->freq
from perf record options. When it is set by libpfm events, it would not
get set. This changes evsel__config to see if attr->freq is set outside
of whether or not it changes attr->freq itself.

Signed-off-by: David Sharp <dhsharp@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: david sharp <dhsharp@google.com>
Link: http://lore.kernel.org/lkml/20200912025655.1337192-2-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-14 18:44:35 -03:00
Ian Rogers d2c73501a7 perf bench: Fix 2 memory sanitizer warnings
Memory sanitizer warns if a write is performed where the memory being
read for the write is uninitialized. Avoid this warning by initializing
the memory.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200912053725.1405857-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-14 18:30:26 -03:00
Jiri Olsa 8a39e8c4d9 perf test: Fix the "signal" test inline assembly
When compiling with DEBUG=1 on Fedora 32 I'm getting crash for 'perf
test signal':

  Program received signal SIGSEGV, Segmentation fault.
  0x0000000000c68548 in __test_function ()
  (gdb) bt
  #0  0x0000000000c68548 in __test_function ()
  #1  0x00000000004d62e9 in test_function () at tests/bp_signal.c:61
  #2  0x00000000004d689a in test__bp_signal (test=0xa8e280 <generic_ ...
  #3  0x00000000004b7d49 in run_test (test=0xa8e280 <generic_tests+1 ...
  #4  0x00000000004b7e7f in test_and_print (t=0xa8e280 <generic_test ...
  #5  0x00000000004b8927 in __cmd_test (argc=1, argv=0x7fffffffdce0, ...
  ...

It's caused by the symbol __test_function being in the ".bss" section:

  $ readelf -a ./perf | less
    [Nr] Name              Type             Address           Offset
         Size              EntSize          Flags  Link  Info  Align
    ...
    [28] .bss              NOBITS           0000000000c356a0  008346a0
         00000000000511f8  0000000000000000  WA       0     0     32

  $ nm perf | grep __test_function
  0000000000c68548 B __test_function

I guess most of the time we're just lucky the inline asm ended up in the
".text" section, so making it specific explicit with push and pop
section clauses.

  $ readelf -a ./perf | less
    [Nr] Name              Type             Address           Offset
         Size              EntSize          Flags  Link  Info  Align
    ...
    [13] .text             PROGBITS         0000000000431240  00031240
         0000000000306faa  0000000000000000  AX       0     0     16

  $ nm perf | grep __test_function
  00000000004d62c8 T __test_function

Committer testing:

  $ readelf -wi ~/bin/perf | grep producer -m1
    <c>   DW_AT_producer    : (indirect string, offset: 0x254a): GNU C99 10.2.1 20200723 (Red Hat 10.2.1-1) -mtune=generic -march=x86-64 -ggdb3 -std=gnu99 -fno-omit-frame-pointer -funwind-tables -fstack-protector-all
                                                                                                                                         ^^^^^
                                                                                                                                         ^^^^^
                                                                                                                                         ^^^^^
  $

Before:

  $ perf test signal
  20: Breakpoint overflow signal handler                    : FAILED!
  $

After:

  $ perf test signal
  20: Breakpoint overflow signal handler                    : Ok
  $

Fixes: 8fd34e1cce ("perf test: Improve bp_signal")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200911130005.1842138-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-14 18:26:45 -03:00
Jiri Olsa 8366f0d268 perf tests: Call test_attr__open() directly
There's no longer need to call test_attr__open() from
sys_perf_event_open(), because both 'perf record' and 'perf stat' call
evsel__open_cpu(), so we can call it directly from there and not polute
the perf-sys.h header.

Committer testing:

Before and after:

  # perf test attr
  17: Setup struct perf_event_attr                                    : Ok
  49: Synthesize attr update                                          : Ok
  # perf test -v attr
  17: Setup struct perf_event_attr                                    :
  --- start ---
  test child forked, pid 2170868
  running '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-any_ret'
  unsupp  '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-any_ret'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-C0'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-graph-fp'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-period'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-group-sampling'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-freq'
  running '/home/acme/libexec/perf-core/tests/attr/test-stat-detailed-3'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-k'
  unsupp  '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-k'
  running '/home/acme/libexec/perf-core/tests/attr/test-stat-group1'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-u'
  unsupp  '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-u'
  running '/home/acme/libexec/perf-core/tests/attr/test-stat-basic'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-any_call'
  unsupp  '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-any_call'
  running '/home/acme/libexec/perf-core/tests/attr/test-stat-default'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-graph-dwarf'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-no-buffering'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-raw'
  running '/home/acme/libexec/perf-core/tests/attr/test-stat-detailed-2'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-count'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-data'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-any'
  unsupp  '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-any'
  running '/home/acme/libexec/perf-core/tests/attr/test-stat-group'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-branch-any'
  unsupp  '/home/acme/libexec/perf-core/tests/attr/test-record-branch-any'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-graph-default'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-no-samples'
  running '/home/acme/libexec/perf-core/tests/attr/test-stat-C0'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-no-inherit'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-ind_call'
  unsupp  '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-ind_call'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-basic'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-group1'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-pfm-period'
  unsupp  '/home/acme/libexec/perf-core/tests/attr/test-record-pfm-period'
  running '/home/acme/libexec/perf-core/tests/attr/test-stat-detailed-1'
  running '/home/acme/libexec/perf-core/tests/attr/test-stat-no-inherit'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-hv'
  unsupp  '/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-hv'
  running '/home/acme/libexec/perf-core/tests/attr/test-record-group'
  test child finished with 0
  ---- end ----
  Setup struct perf_event_attr: Ok
  49: Synthesize attr update                                          :
  --- start ---
  test child forked, pid 2171004
  test child finished with 0
  ---- end ----
  Synthesize attr update: Ok
  #

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200827193201.GB127372@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-10 11:55:37 -03:00
Kajol Jain b1f815c479 perf vendor events power9: Add hv_24x7 core level metric events
This patch adds hv_24x7 core level events in nest_metric.json file and
also add PerChip/PerCore field in metric events.

Result:

power9 platform:

command:# ./perf stat --metric-only -M PowerBUS_Frequency -C 0 -I 1000
     1.000070601                        1.9                        2.0
     2.000253881                        2.0                        1.9
     3.000364810                        2.0                        2.0

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200907064133.75090-6-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-10 09:19:53 -03:00
Kajol Jain f5a489dc81 perf metricgroup: Pass pmu_event structure as a parameter for arch_get_runtimeparam()
This patch adds passing of  pmu_event as a parameter in function
'arch_get_runtimeparam' which can be used to get details like if the
event is percore/perchip.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200907064133.75090-5-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-10 09:18:59 -03:00