Commit Graph

4329 Commits

Author SHA1 Message Date
Taeung Song 360e071b18 perf tools: Use zfree() to avoid keeping dangling pointers
The cases changed in this patch are for when we free but keep the
pointer to the freed area, which is not always a good idea.

Be more defensive and zero the pointer to avoid possible use after
free bugs to take more time to be detected.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-5-git-send-email-treeze.taeung@gmail.com
[ rewrote commit log ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 09:41:12 -03:00
Taeung Song 506fde11a3 perf tools: Use zfree() instead of ad hoc equivalent
We have zfree(&ptr) for this very common pattern:

   free(ptr);
   ptr = NULL;

So use it in a few more places.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-4-git-send-email-treeze.taeung@gmail.com
[ rewrote commit log ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 09:41:11 -03:00
Taeung Song 5aa365f298 perf tools: Add missing check for failure in a zalloc() call
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 09:41:11 -03:00
Taeung Song 75fc5ae5cc perf tools: Only increase index if perf_evsel__new_idx() succeeds
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 09:41:10 -03:00
Victor Kamensky 9b20065351 perf symbols: Take into account symfs setting when reading file build ID
After commit 5baecbcd9c ("perf symbols: we can now read separate
debug-info files based on a build ID") and when --symfs option is used
perf failed to pick up symbols for file with the same name between host
and sysroot specified by --symfs option.  One can see message like this:

  bin/bash with build id 26f0062cb6950d4d1ab0fd9c43eae8b10ca42062 not found, continuing without symbols

It happens because code added by 5baecbcd9c opens files directly by
dso->long_name without symbol_conf.symfs consideration, which as result
picks one from the host. It reads its build ID and later even code finds
another proper file in directory pointed by --symfs perf ignores it
because build id mismatches.

Fix is to use __symbol__join_symfs to adjust file name according to
--symfs setting. If no --symfs passed the operation would noop and picks
the same host file as before.

Also note in latter tree after 5baecbcd9c commit additional check for
'!dso->has_build_id' was added, so to observe error condition 'perf
record' should run with --no-buildid, so perf.data itself would not have
build id for target binary in buildid perf section and 'perf report'
will pass '!dso->has_build_id' condition. Or target binary should not
have build id, but the same binary on host has build id, again
'!dso->has_build_id' will pass in this case and incorrect build id could
be read if --symfs is used.

Signed-off-by: Victor Kamensky <kamensky@cisco.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Phlipot <cphlipot0@gmail.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: xe-linux-external@cisco.com
Fixes: 5baecbcd9c ("perf symbols: we can now read separate debug-info files based on a build ID")
Link: http://lkml.kernel.org/r/1486424908-17094-1-git-send-email-kamensky@cisco.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 09:28:55 -03:00
Andi Kleen f23610245c perf list: Add debug support for outputing alias string
For debugging and testing it is useful to see the converted alias
string. Add support to perf stat/record and perf list to print the alias
conversion. The text string is saved in the alias structure.  For perf
stat/record it is folded into the normal -v. For perf list -v was taken,
so we use --debug.

Before:

% perf list
...
cache:
  l1d.replacement
       [L1D data line replacements]
  l1d_pend_miss.fb_full
       [Cycles a demand request was blocked due to Fill Buffers inavailability]

After

% perf list --debug
...
cache:
  l1d.replacement
       [L1D data line replacements]
        cpu/umask=0x1,period=2000003,event=0x51/
  l1d_pend_miss.fb_full
       [Cycles a demand request was blocked due to Fill Buffers inavailability]
        cpu/umask=0x2,period=2000003,cmask=1,event=0x48/

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-6-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 08:55:04 -03:00
Andi Kleen 231bb2aa32 perf pmu: Support event aliases for non cpu// pmus
The code for handling pmu aliases without specifying the PMU hardcoded
only supported the cpu PMU.

This patch extends it to work for all PMUs. We always duplicate the
event for all PMUs that have an matching alias.  This allows to
automatically expand an alias for all instances of a PMU (so for example
you can monitor all cache boxes with a single event)

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 08:55:04 -03:00
Andi Kleen 15b22ed369 perf pmu: Support per pmu json aliases
Add support for registering json aliases per PMU. Any alias with an unit
matching the prefix is registered to the PMU.  Uncore has multiple
instances of most units, so all these aliases get registered for each
individual PMU (this is important later to run the event on every
instance of the PMU).

To avoid printing the events multiple times in perf list filter out
duplicated events during printing.

v2: Rely on uncore_ prefix already in unit
v3: Document why calls were reordered

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 08:55:03 -03:00
Andi Kleen fedb2b5182 perf jevents: Add support for parsing uncore json files
Handle the "Unit" field, which is needed to find the right PMU for an
event. We call it "pmu" and convert it to the perf pmu name with an
uncore prefix.

Handle the "ExtSel" field, which just extends the event mask with an
additional bit.

Handle the "Filter" field which adds parameters to the main event
to configure filtering.

Handle the "Unit" field which declares the unit the values should be
scaled too (similar to what the kernel exports)

Set up the "perpkg" field for uncore events so that perf knows they are
per package (similar to what the kernel exports)

Then output the fields into the pmu-events data structures which are
compiled into perf.

Filter out zero fields, except for the event itself.

v2: Fix compilation. Add uncore_ prefix at pre-processing time.
    Move eventcode change to separate patch.

v3: Remove extra __maybe_unused

v4: dont duplicate aliases for cpu pmu events

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 08:55:03 -03:00
He Kuang 4d416436f3 perf bpf: Add missing newline in debug messages
These two debug messages are missing the trailing newline.

Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bintian Wang <bintian.wang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20170207073412.26983-2-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 08:55:02 -03:00
Arnaldo Carvalho de Melo e06094ab67 Merge remote-tracking branch 'tip/perf/urgent' into perf/core
To pick fixes that are affecting tests of new 'perf diff' features in
perf/core.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-06 11:09:26 -03:00
Krister Johansen aa33b9b9a2 perf callchain: Reference count maps
If dso__load_kcore frees all of the existing maps, but one has already
been attached to a callchain cursor node, then we can get a SIGSEGV in
any function that happens to try to use this invalid cursor.  Use the
existing map refcount mechanism to forestall cleanup of a map until the
cursor iterates past the node.

Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@kernel.org
Fixes: 84c2cafa28 ("perf tools: Reference count struct map")
Link: http://lkml.kernel.org/r/20170106062331.GB2707@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-02 11:39:09 -03:00
Namhyung Kim a1c9f97f0b perf diff: Fix -o/--order option behavior (again)
Commit 21e6d84286 ("perf diff: Use perf_hpp__register_sort_field
interface") changed list_add() to perf_hpp__register_sort_field().

This resulted in a behavior change since the field was added to the tail
instead of the head.  So the -o option is mostly ignored due to its
order in the list.

This patch fixes it by adding perf_hpp__prepend_sort_field().

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 21e6d84286 ("perf diff: Use perf_hpp__register_sort_field interface")
Link: http://lkml.kernel.org/r/20170118051457.30946-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-02 11:39:09 -03:00
Taeung Song 43d41deb71 perf tools: Create for_each_event macro for tracepoints iteration
Similar to for_each_subsystem and for_each_event in util/parse-events.c,
add new macro 'for_each_event' for easy iteration over the tracepoints
in order to be more compact and readable.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485862711-20216-2-git-send-email-treeze.taeung@gmail.com
[ Slight change to keep existing style for checking strcmp() return ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-31 16:20:08 -03:00
Joe Stringer 9a9c733d68 tools perf util: Make rm_rf(path) argument const
rm_rf() doesn't modify its path argument, and a future caller will pass
a string constant into it to delete.

Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-5-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-31 16:20:07 -03:00
Krister Johansen 9c68ae98c6 perf callchain: Reference count maps
If dso__load_kcore frees all of the existing maps, but one has already
been attached to a callchain cursor node, then we can get a SIGSEGV in
any function that happens to try to use this invalid cursor.  Use the
existing map refcount mechanism to forestall cleanup of a map until the
cursor iterates past the node.

Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@kernel.org
Fixes: 84c2cafa28 ("perf tools: Reference count struct map")
Link: http://lkml.kernel.org/r/20170106062331.GB2707@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-31 16:19:06 -03:00
Arnaldo Carvalho de Melo ecc4c5614b perf tools: Propagate perf_config() errors
Previously these were being ignored, sometimes silently.

Stop doing that, emitting debug messages and handling the errors.

Testing it:

  $ cat ~/.perfconfig
  cat: /home/acme/.perfconfig: No such file or directory
  $ perf stat -e cycles usleep 1

   Performance counter stats for 'usleep 1':

           938,996      cycles:u

       0.003813731 seconds time elapsed

  $ perf top --stdio
  Error:
  You may not have permission to collect system-wide stats.

  Consider tweaking /proc/sys/kernel/perf_event_paranoid,
  <SNIP>
  [ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ]
  [acme@jouet linux]$ perf report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  # Overhead  Command  Shared Object      Symbol
  # ........  .......  .................  .........................
    71.77%  usleep   libc-2.24.so       [.] _dl_addr
    27.07%  usleep   ld-2.24.so         [.] _dl_next_ld_env_entry
     1.13%  usleep   [kernel.kallsyms]  [k] page_fault
  $
  $ touch ~/.perfconfig
  $ ls -la ~/.perfconfig
  -rw-rw-r--. 1 acme acme 0 Jan 27 12:14 /home/acme/.perfconfig
  $
  $ perf stat -e instructions usleep 1

   Performance counter stats for 'usleep 1':

           244,610      instructions:u

       0.000805383 seconds time elapsed

  $
  [root@jouet ~]# chown acme.acme ~/.perfconfig
  [root@jouet ~]# perf stat -e cycles usleep 1
    Warning: File /root/.perfconfig not owned by current user or root, ignoring it.

   Performance counter stats for 'usleep 1':

           937,615      cycles

       0.000836931 seconds time elapsed
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-j2rq96so6xdqlr8p8rd6a3jx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-27 12:23:33 -03:00
Arnaldo Carvalho de Melo afc45cf52c perf config: Do not consider an error not to have any perfconfig file
While propagating the errors from perf_config(), which were being
completely ignored, everything stopped working for people without a
~/.perfconfig file, because the perf_config_set__init() was considering
an error not to have a .perfconfig file, duh, fix it by checking the
errno after the failed stat() call.

It should also not return an error when it says it is ignoring the file,
and also a empty file should not return an error either.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 8beeb00f2c ("perf config: Use new perf_config_set__init() to initialize config set")
Link: http://lkml.kernel.org/n/tip-ygpbab3apbs6l8wr97xedwks@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-27 10:28:34 -03:00
Ingo Molnar e2cf00c257 New features:
- Introduce 'perf ftrace' a perf front end to the kernel's ftrace
   function and function_graph tracer, defaulting to the "function_graph"
   tracer, more work will be done in reviving this effort, forward porting
   it from its initial patch submission (Namhyung Kim)
 
 - Add 'e' and 'c' hotkeys to expand/collapse call chains for a single
   hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa)
 
 Fixes:
 
 - Fix wrong register name for arm64, used in 'perf probe' (He Kuang)
 
 - Fix map offsets in relocation in libbpf (Joe Stringer)
 
 - Fix looking up dwarf unwind stack info (Matija Glavinic Pecotic)
 
 Infrastructure:
 
 - libbpf prog functions sync with what is exported via uapi (Joe Stringer)
 
 Trivial:
 
 - Remove unnecessary checks and assignments in 'perf probe's
   try_to_find_absolute_address() (Markus Elfring)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYig7UAAoJENZQFvNTUqpAhJQP/iI0T7A8TNekPGLv7j20c302
 89N9+9TAFtVqjgr1hIzqQgGOqbOdAW1tU3VTPW92nNDBn9JV5qwuF9YWEiDaAVv2
 0bmV5hLnrNlymddm3pdg/PbD1TVlwk2NFxtrkPxuf/vx0ZhEGqsSrRUCR/xGXbtQ
 TcMg3rQquspV9JNv4HzFdQC9nsG1CGNotZKsE1avRw70pWAqCtF81B0m8teb6OWo
 5qnN+AMJlYcC+OGffROemUksuehkMvi5L8v1e/6RO/lU1qt9Jrc/2sT9cqvjVFNR
 k4c76cUgWOCYzDEotENMpU4bc6e/24DE2ydFeovihdXw8Qs4ajEA9LXKM4yW+ZoE
 MZE3GS153a8n+CvTfkB9Ow1QJ8rgmR/L0BuhmGb6bYW/MtuTRTShhSduZwOrIyap
 9KckHYti4p3oN3CKFYGO9PN3DRUdx+Xqg/miwrgjkPo09QFp+lzfFFOk0P2/Zqw2
 yfvdWeHxkkrwoWQIyMHVKp/E9jQPuyYqwnKdp68LCN+DgNiFpPpSA8id5e47RQDE
 otqrK8U/82ktakfrBijSPBI6EEqFg7ltip2KT/xlDMfnP9HtxgFhzrk52dyi6pM/
 jkBhJaTQhVZTyaFvUXuaLmBSdPpcaaGM4KJ+2iAayA2r0KLiDj6IdzD5ROCRFOvJ
 SFA472mIxNxUjpQEUTtc
 =tYKN
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-4.11-20170126' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull the latest perf/core updates from Arnaldo Carvalho de Melo:

New features:

 - Introduce 'perf ftrace' a perf front end to the kernel's ftrace
   function and function_graph tracer, defaulting to the "function_graph"
   tracer, more work will be done in reviving this effort, forward porting
   it from its initial patch submission (Namhyung Kim)

 - Add 'e' and 'c' hotkeys to expand/collapse call chains for a single
   hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa)

Fixes:

 - Fix wrong register name for arm64, used in 'perf probe' (He Kuang)

 - Fix map offsets in relocation in libbpf (Joe Stringer)

 - Fix looking up dwarf unwind stack info (Matija Glavinic Pecotic)

Infrastructure changes:

 - libbpf prog functions sync with what is exported via uapi (Joe Stringer)

Trivial changes:

 - Remove unnecessary checks and assignments in 'perf probe's
   try_to_find_absolute_address() (Markus Elfring)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-26 16:20:59 +01:00
Namhyung Kim a7619aef6d perf util: Add more debug message on failure path
It's helpful for debugging on tracing features.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremy Eder <jeder@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>,
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-rjysr9ljiesymgk4qblteaty@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:43:00 -03:00
Namhyung Kim cd4ceb6343 perf util: Save pid-cmdline mapping into tracing header
Current trace info data lacks the saved cmdline mapping which is needed
for pevent to find out the comm of a task.  Add this and bump up the
version number so that perf can determine its presence when reading.

This is mostly corresponding to trace.dat file version 6, but still
lacks 4 byte of number of cpus, and 10 bytes of type string - and I
think we don't need those anyway.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremy Eder <jeder@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>,
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
[ Change version test from == to >= ]
Link: http://lkml.kernel.org/n/tip-vaooqpxsikxbb3359p0corcb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:59 -03:00
Arnaldo Carvalho de Melo 0a87e7bc6c perf scripting perl: Do not die() when not founding event for a type
Do just like handling other cases i.e. print some debug message and
ignore the sample.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-t7kzlm3cxyvbd7d9n9554ai9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:59 -03:00
Markus Elfring d1d0e29cb7 perf probe: Delete an unnecessary assignment in try_to_find_absolute_address()
Remove an error code assignment which is redundant in an if branch for
the handling of a memory allocation failure because the same value was
set for the local variable "err" before.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/0ede09ec-79b6-c8bd-5b20-02c63ed98aab@users.sourceforge.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:56 -03:00
Markus Elfring 42e233cacc perf probe: Delete an unnecessary check in try_to_find_absolute_address()
Remove a condition check which is unnecessary at the end
because this source code place should usually only be reached
with a non-zero pointer.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/a3f2473b-6383-a326-bce0-b826423608b8@users.sourceforge.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:55 -03:00
Ingo Molnar 47cd95a632 Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-25 15:52:46 +01:00
Matija Glavinic Pecotic 9343e45bf6 perf unwind: Fix looking up dwarf unwind stack info
Using perf with call graph method dwarf fails to provide backtrace
support for stripped binary even though .gnu_debuglink points to *.dbg
flavor with properly populated debug symbols.

Problem is reproduced on ARM (v7, v8), kernels 3.14.y, 4.4.y and
4.10.rc3.  Perf is configured with libunwind, and unwind dwarf support
[1]. Test code (stress_bt.c) can be found on [2].

Running (explicitly disable other unwinding methods):

  $ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \
	-fno-asynchronous-unwind-tables stress_bt.c
  $ perf record -N --call-graph dwarf ./stress_bt
  $ perf report

results in properly generated call graph. Stripping the binary and running
it results with missing call graph. Expected result is to have call graph:

  $ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \
	  -fno-asynchronous-unwind-tables stress_bt.c
  $ objcopy --only-keep-debug stress_bt stress_bt.dbg
  $ objcopy --strip-debug stress_bt
  $ objcopy --add-gnu-debuglink=stress_bt.dbg stress_bt
  $ perf record -N --call-graph dwarf ./stress_bt
  $ perf report

Problem is that perf doesn't try to read symbols pointed by gnu
debuglink.  Patch adds checking, and reading of the symbols from
debuglink and symsrc.  Order of the check is to first check within dso,
then check whether symsrc is defined and try to read from it. Finally,
debuglink is checked. Default locations of debug files are discussed in
[3] and [4]. Comments on RFC are on [5].

[1] https://wiki.linaro.org/LEG/Engineering/TOOLS/perf-callstack-unwinding
[2] [1]#Backtrace_stress_application
[3] https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
[4] https://sourceware.org/binutils/docs/binutils/objcopy.html
[5] https://lkml.org/lkml/2016/8/22/473

Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/d309d40a-463f-482b-68e1-1465326efdc1@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-18 12:29:52 -03:00
Soramichi AKIYAMA d94386f28a perf evlist: Fix typo in deliver_sample()
This patch fixes a typo: s/delievery/delivery/

Signed-off-by: Soramichi Akiyama <akiyama@m.soramichi.jp>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170117222233.dfd92de0ad701e7c53396950@m.soramichi.jp
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-17 11:36:45 -03:00
Soramichi AKIYAMA d25ed5d9fa perf tools: Move two variables usied in libperf from perf.c
The use_browser and perf_version_string variables are both declared in
perf.c but they are also referenced by other functions of libperf.a.

Therefore a user linking an own main() with libperf.a must declare those
two variables in their files even if the files never use the browser or
the version information.

This patch fixes this issue by moving use_browser and
perf_version_string out of perf.c to some other files.

Signed-off-by: Soramichi Akiyama <akiyama@m.soramichi.jp>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170117002237.c1aec0ce3b4d675dca018deb@m.soramichi.jp
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-17 11:36:45 -03:00
Masami Hiramatsu 613f050d68 perf probe: Fix to probe on gcc generated functions in modules
Fix to probe on gcc generated functions on modules. Since
probing on a module is based on its symbol name, it should
be adjusted on actual symbols.

E.g. without this fix, perf probe shows probe definition
on non-exist symbol as below.

  $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -F in_range*
  in_range.isra.12
  $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
  p:probe/in_range nf_nat:in_range+0

With this fix, perf probe correctly shows a probe on
gcc-generated symbol.

  $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
  p:probe/in_range nf_nat:in_range.isra.12+0

This also fixes same problem on online module as below.

  $ perf probe -m i915 -D assert_plane
  p:probe/assert_plane i915:assert_plane.constprop.134+0

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411450673.9978.14905987549651656075.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 15:43:04 -03:00
Masami Hiramatsu 3e96dac7c9 perf probe: Add error checks to offline probe post-processing
Add error check codes on post processing and improve it for offline
probe events as:

 - post processing fails if no matched symbol found in map(-ENOENT)
   or strdup() failed(-ENOMEM).

 - Even if the symbol name is the same, it updates symbol address
   and offset.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411443738.9978.4617979132625405545.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 15:35:25 -03:00
Masami Hiramatsu d2d4edbebe perf probe: Fix to show correct locations for events on modules
Fix to show correct locations for events on modules by relocating given
address instead of retrying after failure.

This happens when the module text size is big enough, bigger than
sh_addr, because the original code retries with given address + sh_addr
if it failed to find CU DIE at the given address.

Any address smaller than sh_addr always fails and it retries with the
correct address, but addresses bigger than sh_addr will get a CU DIE
which is on the given address (not adjusted by sh_addr).

In my environment(x86-64), the sh_addr of ".text" section is 0x10030.
Since i915 is a huge kernel module, we can see this issue as below.

  $ grep "[Tt] .*\[i915\]" /proc/kallsyms | sort | head -n1
  ffffffffc0270000 t i915_switcheroo_can_switch	[i915]

ffffffffc0270000 + 0x10030 = ffffffffc0280030, so we'll check
symbols cross this boundary.

  $ grep "[Tt] .*\[i915\]" /proc/kallsyms | grep -B1 ^ffffffffc028\
  | head -n 2
  ffffffffc027ff80 t haswell_init_clock_gating	[i915]
  ffffffffc0280110 t valleyview_init_clock_gating	[i915]

So setup probes on both function and see what happen.

  $ sudo ./perf probe -m i915 -a haswell_init_clock_gating \
        -a valleyview_init_clock_gating
  Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

  You can now use it in all perf tools, such as:

  	perf record -e probe:valleyview_init_clock_gating -aR sleep 1

  $ sudo ./perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on i915_vga_set_decode:4@gpu/drm/i915/i915_drv.c in i915)

As you can see, haswell_init_clock_gating is correctly shown,
but valleyview_init_clock_gating is not.

With this patch, both events are shown correctly.

  $ sudo ./perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)

Committer notes:

In my case:

  # perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
  Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

  You can now use it in all perf tools, such as:

	  perf record -e probe:valleyview_init_clock_gating -aR sleep 1

  # perf probe -l
    probe:haswell_init_clock_gating (on i915_getparam+432@gpu/drm/i915/i915_drv.c in i915)
    probe:valleyview_init_clock_gating (on __i915_printk+240@gpu/drm/i915/i915_drv.c in i915)
  #

  # readelf -SW /lib/modules/4.9.0+/build/vmlinux | egrep -w '.text|Name'
   [Nr] Name   Type      Address          Off    Size   ES Flg Lk Inf Al
   [ 1] .text  PROGBITS  ffffffff81000000 200000 822fd3 00  AX  0   0 4096
  #

  So both are b0rked, now with the fix:

  # perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
  Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

  You can now use it in all perf tools, such as:

	perf record -e probe:valleyview_init_clock_gating -aR sleep 1

  # perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
  #

Both looks correct.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411436777.9978.1440275861947194930.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 15:14:06 -03:00
Andi Kleen d02fc6bcd5 perf pmu: Factor out scale conversion code
Move the scale factor parsing code to an own function to reuse it in an
upcoming patch.

v2: Return error in case strdup returns NULL.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170103150833.6694-2-andi@firstfloor.org
[ Keep returning -ENOMEM when strdup() fails in perf_pmu__parse_scale()/convert_scale() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 14:59:15 -03:00
Jiri Olsa 0c5824498e perf record: Add switch-output size warning
Adding switch-output size warning if the requested
size of lower than the wakeup ring buffer size.

  $ perf record --switch-output=1K ls
  WARNING: switch-output data size lower than wakeup kernel buffer size (258K) expect bigger perf.data sizes
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1483955520-29063-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:02 -03:00
Jiri Olsa 9808143ba2 perf tools: Add unit_number__scnprintf function
Add unit_number__scnprintf function to display size units and use it in
-m option info message.

Before:
  $ perf record -m 10M ls
  rounding mmap pages size to 16777216 bytes (4096 pages)
  ...

After:
  $ perf record -m 10M ls
  rounding mmap pages size to 16M (4096 pages)
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1483955520-29063-2-git-send-email-jolsa@kernel.org
[ Rename it to unit_number__scnprintf for consistency ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:01 -03:00
Soramichi Akiyama e978be9ea2 perf evlist: Fix typo in perf_evlist__start_workload()
This patch fixes a typo: s/enable to/unable to/

Signed-off-by: Soramichi AKIYAMA <akiyama@m.soramichi.jp>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: bcf3145fbe ("perf evlist: Enhance perf_evlist__start_workload()")
Link: http://lkml.kernel.org/r/20170110200006.e1f7a766b4faf1f107ae2e1b@m.soramichi.jp
[ Wasn't applying, fixed it up by hand, added Fixes: tag ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:01 -03:00
Arnaldo Carvalho de Melo 7d132caaf9 perf machine: Add a kallsyms loading constructor
To reduce the boilerplate for searching for functions in the running
kernel and modules.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-93iqzayafpaxaguoiwjqezgz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:00 -03:00
Masami Hiramatsu 8a937a25a7 perf probe: Fix to probe on gcc generated symbols for offline kernel
Fix perf-probe to show probe definition on gcc generated symbols for
offline kernel (including cross-arch kernel image).

gcc sometimes optimizes functions and generate new symbols with suffixes
such as ".constprop.N" or ".isra.N" etc. Since those symbol names are
not recorded in DWARF, we have to find correct generated symbols from
offline ELF binary to probe on it (kallsyms doesn't correct it).  For
online kernel or uprobes we don't need it because those are rebased on
_text, or a section relative address.

E.g. Without this:

  $ perf probe -k build-arm/vmlinux -F __slab_alloc*
  __slab_alloc.constprop.9
  $ perf probe -k build-arm/vmlinux -D __slab_alloc
  p:probe/__slab_alloc __slab_alloc+0

If you put above definition on target machine, it should fail
because there is no __slab_alloc in kallsyms.

With this fix, perf probe shows correct probe definition on
__slab_alloc.constprop.9:

  $ perf probe -k build-arm/vmlinux -D __slab_alloc
  p:probe/__slab_alloc __slab_alloc.constprop.9+0

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148350060434.19001.11864836288580083501.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-04 11:44:22 -03:00
Masami Hiramatsu eebc509b20 perf probe: Fix --funcs to show correct symbols for offline module
Fix --funcs (-F) option to show correct symbols for offline module.
Since previous perf-probe uses machine__findnew_module_map() for offline
module, even if user passes a module file (with full path) which is for
other architecture, perf-probe always tries to load symbol map for
current kernel module.

This fix uses dso__new_map() to load the map from given binary as same
as a map for user applications.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148350053478.19001.15435255244512631545.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-04 11:15:09 -03:00
Arnaldo Carvalho de Melo 7934c98a6e perf symbols: Robustify reading of build-id from sysfs
Markus reported that perf segfaults when reading /sys/kernel/notes from
a kernel linked with GNU gold, due to what looks like a gold bug, so do
some bounds checking to avoid crashing in that case.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Report-Link: http://lkml.kernel.org/r/20161219161821.GA294@x4
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ryhgs6a6jxvz207j2636w31c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-03 16:11:13 -03:00
Masami Hiramatsu 1f2ed153b9 perf probe: Fix to get correct modname from elf header
Since 'perf probe' supports cross-arch probes, it is possible to analyze
different arch kernel image which has different bits-per-long.

In that case, it fails to get the module name because it uses the
MOD_NAME_OFFSET macro based on the host machine bits-per-long, instead
of the target arch bits-per-long.

This fixes above issue by changing modname-offset based on the target
archs bit width. This is ok because linux kernel uses LP64 model on
64bit arch.

E.g. without this (on x86_64, and target module is arm32):

  $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
  p:probe/configfs_lookup :configfs_lookup+0
                          ^-Here is an empty module name.

With this fix, you can see correct module name:

  $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
  p:probe/configfs_lookup configfs:configfs_lookup+0

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148337043836.6752.383495516397005695.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-02 14:09:17 -03:00
Kan Liang ed6c166cc7 perf diff: Do not overwrite valid build id
Fixes a perf diff regression issue which was introduced by commit
5baecbcd9c ("perf symbols: we can now read separate debug-info files
based on a build ID")

The binary name could be same when perf diff different binaries. Build
id is used to distinguish between them.
However, the previous patch assumes the same binary name has same build
id. So it overwrites the build id according to the binary name,
regardless of whether the build id is set or not.

Check the has_build_id in dso__load. If the build id is already set, use
it.

Before the fix:

  $ perf diff 1.perf.data 2.perf.data
  # Event 'cycles'
  #
  # Baseline    Delta  Shared Object     Symbol
  # ........  .......  ................  .............................
  #
    99.83%  -99.80%  tchain_edit       [.] f2
     0.12%  +99.81%  tchain_edit       [.] f3
     0.02%   -0.01%  [ixgbe]           [k] ixgbe_read_reg

  After the fix:
  $ perf diff 1.perf.data 2.perf.data
  # Event 'cycles'
  #
  # Baseline    Delta  Shared Object     Symbol
  # ........  .......  ................  .............................
  #
    99.83%   +0.10%  tchain_edit       [.] f3
     0.12%   -0.08%  tchain_edit       [.] f2

Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
CC: Dima Kogan <dima@secretsauce.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 5baecbcd9c ("perf symbols: we can now read separate debug-info files based on a build ID")
Link: http://lkml.kernel.org/r/1481642984-13593-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-20 12:00:38 -03:00
Ravi Bangoria edee44be59 perf annotate: Don't throw error for zero length symbols
'perf report --tui' exits with error when it finds a sample of zero
length symbol (i.e. addr == sym->start == sym->end). Actually these are
valid samples. Don't exit TUI and show report with such symbols.

Reported-and-Tested-by: Anton Blanchard <anton@samba.org>
Link: https://lkml.org/lkml/2016/10/8/189
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@kernel.org # v4.9+
Link: http://lkml.kernel.org/r/1479804050-5028-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-20 12:00:32 -03:00
Ravi Bangoria e216874cc1 perf annotate: Fix jump target outside of function address range
If jump target is outside of function range, perf is not handling it
correctly. Especially when target address is lesser than function start
address, target offset will be negative. But, target address declared to
be unsigned, converts negative number into 2's complement. See below
example. Here target of 'jumpq' instruction at 34cf8 is 34ac0 which is
lesser than function start address(34cf0).

        34ac0 - 34cf0 = -0x230 = 0xfffffffffffffdd0

Objdump output:

  0000000000034cf0 <__sigaction>:
  __GI___sigaction():
    34cf0: lea    -0x20(%rdi),%eax
    34cf3: cmp    -bashx1,%eax
    34cf6: jbe    34d00 <__sigaction+0x10>
    34cf8: jmpq   34ac0 <__GI___libc_sigaction>
    34cfd: nopl   (%rax)
    34d00: mov    0x386161(%rip),%rax        # 3bae68 <_DYNAMIC+0x2e8>
    34d07: movl   -bashx16,%fs:(%rax)
    34d0e: mov    -bashxffffffff,%eax
    34d13: retq

perf annotate before applying patch:

  __GI___sigaction  /usr/lib64/libc-2.22.so
           lea    -0x20(%rdi),%eax
           cmp    -bashx1,%eax
        v  jbe    10
        v  jmpq   fffffffffffffdd0
           nop
    10:    mov    _DYNAMIC+0x2e8,%rax
           movl   -bashx16,%fs:(%rax)
           mov    -bashxffffffff,%eax
           retq

perf annotate after applying patch:

  __GI___sigaction  /usr/lib64/libc-2.22.so
           lea    -0x20(%rdi),%eax
           cmp    -bashx1,%eax
        v  jbe    10
        ^  jmpq   34ac0 <__GI___libc_sigaction>
           nop
    10:    mov    _DYNAMIC+0x2e8,%rax
           movl   -bashx16,%fs:(%rax)
           mov    -bashxffffffff,%eax
           retq

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1480953407-7605-3-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:46 -03:00
Ravi Bangoria 3ee2eb6da2 perf annotate: Support jump instruction with target as second operand
Architectures like PowerPC have jump instructions that includes a target
address as a second operand. For example, 'bne cr7,0xc0000000000f6154'.
Add support for such instruction in perf annotate.

objdump o/p:
  c0000000000f6140:   ld     r9,1032(r31)
  c0000000000f6144:   cmpdi  cr7,r9,0
  c0000000000f6148:   bne    cr7,0xc0000000000f6154
  c0000000000f614c:   ld     r9,2312(r30)
  c0000000000f6150:   std    r9,1032(r31)
  c0000000000f6154:   ld     r9,88(r31)

Corresponding perf annotate o/p:

Before patch:
         ld     r9,1032(r31)
         cmpdi  cr7,r9,0
      v  bne    3ffffffffff09f2c
         ld     r9,2312(r30)
         std    r9,1032(r31)
  74:    ld     r9,88(r31)

After patch:
         ld     r9,1032(r31)
         cmpdi  cr7,r9,0
      v  bne    74
         ld     r9,2312(r30)
         std    r9,1032(r31)
  74:    ld     r9,88(r31)

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1480953407-7605-2-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:46 -03:00
Jiri Olsa a359c17a7e perf evsel: Allow to ignore missing pid
Adding perf_evsel::ignore_missing_cpu_thread bool.

When set true, it allows perf to ignore error of missing pid of perf
event syscall.

We remove missing thread id from the thread_map, so the rest of the
processing like ioctl and mmap won't get disturbed with -1 fd.

The reason for supporting this is to ease up monitoring group of pids,
that 'disappear' before perf opens their event. This currently leads
perf to report error and exit and makes perf record's -u option unusable
under certain setup.

With this change we will allow this race and ignore such failure with
following warning:

  WARNING: Ignored open failure for pid 8605

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20161213074622.GA3084@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:46 -03:00
Jiri Olsa 38af91f01d perf thread_map: Add thread_map__remove function
Add thread_map__remove function to remove thread from thread map.

Add automated test also.

Committer notes:

Testing it:

  # perf test "Remove thread map"
  39: Remove thread map                          : Ok
  # perf test -v "Remove thread map"
  39: Remove thread map                          :
  --- start ---
  test child forked, pid 4483
  2 threads: 4482, 4483
  1 thread: 4483
  0 thread:
  test child finished with 0
  ---- end ----
  Remove thread map: Ok
  #

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481538943-21874-4-git-send-email-jolsa@kernel.org
[ Added stdlib.h, to get the free() declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:45 -03:00
Jiri Olsa 83c2e4f396 perf evsel: Use variable instead of repeating lengthy FD macro
It's more readable and will ease up following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481538943-21874-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:45 -03:00
Namhyung Kim 571f1eb9b9 perf callchain: Introduce callchain_cursor__copy()
The callchain_cursor__copy() function is to save current callchain
captured by a cursor.  It'll be used to keep callchains when switching
to idle task for each cpu.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161206034010.6499-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:33 -03:00
Ravi Bangoria bec60e50af perf annotate: Show raw form for jump instruction with indirect target
For jump instructions that does not include target address as direct operand,
show the original disassembled line for them. This is needed for certain
powerpc jump instructions that use target address in a register (such as bctr,
btar, ...).

Before:
     ld     r12,32088(r12)
     mtctr  r12
  v  bctr   ffffffffffffca2c
     std    r2,24(r1)
     addis  r12,r2,-1

After:
     ld     r12,32088(r12)
     mtctr  r12
  v  bctr
     std    r2,24(r1)
     addis  r12,r2,-1

Committer notes:

Testing it using a perf.data file and vmlinux for powerpc64,
cross-annotating it on a x86_64 workstation:

Before:

  .__bpf_prog_run  vmlinux.powerpc
         │        std    r10,512(r9)                      ▒
         │        lbz    r9,0(r31)                        ▒
         │        rldicr r9,r9,3,60                       ▒
         │        ldx    r9,r30,r9                        ▒
         │        mtctr  r9                               ▒
  100.00 │      ↓ bctr   3fffffffffe01510                 ▒
         │        lwa    r10,4(r31)                       ▒
         │        lwz    r9,0(r31)                        ▒
  <SNIP>
  Invalid jump offset: 3fffffffffe01510

After:

  .__bpf_prog_run  vmlinux.powerpc
         │        std    r10,512(r9)                      ▒
         │        lbz    r9,0(r31)                        ▒
         │        rldicr r9,r9,3,60                       ▒
         │        ldx    r9,r30,r9                        ▒
         │        mtctr  r9                               ▒
  100.00 │      ↓ bctr                                    ▒
         │        lwa    r10,4(r31)                       ▒
         │        lwz    r9,0(r31)                        ▒
  <SNIP>
  Invalid jump offset: 3fffffffffe01510

This, in turn, uncovers another problem with jumps without operands, the
ENTER/-> operation, to jump to the target, still continues using the bogus
target :-)

BTW, this was the file used for the above tests:

  [acme@jouet ravi_bangoria]$ perf report --header-only -i perf.data.f22vm.powerdev
  # ========
  # captured on: Thu Nov 24 12:40:38 2016
  # hostname : pdev-f22-qemu
  # os release : 4.4.10-200.fc22.ppc64
  # perf version : 4.9.rc1.g6298ce
  # arch : ppc64
  # nrcpus online : 48
  # nrcpus avail : 48
  # cpudesc : POWER7 (architected), altivec supported
  # cpuid : 74,513
  # total memory : 4158976 kB
  # cmdline : /home/ravi/Workspace/linux/tools/perf/perf record -a
  # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, disabled = 1, inherit = 1, mmap = 1, c
  # HEADER_CPU_TOPOLOGY info available, use -I to display
  # HEADER_NUMA_TOPOLOGY info available, use -I to display
  # pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
  # missing features: HEADER_TRACING_DATA HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT HEADER_CACHE
  # ========
  #
  [acme@jouet ravi_bangoria]$

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1480953407-7605-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 17:21:57 -03:00
Wang Nan edd695b032 perf clang: Compile BPF script using builtin clang support
After this patch, perf utilizes builtin clang support to build BPF
script, no longer depend on external clang, but fallbacking to it
if for some reason the builtin compiling framework fails.

Test:

  $ type clang
  -bash: type: clang: not found
  $ cat ~/.perfconfig
  $ echo '#define LINUX_VERSION_CODE 0x040700' > ./test.c
  $ cat ./tools/perf/tests/bpf-script-example.c >> ./test.c
  $ ./perf record -v --dry-run -e ./test.c 2>&1 | grep builtin
  bpf: successfull builtin compilation
  $

Can't pass cflags so unable to include kernel headers now. Will be fixed
by following commits.

Committer notes:

Make sure '-v' comes before the '-e ./test.c' in the command line otherwise the
'verbose' variable will not be set when the bpf event is parsed and thus the
pr_debug indicating a 'successfull builtin compilation' will not be output, as
the debug level (1) will be less than what 'verbose' has at that point (0).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-16-wangnan0@huawei.com
[ Spell check/reflow successfull pr_debug string ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:45 -03:00