Commit Graph

4186 Commits

Author SHA1 Message Date
Wang Nan 258e4bfcbd tools: Pass arg to fdarray__filter's call back function
Before this patch there's no way to pass arguments to fdarray__filter's
call back function.

This improvement will be used by 'perf record' to support unmapping ring
buffer for both main evlist and overwrite evlist. Without this patch
there's no way to track overwrite evlist from 'struct fdarray'.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464183898-174512-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-30 12:41:46 -03:00
Wang Nan 5a5ddeb6e3 perf evlist: Choose correct reading direction according to evlist->backward
Now we have evlist->backward to indicate the mmap direction. Make
perf_evlist__mmap_read() choose right direction automatically.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464183898-174512-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-30 12:41:45 -03:00
Wang Nan e10e4ef63b perf evlist: Check 'base' pointer before checking refcnt when put a mmap
evlist->mmap[i]->refcnt could be 0 if an evlist has no evsel or if all
evsels don't match the evlist during mmap. For example, when all evsels
are overwritable but the evlist itself is normal. To avoid crashing,
perf should check 'base' pointer before checking refcnt, and raise bug
only when base is not NULL.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464183898-174512-2-git-send-email-wangnan0@huawei.com
[ Renamed 'mmap' variable, it is reserved in old distros such as Ubuntu 12.04, breaking the build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-30 12:41:45 -03:00
Wang Nan f3058a1c19 perf evlist: Don't poll and mmap overwritable events
There's no need to receive events from overwritable ring buffer.
Instead, perf should make them run in background until some external
event of interest takes place.  This patch makes ignores normal events from
overwrite evlists.

Overwritable events must be mapped readonly and backward, so if evlist
and evsel doesn't match (evsel->overwrite is true but either evlist is
read/write or evlist is not backward, and vice versa), skip mapping it.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464056944-166978-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-30 12:41:45 -03:00
Arnaldo Carvalho de Melo 792d48b4cf perf tools: Per event max-stack settings
The tooling counterpart, now it is possible to do:

  # perf record -e sched:sched_switch/max-stack=10/ -e cycles/call-graph=dwarf,max-stack=4/ -e cpu-cycles/call-graph=dwarf,max-stack=1024/ usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.052 MB perf.data (5 samples) ]
  # perf evlist -v
  sched:sched_switch: type: 2, size: 112, config: 0x110, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, sample_max_stack: 10
  cycles/call-graph=dwarf,max-stack=4/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|REGS_USER|STACK_USER|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, sample_regs_user: 0xff0fff, sample_stack_user: 8192, sample_max_stack: 4
  cpu-cycles/call-graph=dwarf,max-stack=1024/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|REGS_USER|STACK_USER|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, sample_regs_user: 0xff0fff, sample_stack_user: 8192, sample_max_stack: 1024
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

Using just /max-stack=N/ means /call-graph=fp,max-stack=N/, that should
be further configurable by means of some .perfconfig knob.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-30 12:41:44 -03:00
Andi Kleen 480ca357fd perf thread: Adopt get_main_thread from db-export.c
Move the get_main_thread function from db-export.c to thread.c so that
it can be used elsewhere.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464051145-19968-2-git-send-email-andi@firstfloor.org
[ Removed leftover bits from db-export.h ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-30 12:41:43 -03:00
Wang Nan 5ea5888b2f perf ctf: Convert invalid chars in a string before set value
We observed some crazy apps on Android set their comm to unprintable
string. For example:

  # cat /proc/10607/task/*/comm
  tencent.qqmusic
  ...
  Binder_2
  日志输出线  <-- Chinese word 'log output thread'
  WifiManager
  ...

'perf data convert' fails to convert perf.data with such string to CTF format.

For example:

  # cat << EOF > ./badguy.c
  #include <sys/prctl.h>
  int main(int argc, char *argv[])
  {
         prctl(PR_SET_NAME, "\xe6\x97\xa5\xe5\xbf\x97\xe8\xbe\x93\xe5\x87\xba\xe7\xba\xbf");
         while(1)
                 sleep(1);
         return 0;
  }
  EOF
  # gcc ./badguy.c
  # perf record -e sched:* ./a.out
  # perf data convert --to-ctf ./bad.ctf
  CTF stream 4 flush failed
  [ perf data convert: Converted 'perf.data' into CTF data './bad.ctf' ]
  [ perf data convert: Converted and wrote 0.008 MB (78 samples)  ]
  # babeltrace ./bad.ctf/
  [error] Packet size (18446744073709551615 bits) is larger than remaining file size (262144 bits).
  [error] Stream index creation error.
  [error] Open file stream error.
  [warning] [Context] Cannot open_trace of format ctf at path ./bad.ctf.
  [warning] [Context] cannot open trace "./bad.ctf" from ./bad.ctf/ for reading.
  [error] Cannot open any trace for reading.

  [error] opening trace "./bad.ctf/" for reading.

  [error] none of the specified trace paths could be opened.

This patch converts unprintable characters to hexadecimal word.

After applying this patch the above test works correctly:

  # ~/perf data convert --to-ctf ./good.ctf
  [ perf data convert: Converted 'perf.data' into CTF data './good.ctf' ]
  [ perf data convert: Converted and wrote 0.008 MB (78 samples) ]
  # babeltrace ./good.ctf
  ..
  [23:14:35.491665268] (+0.000001100) sched:sched_wakeup: { cpu_id = 4 }, { perf_ip = 0xFFFFFFFF810AEF33, perf_tid = 0, perf_pid = 0, perf_id = 5123, perf_period = 1, common_type = 270, common_flags = 45, common_preempt_count = 4, common_pid = 0, comm = "\xe6\x97\xa5\xe5\xbf\x97\xe8\xbe\x93\xe5\x87\xba\xe7\xba\xbf", pid = 1057, prio = 120, success = 1, target_cpu = 4 }
  [23:14:35.491666230] (+0.000000962) sched:sched_wakeup: { cpu_id = 4 }, { perf_ip = 0xFFFFFFFF810AEF33, perf_tid = 0, perf_pid = 0, perf_id = 5122, perf_period = 1, common_type = 270, common_flags = 45, common_preempt_count = 4, common_pid = 0, comm = "\xe6\x97\xa5\xe5\xbf\x97\xe8\xbe\x93\xe5\x87\xba\xe7\xba\xbf", pid = 1057, prio = 120, success = 1, target_cpu = 4 }
  ..

Committer note:

To build perf with libabeltrace, use:

  $ mkdir -p /tmp/build/perf
  $ make LIBBABELTRACE=1 LIBBABELTRACE_DIR=/usr/local O=/tmp/build/perf -C tools/perf install-bin

Or equivalent (no O=, fixup LIBBABELTRACE_DIR, etc).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464348951-179595-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-27 12:08:40 -03:00
Wang Nan 3dc6c1d54f perf record: Fix crash when kptr is restricted
Before this patch, a simple 'perf record' could fail if kptr_restrict is
set to 1 (for normal user) or 2 (for root):

  # perf record ls
  WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
  check /proc/sys/kernel/kptr_restrict.

  Samples in kernel functions may not be resolved if a suitable vmlinux
  file is not found in the buildid cache or in the vmlinux path.

  Samples in kernel modules won't be resolved at all.

  If some relocation was applied (e.g. kexec) symbols may be misresolved
  even with a suitable vmlinux or kallsyms file.

  Segmentation fault (core dumped)

This patch skips perf_event__synthesize_kernel_mmap() when kptr is not
available.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes: 45e9005690 ("perf machine: Do not bail out if not managing to read ref reloc symbol")
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464081688-167940-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-27 09:41:39 -03:00
Wang Nan 38272dc4f1 perf symbols: Check kptr_restrict for root
If kptr_restrict is set to 2, even root is not allowed to see pointers.
This patch checks kptr_restrict even if euid == 0. For root, report
error if kptr_restrict is 2.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464081688-167940-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-27 09:41:23 -03:00
Linus Torvalds bdc6b758e4 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Mostly tooling and PMU driver fixes, but also a number of late updates
  such as the reworking of the call-chain size limiting logic to make
  call-graph recording more robust, plus tooling side changes for the
  new 'backwards ring-buffer' extension to the perf ring-buffer"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  perf record: Read from backward ring buffer
  perf record: Rename variable to make code clear
  perf record: Prevent reading invalid data in record__mmap_read
  perf evlist: Add API to pause/resume
  perf trace: Use the ptr->name beautifier as default for "filename" args
  perf trace: Use the fd->name beautifier as default for "fd" args
  perf report: Add srcline_from/to branch sort keys
  perf evsel: Record fd into perf_mmap
  perf evsel: Add overwrite attribute and check write_backward
  perf tools: Set buildid dir under symfs when --symfs is provided
  perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced
  perf annotate: Sort list of recognised instructions
  perf annotate: Fix identification of ARM blt and bls instructions
  perf tools: Fix usage of max_stack sysctl
  perf callchain: Stop validating callchains by the max_stack sysctl
  perf trace: Fix exit_group() formatting
  perf top: Use machine->kptr_restrict_warned
  perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1
  perf machine: Do not bail out if not managing to read ref reloc symbol
  perf/x86/intel/p4: Trival indentation fix, remove space
  ...
2016-05-25 17:05:40 -07:00
Wang Nan 3a62a7b820 perf record: Read from backward ring buffer
Introduce rb_find_range() to find start and end position from a backward
ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-5-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-23 18:22:48 -03:00
Wang Nan 65aea23387 perf evlist: Add API to pause/resume
perf_evlist__toggle_{pause,resume}() are introduced to pause/resume
events in an evlist. Utilize PERF_EVENT_IOC_PAUSE_OUTPUT ioctl.

Following commits use them to ensure overwrite ring buffer is paused
before reading.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-2-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Return -1, like all other ioctl() usage in evlist.c, rename 'pause'
  arg to avoid breaking the build on ubuntu 12.04 and other old systems ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-23 18:22:00 -03:00
Andi Kleen 508be0dfe6 perf report: Add srcline_from/to branch sort keys
Add "srcline_from" and "srcline_to" branch sort keys that allow to show
the source lines of a branch.

That makes it much easier to track down where particular branches happen
in the program, for example to examine branch mispredictions, or to
associate it with cycle counts:

  % perf record -b -e cycles:p ./tcall
  % perf report --sort srcline_from,srcline_to,mispredict
  ...
    15.10%  tcall.c:18       tcall.c:10       N
    14.83%  tcall.c:11       tcall.c:5        N
    14.12%  tcall.c:7        tcall.c:12       N
    14.04%  tcall.c:12       tcall.c:5        N
    12.42%  tcall.c:17       tcall.c:18       N
    12.39%  tcall.c:7        tcall.c:13       N
    12.27%  tcall.c:13       tcall.c:17       N
  ...

  % perf report --sort srcline_from,srcline_to,cycles
  ...
    17.12%  tcall.c:18       tcall.c:11       1
    17.01%  tcall.c:12       tcall.c:6        1
    16.98%  tcall.c:11       tcall.c:6        1
    15.91%  tcall.c:17       tcall.c:18       1
     6.38%  tcall.c:7        tcall.c:17       7
     4.80%  tcall.c:7        tcall.c:12       8
     4.21%  tcall.c:7        tcall.c:17       8
     2.67%  tcall.c:7        tcall.c:12       7
     2.62%  tcall.c:7        tcall.c:12       10
     2.10%  tcall.c:7        tcall.c:17       9
     1.58%  tcall.c:7        tcall.c:12       6
     1.44%  tcall.c:7        tcall.c:12       5
     1.38%  tcall.c:7        tcall.c:12       9
     1.06%  tcall.c:7        tcall.c:17       13
     1.05%  tcall.c:7        tcall.c:12       4
     1.01%  tcall.c:7        tcall.c:17       6

Open issues:

- Some kernel symbols get misresolved.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1463775308-32748-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-23 11:25:16 -03:00
Wang Nan d4c6fb36ac perf evsel: Record fd into perf_mmap
Add a fd field into struct perf_mmap so that perf can track the mmap fd.

This feature will be used for toggling overwrite ring buffers.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463762315-155689-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 14:56:58 -03:00
Wang Nan b90dc17a5d perf evsel: Add overwrite attribute and check write_backward
Add 'overwrite' attribute to evsel to mark whether this event is
overwritable. The following commits will support syntax like:

  # perf record -e cycles/overwrite/ ...

An overwritable evsel requires kernel support for the
perf_event_attr.write_backward ring buffer feature.

Add it to perf_missing_feature.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463762315-155689-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 14:54:23 -03:00
Ingo Molnar 408cf67707 perf/core improvements and fixes:
User visible:
 
 - We should not use the current value of the kernel.perf_event_max_stack as the
   default value for --max-stack in tools that can process perf.data files, they
   will only match if that sysctl wasn't changed from its default value at the
   time the perf.data file was recorded, fix it.
 
   This fixes a bug where a 'perf record -a --call-graph dwarf ; perf report'
   produces a glibc invalid free backtrace (Arnaldo Carvalho de Melo)
 
 - Provide a better warning when running 'perf trace' on a system where the
   kernel.kptr_restrict is set to 1, similar to the one produced by 'perf record',
   noticed on ubuntu 16.04 where this is the default kptr_restrict setting.
   (Arnaldo Carvalho de Melo)
 
 - Fix ordering of instructions in the annotation code, noticed when annotating
   ARM binaries, now that table is auto-ordered at first use, to avoid more such
   problems (Chris Ryder)
 
 - Set buildid dir under symfs when --symfs is provided (He Kuang)
 
 - Fix the 'exit_group()' syscall output in 'perf trace' (Arnaldo Carvalho de Melo)
 
 - Only auto set call-graph to "dwarf" in 'perf trace' when syscalls are being
   traced (Arnaldo Carvalho de Melo)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXPyWGAAoJENZQFvNTUqpAb0IQAJnk65WicWUuMppPic3owIb7
 KWudqyRtGY+3Rq1HkuWr49IeC0sEAQw5OAP3fiwJuwBap/+7VtIkKFzGr/+3N1lC
 lncj8Pmap52SraqKmQbR9GV5fqr6/I1MW28uEFLvI0bBYkGRm55QX8wjGlM7AMUu
 m47ok/y07An6OFSJoXV+Sn/gbRJVOayjw/1o9cYy6s/6j2ZV3gN8rUiuHVwtfIhd
 OKoxGPQUX1RVmWrzy6aG718JM8pt1tdQ6zd7AgEBbqvevT8TMxTgE8EzSw3TyQXL
 +O3CF91gmWm7SE8LxKJ0VhGazMJ9IFj/HcK6XV4BAJAX7IDCBmoRJdrGvbDIG6WC
 wHoBkMlpGkq2Y1eBsYQzbmj0ZE80jx8qvi6GETOnTTmaUyCBB3HI+1yUwYZIKVVN
 aqZUPfbJBlJU8MMj1SMS/QU30TKtdeCA3QSJEQC6jf8AinLgqIs2a+3FshHzQ1Tq
 2l55ySdPTpcUiTK9demBJP5GZTHSr8UfIXXZvuD2pu3zrf/5YdWxOP11HppreffA
 vG9C71HJrAbiSlFmjWOF9qra//hN4DD3c7F5ikBePWGk1Hvf0NfjO4hDZUMva4yn
 bXsB203+qe/R2G2sT8OfCYV1UnEp+lOMSDJXzLv7YS4rD8Nq/tPKmMXcNV7WIHWO
 cLf0Ati8z4DBGXFGaTsI
 =mOmQ
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-20160520' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

- We should not use the current value of the kernel.perf_event_max_stack as the
  default value for --max-stack in tools that can process perf.data files, they
  will only match if that sysctl wasn't changed from its default value at the
  time the perf.data file was recorded, fix it.

  This fixes a bug where a 'perf record -a --call-graph dwarf ; perf report'
  produces a glibc invalid free backtrace (Arnaldo Carvalho de Melo)

- Provide a better warning when running 'perf trace' on a system where the
  kernel.kptr_restrict is set to 1, similar to the one produced by 'perf record',
  noticed on ubuntu 16.04 where this is the default kptr_restrict setting.
  (Arnaldo Carvalho de Melo)

- Fix ordering of instructions in the annotation code, noticed when annotating
  ARM binaries, now that table is auto-ordered at first use, to avoid more such
  problems (Chris Ryder)

- Set buildid dir under symfs when --symfs is provided (He Kuang)

- Fix the 'exit_group()' syscall output in 'perf trace' (Arnaldo Carvalho de Melo)

- Only auto set call-graph to "dwarf" in 'perf trace' when syscalls are being
  traced (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-20 19:37:43 +02:00
Linus Torvalds c04a588029 powerpc updates for 4.7
Highlights:
  - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
  - Live patching support for ppc64le (also merged via livepatching.git)
 
 Various cleanups & minor fixes from:
  - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
    Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart
    Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael
    Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta,
    Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin
    Rothberg, Vipin K Parashar.
 
 General:
  - Update LMB associativity index during DLPAR add/remove from Nathan Fontenot
  - Fix branching to OOL handlers in relocatable kernel from Hari Bathini
  - Add support for userspace Power9 copy/paste from Chris Smart
  - Always use STRICT_MM_TYPECHECKS from Michael Ellerman
  - Add mask of possible MMU features from Michael Ellerman
 
 PCI:
  - Enable pass through of NVLink to guests from Alexey Kardashevskiy
  - Cleanups in preparation for powernv PCI hotplug from Gavin Shan
  - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
  - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
  - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G. Piccoli
  - Remove the dependency on EEH struct in DDW mechanism from Guilherme G. Piccoli
 
 selftests:
  - Test cp_abort during context switch from Chris Smart
  - Add several tests for transactional memory support from Rashmica Gupta
 
 perf:
  - Add support for sampling interrupt register state from Anju T
  - Add support for unwinding perf-stackdump from Chandan Kumar
 
 cxl:
  - Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud
  - Allow initialization on timebase sync failures from Frederic Barrat
  - Increase timeout for detection of AFU mmio hang from Frederic Barrat
  - Handle num_of_processes larger than can fit in the SPA from Ian Munsie
  - Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie
  - Add kernel API to allow a context to operate with relocate disabled from Ian Munsie
  - Check periodically the coherent platform function's state from Christophe Lombard
 
 Freescale:
  - Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum
    workaround, and a kconfig dependency fix."
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXPsGzAAoJEFHr6jzI4aWAVoAP/iKdrDe0eYHlVAE9SqnbsiZs
 lgDxdsC8P3fsmP1G9o/HkKhC82zHl/La8Ztz8dtqa+LkSzbfliWP1ztJsI7GsBFo
 tyCKzWnX9Rwvd3meHu/o/SQ29TNLm/PbPyyRqpj5QPbJ8XCXkAXR7ZZZqjvcMsJW
 /AgIr7Cgf53tl9oZzzl/c7CnNHhMq+NBdA71vhWtUx+T97wfJEGyKW6HhZyHDbEU
 iAki7fu77ZpEqC/Fh9swf0dCGBJ+a132NoMVo0AdV7EQLznUYlQpQEqa+1PyHZOP
 /ArOzf2mDg6m3PfCo1eiB07v8PnVZ3llEUbVAJNg3GUxbE4SHrqq/kwm0iElm3p/
 DvFxerCwdX9vmskJX4wDs+pSZRabXYj9XVMptsgFzA4joWrqqb7mBHqaort88YcY
 YSljEt1bHyXmiJ+dBya40qARsWUkCVN7ZgEzdxckq0KI3w7g2tqpqIbO2lClWT6t
 B3GpqQ4jp34+d1M14FB91fIGK7tMvOhSInE0Mv9+tPvRsepXqiiU/SwdAtRlr3m2
 zs/K+4FYcVjJ3Rmpgc+tI38PbZxHe212I35YN6L1LP+4ZfAtzz0NyKdooTIBtkbO
 19pX4WbBjKq8zK+YutrySncBIrbnI6VjW51vtRhgVKZliPFO/6zKagyU6FbxM+E5
 udQES+t3F/9gvtxgxtDe
 =YvyQ
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights:
   - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
   - Live patching support for ppc64le (also merged via livepatching.git)

  Various cleanups & minor fixes from:
   - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
     Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie,
     Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring,
     Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras,
     Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung
     Bauermann, Valentin Rothberg, Vipin K Parashar.

  General:
   - Update LMB associativity index during DLPAR add/remove from Nathan
     Fontenot
   - Fix branching to OOL handlers in relocatable kernel from Hari Bathini
   - Add support for userspace Power9 copy/paste from Chris Smart
   - Always use STRICT_MM_TYPECHECKS from Michael Ellerman
   - Add mask of possible MMU features from Michael Ellerman

  PCI:
   - Enable pass through of NVLink to guests from Alexey Kardashevskiy
   - Cleanups in preparation for powernv PCI hotplug from Gavin Shan
   - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
   - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
   - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
     from Guilherme G Piccoli
   - Remove the dependency on EEH struct in DDW mechanism from Guilherme
     G Piccoli

  selftests:
   - Test cp_abort during context switch from Chris Smart
   - Add several tests for transactional memory support from Rashmica
     Gupta

  perf:
   - Add support for sampling interrupt register state from Anju T
   - Add support for unwinding perf-stackdump from Chandan Kumar

  cxl:
   - Configure the PSL for two CAPI ports on POWER8NVL from Philippe
     Bergheaud
   - Allow initialization on timebase sync failures from Frederic Barrat
   - Increase timeout for detection of AFU mmio hang from Frederic
     Barrat
   - Handle num_of_processes larger than can fit in the SPA from Ian
     Munsie
   - Ensure PSL interrupt is configured for contexts with no AFU IRQs
     from Ian Munsie
   - Add kernel API to allow a context to operate with relocate disabled
     from Ian Munsie
   - Check periodically the coherent platform function's state from
     Christophe Lombard

  Freescale:
   - Updates from Scott: "Contains 86xx fixes, minor device tree fixes,
     an erratum workaround, and a kconfig dependency fix."

* tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits)
  powerpc/86xx: Fix PCI interrupt map definition
  powerpc/86xx: Move pci1 definition to the include file
  powerpc/fsl: Fix build of the dtb embedded kernel images
  powerpc/fsl: Fix rcpm compatible string
  powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC
  powerpc/fsl-pci: Add a workaround for PCI 5 errata
  powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb
  powerpc/powernv/npu: Add PE to PHB's list
  powerpc/powernv: Fix insufficient memory allocation
  powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism
  Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
  powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner()
  powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover()
  powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover()
  powerpc/eeh: Don't report error in eeh_pe_reset_and_recover()
  Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()"
  powerpc/powernv/npu: Enable NVLink pass through
  powerpc/powernv/npu: Rework TCE Kill handling
  powerpc/powernv/npu: Add set/unset window helpers
  powerpc/powernv/ioda2: Export debug helper pe_level_printk()
  ...
2016-05-20 10:12:41 -07:00
He Kuang a706670900 perf tools: Set buildid dir under symfs when --symfs is provided
This patch moves the reference of buildid dir to 'symfs/.debug' and
skips the local buildid dir when '--symfs' is given, so that every
single file opened by perf is relative to symfs directory now.

Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1463658462-85131-2-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:58 -03:00
Chris Ryder 7e4c149813 perf annotate: Sort list of recognised instructions
Currently the list of instructions recognised by perf annotate has to be
explicitly written in sorted order. This makes it easy to make mistakes
when adding new instructions. Sort the list of instructions on first
access.

Signed-off-by: Chris Ryder <chris.ryder@arm.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/4268febaf32f47f322c166fb2fe98cfec7041e11.1463676839.git.chris.ryder@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:57 -03:00
Chris Ryder 58c0400176 perf annotate: Fix identification of ARM blt and bls instructions
The ARM blt and bls instructions are not correctly identified when
parsing assembly because the list of recognised instructions must be
sorted by name. Swap the ordering of blt and bls.

Signed-off-by: Chris Ryder <chris.ryder@arm.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/560e196b7c79b7ff853caae13d8719a31479cb1a.1463676839.git.chris.ryder@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:57 -03:00
Arnaldo Carvalho de Melo fe176085a4 perf tools: Fix usage of max_stack sysctl
We cannot limit processing stacks from the current value of the sysctl,
as we may be processing perf.data files, possibly from other machines.

Instead use the old PERF_MAX_STACK_DEPTH, the sysctl default, that can
be overriden using --max-stack or equivalent.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Fixes: 4cb93446c5 ("perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack")
Link: http://lkml.kernel.org/n/tip-eqeutsr7n7wy0c36z24ytvii@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:56 -03:00
Arnaldo Carvalho de Melo bf8bddbf19 perf callchain: Stop validating callchains by the max_stack sysctl
As thread__resolve_callchain_sample can be used for handling perf.data
files, that could've been recorded with a large max_stack sysctl setting
than what the system used for analysis has set.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-2995bt2g5yq2m05vga4kip6m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:56 -03:00
Arnaldo Carvalho de Melo e77a07425f perf top: Use machine->kptr_restrict_warned
Its now there, no need to have it too.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-y18oeou494uy11im7u9to0dx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:55 -03:00
Arnaldo Carvalho de Melo caf8a0d049 perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1
Hook into the libtraceevent plugin kernel symbol resolver to warn the
user that that can't happen with kptr_restrict=1.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9gc412xx1gl0lvqj1d1xwlyb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:54 -03:00
Arnaldo Carvalho de Melo 45e9005690 perf machine: Do not bail out if not managing to read ref reloc symbol
This means the user can't access /proc/kallsyms, for instance, because
/proc/sys/kernel/kptr_restrict is set to 1.

Instead leave the ref_reloc_sym as NULL and code using it will cope.

This allows 'perf trace' to work on such systems for !root, the only
issue would be when trying to resolve kernel symbols, which happens,
for instance, in some libtracevent plugins.  A warning for that case
will be provided in the next patch in this series.

Noticed in Ubuntu 16.04, that comes with kptr_restrict=1.

Reported-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-knpu3z4iyp2dxpdfm798fac4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:54 -03:00
Ingo Molnar 21f77d231f perf/core improvements and fixes:
User visible:
 
 - Honour the kernel.perf_event_max_stack knob more precisely by not counting
   PERF_CONTEXT_{KERNEL,USER} when deciding when to stop adding entries to
   the perf_sample->ip_callchain[] array (Arnaldo Carvalho de Melo)
 
 - Fix identation of 'stalled-backend-cycles' in 'perf stat' (Namhyung Kim)
 
 - Update runtime using 'cpu-clock' event in 'perf stat' (Namhyung Kim)
 
 - Use 'cpu-clock' for cpu targets in 'perf stat' (Namhyung Kim)
 
 - Avoid fractional digits for integer scales in 'perf stat' (Andi Kleen)
 
 - Store vdso buildid unconditionally, as it appears in callchains and
   we're not checking those when creating the build-id table, so we
   end up not being able to resolve VDSO symbols when doing analysis
   on a different machine than the one where recording was done, possibly
   of a different arch even (arm -> x86_64) (He Kuang)
 
 Infrastructure:
 
 - Generalize max_stack sysctl handler, will be used for configuring
   multiple kernel knobs related to callchains (Arnaldo Carvalho de Melo)
 
 Cleanups:
 
 - Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE, to stop using
   open coded strings (Masami Hiramatsu)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXOn7eAAoJENZQFvNTUqpAsOAP/3f/XJekPQAnMcKRBp2noCuj
 nRu1kBltVJyP8iOU5PKSJwel4F9ykNNMl+/rzzxHDo13IM8uc+HnZOJZ6e9mJIJ1
 xqjdqM4EDlYYoFApJzCjTK6CMlevCazosdQT1bbmMDYVPc2uQR/GnutFrzqf/Plg
 hEougIGtfrdy85g95CRdxpy2yMwDK4EwsiDRm9ib1hnuamQZl97buWemBVqSJmLY
 p82E2aMU5Fv5+B8AO4I7V88ZmgpmryjxpM+LjffgNUDSKsSHrlG4NiQ3znV1bgst
 Rc++w78+qxoIozOu6/IX8eSI2L/1eyM/yQ6Qre0KuvYXCl+NopTAYSSJlaA4tyHF
 c55z7HucuyATN3PrFRHlbWUT/RMIVC0j0lnZOc7SJLl90hJQ+nv0iZcbYwMbeHu1
 3LGlcd9jDwQYiClbaT9ATxZJ8B9An0/k/HJdatbAHN0wRomP2Ozz/qD2nmEbUwpV
 sCyLOo/LJkvVkuUjSg6ZiOArNIk4iTSPSAUV+SAL6YOEOZMAX5ISUJQ174+zFC9a
 gqtVsCXvwLIsndXb8ys1r9/fit/MUci0OzKX3SG1K765+E4Bk23KcAgMNbM/a7lp
 ZmHDXMC+yBYcnYNnaxkp7c55CWUlKGOeR4e+KmB99KoeIleYgPhD2UM5beo61TmN
 yUEPtiiFiZmTRkiAu83R
 =7OdF
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-20160516' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

- Honour the kernel.perf_event_max_stack knob more precisely by not counting
  PERF_CONTEXT_{KERNEL,USER} when deciding when to stop adding entries to
  the perf_sample->ip_callchain[] array (Arnaldo Carvalho de Melo)

- Fix identation of 'stalled-backend-cycles' in 'perf stat' (Namhyung Kim)

- Update runtime using 'cpu-clock' event in 'perf stat' (Namhyung Kim)

- Use 'cpu-clock' for cpu targets in 'perf stat' (Namhyung Kim)

- Avoid fractional digits for integer scales in 'perf stat' (Andi Kleen)

- Store vdso buildid unconditionally, as it appears in callchains and
  we're not checking those when creating the build-id table, so we
  end up not being able to resolve VDSO symbols when doing analysis
  on a different machine than the one where recording was done, possibly
  of a different arch even (arm -> x86_64) (He Kuang)

Infrastructure changes:

- Generalize max_stack sysctl handler, will be used for configuring
  multiple kernel knobs related to callchains (Arnaldo Carvalho de Melo)

Cleanups:

- Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE, to stop using
  open coded strings (Masami Hiramatsu)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-20 08:20:14 +02:00
Linus Torvalds 16bf834805 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (21 commits)
  gitignore: fix wording
  mfd: ab8500-debugfs: fix "between" in printk
  memstick: trivial fix of spelling mistake on management
  cpupowerutils: bench: fix "average"
  treewide: Fix typos in printk
  IB/mlx4: printk fix
  pinctrl: sirf/atlas7: fix printk spelling
  serial: mctrl_gpio: Grammar s/lines GPIOs/line GPIOs/, /sets/set/
  w1: comment spelling s/minmum/minimum/
  Blackfin: comment spelling s/divsor/divisor/
  metag: Fix misspellings in comments.
  ia64: Fix misspellings in comments.
  hexagon: Fix misspellings in comments.
  tools/perf: Fix misspellings in comments.
  cris: Fix misspellings in comments.
  c6x: Fix misspellings in comments.
  blackfin: Fix misspelling of 'register' in comment.
  avr32: Fix misspelling of 'definitions' in comment.
  treewide: Fix typos in printk
  Doc: treewide : Fix typos in DocBook/filesystem.xml
  ...
2016-05-17 17:05:30 -07:00
Arnaldo Carvalho de Melo a29d5c9b81 perf tools: Separate accounting of contexts and real addresses in a stack trace
The perf_sample->ip_callchain->nr value includes all the entries in the
ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
while what the user expects is that what is in the kernel.perf_event_max_stack
sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
honoured in terms of IP addresses in the stack trace.

So match the kernel support and validate chain->nr taking into account
both kernel.perf_event_max_stack and kernel.perf_event_max_contexts_per_stack.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-mgx0jpzfdq4uq4abfa40byu0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:54 -03:00
Masami Hiramatsu 0a77582f04 perf symbols: Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE
Instead of using a raw string, use DSO__NAME_KALLSYMS and
DSO__NAME_KCORE macros for kallsyms and kcore.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160515031935.4017.50971.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:48 -03:00
Namhyung Kim daf4f4786e perf stat: Update runtime using cpu-clock event
Currently only the task-clock event updates the runtime_nsec so it
cannot show the metric when using cpu-clock events.  However cpu clock
works basically same as task-clock, so no need to not update the runtime
IMHO.

Before:

  # perf stat -a -e cpu-clock,context-switches,page-faults,cycles sleep 0.1

    Performance counter stats for 'system wide':

         1217.759506      cpu-clock (msec)
                  93      context-switches
                  61      page-faults
          18,958,022      cycles

         0.101393794 seconds time elapsed

After:

   Performance counter stats for 'system wide':

         1220.471884      cpu-clock (msec)          #   12.013 CPUs utilized
                 118      context-switches          #    0.097 K/sec
                  59      page-faults               #    0.048 K/sec
          17,941,247      cycles                    #    0.015 GHz

         0.101594777 seconds time elapsed

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1463119263-5569-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:46 -03:00
Namhyung Kim b0404be8d6 perf stat: Fix indentation of stalled backend cycle
The commit 140aeadc1f ("perf stat: Abstract stat metrics printing")
changed how shadow metrics are printed, but it missed to update the
width of the stalled backend cycles event to 7.2% like others.  This
resulted in misaligned output like below:

  Performance counter stats for 'pwd':

          0.638313      task-clock (msec)         #    0.567 CPUs utilized
                 0      context-switches          #    0.000 K/sec
                 0      cpu-migrations            #    0.000 K/sec
                54      page-faults               #    0.085 M/sec
           885,600      cycles                    #    1.387 GHz
           558,438      stalled-cycles-frontend   #   63.06% frontend cycles idle
           431,355      stalled-cycles-backend    #  48.71% backend cycles idle
           674,956      instructions              #    0.76  insn per cycle
                                                  #    0.83  stalled cycles per insn
           130,380      branches                  #  204.257 M/sec
     <not counted>      branch-misses

       0.001125426 seconds time elapsed

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 140aeadc1f ("perf stat: Abstract stat metrics printing")
Link: http://lkml.kernel.org/r/1463119263-5569-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:45 -03:00
He Kuang 6ae98ba611 perf symbols: Store vdso buildid unconditionally
When unwinding callchains on a different machine, vdso info should be
available so the unwind process won't be interrupted if address falls
into vdso region. But in most cases, the addresses of sample events are
not in vdso range, the buildid of a zero hit vdso won't be stored into
perf.data.

This patch stores vdso buildid regardless of whether the vdso is hit or
not.

Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1463042596-61703-3-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:45 -03:00
Linus Torvalds 36db171cc7 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Bigger kernel side changes:

   - Add backwards writing capability to the perf ring-buffer code,
     which is preparation for future advanced features like robust
     'overwrite support' and snapshot mode.  (Wang Nan)

   - Add pause and resume ioctls for the perf ringbuffer (Wang Nan)

   - x86 Intel cstate code cleanups and reorgnization (Thomas Gleixner)

   - x86 Intel uncore and CPU PMU driver updates (Kan Liang, Peter
     Zijlstra)

   - x86 AUX (Intel PT) related enhancements and updates (Alexander
     Shishkin)

   - x86 MSR PMU driver enhancements and updates (Huang Rui)

   - ... and lots of other changes spread out over 40+ commits.

  Biggest tooling side changes:

   - 'perf trace' features and enhancements.  (Arnaldo Carvalho de Melo)

   - BPF tooling updates (Wang Nan)

   - 'perf sched' updates (Jiri Olsa)

   - 'perf probe' updates (Masami Hiramatsu)

   - ... plus 200+ other enhancements, fixes and cleanups to tools/

  The merge commits, the shortlog and the changelogs contain a lot more
  details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (249 commits)
  perf/core: Disable the event on a truncated AUX record
  perf/x86/intel/pt: Generate PMI in the STOP region as well
  perf buildid-cache: Use lsdir() for looking up buildid caches
  perf symbols: Use lsdir() for the search in kcore cache directory
  perf tools: Use SBUILD_ID_SIZE where applicable
  perf tools: Fix lsdir to set errno correctly
  perf trace: Move seccomp args beautifiers to tools/perf/trace/beauty/
  perf trace: Move flock op beautifier to tools/perf/trace/beauty/
  perf build: Add build-test for debug-frame on arm/arm64
  perf build: Add build-test for libunwind cross-platforms support
  perf script: Fix export of callchains with recursion in db-export
  perf script: Fix callchain addresses in db-export
  perf script: Fix symbol insertion behavior in db-export
  perf symbols: Add dso__insert_symbol function
  perf scripting python: Use Py_FatalError instead of die()
  perf tools: Remove xrealloc and ALLOC_GROW
  perf help: Do not use ALLOC_GROW in add_cmd_list
  perf pmu: Make pmu_formats_string to check return value of strbuf
  perf header: Make topology checkers to check return value of strbuf
  perf tools: Make alias handler to check return value of strbuf
  ...
2016-05-16 14:08:43 -07:00
Arnaldo Carvalho de Melo 08094828b7 perf evsel: Handle EACCESS + perf_event_paranoid=2 in fallback()
Now with the default for the kernel.perf_event_paranoid sysctl being 2 [1]
we need to fall back to :u, i.e. to set perf_event_attr.exclude_kernel
to 1.

Before:

  [acme@jouet linux]$ perf record usleep 1
  Error:
  You may not have permission to collect stats.

  Consider tweaking /proc/sys/kernel/perf_event_paranoid,
  which controls use of the performance events system by
  unprivileged users (without CAP_SYS_ADMIN).

  The current value is 2:

    -1: Allow use of (almost) all events by all users
  >= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
  >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
  >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
  [acme@jouet linux]$

After:

  [acme@jouet linux]$ perf record usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.016 MB perf.data (7 samples) ]
  [acme@jouet linux]$ perf evlist
  cycles:u
  [acme@jouet linux]$ perf evlist -v
  cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
  [acme@jouet linux]$

And if the user turns on verbose mode, an explanation will appear:

  [acme@jouet linux]$ perf record -v usleep 1
  Warning:
  kernel.perf_event_paranoid=2, trying to fall back to excluding kernel samples
  mmap size 528384B
  [ perf record: Woken up 1 times to write data ]
  Looking at the vmlinux_path (8 entries long)
  Using /lib/modules/4.6.0-rc7+/build/vmlinux for symbols
  [ perf record: Captured and wrote 0.016 MB perf.data (7 samples) ]
  [acme@jouet linux]$

[1] 0161028b7c ("perf/core: Change the default paranoia level to 2")

Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-b20jmx4dxt5hpaa9t2rroi0o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12 16:13:16 -03:00
Arnaldo Carvalho de Melo 7d173913a6 perf evsel: Improve EPERM error handling in open_strerror()
We were showing a hardcoded default value for the kernel.perf_event_paranoid
sysctl, now that it became more paranoid (1 -> 2 [1]), this would need to be
updated, instead show the current value:

  [acme@jouet linux]$ perf record ls
  Error:
  You may not have permission to collect stats.

  Consider tweaking /proc/sys/kernel/perf_event_paranoid,
  which controls use of the performance events system by
  unprivileged users (without CAP_SYS_ADMIN).

  The current value is 2:

    -1: Allow use of (almost) all events by all users
  >= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
  >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
  >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
  [acme@jouet linux]$

[1] 0161028b7c ("perf/core: Change the default paranoia level to 2")

Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-0gc4rdpg8d025r5not8s8028@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12 15:44:55 -03:00
Arnaldo Carvalho de Melo 4924734570 perf probe: Check if dwarf_getlocations() is available
If not, tell the user that:

  config/Makefile:273: Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.157

And return -ENOTSUPP in die_get_var_range(), failing features that
need it, like the one pointed out above.

This fixes the build on older systems, such as Ubuntu 12.04.5.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vinson Lee <vlee@freedesktop.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9l7luqkq4gfnx7vrklkq4obs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12 11:26:59 -03:00
Arnaldo Carvalho de Melo 22a9f41b55 perf tools: Use readdir() instead of deprecated readdir_r()
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case when parsing tracepoint event definitions, to
avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it
instead of readdir_r().

See: http://man7.org/linux/man-pages/man3/readdir.3.html

"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe.  In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."

Noticed while building on a Fedora Rawhide docker container.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wddn49r6bz6wq4ee3dxbl7lo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12 11:26:58 -03:00
Arnaldo Carvalho de Melo 7839b9f32e perf thread_map: Use readdir() instead of deprecated readdir_r()
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case in thread_map, so, to avoid breaking the build
with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r().

See: http://man7.org/linux/man-pages/man3/readdir.3.html

"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe.  In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."

Noticed while building on a Fedora Rawhide docker container.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-del8h2a0f40z75j4r42l96l0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12 11:26:58 -03:00
Arnaldo Carvalho de Melo 2515e61483 perf tools: Use readdir() instead of deprecated readdir_r()
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case when synthesizing events for pre-existing threads
by traversing /proc, so, to avoid breaking the build with glibc-2.23.90
(upcoming 2.24), use it instead of readdir_r().

See: http://man7.org/linux/man-pages/man3/readdir.3.html

"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe.  In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."

Noticed while building on a Fedora Rawhide docker container.

   CC       /tmp/build/perf/util/event.o
  util/event.c: In function '__event__synthesize_thread':
  util/event.c:466:2: error: 'readdir_r' is deprecated [-Werror=deprecated-declarations]
    while (!readdir_r(tasks, &dirent, &next) && next) {
    ^~~~~
  In file included from /usr/include/features.h:368:0,
                   from /usr/include/stdint.h:25,
                   from /usr/lib/gcc/x86_64-redhat-linux/6.0.0/include/stdint.h:9,
                   from /git/linux/tools/include/linux/types.h:6,
                   from util/event.c:1:
  /usr/include/dirent.h:189:12: note: declared here

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i1vj7nyjp2p750rirxgrfd3c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12 10:22:54 -03:00
Masami Hiramatsu d65444d2fb perf buildid-cache: Use lsdir() for looking up buildid caches
Use new lsdir() for looking up buildid caches. This changes logic a bit
to ignore all dot files, since the build-id cache must not start with
dot.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160511135217.23943.94596.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11 13:06:08 -03:00
Masami Hiramatsu c48903b816 perf symbols: Use lsdir() for the search in kcore cache directory
Use lsdir() to search in kcore cache directory. This also avoids
checking hidden dot directory entries, because kcore cache directories
must always have the name from timestamps when taking the kcore
snapshots, and it never start with dot.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160511135208.23943.68071.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11 13:06:07 -03:00
Masami Hiramatsu b5d8bbe860 perf tools: Use SBUILD_ID_SIZE where applicable
Use the existing SBUILD_ID_SIZE macro instead of the equivalent
BUILD_ID_SIZE * 2 + 1 expression for allocating a buffer for build-id
strings.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160511135159.23943.57120.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11 13:06:06 -03:00
Masami Hiramatsu 357a54f32a perf tools: Fix lsdir to set errno correctly
Fix lsdir() to set correct positive error number (ENOMEM).  Since
"errno" must have a positive error number instead of negative number,
fix lsdir to set it correctly.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: e1ce726e1d ("perf tools: Add lsdir() helper to read a directory")
Link: http://lkml.kernel.org/r/20160511135127.23943.40644.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11 13:06:05 -03:00
Chris Phlipot 83302e79b1 perf script: Fix export of callchains with recursion in db-export
When an IP with an unresolved symbol occurs in the callchain more than
once (ie. recursion), then duplicate symbols can be created because
the callchain nodes are never updated after they are first created.

To fix this issue we call dso__find_symbol whenever we encounter a NULL
symbol, in case we already added a symbol at that IP since we started
traversing the callchain.

This change prevents duplicate symbols from being exported when duplicate
IPs are present in the callchain.

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-5-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11 12:24:58 -03:00
Chris Phlipot 7a2544c004 perf script: Fix callchain addresses in db-export
Remove the call to map_ip() to adjust al.addr, because it has already
been called when assembling the callchain, in:

  thread__resolve_callchain_sample(perf_sample)
      add_callchain_ip(ip = perf_sample->callchain->ips[j])
          thread__find_addr_location(addr = ip)
              thread__find_addr_map(addr) {
                  al->addr = addr
                  if (al->map)
                      al->addr = al->map->map_ip(al->map, al->addr);
              }

Calling it a second time can result in incorrect addresses being used.
This can have effects such as duplicate symbols being created and
exported.

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-4-git-send-email-cphlipot0@gmail.com
[ Show the callchain where it is done, to help reviewing this change down the line ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11 12:24:58 -03:00
Chris Phlipot bd0a51dd27 perf script: Fix symbol insertion behavior in db-export
Use the dso__insert_symbol function instead of symbols__insert() in
order to properly update the dso symbol cache.

If the cache is not updated, then duplicate symbols can be
unintentionally created, inserted, and exported.

This change prevents duplicate symbols from being exported due to
dso__find_symbol() using a stale symbol cache.

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-3-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11 12:24:57 -03:00
Chris Phlipot ae93a6c708 perf symbols: Add dso__insert_symbol function
The current method for inserting symbols is to use the symbols__insert()
function. However symbols__insert() does not update the dso symbol
cache.  This causes problems in the following scenario:

1. symbol not found at addr using dso__find_symbol

2. symbol inserted at addr using the existing symbols__insert function

3. symbol still not found at addr using dso__find_symbol() because cache isn't
   updated. This is undesired behavior.

The undesired behavior in (3) is addressed by creating a new function,
dso__insert_symbol() to both insert the symbol and update the symbol
cache if necessary.

If dso__insert_symbol() is used in (2) instead of symbols__insert(),
then the undesired behavior in (3) is avoided.

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11 12:24:57 -03:00
Arnaldo Carvalho de Melo 62665dff75 perf scripting python: Use Py_FatalError instead of die()
It probably is equivalent, but that seems to be the "pythonic" way of
dieing? Anyway, one less die() in the tools/perf codebase.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Chris Phlipot <cphlipot0@gmail.com>
Link: http://lkml.kernel.org/n/tip-nlzgepdv2818zs4e7faif9tu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11 12:24:57 -03:00
Ingo Molnar 38f5d8b32f Merge tag 'perf-core-for-mingo-20160510' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

- Recording 'dwarf' callchains do not need DWARF unwinding support (He Kuang)

- Print recently added perf_event_attr.write_backward bit flag in -vv
  verbose mode (Arnaldo Carvalho de Melo)

- Fix incorrect python db-export error message in 'perf script' (Chris Phlipot)

- Fix handling of zero-length symbols (Chris Phlipot)

- perf stat: Scale values by unit before metrics (Andi Kleen)

Infrastructure changes:

- Rewrite strbuf not to die(), making tools using it to check its
  return value instead (Masami Hiramatsu)

- Support reading from backward ring buffer, add a 'perf test' entry
  for it (Wang Nan)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-11 16:56:58 +02:00
Ingo Molnar d2950158d0 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-11 16:56:38 +02:00
Namhyung Kim e9d848cb65 perf diff: Fix duplicated output column
The commit b97511c5bc ("perf tools: Add overhead/overhead_children
keys defaults via string") moved initialization of column headers but it
missed to check the sort__mode.  As 'perf diff' doesn't call
perf_hpp__init(), the setup_overhead() also should not be called.

Before:

  # Baseline    Delta  Children  Overhead  Shared Object        Symbol
  # ........  .......  ........  ........  ...................  .......................
  #
      28.48%  -28.47%    28.48%    28.48%  [kernel.vmlinux ]    [k] intel_idle
      11.51%  -11.47%    11.51%    11.51%  libxul.so            [.] 0x0000000001a360f7
       3.49%   -3.49%     3.49%     3.49%  [kernel.vmlinux]     [k] generic_exec_single
       2.91%   -2.89%     2.91%     2.91%  libdbus-1.so.3.8.11  [.] 0x000000000000cdc2
       2.86%   -2.85%     2.86%     2.86%  libxcb.so.1.1.0      [.] 0x000000000000c890
       2.44%   -2.39%     2.44%     2.44%  [kernel.vmlinux]     [k] perf_event_aux_ctx

After:

  # Baseline    Delta  Shared Object        Symbol
  # ........  .......  ...................  .......................
  #
      28.48%  -28.47%  [kernel.vmlinux]     [k] intel_idle
      11.51%  -11.47%  libxul.so            [.] 0x0000000001a360f7
       3.49%   -3.49%  [kernel.vmlinux]     [k] generic_exec_single
       2.91%   -2.89%  libdbus-1.so.3.8.11  [.] 0x000000000000cdc2
       2.86%   -2.85%  libxcb.so.1.1.0      [.] 0x000000000000c890
       2.44%   -2.39%  [kernel.vmlinux]     [k] perf_event_aux_ctx

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: <stable@vger.kernel.org> # 4.5+
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: b97511c5bc ("perf tools: Add overhead/overhead_children keys defaults via string")
Link: http://lkml.kernel.org/r/1462890384-12486-2-git-send-email-acme@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-11 16:55:32 +02:00
Naveen N. Rao f47822078d perf tools: Fix perf regs mask generation
On some architectures (powerpc in particular), the number of registers
exceeds what can be represented in an integer bitmask. Ensure we
generate the proper bitmask on such platforms.

Fixes: 71ad0f5e4 ("perf tools: Support for DWARF CFI unwinding on post processing")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:06 +10:00
Masami Hiramatsu 452e840125 perf tools: Remove xrealloc and ALLOC_GROW
Remove unused xrealloc() and ALLOC_GROW() from libperf.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054801.6158.6204.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10 11:58:27 -03:00
Masami Hiramatsu 682f4f035e perf help: Do not use ALLOC_GROW in add_cmd_list
Replace ALLOC_GROW with normal realloc code in add_cmd_list() so that it
can handle errors directly.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054752.6158.30562.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10 11:58:09 -03:00
Masami Hiramatsu 11db4e29bb perf pmu: Make pmu_formats_string to check return value of strbuf
Make pmu_formats_string() to check return value of strbuf APIs so that
it can detect errors in it.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054744.6158.37810.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10 11:57:52 -03:00
Masami Hiramatsu 642aadaa32 perf header: Make topology checkers to check return value of strbuf
Make topology checkers to check the return value of strbuf APIs so that
it can detect errors in it.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054735.6158.98650.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10 11:57:22 -03:00
Masami Hiramatsu 70a6898fdc perf tools: Make alias handler to check return value of strbuf
Make alias handler and sq_quote_argv to check the return value of strbuf
APIs.

In sq_quote_argv() calls die(), but this fix handles strbuf failure as a
special case and returns to caller, since the caller - handle_alias()
also has to check the return value of other strbuf APIs and those checks
can be merged to one if() statement.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054725.6158.84597.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10 11:56:52 -03:00
Masami Hiramatsu bf4d5f25c9 perf probe: Check the return value of strbuf APIs
Check the return value of strbuf APIs in perf-probe
related code, so that it can handle errors in strbuf.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054707.6158.69861.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10 11:53:34 -03:00
Masami Hiramatsu 5cea57f30a perf tools: Rewrite strbuf not to die()
Rewrite strbuf implementation not to use die() nor xrealloc().  Instead
of die(), now most of the API returns error code or 0 if succeeded.

Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054658.6158.24080.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10 11:27:58 -03:00
Chris Phlipot 9c7b37cd63 perf symbols: Fix handling of zero-length symbols.
This change introduces a fix to symbols__find, so that it is able to
find symbols of length zero (where start == end).

The current code has the following problem:

- The current implementation of symbols__find is unable to find any symbols
  of length zero.

- The db-export framework explicitly creates zero length symbols at
  locations where no symbol currently exists.

The combination of the two above behaviors results in behavior similar
to the example below.

1. addr_location is created for a sample, but symbol is unable to be
   resolved.

2. db export creates an "unknown" symbol of length zero at that address
   and inserts it into the dso.

3. A new sample comes in at the same address, but symbol__find is unable
   to find the zero length symbol, so it is still unresolved.

4. db export sees the symbol is unresolved, and allocated a duplicate
   symbol, even though it already did this in step 2.

This behavior continues every time an address without symbol information
is seen, which causes a very large number of these symbols to be
allocated.

The effect of this fix can be observed by looking at the contents of an
exported database before/after the fix (generated with
scripts/python/export-to-postgresql.py)

Ex.
BEFORE THE CHANGE:

  example_db=# select count(*) from symbols;
   count
  --------
   900213
  (1 row)

  example_db=# select count(*) from symbols where symbols.name='unknown';
   count
  --------
   897355
  (1 row)

  example_db=# select count(*) from symbols where symbols.name!='unknown';
   count
  -------
    2858
  (1 row)

AFTER THE CHANGE:

  example_db=# select count(*) from symbols;
   count
  -------
   25217
  (1 row)

  example_db=# select count(*) from symbols where name='unknown';
   count
  -------
   22359
  (1 row)

  example_db=# select count(*) from symbols where name!='unknown';
   count
  -------
    2858
  (1 row)

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462612620-25008-1-git-send-email-cphlipot0@gmail.com
[ Moved the test to later in the rb_tree tests, as this not the likely case ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09 18:40:03 -03:00
Arnaldo Carvalho de Melo 0a241ef4a2 perf evsel: Print state of perf_event_attr.write_backward
Now we can see if it is set when using verbose mode in various tools,
such as 'perf test':

  # perf test -vv back
  45: Test backward reading from ring buffer                   :
  --- start ---
  <SNIP>
  ------------------------------------------------------------
  perf_event_attr:
    type                             2
    size                             112
    config                           0x98
    { sample_period, sample_freq }   1
    sample_type                      IP|TID|TIME|CPU|PERIOD|RAW
    disabled                         1
    mmap                             1
    comm                             1
    task                             1
    sample_id_all                    1
    exclude_guest                    1
    mmap2                            1
    comm_exec                        1
    write_backward                   1
  ------------------------------------------------------------
  sys_perf_event_open: pid 20911  cpu -1  group_fd -1  flags 0x8
  <SNIP>
  ---- end ----
  Test backward reading from ring buffer: Ok
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kxv05kv9qwl5of7rzfeiiwbv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09 18:11:27 -03:00
Wang Nan e24c7520ea perf tools: Support reading from backward ring buffer
perf_evlist__mmap_read_backward() is introduced for reading backward
ring buffer. Since direction for reading such ring buffer is different
from the direction kernel writing to it, and since user need to fetch
most recent record from it, a perf_evlist__mmap_read_catchup() is
introduced to move the reading pointer to the end of the buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1462758471-89706-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09 17:20:53 -03:00
Chris Phlipot aff633406c perf script: Fix incorrect python db-export error message
Fix the error message printed when attempting and failing to create the
call path root incorrectly references the call return process.

This change fixes the message to properly reference the failure to
create the call path root.

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462612620-25008-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09 14:08:39 -03:00
Andi Kleen f340c5fc93 perf stat: Scale values by unit before metrics
Scale values by unit before passing them to the metrics printing
functions.  This is needed for TopDown, because it needs to scale the
slots correctly by pipeline width / SMTness.

For existing metrics it shouldn't make any difference, as those
generally use events that don't have any units.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462489447-31832-8-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09 13:42:09 -03:00
He Kuang 841e3558b2 perf callchain: Recording 'dwarf' callchains do not need DWARF unwinding support
There is no need to check for DWARF unwinding support when using the
'dwarf' callchain record method, as this will only ask the kernel to
collect stack dumps for later DWARF CFI processing, which can be done in
another machine, where the support for DWARF unwinding need to be
present.

Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1462525154-125656-2-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09 13:29:36 -03:00
Chris Phlipot 2c15f5eb04 perf script: Expose usage of the callchain db export via the python api
This change allows python scripts to be able to utilize the recent
changes to the db export api allowing the export of call_paths derived
from sampled callchains. These call paths are also now associated with
the samples from which they were derived.

- This feature is enabled by setting "perf_db_export_callchains" to true

- When enabled, samples that have callchain information will have the
  callchains exported via call_path_table

- The call_path_id field is added to sample_table to enable association of
  samples with the corresponding callchain stored in the call paths
  table. A call_path_id of 0 will be exported if there is no
  corresponding callchain.

- When "perf_db_export_callchains" and "perf_db_export_calls" are both
  set to True, the call path root data structure will be shared. This
  prevents duplicating of data and call path ids that would result from
  building two separate call path trees in memory.

- The call_return_processor structure definition was relocated to the header
  file to make its contents visible to db-export.c. This enables the
  sharing of call path trees between the two features, as mentioned
  above.

This change is visible to python scripts using the python db export api.

The change is backwards compatible with scripts written against the
previous API, assuming that the scripts model the sample_table function
after the one in export-to-postgresql.py script by allowing for
additional arguments to be added in the future. ie. using *x as the
final argument of the sample_table function.

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-6-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06 13:00:54 -03:00
Chris Phlipot 568850eaad perf script: Add call path id to exported sample in db export
The exported sample now contains a reference to the call_path_id that
represents its callchain.

While callchains themselves are nice to have, being able to associate
them with samples makes them much more useful, and can allow for such
things as determining how much cumulative time is spent in a particular
function. This information is normally possible to get from the call
return processor. However, when doing normal sampling, call/return
information is not available, thus necessitating the need for
associating samples directly with call paths.

This commit include changes to db-export layer to make this information
available for subsequent patches in this change set, but by itself, does
not make any changes visible to the user.

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-5-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06 13:00:53 -03:00
Chris Phlipot 0a3eba3ad6 perf script: Enable db export to output sampled callchains
This change enables the db export api to export callchains. This is
accomplished by adding callchains obtained from samples to the
call_path_root structure and exporting them via the current call path
export API.

While the current API does support exporting call paths, this is not
supported when sampling. This commit addresses that missing feature by
allowing the export of call paths when callchains are present in
samples.

Summary:

- This feature is activated by initializing the call_path_root member
  inside the db_export structure to a non-null value.

- Callchains are resolved with thread__resolve_callchain() and then stored
  and exported by adding a call path under call path root.
- Symbol and DSO for each callchain node are exported via db_ids_from_al()

This commit puts in place infrastructure to be used by subsequent commits,
and by itself, does not introduce any user-visible changes.

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-4-git-send-email-cphlipot0@gmail.com
[ Made adjustments suggested by Adrian Hunter, see thread via this cset's Link: tag ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06 13:00:52 -03:00
Chris Phlipot 451db12617 perf tools: Refactor code to move call path handling out of thread-stack
Move the call path handling code out of thread-stack.c and
thread-stack.h to allow other components that are not part of
thread-stack to create call paths.

Summary:

- Create call-path.c and call-path.h and add them to the build.

- Move all call path related code out of thread-stack.c and thread-stack.h
  and into call-path.c and call-path.h.

- A small subset of structures and functions are now visible through
  call-path.h, which is required for thread-stack.c to continue to
  compile.

This change is a prerequisite for subsequent patches in this change set
and by itself contains no user-visible changes.

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-3-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06 13:00:43 -03:00
Chris Phlipot 9919a65ec5 perf callchain: Fix incorrect ordering of entries
The existing implementation of thread__resolve_callchain, under certain
circumstances, can assemble callchain entries in the incorrect order.

The callchain entries are resolved incorrectly for a sample when all of
the following conditions are met:

1. callchain_param.order is set to ORDER_CALLER

2. thread__resolve_callchain_sample is able to resolve callchain entries
   for the sample.

3. unwind__get_entries is also able to resolve callchain entries for the
   sample.

The fix is accomplished by reversing the order in which
thread__resolve_callchain_sample and unwind__get_entries are called when
callchain_param.order is set to ORDER_CALLER.

Unwind specific code from thread__resolve_callchain is also moved into a
new static function to improve readability of the fix.

How to Reproduce the Existing Bug:

Modifying perf script to print call trees in the opposite order or
applying the remaining patches from this series and comparing the
results output from export-to-postgtresql.py are the easiest ways to see
the bug, however it can still be seen in current builds using perf
report.

Here is how i can reproduce the bug using perf report:

  # perf record --call-graph=dwarf stress -c 1 -t 5

when i run this command:

  # perf report --call-graph=flat,0,0,callee

This callchain, containing kernel (handle_irq_event, etc) and userspace
samples (__libc_start_main, etc) is contained in the output, which looks
correct (callee order):

                gen8_irq_handler
                handle_irq_event_percpu
                handle_irq_event
                handle_edge_irq
                handle_irq
                do_IRQ
                ret_from_intr
                __random
                rand
                0x558f2a04dded
                0x558f2a04c774
                __libc_start_main
                0x558f2a04dcd9

Now run this command using caller order:

  # perf report --call-graph=flat,0,0,caller

It is expected to see the exact reverse of the above when using caller
order (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the
bottom) in the output, but it is nowhere to be found.

instead you see this:

                ret_from_intr
                do_IRQ
                handle_irq
                handle_edge_irq
                handle_irq_event
                handle_irq_event_percpu
                gen8_irq_handler
                0x558f2a04dcd9
                __libc_start_main
                0x558f2a04c774
                0x558f2a04dded
                rand
                __random

Notice how internally the kernel symbols are reversed and the user space
symbols are reversed, but the kernel symbols still appear above the user
space symbols.

if this patch is applied and perf script is re-run, you will see the
expected output (with "0x558f2a04dcd9" at the top and "gen8_irq_handler"
at the bottom):

                0x558f2a04dcd9
                __libc_start_main
                0x558f2a04c774
                0x558f2a04dded
                rand
                __random
                ret_from_intr
                do_IRQ
                handle_irq
                handle_edge_irq
                handle_irq_event
                handle_irq_event_percpu
                gen8_irq_handler

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06 08:59:47 -03:00
Wang Nan b6b85dad30 perf evlist: Rename variable in perf_mmap__read()
In perf_mmap__read(), give better names to pointers. Original name 'old'
and 'head' directly related to pointers in ring buffer control page. For
backward ring buffer, the meaning of 'head' point is not 'the first byte
of free space', but 'the first byte of the last record'. To reduce
confusion, rename 'old' to 'start', 'head' to 'end'.  'start' -> 'end'
is the direction the records should be read from.

Change parameter order.

Change 'overwrite' to 'check_messup'. When reading from 'head', no need
to check messup for for backward ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461723563-67451-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05 21:04:04 -03:00
Wang Nan 0f4ccd1181 perf evlist: Extract perf_mmap__read()
Extract event reader from perf_evlist__mmap_read() to perf__mmap_read().
Future commit will feed it with manually computed 'head' and 'old'
pointers.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461723563-67451-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05 21:04:03 -03:00
Naveen N. Rao 0b3c2264ae perf symbols: Fix kallsyms perf test on ppc64le
ppc64le functions have a Global Entry Point (GEP) and a Local Entry
Point (LEP). While placing a probe, we always prefer the LEP since it
catches function calls through both the GEP and the LEP. In order to do
this, we fixup the function entry points during elf symbol table lookup
to point to the LEPs. This works, but breaks 'perf test kallsyms' since
the symbols loaded from the symbol table (pointing to the LEP) do not
match the symbols in kallsyms.

To fix this, we do not adjust all the symbols during symbol table load.
Instead, we note down st_other in a newly introduced arch-specific
member of perf symbol structure, and later use this to adjust the probe
trace point.

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Cc: Mark Wielaard <mjw@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/6be7c2b17e370100c2f79dd444509df7929bdd3e.1460451721.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05 21:04:03 -03:00
Jiri Olsa 7cecb7fe83 perf hists: Move sort__has_comm into struct perf_hpp_list
Now we have sort dimensions private for struct hists,
we need to make dimension booleans hists specific as
well.

Moving sort__has_comm into struct perf_hpp_list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-8-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05 21:04:02 -03:00
Jiri Olsa fa82911a1b perf hists: Move sort__has_thread into struct perf_hpp_list
Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.

Moving sort__has_thread into struct perf_hpp_list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05 21:04:01 -03:00
Jiri Olsa 35a634f76c perf hists: Move sort__has_socket into struct perf_hpp_list
Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.

Moving sort__has_socket into struct perf_hpp_list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05 21:04:01 -03:00
Jiri Olsa 69849fc5d2 perf hists: Move sort__has_dso into struct perf_hpp_list
Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.

Moving sort__has_dso into struct perf_hpp_list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05 21:04:00 -03:00
Jiri Olsa 2e0453af4e perf hists: Move sort__has_sym into struct perf_hpp_list
Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.

Moving sort__has_sym into struct perf_hpp_list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05 21:03:59 -03:00
Jiri Olsa de7e6a7c8b perf hists: Move sort__has_parent into struct perf_hpp_list
Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.

Moving sort__has_parent into struct perf_hpp_list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05 21:03:59 -03:00
Jiri Olsa 52225036fa perf hists: Move sort__need_collapse into struct perf_hpp_list
Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.

Moving sort__need_collapse into struct perf_hpp_list.

Adding hists__has macro to easily access this info perf struct hists
object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05 21:03:58 -03:00
Arnaldo Carvalho de Melo f58c253564 perf tools: Add template for generating rbtree resort class
Sometimes we want to sort an existing rbtree by a different key,
introduce a template for that, that needs only to be provided the
rbtree root and the number of entries in it.

To do that a new rbtree will be created with extra space for each entry,
where possibly pre-calculated keys will be stored to be used in the
resort process and also later, when using the newly sorted rbtree.

Please check the following two changesets to see it in use for resorting
stats for threads and its syscalls in 'perf trace --summary'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9l6e1q34lmf3wwdeewstyakg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05 21:03:55 -03:00
Arnaldo Carvalho de Melo d2c1103440 perf machine: Introduce number of threads member
To be used, for instance, for pre-allocating an rb_tree array for
sorting by other keys besides the current pid one.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ja0ifkwue7ttjhbwijn6g6eu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05 21:03:55 -03:00
Wang Nan 3dcc4436fa perf tools: Introduce trigger class
Use 'trigger' to model operations which need to be executed when an
event (a signal, for example) is observed.

States and transits:

 OFF--(on)--> READY --(hit)--> HIT
		^               |
		|            (ready)
		|               |
		 \_____________/

is_hit and is_ready are two key functions to query the state of a
trigger. is_hit means the event already happen; is_ready means the
trigger is waiting for the event.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461178794-40467-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-28 09:58:58 -03:00
Masami Hiramatsu 909b0360ae perf probe: Use strbuf for making strings
Replace many fixed-length char array with strbuf to stringify
perf_probe_event and probe_trace_event etc.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160427183713.23446.97377.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-28 09:58:58 -03:00
Arnaldo Carvalho de Melo 81d64f46d4 perf evsel: Remove two extraneous ending newlines in open_strerror()
The error messages returned by this method should not have an ending
newline, fix the two cases where it was.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-8af0pazzhzl3dluuh8p7ar7p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-28 09:58:58 -03:00
Arnaldo Carvalho de Melo de46d5268c perf evsel: Handle ENOMEM for perf_event_max_stack + PERF_SAMPLE_CALLCHAIN
When the kernel allows tweaking perf_event_max_stack and the event being
setup has PERF_SAMPLE_CALLCHAIN in its perf_event_attr.sample_type, tell
the user that tweaking /proc/sys/kernel/perf_event_max_stack may solve
the problem.

Before:

  # echo 32000 > /proc/sys/kernel/perf_event_max_stack
  # perf record -g usleep 1
  Error:
  The sys_perf_event_open() syscall returned with 12 (Cannot allocate memory) for event (cycles:ppp).
  /bin/dmesg may provide additional information.
  No CONFIG_PERF_EVENTS=y kernel support configured?

  #

After:

  # echo 64000 > /proc/sys/kernel/perf_event_max_stack
  # perf record -g usleep 1
  Error:
  Not enough memory to setup event with callchain.
  Hint: Try tweaking /proc/sys/kernel/perf_event_max_stack
  Hint: Current value: 64000
  #

Suggested-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ebv0orelj1s1ye857vhb82ov@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-28 09:58:58 -03:00
Arnaldo Carvalho de Melo 4cb93446c5 perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack
There is an upper limit to what tooling considers a valid callchain,
and it was tied to the hardcoded value in the kernel,
PERF_MAX_STACK_DEPTH (127), now that this can be tuned via a sysctl,
make it read it and use that as the upper limit, falling back to
PERF_MAX_STACK_DEPTH for kernels where this sysctl isn't present.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-yjqsd30nnkogvj5oyx9ghir9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-27 10:29:07 -03:00
Ravi Bangoria c61fb959df perf probe: Fix module probe issue if no dwarf support
Perf is not able to register probe in kernel module when dwarf supprt
is not there(and so it goes for symtab). Perf passes full path of
module where only module name is required which is causing the problem.
This patch fixes this issue.

Before applying patch:

  $ dpkg -s libdw-dev
  dpkg-query: package 'libdw-dev' is not installed and no information is...

  $ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init
  Added new event:
    probe:kprobe_init (on kprobe_init in /linux/samples/kprobes/kprobe_example.ko)

  You can now use it in all perf tools, such as:

  perf record -e probe:kprobe_init -aR sleep 1

  $ sudo cat /sys/kernel/debug/tracing/kprobe_events
  p:probe/kprobe_init /linux/samples/kprobes/kprobe_example.ko:kprobe_init

  $ sudo ./perf record -a -e probe:kprobe_init
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.105 MB perf.data ]

  $ sudo ./perf script 	# No output here

After applying patch:

  $ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init
  Added new event:
    probe:kprobe_init    (on kprobe_init in kprobe_example)

  You can now use it in all perf tools, such as:

  perf record -e probe:kprobe_init -aR sleep 1

  $ sudo cat /sys/kernel/debug/tracing/kprobe_events
  p:probe/kprobe_init kprobe_example:kprobe_init

  $ sudo ./perf record -a -e probe:kprobe_init
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.105 MB perf.data (2 samples) ]

  $ sudo ./perf script
  insmod 13990 [002]  5961.216833: probe:kprobe_init: ...
  insmod 13995 [002]  5962.889384: probe:kprobe_init: ...

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1461680741-12517-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-26 13:15:01 -03:00
Ravi Bangoria 63a29613d7 perf probe: Fix offline module name missmatch issue
Perf can add a probe on kernel module which has not been loaded yet.

The current implementation finds the module name from path. But if the
filename is different from the actual module name then perf fails to
register a probe while loading module because of mismatch in the names.

For example, samples/kobject/kobject-example.ko is loaded as
kobject_example.

Before applying patch:

  $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
    Added new event:
      probe:foo_show       (on foo_show in kobject-example)

    You can now use it in all perf tools, such as:

    perf record -e probe:foo_show -aR sleep 1

  $ cat /sys/kernel/debug/tracing/kprobe_events
    p:probe/foo_show kobject-example:foo_show

  $ insmod kobject-example.ko

  $ lsmod
    Module                  Size  Used by
    kobject_example        16384  0

  Generate read to /sys/kernel/kobject_example/foo while recording data
  with below command
  $ sudo ./perf record -e probe:foo_show -a
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.093 MB perf.data ]

  $./perf report --stdio -F overhead,comm,dso,sym
    Error:
    The perf.data.old file has no samples!

After applying patch:

  $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
    Added new event:
      probe:foo_show       (on foo_show in kobject_example)

    You can now use it in all perf tools, such as:

    perf record -e probe:foo_show -aR sleep 1

  $ sudo cat /sys/kernel/debug/tracing/kprobe_events
    p:probe/foo_show kobject_example:foo_show

  $ insmod kobject-example.ko

  $ lsmod
    Module                  Size  Used by
    kobject_example        16384  0

  Generate read to /sys/kernel/kobject_example/foo while recording data
  with below command
  $ sudo ./perf record -e probe:foo_show -a
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.097 MB perf.data (8 samples) ]

  $ sudo ./perf report  --stdio -F overhead,comm,dso,sym
    ...
    # Samples: 8  of event 'probe:foo_show'
    # Event count (approx.): 8
    #
    # Overhead  Command  Shared Object      Symbol
    # ........  .......  .................  ............
    #
       100.00%  cat      [kobject_example]  [k] foo_show

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1461680741-12517-2-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-26 13:15:01 -03:00
Arnaldo Carvalho de Melo 2f3027ac28 perf thread: Introduce method to set comm from /proc/pid/self
Will be used for lazy comm loading in 'perf trace'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-7ogbkuoka1y2qsmcckqxvl5m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-26 13:15:00 -03:00
Masami Hiramatsu 2a12ec13cc perf probe: Set default kprobe group name if it is not given
Set kprobe group name as "probe" if it is not given.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160426090413.11891.95640.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-26 13:14:58 -03:00
Masami Hiramatsu 6ed0720a74 perf probe: Let probe_file__add_event return 0 if succeeded
Since other methods return 0 if succeeded (or filedesc), let
probe_file__add_event() return 0 instead of the length of written bytes.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160426090303.11891.18232.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-26 13:14:58 -03:00
Masami Hiramatsu e1ce726e1d perf tools: Add lsdir() helper to read a directory
As a utility function, add lsdir() which reads given directory and store
entry name into a strlist.  lsdir accepts a filter function so that user
can filter out unneeded entries.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160426090242.11891.79014.stgit@devbox
[ Do not use the 'dirname' it is used in some distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-26 13:14:55 -03:00
Masami Hiramatsu 062d6c2aec perf probe: Close target file on error path
Fix a bug to close target elf file in get_text_start_address().

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160426064737.1443.44093.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-26 10:56:08 -03:00
Wang Nan b04b702375 perf evlist: Enforce ring buffer reading
Don't read broken data after 'head' pointer.

Following commits will feed perf_evlist__mmap_read() with some 'head'
pointers not maintained by kernel. If 'head' pointer breaks an event, we
should avoid reading from the broken event. This can happen in backward
ring buffer.

For example:

                              old     head
                                |     |
                                V     V
     +---+------+----------+----+-----+--+
     |..E|D....D|C........C|B..B|A....|E.|
     +---+------+----------+----+-----+--+

'old' pointer points to the beginning of 'A' and trying read from it,
but 'A' has been overwritten. In this case, don't try to read from 'A',
simply return NULL.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461637738-62722-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-26 10:56:08 -03:00
Kan Liang 09623d7946 perf hists: Clear dummy entry accumulated period
The accumulated period for dummy entry should also be 0.  Otherwise, the
total overhead could be overcounted.

  $ perf record -e '{LLC-load-misses,cpu/instructions/}' --call-graph=lbr ./tchain
  $ perf report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 21K of event 'anon group { LLC-load-misses, cpu/instructions/ }'
  # Event count (approx.): 16313667937
  #
  #         Children              Self  Command      Shared Object     Symbol
  # ................  ................  ...........  ................  ............................
  #
    4769.98%   0.01%     0.00%   0.01%  tchain_edit  [kernel.vmlinux]  [k] update_fast_timekeeper
    4356.18%   0.01%     0.00%   0.01%  tchain_edit  [kernel.vmlinux]  [k] trigger_load_balance
    3181.12%   0.01%     0.00%   0.01%  tchain_edit  [kernel.vmlinux]  [k] irq_work_tick
    1592.37%   0.00%     0.00%   0.00%  tchain_edit  [kernel.vmlinux]  [k] cpu_needs_another_gp

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1461565689-5862-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-25 20:35:59 -03:00
Colin Ian King c066489305 perf intel-pt: Fix off-by-one comparison on maximum code
The check for the maximum code is off-by-one; the current comparison of
a code that is INTEL_PT_ERR_MAX will cause the strlcpy to perform an out
of bounds array access on the intel_pt_err_msgs array.

Fix this with a >= comparison.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461524203-10224-1-git-send-email-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-25 20:35:59 -03:00
Eric Engestrom 3b556bced4 perf tools: Remove duplicate const qualifier
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461577678-29517-1-git-send-email-eric.engestrom@imgtec.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-25 18:12:25 -03:00
Arnaldo Carvalho de Melo a213b92e15 perf evlist: Decode perf_event_attr->branch_sample_type
While trying to use --call-graph lbr in 'perf trace', since we only are
interested in the callchain for userspace, up to the callchain, I found
that 'perf evlist' is not decoding the branch_sample_type field, fix it.

Before:

  # perf record --call-graph lbr usleep 1
  # perf evlist -v
  cycles:ppp: size: 112, { sample_period, sample_freq }: 4000,
  sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK,
  disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1,
  precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1,
  comm_exec: 1, branch_sample_type: 51201
                ^^^^^^^^^^^^^^^^^^^^^^^^^

After:

  # perf evlist -v
  cycles:ppp: size: 112, { sample_period, sample_freq }: 4000,
  sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK,
  disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1,
  precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1,
  comm_exec: 1, branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-hozai7974u0ulgx13k96fcaw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-25 16:48:01 -03:00
Andrey Ryabinin 70a2cba972 perf buildid: Fix off-by-one in write_buildid()
write_buildid() increments 'name_len' with intention to take into
account trailing zero byte. However, 'name_len' was already incremented
in machine__write_buildid_table() before.  So this leads to
out-of-bounds read in do_write():

  $ ./perf record sleep 0
  [ perf record: Woken up 1 times to write data ]
  =================================================================
  ==15899==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000099fc92 at pc 0x7f1aa9c7eab5 bp 0x7fff940f84d0 sp 0x7fff940f7c78
  READ of size 19 at 0x00000099fc92 thread T0
      #0 0x7f1aa9c7eab4  (/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/libasan.so.2+0x44ab4)
      #1 0x649c5b in do_write util/header.c:67
      #2 0x649c5b in write_padded util/header.c:82
      #3 0x57e8bc in write_buildid util/build-id.c:239
      #4 0x57e8bc in machine__write_buildid_table util/build-id.c:278
  ...

  0x00000099fc92 is located 0 bytes to the right of global variable '*.LC99' defined in 'util/symbol.c' (0x99fc80) of size 18
    '*.LC99' is ascii string '[kernel.kallsyms]'
  ...

  Shadow bytes around the buggy address:
    0x00008012bf80: f9 f9 f9 f9 00 00 00 00 00 00 03 f9 f9 f9 f9 f9
  =>0x00008012bf90: 00 00[02]f9 f9 f9 f9 f9 00 00 00 00 00 05 f9 f9
    0x00008012bfa0: f9 f9 f9 f9 00 03 f9 f9 f9 f9 f9 f9 00 00 00 00

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461053847-5633-1-git-send-email-aryabinin@virtuozzo.com
[ Remove the off-by one at the origin, to keep len(s) == strlen(s) assumption ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-25 12:49:16 -03:00
Ingo Molnar 67d61296ff Merge tag 'perf-core-for-mingo-20160419' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

Build fixes:

- Fix 'perf trace' build when DWARF unwind isn't available (Arnaldo Carvalho de Melo)

- Remove x86 references from arch-neutral Build, fixing it in !x86 arches,
  reported as breaking the build for powerpc64le in linux-next (Arnaldo Carvalho de Melo)

Infrastructure changes:

- Do memset() variable 'st' using the correct size in the jit code (Colin Ian King)

- Fix postgresql ubuntu 'perf script' install instructions (Chris Phlipot)

- Use callchain_param more thoroughly when checking how callchains were
  configured, eventually will be the only way to look for callchain parameters
  (Arnaldo Carvalho de Melo)

- Fix some issues in the 'perf test kallsyms' entry (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23 14:50:39 +02:00
Ingo Molnar 65cbbd037b Merge branch 'perf/urgent' into perf/core, to resolve conflict
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23 14:12:10 +02:00
Arnaldo Carvalho de Melo e02092b9a9 perf symbols: Allow loading kallsyms without considering kcore files
Before the support for using /proc/kcore was introduced, the kallsyms
routines used /proc/modules and the first 'perf test' entry expected
finding maps for each module in the system, which is not the case with
the kcore code. Provide a way to ignore kcore files so that the test can
have its expectations met.

Improving the test to cover kcore files as well needs to be done.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ek5urnu103dlhfk4l6pcw041@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-19 12:38:56 -03:00
Arnaldo Carvalho de Melo 2cc4666927 perf build: Remove x86 references from arch-neutral Build
It will already be dealt with generating the syscalltbl.c file in the
x86 arch specific Build files, namely via 'archheaders'.

This fixes the build on !x86 arches, as reported for powerpcle

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 1b700c9975 ("perf tools: Build syscall table .c header from kernel's syscall_64.tbl")
Link: http://lkml.kernel.org/r/20160415212831.GT9056@kernel.org
[ Removed the syscalltbl.o altogether, as per Jiri's suggestion ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-19 12:37:02 -03:00
Colin Ian King f56ebf20d0 perf jit: memset() variable 'st' using the correct size
The current code is memsetting the 'struct stat' variable 'st' with the size of
'stat' (which turns out to be 1 byte) rather than the size of variable 'sz'.

Committer notes:

sizeof(function) isn't valid, the result depends on the compiler used, with
gcc, enabling pedantic warnings we get:

  $ cat sizeof_function.c
  #include <sys/types.h>
  #include <sys/stat.h>
  #include <unistd.h>
  #include <stdio.h>

  int main(void)
  {
	  printf("sizeof(stat)=%zd, stat=%p\n", sizeof(stat), stat);
	  return 0;
  }
  $ readelf -sW sizeof_function | grep -w stat
      49: 0000000000400630    16 FUNC    WEAK   HIDDEN    13 stat
  $ cc -pedantic sizeof_function.c   -o sizeof_function
  sizeof_function.c: In function ‘main’:
  sizeof_function.c:8:46: warning: invalid application of ‘sizeof’ to a function type [-Wpointer-arith]
    printf("sizeof(stat)=%zd, stat=%p\n", sizeof(stat), stat);
                                              ^
  $ ./sizeof_function
  sizeof(stat)=1, stat=0x400630
  $

  Standard C, section 6.5.3.4:

  "The sizeof operator shall not be applied to an expression that has function
   type or an incomplete type, to the parenthesized name of such a type,
   or to an expression that designates a bit-field member."

  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
Link: http://lkml.kernel.org/r/1461020838-9260-1-git-send-email-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-19 12:37:01 -03:00
Ingo Molnar a19cad6d66 perf/urgent fix:
- Fix segfault tracing transactions in Intel PT (Adrian Hunter)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXFOtpAAoJENZQFvNTUqpAQK4P/1DZttzXWtDQAy38z85JnANu
 aC9LHUtmPdREF51ynAjJ55jbwkEiOV9rQaLMnIXgIw7S3Qv2SDtcqhK4qECg9XO/
 5SxI3Qp4+sp7SZZDAG5LxP8Oy/lfhGlXr8++gEr7yk3bar5wGnnWrAvY66/uuxyK
 ELLQ5z/TOPWCKksZxxwC3ZufGdA4CNzEEBmbJ+B9kYNnzH61Z1Rix1GUQM1iYD64
 jbD7zt0JkPJFhCbRsuqG46Yyt57wcOuz7Ve4a7twf4RMu04FeKheOlEB1eVdk22U
 q9NnTWNNcgRpaKsKpaWe+8LFFQ8jSxynClcgmbnNtXBtsPlNQSqGj2nObtcZSwcs
 pRatlwqIPjKj92/VK9GWer5tkdeAqaRTWkP5dxQqzzUAjbUQ7O+u/jMH3S67yWpv
 6SbhYqF0DpCTDnX25yhbNWSgpOlQTDN0HASqXzWCnnZ0MXDGSb01nbe1pSa4kAtX
 OGtAe5ZNP60IJTfb5njFh7AsrEX3ITNVWwgS08gVjsl2QjGN4DogRCW2CCJmDpaN
 q2ZKx3uKLSJziqLfbTonvwzEep1bSLRaue6RZQTW0YSp1SWcVRGb7pjWkhpPgTpc
 HQM1GRyHElhWTLh4CNDcIOc8vgY3qdDKcqUJzQWNSWHfmqMW4mIvEVUyvX1e0e1U
 o34sy/O2JBCgMTOnGDYL
 =F7NR
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-20160418' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull a perf/urgent fix from Arnaldo Carvalho de Melo:

- Fix segfault tracing transactions in Intel PT (Adrian Hunter)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-19 08:41:59 +02:00
Arnaldo Carvalho de Melo 30234f0925 perf callchain: Set callchain_param.enabled when parsing --call-graph
Trying to move in the direction of using callchain_param for all
callchain parameters, eventually ditching them from symbol_conf.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kixllia6r26mz45ng056zq7z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-18 11:53:07 -03:00
Arnaldo Carvalho de Melo acf2abbd0b perf evsel: Add missign class prefix to has_branch_stack method
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5i07ivw1yjsweb7gztr255jd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-18 11:17:09 -03:00
Adrian Hunter 1342e0b7a6 perf intel-pt: Fix segfault tracing transactions
Tracing a workload that uses transactions gave a seg fault as follows:

  perf record -e intel_pt// workload
  perf report
  Program received signal SIGSEGV, Segmentation fault.
  0x000000000054b58c in intel_pt_reset_last_branch_rb (ptq=0x1a36110)
  	at util/intel-pt.c:929
  929 ptq->last_branch_rb->nr = 0;
  (gdb) p ptq->last_branch_rb
  $1 = (struct branch_stack *) 0x0
  (gdb) up
  1148 intel_pt_reset_last_branch_rb(ptq);
  (gdb) l
  1143 if (ret)
  1144 pr_err("Intel Processor Trace: failed to deliver transaction event
  1145 ret);
  1146
  1147 if (pt->synth_opts.callchain)
  1148 intel_pt_reset_last_branch_rb(ptq);
  1149
  1150 return ret;
  1151 }
  1152
  (gdb) p pt->synth_opts.callchain
  $2 = true
  (gdb)
  (gdb) bt
   #0 0x000000000054b58c in intel_pt_reset_last_branch_rb (ptq=0x1a36110)
   #1 0x000000000054c1e0 in intel_pt_synth_transaction_sample (ptq=0x1a36110)
   #2 0x000000000054c5b2 in intel_pt_sample (ptq=0x1a36110)

Caused by checking the 'callchain' flag when it should have been the
'last_branch' flag.  Fix that.

Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: f14445ee72 ("perf intel-pt: Support generating branch stack")
Link: http://lkml.kernel.org/r/1460977068-11566-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-18 11:00:56 -03:00
Adam Buchbinder bd1a0be515 tools/perf: Fix misspellings in comments.
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-04-18 12:45:53 +02:00
Arnaldo Carvalho de Melo f5e7150cd9 perf evlist: Expose perf_event_mlock_kb_in_pages() helper
When the user doesn't set --mmap-pages, perf_evlist__mmap() will do it
by reading the maximum possible for a non-root user from the
/proc/sys/kernel/perf_event_mlock_kb file.

Expose that function so that 'perf trace' can, for root users, to bump
mmap-pages to a higher value for root, based on the contents of this
proc file.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xay69plylwibpb3l4isrpl1k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-15 17:46:31 -03:00
Arnaldo Carvalho de Melo 0883e820a0 perf record: Export record_opts based callchain parsing helper
To be able to call it outside option parsing, like when setting a
default --call-graph parameter in 'perf trace' when just --min-stack is
used.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xay69plylwibpb3l4isrpl1k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-15 16:37:17 -03:00
Arnaldo Carvalho de Melo 25da4fab5f perf evsel: Move fprintf methods to separate source file
They still use functions that would drag more stuff to the python
binding, where these fprintf methods are not used, so separate it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xfp0mgq3hh3px61di6ixi1jk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-14 19:46:58 -03:00
Arnaldo Carvalho de Melo d327e60cfa perf tools: Remove addr_location argument to sample__fprintf_callchain
Not used at all, nuke it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jf2w8ce8nl3wso3vuodg5jci@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-14 19:46:57 -03:00
Arnaldo Carvalho de Melo 6f736735e3 perf evsel: Require that callchains be resolved before calling fprintf_{sym,callchain}
This way the print routine merely does printing, not requiring access to
the resolving machinery, which helps disentangling the object files and
easing creating subsets with a limited functionality set.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2ti2jbra8fypdfawwwm3aee3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-14 19:46:56 -03:00
Arnaldo Carvalho de Melo bfbba189b6 perf symbols: Move fprintf routines to separate object file
To disentangle symbol printing from all the code related to symbol
tables, resolution of addresses to symbols, etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-eik9g3hbtdc7ddv57f1d4v3p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-14 19:46:53 -03:00
Arnaldo Carvalho de Melo de446b40d5 perf evsel: Remove symbol_conf usage
# perf test -v python
  16: Try 'import perf' in python, checking link problems      :
  --- start ---
  test child forked, pid 672
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  ImportError: /tmp/build/perf/python/perf.so: undefined symbol:
  symbol_conf
  test child finished with -1
  ---- end ----
  Try 'import perf' in python, checking link problems: FAILED!
  #

To fix it just pass a parameter to perf_evsel__fprintf_sym telling if
callchains should be printed.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-comrsr20bsnr8bg0n6rfwv12@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-14 14:56:06 -03:00
Arnaldo Carvalho de Melo 91d7b2de31 perf callchain: Start moving away from global per thread cursors
The recent perf_evsel__fprintf_callchain() move to evsel.c added several
new symbol requirements to the python binding, for instance:

  # perf test -v python
  16: Try 'import perf' in python, checking link problems      :
  --- start ---
  test child forked, pid 18030
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  ImportError: /tmp/build/perf/python/perf.so: undefined symbol:
  callchain_cursor
  test child finished with -1
  ---- end ----
  Try 'import perf' in python, checking link problems: FAILED!
  #

This would require linking against callchain.c to access to the global
callchain_cursor variables.

Since lots of functions already receive as a parameter a
callchain_cursor struct pointer, make that be the case for some more
function so that we can start phasing out usage of yet another global
variable.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-djko3097eyg2rn66v2qcqfvn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-14 14:48:07 -03:00
Taeung Song 20105ca124 perf config: Introduce perf_config_set class
This infrastructure code was designed for upcoming features of
'perf config'.

That collect config key-value pairs from user and system config files
(i.e. user wide ~/.perfconfig and system wide $(sysconfdir)/perfconfig)
to manage perf's configs.

Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1460620401-23430-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-14 09:00:42 -03:00
Wang Nan 040f9915e9 perf data: Add perf_data_file__switch() helper
perf_data_file__switch() closes current output file, renames it, then
open a new one to continue recording. It will be used by 'perf record'
to split output into multiple perf.data files.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460535673-159866-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-14 08:57:54 -03:00
Wang Nan b26dc73018 perf session: Make ordered_events reusable
ordered_events__free() leaves linked lists and timestamps not cleared,
so unable to be reused after ordered_events__free(). Which is inconvenient
after 'perf record' supports generating multiple perf.data output and
process build-ids for each of them.

Use ordered_events__reinit() for this.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460535673-159866-2-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-14 08:57:54 -03:00
Wang Nan 4532f64297 perf ordered_events: Introduce reinit()
'perf record' will use this when outputting multiple perf.data files.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460535673-159866-2-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-14 08:57:54 -03:00
Arnaldo Carvalho de Melo e20ab86e51 perf evsel: Move some methods from session.[ch] to evsel.[ch]
Those were converted to be evsel methods long ago, move the
source to where it belongs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vja8rjmkw3gd5ungaeyb5s2j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-13 10:11:52 -03:00
Jiri Olsa 097be0f503 perf thread_map: Make new_by_tid_str constructor public
It will be used in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-13 10:11:51 -03:00
Jiri Olsa e632aa69c9 perf cpu_map: Add has() method
Adding cpu_map__has() to return bool of cpu presence in cpus map.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-13 10:11:50 -03:00
Jiri Olsa 3407df8bbc perf thread_map: Add has() method
Adding thread_map__has() to return bool of pid presence in threads map.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-13 10:11:50 -03:00
Ingo Molnar bed9441ba7 perf/core improvements:
- Automagically create a 'bpf-output' event, easing the setup of BPF
   C "scripts" that produce output via the perf ring buffer. Now it is
   just a matter of calling any perf tool, such as 'trace', with a C
   source file that references the __bpf_stdout__ output channel and
   that channel will be created and connected to the script:
 
   # trace -e nanosleep --event test_bpf_stdout.c usleep 1
     0.013 ( 0.013 ms): usleep/2818 nanosleep(rqtp: 0x7ffcead45f40                                        ) ...
     0.013 (         ): __bpf_stdout__:Raise a BPF event!..)
     0.015 (         ): perf_bpf_probe:func_begin:(ffffffff81112460))
     0.261 (         ): __bpf_stdout__:Raise a BPF event!..)
     0.262 (         ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
     0.264 ( 0.264 ms): usleep/2818  ... [continued]: nanosleep()) = 0
   #
 
   Further work is needed to reduce the number of lines in a perf bpf C source
   file, this being the part where we greatly reduce the command line setup (Wang Nan)
 
 - 'perf trace' now supports callchains, with 'trace --call-graph dwarf' using
   libunwind, just like 'perf top', to ask the kernel for stack dumps for CFI
   processing. This reduces the overhead by asking just for userspace callchains
   and also only for the syscall exit tracepoint (raw_syscalls:sys_exit)
   (Milian Wolff, Arnaldo Carvalho de Melo)
 
   Try it with, for instance:
 
      # perf trace --call dwarf ping 127.0.0.1
 
   An excerpt of a system wide 'perf trace --call dwarf" session is at:
 
    https://fedorapeople.org/~acme/perf/perf-trace--call-graph-dwarf--all-cpus.txt
 
   You may need to bump the number of mmap pages, using -m/--mmap-pages,
   but on a Broadwell machine the defaults allowed system wide tracing to
   work without losing that many records, experiment with just some
   syscalls, like:
 
     # perf trace --call dwarf -e nanosleep,futex
 
   All the targets available for 'perf record', 'perf top' (--pid, --tid, --cpu,
   etc) should work. Also --duration may be interesting to try.
 
   To get filenames from in various syscalls pointer args (open, ettc), add this
   to the mix:
 
   # perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'
 
   Making this work is next in line:
 
      # trace --call dwarf --ev sched:sched_switch/call-graph=fp/ usleep 1
 
   I.e. honouring per-tracepoint callchains in 'perf trace' in addition to
   in raw_syscalls:sys_exit.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXDFQyAAoJENZQFvNTUqpAZCsP/2Q1Q8XfNpNZm7+JPrZDYUkm
 KBxR3WSP8l46G8hJO2SBKHgXDv6EOCsfL/lvtLv18IHrz9pSTLZFPgl3a889iOnz
 /d2pC/ydlDQ9yPR28cELb7gKMB0OF+rUqdZIWBqSM84LnvsYHgY6CntEIejfc2wf
 jiVYHkug2dcUOmfgFpV4Jp3m6J8Okf9w9+/W4n+mkcS6o9WJvKCCiTMWoOwDDkyQ
 gfEGN7YJt2iYLg4AhsG9ZJa+XKye53znjodpFNLCVbozXbZ4YSEbogR0qKJksHfH
 5uabD2bEu2y0LiC9694xp5FLFM9tGML3Nr0JAkq6Jd230Ho4XyUy+/ZD0Lq0BHnv
 HdIR7T4+wUYVKUf/ZW8gbPShR63UJ6qrgfLE8yZMxG0WKzh3XIQtg/BcxLw8XPi8
 aF/IQt/om2KXPVEZv6SjNMp9DdmydeZ4KPrA9q2BGhbQzC2Ast7e6pHKouxbRrpb
 mOSfLgDcqPFp75ZpIbFatKdg6S8VNKtFgF8wWAGrACtLboKa5PDS3El56BSNx2IA
 6pexLuhaD8ndwvHP1F6nQQAHvFn5q4FKEg2fU0Pq8VnUN8SxrCvVjZZR3SjM+tGy
 V5GHzJ7GTn9Cm2fwllrD/tndzPWQsbFA0UuLZPwVoxq2Lt2HC0YG30+SupsAVZrx
 fCANHt3ci+qU1OCQAlIP
 =jBSC
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-20160411' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements from Arnaldo Carvalho de Melo:

User visible changes:

- Automagically create a 'bpf-output' event, easing the setup of BPF
  C "scripts" that produce output via the perf ring buffer. Now it is
  just a matter of calling any perf tool, such as 'trace', with a C
  source file that references the __bpf_stdout__ output channel and
  that channel will be created and connected to the script:

  # trace -e nanosleep --event test_bpf_stdout.c usleep 1
    0.013 ( 0.013 ms): usleep/2818 nanosleep(rqtp: 0x7ffcead45f40                                        ) ...
    0.013 (         ): __bpf_stdout__:Raise a BPF event!..)
    0.015 (         ): perf_bpf_probe:func_begin:(ffffffff81112460))
    0.261 (         ): __bpf_stdout__:Raise a BPF event!..)
    0.262 (         ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
    0.264 ( 0.264 ms): usleep/2818  ... [continued]: nanosleep()) = 0
  #

  Further work is needed to reduce the number of lines in a perf bpf C source
  file, this being the part where we greatly reduce the command line setup (Wang Nan)

- 'perf trace' now supports callchains, with 'trace --call-graph dwarf' using
  libunwind, just like 'perf top', to ask the kernel for stack dumps for CFI
  processing. This reduces the overhead by asking just for userspace callchains
  and also only for the syscall exit tracepoint (raw_syscalls:sys_exit)
  (Milian Wolff, Arnaldo Carvalho de Melo)

  Try it with, for instance:

     # perf trace --call dwarf ping 127.0.0.1

  An excerpt of a system wide 'perf trace --call dwarf" session is at:

   https://fedorapeople.org/~acme/perf/perf-trace--call-graph-dwarf--all-cpus.txt

  You may need to bump the number of mmap pages, using -m/--mmap-pages,
  but on a Broadwell machine the defaults allowed system wide tracing to
  work without losing that many records, experiment with just some
  syscalls, like:

    # perf trace --call dwarf -e nanosleep,futex

  All the targets available for 'perf record', 'perf top' (--pid, --tid, --cpu,
  etc) should work. Also --duration may be interesting to try.

  To get filenames from in various syscalls pointer args (open, ettc), add this
  to the mix:

  # perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'

  Making this work is next in line:

     # trace --call dwarf --ev sched:sched_switch/call-graph=fp/ usleep 1

  I.e. honouring per-tracepoint callchains in 'perf trace' in addition to
  in raw_syscalls:sys_exit.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 09:02:07 +02:00
Ingo Molnar aeaae7d612 perf/core improvements and fixes:
User visible:
 
 - Beautify more syscall arguments in 'perf trace', using the type column in
   tracepoint /format fields to attach, for instance, a pid_t resolver to the
   thread COMM, also attach a mode_t beautifier in the same fashion
   (Arnaldo Carvalho de Melo)
 
 - Build the syscall table id <-> name resolver using the same .tbl file
   used in the kernel to generate headers, to avoid the delay in getting
   new syscalls supported in the audit-libs external dependency, done so
   far only for x86_64 (Arnaldo Carvalho de Melo)
 
 - Improve the documentation of event specifications (Andi Kleen)
 
 - Process update events in 'perf script', fixing up this use case:
 
     # perf stat -a -I 1000 -e cycles record | perf script -s script.py
 
 - Shared object symbol adjustment fixes, fixing symbol resolution in
   Android (Wang Nan)
 
 Infrastructure:
 
 - Add dedicated unwind addr_space member into thread struct, to allow
   tools to use thread->priv, noticed while working on having callchains
   in 'perf trace' (Jiri Olsa)
 
 Build fixes:
 
 - Fix the build in Ubuntu 12.04 (Arnaldo Carvalho de Melo, Vinson Lee)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXB646AAoJENZQFvNTUqpAvroP/1BE/cViqrM3v351qkPDh2Dw
 Mat7ZDTDUxsZ2XqzJQTsRiao1oi2ALF1Kyh3UzgRLqsDuahQHhwpm4M9pegtgDsu
 I77yGGFVw/rdZLdJGHrGXE8HeOrvW6NKk7HqmnMlwpLpNkMfPVpqlq82/rUHcA46
 F98/7DTG/PNstbIxuCgV3fI1aB8/v+lVGLw+S7vnf/Xuba4MNdqTQNEzkXaAubzv
 c2I6YBwSO3v+oVgu3Vr679gGxNx3P/iGkgIb2WqJxezNiZUVdh4McJ4Dg3/mm5hj
 Q1scLuW2MGXy3WaQW4i7sjoAwlOX+4kkUO8l5yk67GXKLwstnx4SWJNyeupMxDcp
 UKl2mPcZIPaCPjR7aFnR6Xvxv71QkEEgeQlvR1gwAZeONBtZ9kOFD2jJfK5tbwyS
 GG3rF5dB4KJqjr31tDcGYxuti4ggiJIvi/ujyrKvJUl7JChkJVDbqENOK35qistR
 iHV1nSEQM1O2+NNaH4Ise9WYjQOH5tERc9L9XOpaR9x6KDsz1TR7bqqBu8HdASJs
 v9x02RaPjDc/x3MUrhZ1sEDOqEqLO2eHhq1stnFTz4x9r7k+YPySS43FtnARB1CZ
 H8Pc21gyxt1pzO0ogvnLoji1715+DaSaEWas+vrfzJ1cJ4Tf6v5zYEuOfYJhDQrC
 dK75s0KDR6DCO/cwUEr1
 =AF/P
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-20160408' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

- Beautify more syscall arguments in 'perf trace', using the type column in
  tracepoint /format fields to attach, for instance, a pid_t resolver to the
  thread COMM, also attach a mode_t beautifier in the same fashion
  (Arnaldo Carvalho de Melo)

- Build the syscall table id <-> name resolver using the same .tbl file
  used in the kernel to generate headers, to avoid the delay in getting
  new syscalls supported in the audit-libs external dependency, done so
  far only for x86_64 (Arnaldo Carvalho de Melo)

- Improve the documentation of event specifications (Andi Kleen)

- Process update events in 'perf script', fixing up this use case:

    # perf stat -a -I 1000 -e cycles record | perf script -s script.py

- Shared object symbol adjustment fixes, fixing symbol resolution in
  Android (Wang Nan)

Infrastructure changes:

- Add dedicated unwind addr_space member into thread struct, to allow
  tools to use thread->priv, noticed while working on having callchains
  in 'perf trace' (Jiri Olsa)

Build fixes:

- Fix the build in Ubuntu 12.04 (Arnaldo Carvalho de Melo, Vinson Lee)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 08:57:50 +02:00
Ingo Molnar 889fac6d67 Linux 4.6-rc3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXCva8AAoJEHm+PkMAQRiGXBoIAIkrjxdbuT2nS9A3tHwkiFXa
 6/Th1UjbNaoLuZ+MckQHayAD9NcWY9lVjOUmFsSiSWMCQK/rTWDl8x5ITputrY2V
 VuhrJCwI7huEtu6GpRaJaUgwtdOjhIHz1Ue2MCdNIbKX3l+LjVyyJ9Vo8rruvZcR
 fC7kiivH04fYX58oQ+SHymCg54ny3qJEPT8i4+g26686m11hvZLI3UAs2PAn6ut+
 atCjxdQ4yLN3DWsbjuA7wYGWhTgFloxL4TIoisuOUc3FXnSi/ivIbXZvu4lUfisz
 LA2JBhfII3AEMBWG9xfGbXPijJTT4q7yNlTD0oYcnMtAt/Roh2F04asqB1LetEY=
 =bri6
 -----END PGP SIGNATURE-----

Merge tag 'v4.6-rc3' into perf/core, to refresh the tree

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 08:57:03 +02:00
Arnaldo Carvalho de Melo fd4be13067 perf evsel: Allow unresolved symbol names to be printed as addresses
The fprintf_sym() and fprintf_callchain() methods now allow users to
change the existing behaviour of showing "[unknown]" as the name of
unresolved symbols to instead show "[0x123456]", i.e. its address.

The current patch doesn't change tools to use this facility, the results
from 'perf trace' and 'perf script' cotinue like:

70.109 ( 0.001 ms): qemu-system-x8/10153 poll(ufds: 0x7f2d93ffe870, nfds: 1) = 0 Timeout
                                   [unknown] (/usr/lib64/libc-2.22.so)
                                   [unknown] (/usr/lib64/libspice-server.so.1.10.0)
                                   [unknown] (/usr/lib64/libspice-server.so.1.10.0)
                                   [unknown] (/usr/lib64/libspice-server.so.1.10.0)
                                   start_thread+0xca (/usr/lib64/libpthread-2.22.so)
                                   __clone+0x6d (/usr/lib64/libc-2.22.so)

The next patch will make 'perf trace' use the new formatting.

Suggested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-11 22:18:24 -03:00
Arnaldo Carvalho de Melo 01e0d50c3f perf evsel: Rename config_callgraph() to config_callchain() and make it public
The rename is for consistency with the parameter name.

Make it public for fine grained control of which evsels should have
callchains enabled, like, for instance, will be done in the next
changesets in 'perf trace', to enable callchains just on the
"raw_syscalls:sys_exit" tracepoint.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-og8vup111rn357g4yagus3ao@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-11 22:18:23 -03:00
Arnaldo Carvalho de Melo 22c8a376b5 perf evlist: Add (reset,set)_sample_bit methods
For fiddling with sample_type fields in all evsels in an evlist.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-dg6yavctt0hzl2tsgfb43qsr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-11 22:18:21 -03:00
Arnaldo Carvalho de Melo e68ae9cf7d perf evsel: Do not use globals in config()
Instead receive a callchain_param pointer to configure callchain
aspects, not doing so if NULL is passed.

This will allow fine grained control over which evsels in an evlist
gets callchains enabled.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2mupip6khc92mh5x4nw9to82@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-11 22:18:20 -03:00
Arnaldo Carvalho de Melo ea4539652e perf evsel: Introduce fprintf_callchain() method out of fprintf_sym()
In 'perf trace' we're just interested in printing callchains, and we
don't want to use the symbol_conf.use_callchain, so move the callchain
part to a new method.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kcn3romzivcpxb3u75s9nz33@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-11 22:18:18 -03:00
Arnaldo Carvalho de Melo ff0c107806 perf evsel: Rename print_ip() to fprintf_sym()
As it receives a FILE, and its more than just the IP, which can even be
requested not to be printed.

For consistency with other similar methods in tools/perf/, name it as
perf_evsel__fprintf_sym() and make it return the number of bytes
printed, just like 'fprintf(3)'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-84gawlqa3lhk63nf0t9vnqnn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-11 22:18:17 -03:00
Arnaldo Carvalho de Melo db3617f362 perf evsel: Allow passing a left alignment when printing a symbol
For callchains, etc where we want it to align just below the syscall
name, for instance, in 'perf trace'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-uk9ekchd67651c625ltaur5y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-11 22:18:15 -03:00
Milian Wolff 6186de9a49 perf evsel: Allow specifying a file to output in perf_evsel__print_ip
As this function will be used in 'perf trace'.

Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/n/tip-8x297v9utnxq77onikevvlse@git.kernel.org
[ Split from a larger patch ]
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
2016-04-11 22:18:14 -03:00
Wang Nan 72c0809856 perf bpf: Automatically create bpf-output event __bpf_stdout__
This patch removes the need to set a bpf-output event in cmdline.  By
referencing a map named '__bpf_stdout__', perf automatically creates an
event for it.

For example:

  # perf record -e ./test_bpf_trace.c usleep 100000
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ]
  # perf script
           usleep  4639 [000] 261895.307826:        0            __bpf_stdout__:  ffffffff810eb9a1 ...
       BPF output: 0000: 52 61 69 73 65 20 61 20  Raise a
                   0008: 42 50 46 20 65 76 65 6e  BPF even
                   0010: 74 21 00 00              t!..
       BPF string: "Raise a BPF event!"

           usleep  4639 [000] 261895.407883:        0            __bpf_stdout__:  ffffffff8105d609 ...
       BPF output: 0000: 52 61 69 73 65 20 61 20  Raise a
                   0008: 42 50 46 20 65 76 65 6e  BPF even
                   0010: 74 21 00 00              t!..
       BPF string: "Raise a BPF event!"

  perf record -e ./test_bpf_trace.c usleep 100000

  equals to:

  perf record -e bpf-output/no-inherit=1,name=__bpf_stdout__/ \
              -e ./test_bpf_trace.c/map:__bpf_stdout__.event=__bpf_stdout__/ \
              usleep 100000

Where test_bpf_trace.c is:

  /************************ BEGIN **************************/
  #include <uapi/linux/bpf.h>
  struct bpf_map_def {
         unsigned int type;
         unsigned int key_size;
         unsigned int value_size;
         unsigned int max_entries;
  };
  #define SEC(NAME) __attribute__((section(NAME), used))
  static u64 (*ktime_get_ns)(void) =
         (void *)BPF_FUNC_ktime_get_ns;
  static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
         (void *)BPF_FUNC_trace_printk;
  static int (*get_smp_processor_id)(void) =
         (void *)BPF_FUNC_get_smp_processor_id;
  static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
         (void *)BPF_FUNC_perf_event_output;

  struct bpf_map_def SEC("maps") __bpf_stdout__ = {
         .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
         .key_size = sizeof(int),
         .value_size = sizeof(u32),
         .max_entries = __NR_CPUS__,
  };

  static inline int __attribute__((always_inline))
  func(void *ctx, int type)
  {
	char output_str[] = "Raise a BPF event!";
	char err_str[] = "BAD %d\n";
	int err;

        err = perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(),
			        &output_str, sizeof(output_str));
	if (err)
		trace_printk(err_str, sizeof(err_str), err);
        return 1;
  }
  SEC("func_begin=sys_nanosleep")
  int func_begin(void *ctx) {return func(ctx, 1);}
  SEC("func_end=sys_nanosleep%return")
  int func_end(void *ctx) { return func(ctx, 2);}
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  /************************* END ***************************/

Committer note:

Testing with 'perf trace':

  # trace -e nanosleep --ev test_bpf_stdout.c usleep 1
     0.007 ( 0.007 ms): usleep/729 nanosleep(rqtp: 0x7ffc5bbc5fe0) ...
     0.007 (         ): __bpf_stdout__:Raise a BPF event!..)
     0.008 (         ): perf_bpf_probe:func_begin:(ffffffff81112460))
     0.069 (         ): __bpf_stdout__:Raise a BPF event!..)
     0.070 (         ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
     0.072 ( 0.072 ms): usleep/729  ... [continued]: nanosleep()) = 0
  #

Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460128045-97310-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-11 22:18:04 -03:00
Wang Nan d78885739a perf bpf: Clone bpf stdout events in multiple bpf scripts
This patch allows cloning bpf-output event configuration among multiple
bpf scripts. If there exist a map named '__bpf_output__' and not
configured using 'map:__bpf_output__.event=', this patch clones the
configuration of another '__bpf_stdout__' map. For example, following
command:

  # perf trace --ev bpf-output/no-inherit,name=evt/ \
               --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \
               --ev ./test_bpf_trace2.c usleep 100000

equals to:

  # perf trace --ev bpf-output/no-inherit,name=evt/ \
               --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/  \
               --ev ./test_bpf_trace2.c/map:__bpf_stdout__.event=evt/ \
               usleep 100000

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460128045-97310-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-11 22:17:45 -03:00
Arnaldo Carvalho de Melo bfc279f3d2 perf tools: Use readdir() instead of deprecated readdir_r()
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case when parsing tracepoint event definitions, to
avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it
instead of readdir_r().

See: http://man7.org/linux/man-pages/man3/readdir.3.html

"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe.  In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."

Noticed while building on a Fedora Rawhide docker container.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wddn49r6bz6wq4ee3dxbl7lo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08 11:53:02 -03:00
Arnaldo Carvalho de Melo 7093b4c963 perf tools: Use readdir() instead of deprecated readdir_r()
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case when synthesizing events for pre-existing threads
by traversing /proc, so, to avoid breaking the build with glibc-2.23.90
(upcoming 2.24), use it instead of readdir_r().

See: http://man7.org/linux/man-pages/man3/readdir.3.html

"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe.  In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."

Noticed while building on a Fedora Rawhide docker container.

   CC       /tmp/build/perf/util/event.o
  util/event.c: In function '__event__synthesize_thread':
  util/event.c:466:2: error: 'readdir_r' is deprecated [-Werror=deprecated-declarations]
    while (!readdir_r(tasks, &dirent, &next) && next) {
    ^~~~~
  In file included from /usr/include/features.h:368:0,
                   from /usr/include/stdint.h:25,
                   from /usr/lib/gcc/x86_64-redhat-linux/6.0.0/include/stdint.h:9,
                   from /git/linux/tools/include/linux/types.h:6,
                   from util/event.c:1:
  /usr/include/dirent.h:189:12: note: declared here

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i1vj7nyjp2p750rirxgrfd3c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08 11:32:15 -03:00
Arnaldo Carvalho de Melo 3354cf7110 perf thread_map: Use readdir() instead of deprecated readdir_r()
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case in thread_map, so, to avoid breaking the build
with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r().

See: http://man7.org/linux/man-pages/man3/readdir.3.html

"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe.  In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."

Noticed while building on a Fedora Rawhide docker container.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-del8h2a0f40z75j4r42l96l0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08 11:31:24 -03:00
Wang Nan 99e87f7bb7 perf symbols: Adjust symbol for shared objects
He Kuang reported a problem that perf fails to get correct symbol on
Android platform in [1]. The problem can be reproduced on normal x86_64
platform. I will describe the reproducing steps in detail at the end of
commit message.

The reason of this problem is the missing of symbol adjustment for normal
shared objects. In most of the cases skipping adjustment is okay. However,
when '.text' section have different 'address' and 'offset' the result is wrong.
I checked all shared objects in my working platform, only wine dll objects and
debug objects (in .debug) have this problem. However, it is common on Android.
For example:

 $ readelf -S ./libsurfaceflinger.so | grep \.text
   [10] .text             PROGBITS         0000000000029030  00012030

This patch enables symbol adjustment for dynamic objects so the symbol
address got from elfutils would be adjusted correctly.

Now nearly all types of ELF files should adjust symbols. Makes
ss->adjust_symbols default to true.

Steps to reproduce the problem:

  $ cat ./Makefile
  PWD := $(shell pwd)
  LDFLAGS += "-Wl,-rpath=$(PWD)"
  CFLAGS += -g
  main: main.c libbuggy.so
  libbuggy.so: buggy.c
	gcc -g -shared -fPIC -Wl,-Ttext-segment=0x200000 $< -o $@
  clean:
	rm -rf main libbuggy.so *.o

  $ cat ./buggy.c
  int fib(int x)
  {
      return (x == 0) ? 1 : (x == 1) ? 1 : fib(x - 1) + fib(x - 2);
  }

  $ cat ./main.c
  #include <stdio.h>

  extern int fib(int x);
  int main()
  {
     int i;

     for (i = 0; i < 40; i++)
         printf("%d\n", fib(i));
     return 0;
 }

 $ make
 $ perf record ./main
 ...
 $ perf report --stdio
 # Overhead  Command  Shared Object      Symbol
 # ........  .......  .................  ...............................
 #
     14.97%  main     libbuggy.so        [.] 0x000000000000066c
      8.68%  main     libbuggy.so        [.] 0x00000000000006aa
      8.52%  main     libbuggy.so        [.] fib@plt
      7.95%  main     libbuggy.so        [.] 0x0000000000000664
      5.94%  main     libbuggy.so        [.] 0x00000000000006a9
      5.35%  main     libbuggy.so        [.] 0x0000000000000678
 ...

The correct result should be (after this patch):

  # Overhead  Command  Shared Object      Symbol
  # ........  .......  .................  ...............................
  #
      91.47%  main     libbuggy.so        [.] fib
       8.52%  main     libbuggy.so        [.] fib@plt
       0.00%  main     [kernel.kallsyms]  [k] kmem_cache_free

[1] http://lkml.kernel.org/g/1452567507-54013-1-git-send-email-hekuang@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460024671-64774-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08 09:58:15 -03:00
Wang Nan a58f7033ba perf symbols: Record text offset in dso to calculate objdump address
In this patch, the offset of '.text' section is stored into dso
and used here to re-calculate address to objdump.

In most of the cases, executable code is in '.text' section, so the
adjustment made to a symbol in dso__load_sym (using
sym.st_value -= shdr.sh_addr - shdr.sh_offset) should equal to
'sym.st_value -= dso->text_offset'. Therefore, adding text_offset back
get objdump address from symbol address (rip). However, it is not true
for kernel and kernel module since there could be multiple executable
sections with different offset. Exclude kernel for this reason.

After this patch, even dso->adjust_symbols is set to true for shared
objects, map__rip_2objdump() and map__objdump_2mem() would return
correct result, so perf behavior of annotate won't be changed.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460024671-64774-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08 09:58:14 -03:00
Arnaldo Carvalho de Melo 1b700c9975 perf tools: Build syscall table .c header from kernel's syscall_64.tbl
We used libaudit to map ids to syscall names and vice-versa, but that
imposes a delay in supporting new syscalls, having to wait for libaudit
to get those new syscalls on its tables.

To remove that delay, for x86_64 initially, grab a copy of
arch/x86/entry/syscalls/syscall_64.tbl and use it to generate those
tables.

Syscalls currently not available in audit-libs:

  # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd
  Error:	Invalid syscall copy_file_range, membarrier, mlock2, pread64, pwrite64, timerfd_create, userfaultfd
  Hint:	try 'perf list syscalls:sys_enter_*'
  Hint:	and: 'man syscalls'
  #

With this patch:

  # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd
    8505.733 ( 0.010 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 36
    8506.688 ( 0.005 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 40
   30023.097 ( 0.025 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63ae382000, count: 4096, pos: 529592320) = 4096
   31268.712 ( 0.028 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afd8b000, count: 4096, pos: 2314133504) = 4096
   31268.854 ( 0.016 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afda2000, count: 4096, pos: 2314137600) = 4096

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-51xfjbxevdsucmnbc4ka5r88@git.kernel.org
[ Added make dep for 'prepare' in 'LIBPERF_IN', fix by Wang Nan to fix parallell build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08 09:58:14 -03:00
Arnaldo Carvalho de Melo 5af56fab2b perf tools: Allow generating per-arch syscall table arrays
Tools should use a mechanism similar to arch/x86/entry/syscalls/ to
generate a header file with the definitions for two variables:

  static const char *syscalltbl_x86_64[] = {
	[0] = "read",
	[1] = "write",
  <SNIP>
	[324] = "membarrier",
	[325] = "mlock2",
	[326] = "copy_file_range",
  };
  static const int syscalltbl_x86_64_max_id = 326;

In a per arch file that should then be included in
tools/perf/util/syscalltbl.c.

First one will be for x86_64.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-02uuamkxgccczdth8komspgp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08 09:58:14 -03:00
Arnaldo Carvalho de Melo fd0db10268 perf trace: Move syscall table id <-> name routines to separate class
We're using libaudit for doing name to id and id to syscall name
translations, but that makes 'perf trace' to have to wait for newer
libaudit versions supporting recently added syscalls, such as
"userfaultfd" at the time of this changeset.

We have all the information right there, in the kernel sources, so move
this code to a separate place, wrapped behind functions that will
progressively use the kernel source files to extract the syscall table
for use in 'perf trace'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i38opd09ow25mmyrvfwnbvkj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08 09:58:13 -03:00
Jiri Olsa e583d70c54 perf tools: Add dedicated unwind addr_space member into thread struct
Milian reported issue with thread::priv, which was double booked by perf
trace and DWARF unwind code. So using those together is impossible at
the moment.

Moving DWARF unwind private data into separate variable so perf trace
can keep using thread::priv.

Reported-and-Tested-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andreas Hollmann <hollmann@in.tum.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460013073-18444-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08 09:58:02 -03:00
Jiri Olsa 7d6a7e7825 perf tools: Introduce trim function
To be used in cases for both sides trim.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andreas Hollmann <hollmann@in.tum.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460013073-18444-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-07 10:21:49 -03:00
Arnaldo Carvalho de Melo 76e20522b7 perf script perl: Do error checking on new backtrace routine
This ended up triggering these warnings when building on Ubuntu 12.04.5:

  util/scripting-engines/trace-event-perl.c: In function 'perl_process_callchain':
  util/scripting-engines/trace-event-perl.c:293:4: error: value computed is not used [-Werror=unused-value]
  util/scripting-engines/trace-event-perl.c:294:4: error: value computed is not used [-Werror=unused-value]
  util/scripting-engines/trace-event-perl.c:295:4: error: value computed is not used [-Werror=unused-value]
  util/scripting-engines/trace-event-perl.c:297:4: error: value computed is not used [-Werror=unused-value]
  util/scripting-engines/trace-event-perl.c:309:4: error: value computed is not used [-Werror=unused-value]
  cc1: all warnings being treated as errors
  mv: cannot stat `/tmp/build/perf/util/scripting-engines/.trace-event-perl.o.tmp': No such file or directory
  make[4]: *** [/tmp/build/perf/util/scripting-engines/trace-event-perl.o] Error 1

Fix it by doing error checking when building the perl data structures
related to callchains.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@gmail.com>
Fixes: f7380c12ec ("perf script perl: Perl scripts now get a backtrace, like the python ones")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06 10:44:28 -03:00
Arnaldo Carvalho de Melo bd0419e2a5 perf probe: Check if dwarf_getlocations() is available
If not, tell the user that:

  config/Makefile:273: Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.157

And return -ENOTSUPP in die_get_var_range(), failing features that
need it, like the one pointed out above.

This fixes the build on older systems, such as Ubuntu 12.04.5.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vinson Lee <vlee@freedesktop.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9l7luqkq4gfnx7vrklkq4obs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06 10:44:28 -03:00
Vinson Lee d8e28654f2 perf config: Fix build with older toolchain.
Fix build error on Ubuntu 12.04.5 with GCC 4.6.3.

    CC       util/config.o
  util/config.c: In function ‘perf_buildid_config’:
  util/config.c:384:15: error: declaration of ‘dirname’ shadows a global declaration [-Werror=shadow]

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 9cb5987c82 ("perf config: Rework buildid_dir_command_config to perf_buildid_config")
Link: http://lkml.kernel.org/r/1459807659-9020-1-git-send-email-vlee@freedesktop.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06 10:44:28 -03:00
Linus Torvalds 4c3b73c6a2 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc kernel side fixes:

   - fix event leak
   - fix AMD PMU driver bug
   - fix core event handling bug
   - fix build bug on certain randconfigs

  Plus misc tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/amd/ibs: Fix pmu::stop() nesting
  perf/core: Don't leak event in the syscall error path
  perf/core: Fix time tracking bug with multiplexing
  perf jit: genelf makes assumptions about endian
  perf hists: Fix determination of a callchain node's childlessness
  perf tools: Add missing initialization of perf_sample.cpumode in synthesized samples
  perf tools: Fix build break on powerpc
  perf/x86: Move events_sysfs_show() outside CPU_SUP_INTEL
  perf bench: Fix detached tarball building due to missing 'perf bench memcpy' headers
  perf tests: Fix tarpkg build test error output redirection
2016-04-03 07:22:12 -05:00
Wang Nan d37ba88059 perf bpf: Add sample types for 'bpf-output' event
Before this patch we can see very large time in the events before the
'bpf-output' event. For example:

  # perf trace -vv -T --ev sched:sched_switch \
                      --ev bpf-output/no-inherit,name=evt/ \
                      --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                      usleep 10
  ...
  18446744073709.551 (18446564645918.480 ms): usleep/4157 nanosleep(rqtp: 0x7ffd3f0dc4e0) ...
  18446744073709.551 (         ): evt:Raise a BPF event!..)
  179427791.076 (         ): perf_bpf_probe:func_begin:(ffffffff810eb9a0))
  179427791.081 (         ): sched:sched_switch:usleep:4157 [120] S ==> swapper/2:0 [120])
  ...

We can also see the differences between bpf-output events and
breakpoint events:

For bpf output event:
   sample_type                    IP|TID|RAW|IDENTIFIER

For tracepoint events:
   sample_type                    IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER

This patch fix this differences by adding more sample type for
bpf-output events.

After this patch:

  # perf trace -vv -T --ev sched:sched_switch \
                      --ev bpf-output/no-inherit,name=evt/ \
                      --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                      usleep 10
  ...
  179877370.878 ( 0.003 ms): usleep/5336 nanosleep(rqtp: 0x7ffff866c450) ...
  179877370.878 (         ): evt:Raise a BPF event!..)
  179877370.878 (         ): perf_bpf_probe:func_begin:(ffffffff810eb9a0))
  179877370.882 (         ): sched:sched_switch:usleep:5336 [120] S ==> swapper/4:0 [120])
  179877370.945 (         ): evt:Raise a BPF event!..)
  ...

  # ./perf trace -vv -T --ev sched:sched_switch \
                        --ev bpf-output/no-inherit,name=evt/ \
                        --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                        usleep 10 2>&1 | grep sample_type
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW

The 'IDENTIFIER' info is not required because all events have the same
sample_type.

Committer notes:

Further testing, on top of the changes making 'perf trace' avoid samples
from events without PERF_SAMPLE_TIME:

Before:

  # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10
  <SNIP>
    0.560 ( 0.001 ms): brk(                                                   ) = 0x55e5a1df8000
    18446640227439.430 (18446640227438.859 ms): nanosleep(rqtp: 0x7ffc96643370) ...
    18446640227439.430 (         ): evt:Raise a BPF event!..)
    0.576 (         ): perf_bpf_probe:func_begin:(ffffffff81112460))
    18446640227439.430 (         ): evt:Raise a BPF event!..)
    0.645 (         ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
    0.646 ( 0.076 ms):  ... [continued]: nanosleep()) = 0
  #

After:

  # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10
  <SNIP>
     0.292 ( 0.001 ms): brk(                          ) = 0x55c7cd6e1000
     0.302 ( 0.004 ms): nanosleep(rqtp: 0x7ffedd8bc0f0) ...
     0.302 (         ): evt:Raise a BPF event!..)
     0.303 (         ): perf_bpf_probe:func_begin:(ffffffff81112460))
     0.397 (         ): evt:Raise a BPF event!..)
     0.397 (         ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
     0.398 ( 0.100 ms):  ... [continued]: nanosleep()) = 0

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1459517202-42320-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-01 18:46:25 -03:00
Kan Liang ac0e2cd555 perf tools: Fix PMU term format max value calculation
Currently the max value of format is calculated by the bits number. It
relies on the continuity of the format.

However, uncore event format is not continuous. E.g. uncore qpi event
format can be 0-7,21.

If bit 21 is set, there is parsing issues as below.

  $ perf stat -a -e uncore_qpi_0/event=0x200002,umask=0x8/
  event syntax error: '..pi_0/event=0x200002,umask=0x8/'
                                    \___ value too big for format, maximum is 511

This patch return the real max value by setting all possible bits to 1.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1459365375-14285-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-01 18:46:24 -03:00
Adrian Hunter 2a28e23049 perf jit: Add support for using TSC as a timestamp
Intel PT uses TSC as a timestamp, so add support for using TSC instead
of the monotonic clock.  Use of TSC is selected by an environment
variable "JITDUMP_USE_ARCH_TIMESTAMP" and flagged in the jitdump file
with flag JITDUMP_FLAGS_ARCH_TIMESTAMP.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457426330-30226-1-git-send-email-adrian.hunter@intel.com
[ Added the fixup from He Kuang to make it build on other arches, ]
[ such as aarch64, to avoid inserting this bisectiong breakage upstream ]
Link: http://lkml.kernel.org/r/1459482572-129494-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-01 18:42:55 -03:00
Adrian Hunter 46bc29b970 perf tools: Add time conversion event
Intel PT uses the time members from the perf_event_mmap_page to convert
between TSC and perf time.

Due to a lack of foresight when Intel PT was implemented, those time
members were recorded in the (implementation dependent) AUXTRACE_INFO
event, the structure of which is generally inaccessible outside of the
Intel PT decoder.  However now the conversion between TSC and perf time
is needed when processing a jitdump file when Intel PT has been used for
tracing.

So add a user event to record the time members.  'perf record' will
synthesize the event if the information is available.  And session
processing will put a copy of the event on the session so that tools
like 'perf inject' can easily access it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1457426324-30158-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31 10:52:24 -03:00
Ingo Molnar 643cb15ba0 perf/core improvements and fixes:
User visible:
 
 - Add support for skipping itrace instructions, useful to fast forward
   processor trace (Intel PT, BTS) to right after initialization code at the start
   of a workload (Andi Kleen)
 
 - Add support for backtraces in perl 'perf script's (Dima Kogan)
 
 - Add -U/-K (--all-user/--all-kernel) options to 'perf mem' (Jiri Olsa)
 
 - Make -f/--force option documentation consistent across tools (Jiri Olsa)
 
 Infrastructure:
 
 - Add 'perf test' to check for event times (Jiri Olsa)
 
 - 'perf config' cleanups (Taeung Song)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW++RsAAoJENZQFvNTUqpAxrgP/A8i+A+WqTok8jEvmeCnWiqQ
 b8ZIdiLOM0hrDHrZwSXvmOqyicix1mOaHfS1INm76i9C0Olwh2SRla4/MIYVI5vI
 MAdvXIABgm0ychNh+XtleE16gXRWuFfc/7NBZk3f36bYUXhErOGxFuAUFlK1iOZ3
 0D4rZaJkvVNc2LG01W56uZxueD5R0oCae8ctuVH7cqiCtWH6geIkrOHW2xk6Fj30
 IU48Oq5C0p4q1n2YHpNudpngUH+Sr9GSMKPtRYCtGkuD/AgDzKfIDrqppeIK33Yo
 SXKXkTCK2+cCajCbAFQgLf8xP+GLSgbk2fnQMsSXmEjnng7w0s0bu+Yew33qVjnx
 /Hiyl0kkFD3xuSdmqlQE15ltrB1s1RHtBqK10SxZVQ7MNFLg5lltpMddEf/d/U3Z
 U/jii+RYJKVIRnS/QwqRZdwnsRwTCeGB3KyZU55mM0P0PN3+U+EMTRasWvdyE7ou
 7rSnhKdY9swybZyU0Z084LOWHSIEfYeMD/JofqDOrLpR/3ZKIFbmWAv5kcPp+tXL
 CMoNPY2QIvo7GQXy3Qju8cthi/Td5GO1vr2InYgmm2IqksOT7OgvTssxbfbYOB/M
 ii6wrkT/WILQNoMThoZXm+dFJP+6Lxsu0KbpEVBytnd0tmSgnHVPlFeGLB3Qfblv
 d2mZss5ibOsMpo0f22hm
 =NGjt
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-20160330' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes:

User visible changes:

  - Add support for skipping itrace instructions, useful to fast forward
    processor trace (Intel PT, BTS) to right after initialization code at the start
    of a workload (Andi Kleen)

  - Add support for backtraces in perl 'perf script's (Dima Kogan)

  - Add -U/-K (--all-user/--all-kernel) options to 'perf mem' (Jiri Olsa)

  - Make -f/--force option documentation consistent across tools (Jiri Olsa)

Infrastructure changes:

  - Add 'perf test' to check for event times (Jiri Olsa)

  - 'perf config' cleanups (Taeung Song)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 08:33:43 +02:00
Anton Blanchard 9f56c092b9 perf jit: genelf makes assumptions about endian
Commit 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
incorrectly assumed that PowerPC is big endian only.

Simplify things by consolidating the define of GEN_ELF_ENDIAN and checking
for __BYTE_ORDER == __BIG_ENDIAN.

The PowerPC checks were also incorrect, they do not match what gcc
emits. We should first look for __powerpc64__, then __powerpc__.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Carl Love <cel@us.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Fixes: 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
Link: http://lkml.kernel.org/r/20160329175944.33a211cc@kryten
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 18:12:06 -03:00
Andi Kleen d1706b39f0 perf tools: Add support for skipping itrace instructions
When using 'perf script' to look at PT traces it is often useful to
ignore the initialization code at the beginning.

On larger traces which may have many millions of instructions in
initialization code doing that in a pipeline can be very slow, with perf
script spending a lot of CPU time calling printf and writing data.

This patch adds an extension to the --itrace argument that skips 'n'
events (instructions, branches or transactions) at the beginning. This
is much more efficient.

v2:
Add support for BTS (Adrian Hunter)
Document in itrace.txt
Fix branch check
Check transactions and instructions too

Committer note:

To test intel_pt one needs to make sure VT-x isn't active, i.e.
stopping KVM guests on the test machine, as described by Andi Kleen
at http://lkml.kernel.org/r/20160301234953.GD23621@tassilo.jf.intel.com

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1459187142-20035-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:09 -03:00
Dima Kogan f7380c12ec perf script perl: Perl scripts now get a backtrace, like the python ones
We have some infrastructure to use perl or python to analyze logs
generated by perf.  Prior to this patch, only the python tools had
access to backtrace information.  This patch makes this information
available to perl scripts as well.  Example:

  Let's look at malloc() calls made by the seq utility.  First we
  create a probe point:

      $ perf probe -x /lib/x86_64-linux-gnu/libc.so.6 malloc
      Added new events:
      ...

  Now we run seq, while monitoring malloc() calls with perf

      $ perf record --call-graph=dwarf -e probe_libc:malloc seq 5
      1
      2
      3
      4
      5
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.064 MB perf.data (6 samples) ]

  We can use perf to look at its log to see the malloc calls and the backtrace

      $ perf script
      seq 14195 [000] 1927993.748254: probe_libc:malloc: (7f9ff8edd320) bytes=0x22
                  7f9ff8edd320 malloc (/lib/x86_64-linux-gnu/libc-2.22.so)
                  7f9ff8e8eab0 set_binding_values.part.0 (/lib/x86_64-linux-gnu/libc-2.22.so)
                  7f9ff8e8eda1 __bindtextdomain (/lib/x86_64-linux-gnu/libc-2.22.so)
                        401b22 main (/usr/bin/seq)
                  7f9ff8e82610 __libc_start_main (/lib/x86_64-linux-gnu/libc-2.22.so)
                        402799 _start (/usr/bin/seq)
      ...

  We can also use the scripting facilities.  We create a skeleton perl
  script that simply prints out the events

      $ perf script -g perl
      generated Perl script: perf-script.pl

  We can then use this script to see the malloc() calls with a
  backtrace.  Prior to this patch, the backtrace was not available to
  the perl scripts.

      $ perf script -s perf-script.pl
      probe_libc::malloc  0 1927993.748254260  14195 seq   __probe_ip=140325052863264, bytes=34
              [7f9ff8edd320] malloc
              [7f9ff8e8eab0] set_binding_values.part.0
              [7f9ff8e8eda1] __bindtextdomain
              [401b22] main
              [7f9ff8e82610] __libc_start_main
              [402799] _start
      ...

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/87mvphzld0.fsf@secretsauce.net
Signed-off-by: Dima Kogan <dima@secretsauce.net>
2016-03-30 11:14:09 -03:00
Taeung Song 37194f443a perf config: Rename 'v' to 'home' in set_buildid_dir()
Change the variable name 'v' to 'home' to make it more readable.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1459099340-16911-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:09 -03:00
Taeung Song 9cb5987c82 perf config: Rework buildid_dir_command_config to perf_buildid_config
To avoid repeated calling perf_config() remove
buildid_dir_command_config() and add new perf_buildid_config into
perf_default_config.

Because perf_config() is already called with perf_default_config at
main().

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1459099340-16911-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:09 -03:00
Jiri Olsa 592dac6f35 perf tools: Make hists__collapse_insert_entry static
No need to export hists__collapse_insert_entry function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1458823940-24583-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:07 -03:00
Arnaldo Carvalho de Melo 3ea223adcb perf tools: Add missing initialization of perf_sample.cpumode in synthesized samples
In 473398a21d ("perf tools: Add cpumode to struct perf_sample"), I
missed some places where perf_sample fields are directly initialized in
addition to what is done in perf_evsel__parse_sample(), namely when
synthesizing PERF_RECORD_{MMAP*,COMM,FORK,EXIT} for pre-existing threads
and also in intel_pt and intel_bts when synthesizing events from
processor trace, the jitdump code also was affected, fix it.

The problem was noticed with running:

  # perf record -e intel_pt//u true
  # perf script

Where the samples wouldn't get resolved because perf_sample.cpumode
would be left as zero, i.e. PERF_RECORD_MISC_CPUMODE_UNKNOWN, not
resolving as kernel, hypervisor or user cpu modes.

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 473398a21d ("perf tools: Add cpumode to struct perf_sample")
Link: http://lkml.kernel.org/n/tip-n5sdauxgk24d5nun8kuuu2mh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-29 20:03:56 -03:00
Linus Torvalds 3fa2fe2ce0 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "This tree contains various perf fixes on the kernel side, plus three
  hw/event-enablement late additions:

   - Intel Memory Bandwidth Monitoring events and handling
   - the AMD Accumulated Power Mechanism reporting facility
   - more IOMMU events

  ... and a final round of perf tooling updates/fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  perf llvm: Use strerror_r instead of the thread unsafe strerror one
  perf llvm: Use realpath to canonicalize paths
  perf tools: Unexport some methods unused outside strbuf.c
  perf probe: No need to use formatting strbuf method
  perf help: Use asprintf instead of adhoc equivalents
  perf tools: Remove unused perf_pathdup, xstrdup functions
  perf tools: Do not include stringify.h from the kernel sources
  tools include: Copy linux/stringify.h from the kernel
  tools lib traceevent: Remove redundant CPU output
  perf tools: Remove needless 'extern' from function prototypes
  perf tools: Simplify die() mechanism
  perf tools: Remove unused DIE_IF macro
  perf script: Remove lots of unused arguments
  perf thread: Rename perf_event__preprocess_sample_addr to thread__resolve
  perf machine: Rename perf_event__preprocess_sample to machine__resolve
  perf tools: Add cpumode to struct perf_sample
  perf tests: Forward the perf_sample in the dwarf unwind test
  perf tools: Remove misplaced __maybe_unused
  perf list: Fix documentation of :ppp
  perf bench numa: Fix assertion for nodes bitfield
  ...
2016-03-24 10:02:14 -07:00
Arnaldo Carvalho de Melo 76267147f2 perf llvm: Use strerror_r instead of the thread unsafe strerror one
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5njrq9dltckgm624omw9ljgu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 17:42:21 -03:00
Arnaldo Carvalho de Melo 78478269d2 perf llvm: Use realpath to canonicalize paths
To kill the last user of make_nonrelative_path(), that gets ditched,
one more panicking function killed.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-3hu56rvyh4q5gxogovb6ko8a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 17:39:19 -03:00
Arnaldo Carvalho de Melo 0741208a7c perf tools: Unexport some methods unused outside strbuf.c
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-nq1wvtky4mpu0nupjyar7sbw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 17:09:53 -03:00
Arnaldo Carvalho de Melo 88fd633cdf perf probe: No need to use formatting strbuf method
We have addch() for chars, add() for fixed size data, and addstr() for
variable length strings, use them.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-0ap02fn2xtvpduj2j6b2o1j4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 16:53:05 -03:00
Arnaldo Carvalho de Melo cf47a8aede perf tools: Remove unused perf_pathdup, xstrdup functions
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-s87zi5d03m6rz622y1z6rlsa@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 15:27:33 -03:00
Arnaldo Carvalho de Melo 531d241063 perf tools: Do not include stringify.h from the kernel sources
Use instead the copy just made to tools/include/linux/.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-q736w12nwy98x5ox2hamp5ow@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 15:21:15 -03:00
Arnaldo Carvalho de Melo 3938bad44e perf tools: Remove needless 'extern' from function prototypes
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-w246stf7ponfamclsai6b9zo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 15:06:35 -03:00
Arnaldo Carvalho de Melo e476343860 perf tools: Simplify die() mechanism
This should die altogether, but for now lets remove a bit of this stuff,
as it is not used at all.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ade3n99xscldhg5mx2vzd8p3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:32:31 -03:00
Arnaldo Carvalho de Melo 236f71eb94 perf tools: Remove unused DIE_IF macro
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-elxg25jd4dhwod4wqbko87qh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:30:42 -03:00
Arnaldo Carvalho de Melo c2740a87ca perf thread: Rename perf_event__preprocess_sample_addr to thread__resolve
Since none of the perf_event fields are used anymore, just the
perf_sample ones, and since this resolves to (map, symbol) from data
structures within struct thread, rename it to thread__resolve and make
the argument ordering similar to the one in machine__resolve().

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2b33hs9bp550tezzlhl4kejh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:03:08 -03:00
Arnaldo Carvalho de Melo bb3eb56622 perf machine: Rename perf_event__preprocess_sample to machine__resolve
Since we only deal with fields in the passed struct perf_sample move
this method to struct machine, that is where the perf_sample fields
will be resolved to a struct addr_location, i.e. thread, map, symbol,
etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-a1ww2lbm2vbuqsv4p7ilubu9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:03:08 -03:00
Arnaldo Carvalho de Melo 473398a21d perf tools: Add cpumode to struct perf_sample
To avoid parsing event->header.misc in many locations.

This will also allow setting perf.sample.{ip,cpumode} in a single place,
from tracepoint fields, as needed by 'perf kvm' with PPC guests, where
the guest hardware counters is not available at the host.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qp3yradhyt6q3wl895b1aat0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:03:07 -03:00
Arnaldo Carvalho de Melo b8f8eb84f4 perf tools: Remove misplaced __maybe_unused
All over the tree.

Cc: David Ahern <dsahern@gmail.com>
cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-8nzhnokxyp8y4v7gf0j00oyb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:03:04 -03:00
Linus Torvalds 26660a4046 Merge branch 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull 'objtool' stack frame validation from Ingo Molnar:
 "This tree adds a new kernel build-time object file validation feature
  (ONFIG_STACK_VALIDATION=y): kernel stack frame correctness validation.
  It was written by and is maintained by Josh Poimboeuf.

  The motivation: there's a category of hard to find kernel bugs, most
  of them in assembly code (but also occasionally in C code), that
  degrades the quality of kernel stack dumps/backtraces.  These bugs are
  hard to detect at the source code level.  Such bugs result in
  incorrect/incomplete backtraces most of time - but can also in some
  rare cases result in crashes or other undefined behavior.

  The build time correctness checking is done via the new 'objtool'
  user-space utility that was written for this purpose and which is
  hosted in the kernel repository in tools/objtool/.  The tool's (very
  simple) UI and source code design is shaped after Git and perf and
  shares quite a bit of infrastructure with tools/perf (which tooling
  infrastructure sharing effort got merged via perf and is already
  upstream).  Objtool follows the well-known kernel coding style.

  Objtool does not try to check .c or .S files, it instead analyzes the
  resulting .o generated machine code from first principles: it decodes
  the instruction stream and interprets it.  (Right now objtool supports
  the x86-64 architecture.)

  From tools/objtool/Documentation/stack-validation.txt:

   "The kernel CONFIG_STACK_VALIDATION option enables a host tool named
    objtool which runs at compile time.  It has a "check" subcommand
    which analyzes every .o file and ensures the validity of its stack
    metadata.  It enforces a set of rules on asm code and C inline
    assembly code so that stack traces can be reliable.

    Currently it only checks frame pointer usage, but there are plans to
    add CFI validation for C files and CFI generation for asm files.

    For each function, it recursively follows all possible code paths
    and validates the correct frame pointer state at each instruction.

    It also follows code paths involving special sections, like
    .altinstructions, __jump_table, and __ex_table, which can add
    alternative execution paths to a given instruction (or set of
    instructions).  Similarly, it knows how to follow switch statements,
    for which gcc sometimes uses jump tables."

  When this new kernel option is enabled (it's disabled by default), the
  tool, if it finds any suspicious assembly code pattern, outputs
  warnings in compiler warning format:

    warning: objtool: rtlwifi_rate_mapping()+0x2e7: frame pointer state mismatch
    warning: objtool: cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup
    warning: objtool:__schedule()+0x3c0: duplicate frame pointer save
    warning: objtool:__schedule()+0x3fd: sibling call from callable instruction with changed frame pointer

  ... so that scripts that pick up compiler warnings will notice them.
  All known warnings triggered by the tool are fixed by the tree, most
  of the commits in fact prepare the kernel to be warning-free.  Most of
  them are bugfixes or cleanups that stand on their own, but there are
  also some annotations of 'special' stack frames for justified cases
  such entries to JIT-ed code (BPF) or really special boot time code.

  There are two other long-term motivations behind this tool as well:

   - To improve the quality and reliability of kernel stack frames, so
     that they can be used for optimized live patching.

   - To create independent infrastructure to check the correctness of
     CFI stack frames at build time.  CFI debuginfo is notoriously
     unreliable and we cannot use it in the kernel as-is without extra
     checking done both on the kernel side and on the build side.

  The quality of kernel stack frames matters to debuggability as well,
  so IMO we can merge this without having to consider the live patching
  or CFI debuginfo angle"

* 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  objtool: Only print one warning per function
  objtool: Add several performance improvements
  tools: Copy hashtable.h into tools directory
  objtool: Fix false positive warnings for functions with multiple switch statements
  objtool: Rename some variables and functions
  objtool: Remove superflous INIT_LIST_HEAD
  objtool: Add helper macros for traversing instructions
  objtool: Fix false positive warnings related to sibling calls
  objtool: Compile with debugging symbols
  objtool: Detect infinite recursion
  objtool: Prevent infinite recursion in noreturn detection
  objtool: Detect and warn if libelf is missing and don't break the build
  tools: Support relative directory path for 'O='
  objtool: Support CROSS_COMPILE
  x86/asm/decoder: Use explicitly signed chars
  objtool: Enable stack metadata validation on 64-bit x86
  objtool: Add CONFIG_STACK_VALIDATION option
  objtool: Add tool to perform compile-time stack metadata validation
  x86/kprobes: Mark kretprobe_trampoline() stack frame as non-standard
  sched: Always inline context_switch()
  ...
2016-03-20 18:23:21 -07:00
Wang Nan 73cdf0c6ea perf symbols: Record text offset in dso to calculate objdump address
Store DSO's .text offset into DSO, used for VDSOs and will also be used for
other needs, like handling kernel modules.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-2-git-send-email-wangnan0@huawei.com
[ Extracted from larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-18 14:23:59 -03:00
Sukadev Bhattiprolu 4c9d6c18fd perf test: Remove 'core_id' check in topo test
The topology test case of 'perf test' seems to be broken on my x86
system - due to the comparison of a "core-id" with # of CPUs online.

There are 8 online CPUs:

	$ cat /sys/devices/system/cpu/online
	0-7

but core-ids are not sequential and some core-ids exceed the number
of online CPUs.

	$ cat /sys/devices/system/cpu/cpu?/topology/core_id
	0
	1
	9
	10
	0
	1
	9
	10

Looks like we can safely remove the check.  Output before:

	$ perf --version
	perf version 4.4.rc1.g34258a

	$ perf test -v topo
	36: Test topology in session                                 :
	--- start ---
	test child forked, pid 5906
	templ file: /tmp/perf-test-vCwWG3
	core_id number is too big.You may need to upgrade the perf tool.
	test child interrupted
	---- end ----
	Test topology in session: FAILED!

and after:

	$ perf test -v topo
	36: Test topology in session                                 :
	--- start ---
	test child forked, pid 6532
	templ file: /tmp/perf-test-y10wFJ
	CPU 0, core 0, socket 0
	CPU 1, core 1, socket 0
	CPU 2, core 9, socket 0
	CPU 3, core 10, socket 0
	CPU 4, core 0, socket 1
	CPU 5, core 1, socket 1
	CPU 6, core 9, socket 1
	CPU 7, core 10, socket 1
	test child finished with 0
	---- end ----
	Test topology in session: Ok

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20151203233219.GA27696@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-11 13:45:04 -03:00
Namhyung Kim 078b8d4a40 perf tools: Add sort__has_comm variable
The sort__has_comm variable is to check whether the comm sort key is
given.  This is necessary to support thread filtering in the TUI hists
browser later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1457533253-21419-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:47:19 -03:00
Namhyung Kim f7fb538afe perf tools: Recalc total periods using top-level entries in hierarchy
When hierarchy mode is enabled, each entry in a hierarchy level shares
the period.  IOW an upper level entry's period is the sum of lower level
entries.  Thus perf uses only one of them to calculate the total period
of hists.  It was lowest-level (leaf) entries but it has a problem when
it comes to filters.

If a filter is applied, entries in the same level will be filtered or
not.  But upper level entries still have period of their sum including
filtered one.  So total sum of upper level entries will not be same as
sum of lower level entries.

This resulted in entries having more than 100% of overhead and it can be
produced using perf top with filter(s).

Reported-and-Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:46:13 -03:00
Namhyung Kim 86e3ee5224 perf tools: Remove nr_sort_keys field
The nr_sort_keys field is to carry the number of sort entries in a
hpp_list or hists to determine the depth of indentation of a hist entry.
As it's only used in hierarchy mode and now we have used nr_hpp_node for
this reason, there's no need to keep it anymore.  Let's get rid of it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:46:08 -03:00
Namhyung Kim a515d8ff70 perf tools: Remove hist_entry->fmt field
It's not used anymore and the output format is accessed by the hpp_list
pointer instead when hierarchy is enabled.  Let's get rid of it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:45:59 -03:00
Namhyung Kim aec13a7ec7 perf tools: Fix command line filters in hierarchy mode
When a command-line filter is applied in hierarchy mode, output is
broken especially when filtering on lower level.  The higher level
entries doesn't show up so it's hard to see the results.

Also it needs to handle multi sort keys in a single hierarchy level.

Before:

  $ perf report --hierarchy -s 'cpu,{dso,comm}' --comms swapper --stdio
  ...
  #    Overhead  CPU / Shared Object+Command
  # ...........  ...........................
  #
         13.79%     [kernel.vmlinux]  swapper
      31.71%     000
         13.80%     [kernel.vmlinux]  swapper
          0.43%     [e1000e]          swapper
         11.89%     [kernel.vmlinux]  swapper
          9.18%     [kernel.vmlinux]  swapper

After:

  #    Overhead  CPU / Shared Object+Command
  # ...........  ...............................
  #
      33.09%     003
         13.79%     [kernel.vmlinux]  swapper
      31.71%     000
         13.80%     [kernel.vmlinux]  swapper
          0.43%     [e1000e]          swapper
      21.90%     002
         11.89%     [kernel.vmlinux]  swapper
      13.30%     001
          9.18%     [kernel.vmlinux]  swapper

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:45:48 -03:00
Namhyung Kim 4945cf2aa1 perf tools: Add more sort entry check functions
Those functions are for checkinf if a given perf_hpp_fmt is a
filter-related sort entry.  With hierarchy mode, it needs to check
filters on the hist entries with its own hpp format list.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:45:44 -03:00
Namhyung Kim f4954cfb1c perf tools: Fix hist_entry__filter() for hierarchy
When hierarchy mode is enabled each output format is in a separate hpp
list.  So when applying a filter it should check all formats in the
list.  Currently it only checks a single ->fmt field which was not set
properly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:45:36 -03:00
Jiri Olsa e12b202f8f perf jitdump: Build only on supported archs
Build jitdump only on architectures defined in util/genelf.h file, to avoid
breaking the build on such arches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Davidlohr Bueso <dbueso@suse.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Mel Gorman <mgorman@suse.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20160310164113.GA11357@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:33:19 -03:00
Jiri Olsa ea8f75f981 perf tools: Omit unnecessary cast in perf_pmu__parse_scale
There's no need to use a const char pointer, we can used char pointer
from the beginning and omit the unnecessary cast.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160308184230.GB7897@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-09 10:42:22 -03:00
Jiri Olsa d7b617f51b perf tools: Pass perf_hpp_list all the way through setup_sort_list
Pass perf_hpp_list all the way through setup_sort_list so that the sort
entry can be added on the arbitrary list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160309100417.GA30910@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-09 10:37:26 -03:00
Chris Phlipot 616df645d7 perf tools: Fix perf script python database export crash
Remove the union in evsel so that the database id and priv pointer can
be used simultainously without conflicting and crashing.

Detailed Description for the fixed bug follows:

perf script crashes with a segmentation fault on user space tool version
4.5.rc7.ge2857b when using the python database export API. It works
properly in 4.4 and prior versions.

the crash fist appeared in:

cfc8874a48 ("perf script: Process cpu/threads maps")

How to reproduce the bug:

Remove any temporary files left over from a previous crash (if you have
already attemped to reproduce the bug):

  $ rm -r test_db-perf-data
  $ dropdb test_db

  $ perf record timeout 1 yes >/dev/null
  $ perf script -s scripts/python/export-to-postgresql.py test_db

  Stack Trace:
  Program received signal SIGSEGV, Segmentation fault.
  __GI___libc_free (mem=0x1) at malloc.c:2929
  2929	malloc.c: No such file or directory.
  (gdb) bt
    at util/stat.c:122
    argv=<optimized out>, prefix=<optimized out>) at builtin-script.c:2231
    argc=argc@entry=4, argv=argv@entry=0x7fffffffdf70) at perf.c:390
    at perf.c:451

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: cfc8874a48 ("perf script: Process cpu/threads maps")
Link: http://lkml.kernel.org/r/1457500314-8912-1-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-09 10:31:02 -03:00
Arnaldo Carvalho de Melo 46dad054a1 perf jitdump: DWARF is also needed
While building on a Docker container for ubuntu and installing package
by package one ends up with:

    MKDIR    /tmp/build/util/
    CC       /tmp/build/util/genelf.o
  util/genelf.c:22:19: fatal error: dwarf.h: No such file or directory
   #include <dwarf.h>
                   ^
  compilation terminated.
  mv: cannot stat '/tmp/build/util/.genelf.o.tmp': No such file or directory

Because the jitdump code needs the DWARF related development packages to
be installed. So make it dependent on that so that the build can succeed
without jitdump support.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-le498robnmxd40237wej3w62@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-09 10:29:03 -03:00
Namhyung Kim 2dbbe9f26c perf hists: Fix indent for multiple hierarchy sort key
When multiple sort keys are used in a single hierarchy, it should indent
using number of hierarchy levels instead of number of sort keys.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457361308-514-5-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 10:11:20 +01:00
Namhyung Kim a23f37e864 perf hists: Support multiple sort keys in a hierarchy level
This implements having multiple sort keys in a single hierarchy level.
Originally only single sort key is supported for each level, but now
using the group syntax with '{ }', it can set more than one sort key in
one level.  Note that now it needs to quote in order to prevent shell
interpretation.

For example:

  $ perf report --hierarchy -s '{comm,dso},sym'
  ...
  #       Overhead  Command / Shared Object / Symbol
  # ..............  ..........................................
  #
      48.67%        swapper          [kernel.vmlinux]
         34.42%        [k] intel_idle
          1.30%        [k] __tick_nohz_idle_enter
          1.03%        [k] cpuidle_reflect
       8.87%        firefox          libpthread-2.22.so
          6.60%        [.] __GI___libc_recvmsg
          1.18%        [.] pthread_cond_signal@@GLIBC_2.3.2
          1.09%        [.] 0x000000000000ff4b
       6.11%        Xorg             libc-2.22.so
          5.27%        [.] __memcpy_sse2_unaligned

In the above example, the command name and the shared object name are
shown on the same line but the symbol name is on the different line.
Since the first two are grouped by '{}', they are in the same level.

Suggested-and-Tested=by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457361308-514-4-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 10:11:20 +01:00
Namhyung Kim 1b2dbbf41a perf hists: Use own hpp_list for hierarchy mode
Now each hists has its own hpp lists in hierarchy.  So instead of having
a pointer to a single perf_hpp_fmt in a hist entry, make it point the
hpp_list for its level.  This will be used to support multiple sort keys
in a single hierarchy level.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457361308-514-3-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 10:11:19 +01:00
Namhyung Kim c3bc0c4368 perf hists: Introduce perf_hpp__setup_hists_formats()
The perf_hpp__setup_hists_formats() is to build hists-specific output
formats (and sort keys).  Currently it's only used in order to build the
output format in a hierarchy with same sort keys, but it could be used
with different sort keys in non-hierarchy mode later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457361308-514-2-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 10:11:19 +01:00
Namhyung Kim 4b633eba14 perf hists: Add level field to struct perf_hpp_fmt
The level field is to distinguish levels in the hierarchy mode.
Currently each column (perf_hpp_fmt) has a different level.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457103582-28396-2-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 10:11:18 +01:00
Adrian Hunter a23f96ee4d perf tools: Use 64-bit shifts with (TSC) time conversion
Commit b9511cd761 ("perf/x86: Fix time_shift in perf_event_mmap_page")
altered the time conversion algorithms documented in the perf_event.h
header file, to use 64-bit shifts.  That was done to make the code more
future-proof (i.e. some time in the future a 32-bit shift could be
allowed).  Reflect those changes in perf tools.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457005856-6143-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 10:11:18 +01:00
Adrian Hunter 4a018cc479 perf jit: Move clockid validation
Move clockid validation into jit_process() so it can later be made
conditional.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457005856-6143-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 10:11:17 +01:00
Adrian Hunter 570735b33d perf jit: Let jit_process() return errors
In preparation for moving clockid validation into jit_process().

Previously a return value of zero meant the processing had been done and
non-zero meant either the processing was not done (i.e. not the jitdump
file mmap event) or an error occurred.

Change it so that zero means the processing was not done, one means the
processing was done and successful, and negative values are an error.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457005856-6143-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 10:11:17 +01:00
Adrian Hunter 5fb0ac16c5 perf session: Simplify tool stubs
Some of the stubs are identical so just have one function for them.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457005856-6143-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 10:11:17 +01:00
Colin Ian King 07ef757445 perf tools: Explicitly declare inc_group_count as a void function
The return type is not defined, so it defaults to int, however, the
function is not returning anything, so this is clearly not correct. Make
it a void function.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457008214-14393-1-git-send-email-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08 10:11:16 +01:00
Josh Poimboeuf 19072f23d1 x86/asm/decoder: Use explicitly signed chars
When running objtool on a ppc64le host to analyze x86 binaries, it
reports a lot of false warnings like:

  ipc/compat_mq.o: warning: objtool: compat_SyS_mq_open()+0x91: can't find jump dest instruction at .text+0x3a5

The warnings are caused by the x86 instruction decoder setting the wrong
value for the jump instruction's immediate field because it assumes that
"char == signed char", which isn't true for all architectures.  When
converting char to int, gcc sign-extends on x86 but doesn't sign-extend
on ppc64le.

According to the gcc man page, that's a feature, not a bug:

  > Each kind of machine has a default for what "char" should be.  It is
  > either like "unsigned char" by default or like "signed char" by
  > default.
  >
  > Ideally, a portable program should always use "signed char" or
  > "unsigned char" when it depends on the signedness of an object.

Conform to the "standards" by changing the "char" casts to "signed
char".  This results in no actual changes to the object code on x86.

Note: the x86 decoder now lives in three different locations in the
kernel tree, which are all kept in sync via makefile checks and
warnings: in-kernel, perf, and objtool.  This fixes all three locations.
Eventually we should probably try to at least converge the two separate
"tools" locations into a single shared location.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/9dd4161719b20e6def9564646d68bfbe498c549f.1456962210.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-03 16:13:00 +01:00
Andi Kleen fb4605ba47 perf stat: Check for frontend stalled for metrics
Add an extra check for frontend stalled in the metrics.  This avoids an
extra column for the --metric-only case when the CPU does not support
frontend stalled.

v2: Add separate init function

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1456858672-21594-8-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03 11:10:40 -03:00
Arnaldo Carvalho de Melo 9b240637eb perf test: Fix hists related entries
That got broken by d3a72fd818 ("perf report: Fix indentation of
dynamic entries in hierarchy"), by using the evlist in setup_sorting()
without checking if it is NULL, as done in some 'perf test' entries:

  $ find tools/ -name "*.c" | xargs grep 'setup_sorting(NULL);'
  tools/perf/tests/hists_output.c:      setup_sorting(NULL);
  tools/perf/tests/hists_output.c:      setup_sorting(NULL);
  tools/perf/tests/hists_output.c:      setup_sorting(NULL);
  tools/perf/tests/hists_output.c:      setup_sorting(NULL);
  tools/perf/tests/hists_output.c:      setup_sorting(NULL);
  tools/perf/tests/hists_cumulate.c:    setup_sorting(NULL);
  tools/perf/tests/hists_cumulate.c:    setup_sorting(NULL);
  tools/perf/tests/hists_cumulate.c:    setup_sorting(NULL);
  tools/perf/tests/hists_cumulate.c:    setup_sorting(NULL);
  $

Fix it.

Before:

  [root@jouet ~]# perf test
  <SNIP>
  15: Test matching and linking multiple hists                 : FAILED!
  16: Try 'import perf' in python, checking link problems      : Ok
  17: Test breakpoint overflow signal handler                  : Ok
  18: Test breakpoint overflow sampling                        : Ok
  19: Test number of exit event of a simple workload           : Ok
  20: Test software clock events have valid period values      : Ok
  21: Test object code reading                                 : Ok
  22: Test sample parsing                                      : Ok
  23: Test using a dummy software event to keep tracking       : Ok
  24: Test parsing with no sample_id_all bit set               : Ok
  25: Test filtering hist entries                              : FAILED!
  26: Test mmap thread lookup                                  : Ok
  27: Test thread mg sharing                                   : Ok
  28: Test output sorting of hist entries                      : FAILED!
  29: Test cumulation of child hist entries                    : FAILED!
  <SNIP>

After the patch the above failed tests complete successfully.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: d3a72fd818 ("perf report: Fix indentation of dynamic entries in hierarchy")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03 11:10:39 -03:00
Colin Ian King 979ac257b0 perf script: Fix double free on command_line
The 'command_line' variable is free'd twice if db_export__branch_types()
fails. To avoid this, defer the free'ing of 'command_line' to after this
call so that the error return path will just free 'command_line' once.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Javi Merino <javi.merino@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1456875980-25606-1-git-send-email-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03 11:10:37 -03:00
Andi Kleen 44d49a6002 perf stat: Support metrics in --per-core/socket mode
Enable metrics printing in --per-core / --per-socket mode. We need to
save the shadow metrics in a unique place. Always use the first CPU in
the aggregation. Then use the same CPU to retrieve the shadow value
later.

Example output:

  % perf stat --per-core -a ./BC1s

   Performance counter stats for 'system wide':

  S0-C0 2   2966.020381 task-clock (msec) #   2.004 CPUs utilized  (100.00%)
  S0-C0 2            49 context-switches  #   0.017 K/sec          (100.00%)
  S0-C0 2             4 cpu-migrations    #   0.001 K/sec          (100.00%)
  S0-C0 2           467 page-faults       #   0.157 K/sec
  S0-C0 2 4,599,061,773 cycles            #   1.551 GHz            (100.00%)
  S0-C0 2 9,755,886,883 instructions      #   2.12  insn per cycle (100.00%)
  S0-C0 2 1,906,272,125 branches          # 642.704 M/sec          (100.00%)
  S0-C0 2    81,180,867 branch-misses     #   4.26% of all branches
  S0-C1 2   2965.995373 task-clock (msec) #   2.003 CPUs utilized  (100.00%)
  S0-C1 2            62 context-switches  #   0.021 K/sec          (100.00%)
  S0-C1 2             8 cpu-migrations    #   0.003 K/sec          (100.00%)
  S0-C1 2           281 page-faults       #   0.095 K/sec
  S0-C1 2     6,347,290 cycles            #   0.002 GHz            (100.00%)
  S0-C1 2     4,654,156 instructions      #   0.73  insn per cycle (100.00%)
  S0-C1 2       947,121 branches          #   0.319 M/sec          (100.00%)
  S0-C1 2        37,322 branch-misses     #   3.94% of all branches

         1.480409747 seconds time elapsed

v2: Rebase to older patches
v3: Document shadow cpus. Fix aggr_get_id argument. Fix -A shadows (Jiri)

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1456785386-19481-4-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03 11:10:36 -03:00
Andi Kleen 92a61f6412 perf stat: Implement CSV metrics output
Now support CSV output for metrics. With the new output callbacks this
is relatively straight forward by creating new callbacks.

This allows to easily plot metrics from CSV files.

The new line callback needs to know the number of fields to skip them
correctly

Example output before:

  % perf stat -x, true
  0.200687,,task-clock,200687,100.00
  0,,context-switches,200687,100.00
  0,,cpu-migrations,200687,100.00
  40,,page-faults,200687,100.00
  730871,,cycles,203601,100.00
  551056,,stalled-cycles-frontend,203601,100.00
  <not supported>,,stalled-cycles-backend,0,100.00
  385523,,instructions,203601,100.00
  78028,,branches,203601,100.00
  3946,,branch-misses,203601,100.00

After:

  % perf stat -x, true
  .502457,,task-clock,502457,100.00,0.485,CPUs utilized
  0,,context-switches,502457,100.00,0.000,K/sec
  0,,cpu-migrations,502457,100.00,0.000,K/sec
  45,,page-faults,502457,100.00,0.090,M/sec
  644692,,cycles,509102,100.00,1.283,GHz
  423470,,stalled-cycles-frontend,509102,100.00,65.69,frontend cycles idle
  <not supported>,,stalled-cycles-backend,0,100.00,,,,
  492701,,instructions,509102,100.00,0.76,insn per cycle
  ,,,,,0.86,stalled cycles per insn
  97767,,branches,509102,100.00,194.578,M/sec
  4788,,branch-misses,509102,100.00,4.90,of all branches

or easier readable

  $ perf stat  -x, -o x.csv true
  $ column -s, -t x.csv
  0.490635        task-clock              490635 100.00 0.489   CPUs utilized
  0               context-switches        490635 100.00 0.000   K/sec
  0               cpu-migrations          490635 100.00 0.000   K/sec
  45              page-faults             490635 100.00 0.092   M/sec
  629080          cycles                  497698 100.00 1.282   GHz
  409498          stalled-cycles-frontend 497698 100.00 65.09   frontend cycles idle
  <not supported> stalled-cycles-backend  0      100.00
  491424          instructions            497698 100.00 0.78    insn per cycle
                                                        0.83    stalled cycles per insn
  97278           branches                497698 100.00 198.270 M/sec
  4569            branch-misses           497698 100.00 4.70    of all branches

Two new fields are added: metric value and metric name.

v2: Split out function argument changes
v3: Reenable metrics for real.
v4: Fix wrong hunk from refactoring.
v5: Remove extra "noise" printing (Jiri), but add it to the not counted case.
Print empty metrics for not counted.
v6: Avoid outputting metric on empty format.
v7: Print metric at the end
v8: Remove extra run, ena fields
v9: Avoid extra new line for unsupported counters

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1456785386-19481-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03 11:10:36 -03:00
Wang Nan f8dd2d5ff9 perf data: Explicitly set byte order for integer types
After babeltrace commit 5cec03e402aa ("ir: copy variants and sequences
when setting a field path"), 'perf data convert' gets incorrect result
if there's bpf output data. For example:

 # perf data convert --to-ctf ./out.ctf
 # babeltrace ./out.ctf
 [10:44:31.186045346] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E7DD1, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0xC028E32F, [1] = 0x815D0100, [2] = 0x1000000 ] }
 [10:44:31.286101003] (+0.100055657) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF8105B609, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0x35D9F1EB, [1] = 0x15D81, [2] = 0x2 ] }

The expected result of the first sample should be:

 raw_data = [ [0] = 0x2FE328C0, [1] = 0x15D81, [2] = 0x1 ] }

however, 'perf data convert' output big endian value to resuling CTF
file.

The reason is a internal change (or a bug?) of babeltrace.

Before this patch, at the first add_bpf_output_values(), byte order of
all integer type is uncertain (is 0, neither 1234 (le) nor 4321 (be)).
It would be fixed by:

perf_evlist__deliver_sample
 -> process_sample_event
   -> ctf_stream
      ...
      ->bt_ctf_trace_add_stream_class
        ->bt_ctf_field_type_structure_set_byte_order
          ->bt_ctf_field_type_integer_set_byte_order

during creating the stream.

However, the babeltrace commit mentioned above duplicates types in
sequence to prevent potential conflict in following call stack and link
the newly allocated type into the 'raw_data' sequence:

perf_evlist__deliver_sample
 -> process_sample_event
   -> ctf_stream
      ...
      -> bt_ctf_trace_add_stream_class
        -> bt_ctf_stream_class_resolve_types
           ...
           -> bt_ctf_field_type_sequence_copy
             ->bt_ctf_field_type_integer_copy

This happens before byte order setting, so only the newly allocated
type is initialized, the byte order of original type perf choose to
create the first raw_data is still uncertain.

Byte order in CTF output is not related to byte order in perf.data.
Setting it to anything other than BT_CTF_BYTE_ORDER_NATIVE solves this
problem (only BT_CTF_BYTE_ORDER_NATIVE needs to be fixed). To reduce
behavior changing, set byte order according to compiling options.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03 11:10:34 -03:00
Wang Nan 6122d57e9f perf data: Support converting data from bpf_perf_event_output()
bpf_perf_event_output() outputs data through sample->raw_data. This
patch adds support to convert those data into CTF. A python script then
can be used to process output data from BPF programs.

Test result:

  # cat ./test_bpf_output_2.c
  /************************ BEGIN **************************/
  #include <uapi/linux/bpf.h>
  struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
  };
  #define SEC(NAME) __attribute__((section(NAME), used))
  static u64 (*ktime_get_ns)(void) =
 	(void *)BPF_FUNC_ktime_get_ns;
  static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
  static int (*get_smp_processor_id)(void) =
 	(void *)BPF_FUNC_get_smp_processor_id;
  static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
 	(void *)BPF_FUNC_perf_event_output;

  struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(u32),
 	.max_entries = __NR_CPUS__,
  };

  static inline int __attribute__((always_inline))
  func(void *ctx, int type)
  {
 	struct {
 		u64 ktime;
 		int type;
 	} __attribute__((packed)) output_data;
 	char error_data[] = "Error: failed to output\n";
 	int err;

 	output_data.type = type;
 	output_data.ktime = ktime_get_ns();
 	err = perf_event_output(ctx, &channel, get_smp_processor_id(),
 				&output_data, sizeof(output_data));
 	if (err)
 		trace_printk(error_data, sizeof(error_data));
 	return 0;
  }
  SEC("func_begin=sys_nanosleep")
  int func_begin(void *ctx) {return func(ctx, 1);}
  SEC("func_end=sys_nanosleep%return")
  int func_end(void *ctx) { return func(ctx, 2);}
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  /************************* END ***************************/

  # ./perf record -e bpf-output/no-inherit,name=evt/ \
                 -e ./test_bpf_output_2.c/map:channel.event=evt/ \
                 usleep 100000
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ]

  # ./perf script
          usleep 14942 92503.198504: evt:  ffffffff810e0ba1 sys_nanosleep (/lib/modules/4.3.0....
          usleep 14942 92503.298562: evt:  ffffffff810585e9 kretprobe_trampoline_holder (/lib....

  # ./perf data convert --to-ctf ./out.ctf
  [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
  [ perf data convert: Converted and wrote 0.000 MB (2 samples) ]

  # babeltrace ./out.ctf
  [01:41:43.198504134] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E0BA1, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x32C0C07B, [1] = 0x5421, [2] = 0x1 ] }
  [01:41:43.298562257] (+0.100058123) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810585E9, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x38B77FAA, [1] = 0x5421, [2] = 0x2 ] }

  # cat ./test_bpf_output_2.py
  from babeltrace import TraceCollection
  tc = TraceCollection()
  tc.add_trace('./out.ctf', 'ctf')
  d = {1:[], 2:[]}
  for event in tc.events:
     if not event.name.startswith('evt'):
         continue
     raw_data = event['raw_data']
     (time, type) = ((raw_data[0] + (raw_data[1] << 32)), raw_data[2])
     d[type].append(time)
  print(list(map(lambda i: d[2][i] - d[1][i], range(len(d[1])))));

  # python3 ./test_bpf_output_2.py
  [100056879]

Committer note:

Make sure you have python3-devel installed, not python-devel, which may
be for python2, which will lead to some "PyInstance_Type" errors. Also
make sure that you use the right libbabeltrace, because it is shipped
in Fedora, for instance, but an older version.

To build libbabeltrace's python binding one also needs to use:

 ./configure --enable-python-bindings

And then set PYTHONPATH=/usr/local/lib64/python3.4/site-packages/.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-9-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03 11:10:34 -03:00
Jiri Olsa f9a5978ac4 perf tools: Fix locale handling in pmu parsing
Ingo reported regression on display format of big numbers, which is
missing separators (in default perf stat output).

 triton:~/tip> perf stat -a sleep 1
         ...
         127008602      cycles                    #    0.011 GHz
         279538533      stalled-cycles-frontend   #  220.09% frontend cycles idle
         119213269      instructions              #    0.94  insn per cycle

This is caused by recent change:

  perf stat: Check existence of frontend/backed stalled cycles

that added call to pmu_have_event, that subsequently calls
perf_pmu__parse_scale, which has a bug in locale handling.

The lc string returned from setlocale, that we use to store old locale
value, may be allocated in static storage. Getting a dynamic copy to
make it survive another setlocale call.

  $ perf stat ls
         ...
         2,360,602      cycles                    #    3.080 GHz
         2,703,090      instructions              #    1.15  insn per cycle
           546,031      branches                  #  712.511 M/sec

Committer note:

Since the patch introducing the regression didn't made to perf/core,
move it to just before where the regression was introduced, so that we
don't break bisection for this feature.

Reported-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160303095348.GA24511@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03 11:04:54 -03:00
Jiri Olsa 67d5268908 perf tools: Fix python extension build
The util/python-ext-sources file contains source files required to build
the python extension relative to $(srctree)/tools/perf,

Such a file path $(FILE).c is handed over to the python extension build
system, which builds the final object in the
$(PYTHON_EXTBUILD)/tmp/$(FILE).o path.

After the build is done all files from $(PYTHON_EXTBUILD)lib/ are
carried as the result binaries.

Above system fails when we add source file relative to ../lib, which we
do for:

  ../lib/bitmap.c
  ../lib/find_bit.c
  ../lib/hweight.c
  ../lib/rbtree.c

All above objects will be built like:

  $(PYTHON_EXTBUILD)/tmp/../lib/bitmap.c
  $(PYTHON_EXTBUILD)/tmp/../lib/find_bit.c
  $(PYTHON_EXTBUILD)/tmp/../lib/hweight.c
  $(PYTHON_EXTBUILD)/tmp/../lib/rbtree.c

which accidentally happens to be final library path:

  $(PYTHON_EXTBUILD)/lib/

Changing setup.py to pass full paths of source files to Extension build
class and thus keep all built objects under $(PYTHON_EXTBUILD)tmp
directory.

Reported-by: Jeff Bastian <jbastian@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@vger.kernel.org # v4.2+
Link: http://lkml.kernel.org/r/20160227201350.GB28494@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-29 11:18:25 -03:00
Wang Nan fdf14720fb perf tools: Only set filter for tracepoints events
perf_evlist__set_filter() tries to set filter to every evsel linked in
the evlist. However, since filters can only be applied to tracepoints,
checking type of evsel before calling perf_evsel__set_filter() would be
better.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-26 19:50:01 -03:00
Wang Nan b8cbb34906 perf config: Bring perf_default_config to the very beginning at main()
Before this patch each subcommand calls perf_config() by themself,
reading the default configuration together with subcommand specific
options. If a subcommand doesn't have it own options, it needs to call
'perf_config(perf_default_config, NULL)' to ensure .perfconfig is
loaded.

This patch brings perf_config(perf_default_config, NULL) to the very
start of main(), so subcommands don't need to do it.

After this patch, 'llvm.clang-path' works for 'perf trace'.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-26 19:49:16 -03:00
Namhyung Kim abab5e7fce perf report: Update column width of dynamic entries
The column width of dynamic entries is updated when comparing hist
entries.  However some unique entries can miss the chance to update.  So
move the update to output resort stage to make sure every entry will get
called before display.

To do that, abuse ->sort callback to update the width when the third
argument is NULL.  When resorting entries in normal path, it never be
NULL so it should be fine IMHO.

Before:

  #       Overhead  ptr / bytes_req / gfp_flags
  # ..............  ..........................................
  #
      37.50%        0xffff8803f7669400
         37.50%        448
            37.50%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
      10.42%        0xffff8803f766be00
          8.33%        96
             8.33%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
          2.08%        512
             2.08%        GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP   <-- here

After:

  #       Overhead  ptr / bytes_req / gfp_flags
  # ..............  .....................................................
  #
      37.50%        0xffff8803f7669400
         37.50%        448
            37.50%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
      10.42%        0xffff8803f766be00
          8.33%        96
             8.33%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
          2.08%        512
             2.08%        GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP_NOMEMALLOC

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-26 19:38:48 -03:00
Namhyung Kim e049d4a3fa perf hists: Fix dynamic entry display in hierarchy
When dynamic sort key is used it might not show pretty printed output.
This is because the trace output was not set only for the first dynamic
sort key.  During hierarchy_insert_entry() it missed to pass the
trace_output to dynamic entries.  Also even if it did, only first entry
will have it.  Subsequent entries might set it during collapsing stage
but it's not guaranteed.

Before:

  $ perf report --hierarchy --stdio -s ptr,bytes_req,gfp_flags -g none
  #
  #       Overhead  ptr / bytes_req / gfp_flags
  # ..............  ..........................................
  #
      37.50%        0xffff8803f7669400
         37.50%        448
            37.50%        66080
      10.42%        0xffff8803f766be00
          8.33%        96
             8.33%        66080
          2.08%        512
             2.08%        67280

After:

  #
  #       Overhead  ptr / bytes_req / gfp_flags
  # ..............  ..........................................
  #
      37.50%        0xffff8803f7669400
         37.50%        448
            37.50%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
      10.42%        0xffff8803f766be00
          8.33%        96
             8.33%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
          2.08%        512
             2.08%        GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-26 19:37:38 -03:00
Namhyung Kim d3a72fd818 perf report: Fix indentation of dynamic entries in hierarchy
When dynamic entries are used in the hierarchy mode with multiple
events, the output might not be aligned properly.  In the hierarchy
mode, the each sort column is indented using total number of sort keys.
So it keeps track of number of sort keys when adding them.  However
a dynamic sort key can be added more than once when multiple events have
same field names.  This results in unnecessarily long indentation in the
output.

For example perf kmem records following events:

  $ perf evlist --trace-fields -i perf.data.kmem
  kmem:kmalloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
  kmem:kmalloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
  kmem:kfree: trace_fields: call_site,ptr
  kmem:kmem_cache_alloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
  kmem:kmem_cache_alloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
  kmem:kmem_cache_free: trace_fields: call_site,ptr
  kmem:mm_page_alloc: trace_fields: page,order,gfp_flags,migratetype
  kmem:mm_page_free: trace_fields: page,order

As you can see, many field names shared between kmem events.  So adding
'ptr' dynamic sort key alone will set nr_sort_keys to 6.  And this adds
many unnecessary spaces between columns.

Before:

  $ perf report -i perf.data.kmem --hierarchy -s ptr -g none --stdio
  ...
  #                Overhead                 ptr
  # .......................  ...................................
  #
      99.89%                 0xffff8803ffb79720
       0.06%                 0xffff8803d228a000
       0.03%                 0xffff8803f7678f00
       0.00%                 0xffff880401dc5280
       0.00%                 0xffff880406172380
       0.00%                 0xffff8803ffac3a00
       0.00%                 0xffff8803ffac1600

After:

  # Overhead                 ptr
  # ........  ....................
  #
      99.89%  0xffff8803ffb79720
       0.06%  0xffff8803d228a000
       0.03%  0xffff8803f7678f00
       0.00%  0xffff880401dc5280
       0.00%  0xffff880406172380
       0.00%  0xffff8803ffac3a00
       0.00%  0xffff8803ffac1600

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-26 18:36:11 -03:00
Namhyung Kim 84b6ee8ea3 perf hists: Fix comparing of dynamic entries
When hist_entry__cmp() and hist_entry__collapse() are called, they
should check if the dynamic entry is comparing matching hists only.

Otherwise it might access different hists resulting in incorrect output.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-26 18:35:57 -03:00
Namhyung Kim 79dded8776 perf hists browser: Show message for percent limit
Like the stdio, it should show messages about omitted hierarchy entries.
Please refer the previous commit for more details.

As it needs to check an entry is omitted or not multiple times, add the
has_no_entry field in the hist entry.

Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-26 11:20:36 -03:00
Namhyung Kim a7b5895b91 perf hists: Add more helper functions for the hierarchy mode
The hists__overhead_width() is to calculate width occupied by the
overhead (and others) columns before the sort columns.

The hist_entry__has_hiearchy_children() is to check whether an entry has
lower entries (children) in the hierarchy to be shown in the output.
This means the children should not be filtered out and above the percent
limit.

These two functions will be used to show information when all children
of an entry is omitted by the percent limit (or filter).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-26 11:20:35 -03:00
Taeung Song 8579aca3f9 perf script: Exception handling when the print fmt is empty
After collecting samples for events 'syscalls:', perf-script with python
script doesn't occasionally work generating a segmentation fault.

The reason is that the print fmt is empty and a value of
event->print_fmt.args is NULL, so dereferencing the null pointer results
in a segmentation fault i.e.:

    # perf record -e syscalls:*
    # perf script -g python
    # perf script -s perf-script.py

    in trace_begin
    syscalls__sys_enter_brk  3 79841.832099154  3777 test.sh  syscall_nr=12, brk=0

    ... (omitted) ...

    Segmentation fault (core dumped)

For example, a format of sys_enter_getuid() hasn't
print fmt as below.

    # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_getuid/format
    name: sys_enter_getuid
    ID: 188
    format:
            field:unsigned short common_type;         offset:0; size:2; signed:0;
            field:unsigned char common_flags;         offset:2; size:1; signed:0;
            field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
            field:int common_pid;                     offset:4; size:4; signed:1;
            field:int syscall_nr;                     offset:8; size:4; signed:1;

    print fmt: ""

So add exception handling to avoid this problem.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456413179-12331-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-25 12:54:20 -03:00
Arnaldo Carvalho de Melo bb109acc4a perf tools: Fix parsing of pmu events with empty list of modifiers
In 1d55e8ef34 ("perf tools: Introduce opt_event_config nonterminal") I
removed the unconditional "'/' '/'" for pmu events such as
"intel_pt//" but forgot to use opt_event_config where it expected some
event_config, oops. Fix it.

Noticed when trying to use:

  # perf record -e intel_pt// -a sleep 1
  event syntax error: 'intel_pt//'
                               \___ parser error
  Run 'perf list' for a list of valid events

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

      -e, --event <event>   event selector. use 'perf list' to list available events
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 1d55e8ef34 ("perf tools: Introduce opt_event_config nonterminal")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-25 10:56:21 -03:00
Namhyung Kim 5d8200ae67 perf hists: Support decaying in hierarchy mode
In the hierarchy mode, hist entries should decay their children too.
Also update hists__delete_entry() to be able to free child entries.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-18-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 20:21:15 -03:00
Namhyung Kim 8e2fc44f46 perf ui/stdio: Align column header for hierarchy output
The hierarchy output mode is to group entries so the existing columns
won't fit to the new output.  Treat all sort keys as a single column and
separate headers by "/".

  #    Overhead  Command / Shared Object
  # ...........  ................................
  #
      15.11%     swapper
         14.97%     [kernel.vmlinux]
          0.09%     [libahci]
          0.05%     [iwlwifi]
  ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-11-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 20:21:12 -03:00
Namhyung Kim ef86d68a08 perf ui/stdio: Implement hierarchy output mode
The hierarchy output mode is to group entries for each level so that
user can see higher level picture more easily.  It also helps to find
out which component is most costly.  The output will look like below:

      15.11%     swapper
         14.97%     [kernel.vmlinux]
          0.09%     [libahci]
          0.05%     [iwlwifi]
      10.29%     irq/33-iwlwifi
          6.45%     [kernel.vmlinux]
          1.41%     [mac80211]
          1.15%     [iwldvm]
          1.14%     [iwlwifi]
          0.14%     [cfg80211]
       4.81%     firefox
          3.92%     libxul.so
          0.34%     [kernel.vmlinux]

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-10-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 20:21:12 -03:00
Namhyung Kim 1f2d72cf32 perf hists: Count number of sort keys
It'll be used for hierarchy output mode to indent entries properly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-9-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 20:21:11 -03:00
Namhyung Kim 70642850fa perf hists: Resort after filtering hierarchy
In hierarchy mode, a filter can affect periods of entries in upper
hierarchy.  So it needs to resort the hists after filter.

For example, let's look at following example:

 Overhead      Command / Shared Object / Symbol
 ------------  --------------------------------
 30.00%        perf
    20.00%        perf
       10.00%        main
        5.00%        pr_debug
        5.00%        memcpy
    10.00%        [kernel.vmlinux]
        8.00%        memset
        2.00%        cpu_idle

If we apply simbol filter for 'mem' it should look like this

 13.00%        perf
     8.00%        [kernel.vmlinux]
        8.00%        memset
     5.00%        perf
        5.00%        memcpy

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 20:21:11 -03:00
Namhyung Kim 155e9afff7 perf hists: Support filtering in hierarchy mode
The hists__filter_hierarchy() function implements filtering in hierarchy
mode.  Now we have hist_entry__filter() so use it for entries in the
hierarchy.  It returns 3 kind of values.

A negative value means that it's not filtered by this type.  It marks
current entry as filtered tentatively so if a lower level entry removes
the filter it also removes the all parent so that we can find the entry
in the output.

Zero means it's filtered out by this type. A positive value means it's
not filtered so it removes the filter and shows in the output.  In these
cases, it moves to next entry since lower level entry won't match by
this type of filter anymore.  Thus all children will be filtered or not
together.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 20:21:10 -03:00
Namhyung Kim 54430101d2 perf hists: Introduce hist_entry__filter()
The hist_entry__filter() function is to filter hist entries using sort
key related info.  This is needed to support hierarchy mode since each
hist entry will be associated with a hpp fmt which has a sort key.  So
each entry should compare to only matching type of filters.

To do that, add the ->se_filter callback field to struct sort_entry.
This callback takes 'type' argument which determines whether it's
matching sort key or not.  It returns -1 for non-matching type, 0 for
filtered entry and 1 for not filtered entries.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-6-git-send-email-namhyung@kernel.org
[ 'socket' is reserved in sys/socket.h, so replace it with 'sk' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 20:19:14 -03:00
Namhyung Kim 8c01872fe3 perf hists: Add helper functions for hierarchy mode
The rb_hierarchy_{next,prev,last} functions are to traverse all hist
entries in a hierarchy.  They will be used by various function which
supports hierarchy output.

As the rb_hierarchy_next() is used to traverse the whole hierarchy, it
sometime needs to visit entries regardless of current folding state.  So
add enum hierarchy_move_dir and pass it to __rb_hierarchy_next() for
those cases.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 16:55:17 -03:00
Namhyung Kim 1a3906a7e6 perf hists: Resort hist entries with hierarchy
For hierarchical output, each entry must be sorted in their rbtree
(hroot) properly.  Add hists__hierarchy_output_resort() to do the job.
Note that those hierarchy entries share the period counts, it'd be
important to update the hists->stats only once (for leaves).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 16:54:09 -03:00
Namhyung Kim aef810ec4e perf hists: Basic support of hierarchical report view
In the hierarchical view, entries will be grouped and sorted on the
first key, and then on the second key, and so on.  Add the
he->hroot_{in,out} fields to keep the lower level entries. Actually this
can share space, in a union, with callchain's 'sorted_root' since the
hroots are only used by non-leaf entries and callchain is only used by
leaf entries.

It also adds the 'parent_he' and 'depth' fields which can be used by browsers.

This patch only implements collapsing part which creates internal
entries for each sort key.  These need to be sorted by output_sort stage
and to be displayed properly in the later patch(es).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 13:35:44 -03:00
Namhyung Kim a9c6e46c04 perf tools: Add helper functions for some sort keys
The 'trace', 'srcline' and 'srcfile' sort keys updates hist entry's
field later.  With the hierarchy mode, those fields are passed to a
matching entry so it needs to identify the sort keys.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 13:05:04 -03:00
Wang Nan c339b1a90e perf tools: Make binary data printer code in trace_event public available
Move code printing binray data from trace_event() to utils.c and allows
passing different printer. Further commits will use this logic to print
bpf output event.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456312845-111583-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 11:38:01 -03:00
Jiri Olsa c19ac91245 perf script: Display data_src values
Adding support to display data_src values, for events with data_src data
in sample.

Example:
  $ perf script
  ...
           rcuos/3    32 [002] ... 68501042 Local RAM hit|SNP None or Hit|TLB L1 or L2 hit|LCK No   ...
           rcuos/3    32 [002] ... 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No                 ...
           swapper     0 [002] ... 68100242 LFB hit|SNP None|TLB L1 or L2 hit|LCK No                ...
           swapper     0 [000] ... 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No                 ...
           swapper     0 [000] ... 50100142 L1 hit|SNP None|TLB L2 miss|LCK No                      ...
           rcuos/3    32 [002] ... 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No                 ...
   plugin-containe 16538 [000] ... 6a100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK Yes                ...
           gkrellm  1736 [000] ... 68100242 LFB hit|SNP None|TLB L1 or L2 hit|LCK No                ...
           gkrellm  1736 [000] ... 6a100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK Yes                ...

                                   ^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                             data_src value                     data_src translation

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-14-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 10:32:11 -03:00
Jiri Olsa 8b0819c8a3 perf tools: Change perf_mem__lck_scnprintf to return nb of displayed bytes
Moving strncat call into scnprintf to easily track number of displayed
bytes. It will be used in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-13-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 10:31:03 -03:00
Jiri Olsa 149d750767 perf tools: Change perf_mem__snp_scnprintf to return nb of displayed bytes
Moving strncat/strcpy calls into scnprintf to easily track number of
displayed bytes. It will be used in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-12-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 10:30:22 -03:00
Jiri Olsa 969075630e perf tools: Change perf_mem__lvl_scnprintf to return nb of displayed bytes
Moving strncat/strcpy calls into scnprintf to easily track number of
displayed bytes. It will be used in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-11-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 10:30:14 -03:00
Jiri Olsa b1a5fbea3d perf tools: Change perf_mem__tlb_scnprintf to return nb of displayed bytes
Moving strncat/strcpy calls into scnprintf to easily track
number of displayed bytes. It will be used in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 10:30:03 -03:00
Jiri Olsa 69a7727592 perf tools: Introduce perf_mem__lck_scnprintf function
Move meminfo's lck display function into mem-events.c object, so it
could be reused later from script code.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 10:29:52 -03:00
Jiri Olsa 2c07af13dc perf tools: Introduce perf_mem__snp_scnprintf function
Move meminfo's snp display function into mem-events.c object, so it
could be reused later from script code.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-8-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 10:20:45 -03:00
Jiri Olsa 071e9a1e12 perf tools: Introduce perf_mem__lvl_scnprintf function
Move meminfo's lvl display function into mem-events.c object, so it
could be reused later from script code.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 10:20:28 -03:00
Jiri Olsa 0c877d759d perf tools: Introduce perf_mem__tlb_scnprintf function
Move meminfo's tlb display function into mem-events.c object, so it
could be reused later from script code.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 10:20:08 -03:00
Jiri Olsa 2ba7ac5814 perf mem: Introduce perf_mem_events__name function
Wrap perf_mem_events[].name into perf_mem_events__name() so we could alter the
events name if needed.

This will be handy when changing latency settings for loads event in following
patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 10:11:52 -03:00
Jiri Olsa 54fbad54eb perf mem record: Check for memory events support
Check if current kernel support available memory events and display the
status within -e  list option:

  $ perf mem record -e list
  ldlat-loads  : available
  ldlat-stores : available

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24 10:10:59 -03:00
Arnaldo Carvalho de Melo bea2400621 perf tools: Remove strbuf_{remove,splice}()
No users, nuke them.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kfv2wo8xann8t97wdalttcx7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23 16:21:04 -03:00
Andi Kleen 940db6dcd3 perf tools: Dont stop PMU parsing on alias parse error
When an error happens during alias parsing currently the complete
parsing of all attributes of the PMU is stopped. This is breaks old perf
on a newer kernel that may have not-yet-know alias attributes (such as
.scale or .per-pkg).

Continue when some attribute is unparseable.

This is IMHO a stable candidate and should be backported to older
versions to avoid problems with newer kernels.

v2: Print warnings when something goes wrong.
v3: Change warning to debug output

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org # v3.6+
Link: http://lkml.kernel.org/r/1455749095-18358-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23 12:46:16 -03:00
Jiri Olsa b19a1b6a23 perf tools: Use ARRAY_SIZE in mem sort display functions
There's no need to define extra macros for that.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-13-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23 12:19:10 -03:00
Jiri Olsa ce1e22b08f perf mem: Add -e record option
Adding -e option for perf mem record command, to be able to specify
memory event directly.

Get list of available events:

  $ perf mem record -e list
  ldlat-loads
  ldlat-stores

Monitor ldlat-loads:
  $ perf mem record -e ldlat-loads true

Committer notes:

Further testing:

  # perf mem record -e ldlat-loads true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.020 MB perf.data (10 samples) ]
  # perf evlist
  cpu/mem-loads,ldlat=30/P
  #

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23 12:15:59 -03:00
Jiri Olsa acbe613e0c perf tools: Add monitored events array
It will ease up configuration of memory events and addition of other
memory events in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23 12:11:06 -03:00
Jiri Olsa d392711095 perf tools: Introduce cl_offset function
It'll be used in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23 12:09:22 -03:00
Jiri Olsa e95cf700b1 perf tools: Make cl_address global
It'll be used in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23 12:09:02 -03:00
Wang Nan 03e0a7df3e perf tools: Introduce bpf-output event
Commit a43eec3042 ("bpf: introduce bpf_perf_event_output() helper")
adds a helper to enable a BPF program to output data to a perf ring
buffer through a new type of perf event, PERF_COUNT_SW_BPF_OUTPUT. This
patch enables perf to create events of that type. Now a perf user can
use the following cmdline to receive output data from BPF programs:

  # perf record -a -e bpf-output/no-inherit,name=evt/ \
                    -e ./test_bpf_output.c/map:channel.event=evt/ ls /
  # perf script
     perf 1560 [004] 347747.086295:  evt: ffffffff811fd201 sys_write ...
     perf 1560 [004] 347747.086300:  evt: ffffffff811fd201 sys_write ...
     perf 1560 [004] 347747.086315:  evt: ffffffff811fd201 sys_write ...
            ...

Test result:

  # cat test_bpf_output.c
  /************************ BEGIN **************************/
  #include <uapi/linux/bpf.h>
  struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
  };

  #define SEC(NAME) __attribute__((section(NAME), used))
  static u64 (*ktime_get_ns)(void) =
 	(void *)BPF_FUNC_ktime_get_ns;
  static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
  static int (*get_smp_processor_id)(void) =
 	(void *)BPF_FUNC_get_smp_processor_id;
  static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
 	(void *)BPF_FUNC_perf_event_output;

  struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(u32),
 	.max_entries = __NR_CPUS__,
  };

  SEC("func_write=sys_write")
  int func_write(void *ctx)
  {
 	struct {
 		u64 ktime;
 		int cpuid;
 	} __attribute__((packed)) output_data;
 	char error_data[] = "Error: failed to output: %d\n";

 	output_data.cpuid = get_smp_processor_id();
 	output_data.ktime = ktime_get_ns();
 	int err = perf_event_output(ctx, &channel, get_smp_processor_id(),
 				    &output_data, sizeof(output_data));
 	if (err)
 		trace_printk(error_data, sizeof(error_data), err);
 	return 0;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  /************************ END ***************************/

  # perf record -a -e bpf-output/no-inherit,name=evt/ \
                    -e ./test_bpf_output.c/map:channel.event=evt/ ls /
  # perf script | grep ls
     ls  2242 [003] 347851.557563:   evt: ffffffff811fd201 sys_write ...
     ls  2242 [003] 347851.557571:   evt: ffffffff811fd201 sys_write ...

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-11-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 14:37:21 -03:00
Wang Nan 95088a591e perf tools: Apply tracepoint event definition options to BPF script
Users can pass options to tracepoints defined in the BPF script.  For
example:

  # perf record -e ./test.c/no-inherit/ bash
  # dd if=/dev/zero of=/dev/null count=10000
  # exit
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.022 MB perf.data (139 samples) ]

  (no-inherit works, only the sys_read issued by bash are captured, at
   least 10000 sys_read issued by dd are skipped.)

test.c:

  #define SEC(NAME) __attribute__((section(NAME), used))
  SEC("func=sys_read")
  int bpf_func__sys_read(void *ctx)
  {
      return 1;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;

no-inherit is applied to the kprobe event defined in test.c.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 13:02:44 -03:00
Wang Nan e571e029bd perf tools: Enable indices setting syntax for BPF map
This patch introduces a new syntax to perf event parser:

 # perf record -e './test_bpf_map_3.c/map:channel.value[0,1,2,3...5]=101/' usleep 2

By utilizing the basic facilities in bpf-loader.c which allow setting
different slots in a BPF map separately, the newly introduced syntax
allows perf to control specific elements in a BPF map.

Test result:

  # cat ./test_bpf_map_3.c
  /************************ BEGIN **************************/
  #include <uapi/linux/bpf.h>
  #define SEC(NAME) __attribute__((section(NAME), used))
  struct bpf_map_def {
	unsigned int type;
	unsigned int key_size;
	unsigned int value_size;
	unsigned int max_entries;
  };
  static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
 	(void *)BPF_FUNC_map_lookup_elem;
  static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
  struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(unsigned char),
 	.max_entries = 100,
  };
  SEC("func=hrtimer_nanosleep rqtp->tv_nsec")
  int func(void *ctx, int err, long nsec)
  {
 	char fmt[] = "%ld\n";
 	long usec = nsec * 0x10624dd3 >> 38; // nsec / 1000
 	int key = (int)usec;
 	unsigned char *pval = map_lookup_elem(&channel, &key);

 	if (!pval)
 		return 0;
 	trace_printk(fmt, sizeof(fmt), (unsigned char)*pval);
 	return 0;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  /************************* END ***************************/

Normal case:

  # echo "" > /sys/kernel/debug/tracing/trace
  # ./perf record -e './test_bpf_map_3.c/map:channel.value[0,1,2,3...5]=101/' usleep 2
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data ]
  # cat /sys/kernel/debug/tracing/trace | grep usleep
            usleep-405   [004] d... 2745423.547822: : 101
  # ./perf record -e './test_bpf_map_3.c/map:channel.value[0...9,20...29]=102,map:channel.value[10...19]=103/' usleep 3
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data ]
  # ./perf record -e './test_bpf_map_3.c/map:channel.value[0...9,20...29]=102,map:channel.value[10...19]=103/' usleep 15
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data ]
  # cat /sys/kernel/debug/tracing/trace | grep usleep
            usleep-405   [004] d... 2745423.547822: : 101
            usleep-655   [006] d... 2745434.122814: : 102
            usleep-904   [006] d... 2745439.916264: : 103
  # ./perf record -e './test_bpf_map_3.c/map:channel.value[all]=104/' usleep 99
  # cat /sys/kernel/debug/tracing/trace | grep usleep
            usleep-405   [004] d... 2745423.547822: : 101
            usleep-655   [006] d... 2745434.122814: : 102
            usleep-904   [006] d... 2745439.916264: : 103
            usleep-1537  [003] d... 2745538.053737: : 104

Error case:

  # ./perf record -e './test_bpf_map_3.c/map:channel.value[10...1000]=104/' usleep 99
  event syntax error: '..annel.value[10...1000]=104/'
                                   \___ Index too large
  Hint:	Valid config terms:
      	map:[<arraymap>].value<indices>=[value]
      	map:[<eventmap>].event<indices>=[event]

      	where <indices> is something like [0,3...5] or [all]
      	(add -v to see detail)
  Run 'perf list' for a list of valid events

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

      -e, --event <event>   event selector. use 'perf list' to list available events

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-9-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 12:59:49 -03:00
Wang Nan 2d055bf253 perf tools: Support setting different slots in a BPF map separately
This patch introduces basic facilities to support config different slots
in a BPF map one by one.

array.nr_ranges and array.ranges are introduced into 'struct
parse_events_term', where ranges is an array of indices range (start,
length) which will be configured by this config term. nr_ranges is the
size of the array. The array is passed to 'struct bpf_map_priv'.  To
indicate the new type of configuration, BPF_MAP_KEY_RANGES is added as a
new key type. bpf_map_config_foreach_key() is extended to iterate over
those indices instead of all possible keys.

Code in this commit will be enabled by following commit which enables
the indices syntax for array configuration.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-8-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 12:48:50 -03:00
Wang Nan 7630b3e28d perf tools: Enable passing event to BPF object
A new syntax is added to the parser so that the user can access
predefined perf events in BPF objects.

After this patch, BPF programs for perf are finally able to utilize
bpf_perf_event_read() introduced in commit 35578d7984 ("bpf: Implement
function bpf_perf_event_read() that get the selected hardware PMU
counter").

Test result:

  # cat test_bpf_map_2.c
  /************************ BEGIN **************************/
  #include <uapi/linux/bpf.h>
  #define SEC(NAME) __attribute__((section(NAME), used))
  struct bpf_map_def {
      unsigned int type;
      unsigned int key_size;
      unsigned int value_size;
      unsigned int max_entries;
  };
  static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
      (void *)BPF_FUNC_trace_printk;
  static int (*get_smp_processor_id)(void) =
      (void *)BPF_FUNC_get_smp_processor_id;
  static int (*perf_event_read)(struct bpf_map_def *, int) =
      (void *)BPF_FUNC_perf_event_read;

  struct bpf_map_def SEC("maps") pmu_map = {
      .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
      .key_size = sizeof(int),
      .value_size = sizeof(int),
      .max_entries = __NR_CPUS__,
  };
  SEC("func_write=sys_write")
  int func_write(void *ctx)
  {
      unsigned long long val;
      char fmt[] = "sys_write:        pmu=%llu\n";
      val = perf_event_read(&pmu_map, get_smp_processor_id());
      trace_printk(fmt, sizeof(fmt), val);
      return 0;
  }

  SEC("func_write_return=sys_write%return")
  int func_write_return(void *ctx)
  {
      unsigned long long val = 0;
      char fmt[] = "sys_write_return: pmu=%llu\n";
      val = perf_event_read(&pmu_map, get_smp_processor_id());
      trace_printk(fmt, sizeof(fmt), val);
      return 0;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  /************************* END ***************************/

Normal case:

  # echo "" > /sys/kernel/debug/tracing/trace
  # perf record -i -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' ls /
  [SNIP]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.013 MB perf.data (7 samples) ]
  # cat /sys/kernel/debug/tracing/trace | grep ls
                ls-17066 [000] d... 938449.863301: : sys_write:        pmu=1157327
                ls-17066 [000] dN.. 938449.863342: : sys_write_return: pmu=1225218
                ls-17066 [000] d... 938449.863349: : sys_write:        pmu=1241922
                ls-17066 [000] dN.. 938449.863369: : sys_write_return: pmu=1267445

Normal case (system wide):

  # echo "" > /sys/kernel/debug/tracing/trace
  # perf record -i -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' -a
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.811 MB perf.data (120 samples) ]

  # cat /sys/kernel/debug/tracing/trace | grep -v '18446744073709551594' | grep -v perf | head -n 20
  [SNIP]
  #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
  #              | |       |   ||||       |         |
             gmain-30828 [002] d... 2740551.068992: : sys_write:        pmu=84373
             gmain-30828 [002] d... 2740551.068992: : sys_write_return: pmu=87696
             gmain-30828 [002] d... 2740551.068996: : sys_write:        pmu=100658
             gmain-30828 [002] d... 2740551.068997: : sys_write_return: pmu=102572

Error case 1:

  # perf record -e './test_bpf_map_2.c' ls /
  [SNIP]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.014 MB perf.data ]
  # cat /sys/kernel/debug/tracing/trace | grep ls
                ls-17115 [007] d... 2724279.665625: : sys_write:        pmu=18446744073709551614
                ls-17115 [007] dN.. 2724279.665651: : sys_write_return: pmu=18446744073709551614
                ls-17115 [007] d... 2724279.665658: : sys_write:        pmu=18446744073709551614
                ls-17115 [007] dN.. 2724279.665677: : sys_write_return: pmu=18446744073709551614

  (18446744073709551614 is 0xfffffffffffffffe (-2))

Error case 2:

  # perf record -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=evt/' -a
  event syntax error: '..ps:pmu_map.event=evt/'
                                    \___ Event not found for map setting

  Hint:	Valid config terms:
       	map:[<arraymap>].value=[value]
       	map:[<eventmap>].event=[event]
  [SNIP]

Error case 3:
  # ls /proc/2348/task/
  2348  2505  2506  2507  2508
  # perf record -i -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' -p 2348
  ERROR: Apply config to BPF failed: Cannot set event to BPF map in multi-thread tracing

Error case 4:
  # perf record -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' ls /
  ERROR: Apply config to BPF failed: Doesn't support inherit event (Hint: use -i to turn off inherit)

Error case 5:
  # perf record -i -e raw_syscalls:sys_enter -e './test_bpf_map_2.c/map:pmu_map.event=raw_syscalls:sys_enter/' ls
  ERROR: Apply config to BPF failed: Can only put raw, hardware and BPF output event into a BPF map

Error case 6:
  # perf record -i -e './test_bpf_map_2.c/map:pmu_map.event=123/' ls /
  event syntax error: '.._map.event=123/'
                                    \___ Incorrect value type for map
  [SNIP]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-7-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 12:30:50 -03:00
Wang Nan 8690a2a773 perf record: Apply config to BPF objects before recording
bpf__apply_obj_config() is introduced as the core API to apply object
config options to all BPF objects. This patch also does the real work
for setting values for BPF_MAP_TYPE_PERF_ARRAY maps by inserting value
stored in map's private field into the BPF map.

This patch is required because we are not always able to set all BPF
config during parsing. Further patch will set events created by perf to
BPF_MAP_TYPE_PERF_EVENT_ARRAY maps, which is not exist until
perf_evsel__open().

bpf_map_foreach_key() is introduced to iterate over each key needs to be
configured. This function would be extended to support more map types
and different key settings.

In perf record, before start recording, call bpf__apply_config() to turn
on all BPF config options.

Test result:

  # cat ./test_bpf_map_1.c
  /************************ BEGIN **************************/
  #include <uapi/linux/bpf.h>
  #define SEC(NAME) __attribute__((section(NAME), used))
  struct bpf_map_def {
      unsigned int type;
      unsigned int key_size;
      unsigned int value_size;
      unsigned int max_entries;
  };
  static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
      (void *)BPF_FUNC_map_lookup_elem;
  static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
      (void *)BPF_FUNC_trace_printk;
  struct bpf_map_def SEC("maps") channel = {
      .type = BPF_MAP_TYPE_ARRAY,
      .key_size = sizeof(int),
      .value_size = sizeof(int),
      .max_entries = 1,
  };
  SEC("func=sys_nanosleep")
  int func(void *ctx)
  {
      int key = 0;
      char fmt[] = "%d\n";
      int *pval = map_lookup_elem(&channel, &key);
      if (!pval)
          return 0;
      trace_printk(fmt, sizeof(fmt), *pval);
      return 0;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  /************************* END ***************************/

  # echo "" > /sys/kernel/debug/tracing/trace
  # ./perf record -e './test_bpf_map_1.c/map:channel.value=11/' usleep 10
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data ]
  # cat /sys/kernel/debug/tracing/trace
  # tracer: nop
  #
  # entries-in-buffer/entries-written: 1/1   #P:8
  [SNIP]
  #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
  #              | |       |   ||||       |         |
             usleep-18593 [007] d... 2394714.395539: : 11
  # ./perf record -e './test_bpf_map_1.c/map:channel.value=101/' usleep 10
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data ]
  # cat /sys/kernel/debug/tracing/trace
  # tracer: nop
  #
  # entries-in-buffer/entries-written: 1/1   #P:8
  [SNIP]
  #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
  #              | |       |   ||||       |         |
             usleep-18593 [007] d... 2394714.395539: : 11
             usleep-19000 [006] d... 2394831.057840: : 101

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-6-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 12:28:02 -03:00
Wang Nan a34f3be70c perf tools: Enable BPF object configure syntax
This patch adds the final step for BPF map configuration. A new syntax
is appended into parser so user can config BPF objects through '/' '/'
enclosed config terms.

After this patch, following syntax is available:

  # perf record -e ./test_bpf_map_1.c/map:channel.value=10/ ...

It would takes effect after appling following commits.

Test result:

  # cat ./test_bpf_map_1.c
  /************************ BEGIN **************************/
  #include <uapi/linux/bpf.h>
  #define SEC(NAME) __attribute__((section(NAME), used))
  struct bpf_map_def {
      unsigned int type;
      unsigned int key_size;
      unsigned int value_size;
      unsigned int max_entries;
  };
  static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
      (void *)BPF_FUNC_map_lookup_elem;
  static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
      (void *)BPF_FUNC_trace_printk;
  struct bpf_map_def SEC("maps") channel = {
      .type = BPF_MAP_TYPE_ARRAY,
      .key_size = sizeof(int),
      .value_size = sizeof(int),
      .max_entries = 1,
  };
  SEC("func=sys_nanosleep")
  int func(void *ctx)
  {
      int key = 0;
      char fmt[] = "%d\n";
      int *pval = map_lookup_elem(&channel, &key);
      if (!pval)
          return 0;
      trace_printk(fmt, sizeof(fmt), *pval);
      return 0;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  /************************* END ***************************/

 - Normal case:
  # ./perf record -e './test_bpf_map_1.c/map:channel.value=10/' usleep 10
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data ]

 - Error case:

  # ./perf record -e './test_bpf_map_1.c/map:channel.value/' usleep 10
  event syntax error: '..ps:channel:value/'
                                   \___ Config value not set (missing '=')
  Hint:	Valid config term:
         map:[<arraymap>]:value=[value]
         (add -v to see detail)
  Run 'perf list' for a list of valid events

  Usage: perf record [<options>] [<command>]
     or: perf record [<options>] -- <command> [<options>]

     -e, --event <event>   event selector. use 'perf list' to list available events

  # ./perf record -e './test_bpf_map_1.c/xmap:channel.value=10/' usleep 10
  event syntax error: '..pf_map_1.c/xmap:channel.value=10/'
                                    \___ Invalid object config option
  [SNIP]

  # ./perf record -e './test_bpf_map_1.c/map:xchannel.value=10/' usleep 10
  event syntax error: '..p_1.c/map:xchannel.value=10/'
                                    \___ Target map not exist
  [SNIP]

  # ./perf record -e './test_bpf_map_1.c/map:channel.xvalue=10/' usleep 10
  event syntax error: '..ps:channel.xvalue=10/'
                                    \___ Invalid object map config option
  [SNIP]

  # ./perf record -e './test_bpf_map_1.c/map:channel.value=x10/' usleep 10
  event syntax error: '..nnel.value=x10/'
                                    \___ Incorrect value type for map
  [SNIP]

  Change BPF_MAP_TYPE_ARRAY to '1' in test_bpf_map_1.c:

  # ./perf record -e './test_bpf_map_1.c/map:channel.value=10/' usleep 10
  event syntax error: '..ps:channel.value=10/'
                                    \___ Can't use this config term to this type of map

  Hint:	Valid config term:
      	map:[<arraymap>].value=[value]
      	(add -v to see detail)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
[for parser part]
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-5-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 12:20:35 -03:00
Wang Nan 066dacbf2a perf bpf: Add API to set values to map entries in a bpf object
bpf__config_obj() is introduced as a core API to config BPF object after
loading. One configuration option of maps is introduced. After this
patch BPF object can accept assignments like:

  map:my_map.value=1234

(map.my_map.value looks pretty. However, there's a small but hard to fix
problem related to flex's greedy matching. Please see [1].  Choose ':'
to avoid it in a simpler way.)

This patch is more complex than the work it does because the
consideration of extension. In designing BPF map configuration, the
following things should be considered:

 1. Array indices selection: perf should allow user setting different
    value for different slots in an array, with syntax like:
    map:my_map.value[0,3...6]=1234;

 2. A map should be set by different config terms, each for a part
    of it. For example, set each slot to the pid of a thread;

 3. Type of value: integer is not the only valid value type. A perf
    counter can also be put into a map after commit 35578d7984
    ("bpf: Implement function bpf_perf_event_read() that get the
      selected hardware PMU counter")

 4. For a hash table, it should be possible to use a string or other
    value as a key;

 5. It is possible that map configuration is unable to be setup
    during parsing. A perf counter is an example.

Therefore, this patch does the following:

 1. Instead of updating map element during parsing, this patch stores
    map config options in 'struct bpf_map_priv'. Following patches
    will apply those configs at an appropriate time;

 2. Link map operations in a list so a map can have multiple config
    terms attached, so different parts can be configured separately;

 3. Make 'struct bpf_map_priv' extensible so that the following patches
    can add new types of keys and operations;

 4. Use bpf_obj_config__map_funcs array to support more map config options.

Since the patch changing the event parser to parse BPF object config is
relative large, I've put it in another commit. Code in this patch can be
tested after applying the next patch.

[1] http://lkml.kernel.org/g/564ED621.4050500@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-4-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Changes "maps:my_map.value" to "map:my_map.value", improved error messages ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 12:17:48 -03:00
Namhyung Kim 0c0af78d47 perf tools: Fix column width setting on 'trace' sort key
It missed to update column length of the 'trace' sort key in the
hists__calc_col_len() so it might truncate the output.  It calculated
the column length in the ->cmp() callback originally but it doesn't
guarantee it's called always.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1456064558-13086-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 12:06:53 -03:00
Namhyung Kim 2960ed6f8d perf tools: Fix alignment on some sort keys
The srcline, srcfile and trace sort keys can have long entries.  With
commit 89fee70943 ("perf hists: Do column alignment on the format
iterator"), it now aligns output with hist_entry__snprintf_alignment().
So each (possibly long) sort entries don't need to do it themselves.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1456101153-14519-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 12:05:55 -03:00
Namhyung Kim cecaec635d perf tools: Update srcline/file if needed
Normally the hist entry's srcline and/or srcfile is set during sorting.
However sometime it's possible to a hist entry's srcline is not set yet
after the sorting.  This is because the entry is so unique and other
sort keys already make it distinct.  Then the srcline/file sort didn't
have a chance to be called during the sorting.  In that case it has NULL
srcline/srcfile field and shows nothing.

Before:

  $ perf report -s comm,sym,srcline
  ...
  Overhead  Command       Symbol
  -----------------------------------------------------------------
    34.42%  swapper       [k] intel_idle          intel_idle.c:0
     2.44%  perf          [.] __poll_nocancel     (null)
     1.70%  gnome-shell   [k] fw_domains_get      (null)
     1.04%  Xorg          [k] sock_poll           (null)

After:

    34.42%  swapper       [k] intel_idle          intel_idle.c:0
     2.44%  perf          [.] __poll_nocancel     .:0
     1.70%  gnome-shell   [k] fw_domains_get      fw_domains_get+42
     1.04%  Xorg          [k] sock_poll           socket.c:0

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1456101111-14400-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 12:04:34 -03:00
Namhyung Kim 665aa75700 perf tools: Fix segfault on dynamic entries
A dynamic entry is created for each tracepoint event.  When it sets up
the sort key, it checks with existing keys using ->equal() callback.
But it missed to set the ->equal for dynamic entries.  The following
segfault was due to the missing ->equal() callback.

  (gdb) bt
  #0  0x0000000000140003 in ?? ()
  #1  0x0000000000537769 in fmt_equal (b=0x2106980, a=0x21067a0) at ui/hist.c:548
  #2  perf_hpp__setup_output_field (list=0x8c6d80 <perf_hpp_list>) at ui/hist.c:560
  #3  0x00000000004e927e in setup_sorting (evlist=<optimized out>) at util/sort.c:2642
  #4  0x000000000043cf50 in cmd_report (argc=<optimized out>, argv=<optimized out>, prefix=<optimized out>)
      at builtin-report.c:932
  #5  0x00000000004865a1 in run_builtin (p=p@entry=0x8bbce0 <commands+192>, argc=argc@entry=7,
      argv=argv@entry=0x7ffd24d56ce0) at perf.c:390
  #6  0x000000000042dc1f in handle_internal_command (argv=0x7ffd24d56ce0, argc=7) at perf.c:451
  #7  run_argv (argv=0x7ffd24d56a70, argcp=0x7ffd24d56a7c) at perf.c:495
  #8  main (argc=7, argv=0x7ffd24d56ce0) at perf.c:620

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1456064558-13086-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 12:01:09 -03:00
Arnaldo Carvalho de Melo 58de6ed0a9 perf tools: Remove duplicate typedef config_term_func_t definition
Older compilers don't like this, for instance, on RHEL6.7:

    CC       /tmp/build/perf/util/parse-events.o
  util/parse-events.c:844: error: redefinition of typedef ‘config_term_func_t’
  util/parse-events.c:353: note: previous declaration of ‘config_term_func_t’ was here

So remove the second definition, that should've been just moved in 43d0b97817
("perf tools: Enable config and setting names for legacy cache events"), not copied.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 43d0b97817 ("perf tools: Enable config and setting names for legacy cache events")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 11:48:44 -03:00
Arnaldo Carvalho de Melo 2c97b0d4a7 perf tools: Fix build on older systems
In RHEL 6.7:

  CC       /tmp/build/perf/util/parse-events.o
  cc1: warnings being treated as errors
  util/parse-events.c: In function ‘parse_events_add_cache’:
  util/parse-events.c:366: error: declaration of ‘error’ shadows a global declaration
  util/util.h:136: error: shadowed declaration is here

Rename it to 'err'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 43d0b97817 ("perf tools: Enable config and setting names for legacy cache events")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22 11:48:44 -03:00
Namhyung Kim bba58cdfaa perf hists: Return error from hists__collapse_resort()
Currently hists__collapse_resort() and hists__collapse_insert_entry()
don't return an error code. Now that callchain_merge() can check for
errors, abort and pass the error to the user.  A later patch can add
more work which also can fail.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:16:06 -03:00
Namhyung Kim dca0d122e4 perf callchain: Check return value of append_chain_children()
Now it can check the error case, so check and pass it to the caller.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:15:01 -03:00
Namhyung Kim f2bb4c5af4 perf callchain: Check return value of split_add_child()
Now create_child() and add_child() return errors so check and pass it
to the caller.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:14:36 -03:00
Namhyung Kim 2d713b809d perf callchain: Add enum match_result for match_chain()
The append_chain() might return either result of match_chain() or other
(error) code.  But match_chain() can return any value in s64 type so
it's hard to check the error case.  Add new enum match_result and make
match_chain() return non-negative values only so that we can check the
error cases.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:14:20 -03:00
Namhyung Kim 8451cbb9b1 perf callchain: Check return value of fill_node()
Memory allocation in the fill_node() can fail so change its return type
to int and check it in add_child() too.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:13:21 -03:00
Namhyung Kim 7565bd39c1 perf callchain: Check return value of add_child()
The create_child() in add_child() can return NULL in case of memory
allocation failure.  So check the return value and bail out.  The proper
error handling will be added later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:12:52 -03:00
Namhyung Kim 467ef10c68 perf hists browser: Fix percentage update on key press
Currently 'perf top --tui' decrements percentage of all entries on any
key press.  This is because it adds total period as new samples are
added to hists.  As perf-top does it currently but added samples are not
passed to the display thread, the percentages are decresing
continuously.

So separate total period stat into a different variable so that it
cannot affect the output total period.  This new total period stats are
used only for calcualating callchain percent limit.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 0f58474ec8 ("perf hists: Update hists' total period when adding entries")
Link: http://lkml.kernel.org/r/1455631723-17345-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:12:51 -03:00
Wang Nan 43d0b97817 perf tools: Enable config and setting names for legacy cache events
This patch allows setting config terms for legacy cache events.
For example:

  # perf stat -e L1-icache-misses/name=valA/ -e branches/name=valB/ ls
  ...
   Performance counter stats for 'ls':

              11299      valA
             451605      valB

        0.000779091 seconds time elapsed

  # perf record -e cache-misses/name=inh/ -e cache-misses/name=noinh,no-inherit/ bash
  # ls
  # exit
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.023 MB perf.data (131 samples) ]
  # perf report --stdio | grep -B 1 'Event count'
  # Samples: 105  of event 'inh'
  # Event count (approx.): 109118
  --
  # Samples: 26  of event 'noinh'
  # Event count (approx.): 48302

A test case is introduced to test this feature.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-14-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:12:51 -03:00
Wang Nan 10bf358a1b perf tools: Enable config raw and numeric events
This patch allows setting config terms for raw and numeric events.
For example:

  # perf stat -e cycles/name=cyc/ ls
  ...
  1821108      cyc
  ...

  # perf stat -e r6530160/name=event/ ls
  ...
  1103195      event
  ...

  # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
  ...
  # perf report --stdio
  ...
  # Samples: 124  of event 'cycles'
  46.61%     0.00%  swapper        [kernel.vmlinux]            [k] cpu_startup_entry
  41.26%     0.00%  swapper        [kernel.vmlinux]            [k] start_secondary
  ...
  # Samples: 91  of event 'evtx'
  ...
  93.76%     0.00%  swapper      [kernel.vmlinux]            [k] cpu_startup_entry
          |
          ---cpu_startup_entry
             |
             |--66.63%--call_cpuidle
             |          cpuidle_enter
             |          |
  ...

3 test cases are introduced to test config terms for symbol, raw and
numeric events.

Committer note:

Further testing shows that we can retrieve the event name using 'perf
evlist -v' and looking at the 'config' perf_event_attr field, i.e.:

  # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.724 MB perf.data (2076 samples) ]
  # perf evlist
  cycles
  evtx
  # perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
evtx: type: 4, size: 112, config: 0x6530160, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
  #

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-13-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:12:50 -03:00
Arnaldo Carvalho de Melo 1d55e8ef34 perf tools: Introduce opt_event_config nonterminal
To remove duplicated code that differs only in using the matching
'/a,b,c/' part or NULL if no event configuration is done ('//' or no
pair of slashes at all).

Will be used by some new targets allowing the configuration of hardware
events, etc.

Lifted part of the 'opt_event_config' nonterminal from a patch by Wang
Nan.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-e3xzpx9cqsmwnaguaxyw6r42@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:12:50 -03:00
Wang Nan e814fddde1 perf tools: Rename and move pmu_event_name to get_config_name
Following commits will make more events obey /name=newname/ options.
This patch makes pmu_event_name() a generic helper.

Makes new get_config_name() accept NULL input to make life easier.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-12-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:12:50 -03:00
Wang Nan 1669e509ea perf stat: Bail out on unsupported event config modifiers
'perf stat' accepts some config terms but doesn't apply them. For
example:

  # perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
  # ls
  # exit

  Performance counter stats for 'bash':

         266258061      instructions/no-inherit/
         266258061      instructions/inherit/

       1.402183915 seconds time elapsed

The result is confusing, because user may expect the first
'instructions' event exclude the 'ls' command.

This patch forbid most of these config terms for 'perf stat'.

Result:

  # ./perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
  event syntax error: 'instructions/no-inherit/'
                       \___ 'no-inherit' is not usable in 'perf stat'
  ...

We can add blocked config terms back when 'perf stat' really supports them.

This patch also removes unavailable config term from error message:

  # ./perf stat -e 'instructions/badterm/' ls
  event syntax error: 'instructions/badterm/'
                                    \___ unknown term

  valid terms: config,config1,config2,name

  # ./perf stat -e 'cpu/badterm/' ls
  event syntax error: 'cpu/badterm/'
                           \___ unknown term

  valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-11-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:12:49 -03:00
Wang Nan 17cb5f84b8 perf tools: Create config_term_names array
config_term_names[] is introduced for future commits which will be able
to retrieve the config name through the config term.

Utilize this array in parse_events_formats_error_string() so the missing
'{,no-}inherit' terms are added.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:12:48 -03:00
Wang Nan 26dee028d3 perf tools: Fix checking asprintf return value
According to man pages, asprintf returns -1 when failure. This patch
fixes two incorrect return value checker.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: stable@vger.kernel.org # v4.4+
Fixes: ffeb883e56 ("perf tools: Show proper error message for wrong terms of hw/sw events")
Link: http://lkml.kernel.org/r/1455882283-79592-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:12:47 -03:00
Wang Nan 80cdce7666 perf bpf: Rename bpf_prog_priv__clear() to clear_prog_priv()
The name of bpf_prog_priv__clear() doesn't follow perf's naming
convention. bpf_prog_priv__delete() seems to be a better name. However,
bpf_prog_priv__delete() should be a method of 'struct bpf_prog_priv',
but its first parameter is 'struct bpf_program'.

It is callback from libbpf to clear priv structures when destroying a
bpf program. It is actually a method of bpf_program (libbpf object), but
bpf_program__ functions should be provided by libbpf.

This patch removes the prefix of that function.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:12:46 -03:00
Arnaldo Carvalho de Melo d9aade7fd2 perf evlist: Handle -EINVAL for sample_freq > max_sample_rate in strerror_open()
When running the "code reading" test we get:

  # perf test -v "code reading" 2>&1 | tail -5
  Parsing event 'cycles:u'
  perf_evlist__open failed
  test child finished with -1
  ---- end ----
  Test object code reading: FAILED!
  #

And with -vv we get the errno value, -22, i.e. -EINVAL, but we can do
better and handle the case at hand, with this patch it becomes:

  # perf test -v "code reading" 2>&1 | tail -7
  perf_evlist__open() failed!
  Error: Invalid argument.
  Hint:  Check /proc/sys/kernel/perf_event_max_sample_rate.
  Hint:  The current value is 1000 and 4000 is being requested.
  test child finished with -1
  ---- end ----
  Test object code reading: FAILED!
  #

Next patch will make this 'perf test' entry to use perf_evlist__strerror()

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i31ai6kfefn75eapejjokfhc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19 19:12:42 -03:00
Jiri Olsa 85723885fe perf record: Add --all-user/--all-kernel options
Allow user to easily switch all events to user or kernel space with simple
--all-user or --all-kernel options.

This will be handy within perf mem/c2c wrappers to switch easily monitoring
modes.

Committer note:

Testing it:

  # perf record --all-kernel --all-user -a sleep 2
   Error: option `all-user' cannot be used with all-kernel
   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

        --all-user        Configure all used events to run in user space.
        --all-kernel      Configure all used events to run in kernel space.
  # perf record --all-user --all-kernel -a sleep 2
   Error: option `all-kernel' cannot be used with all-user
   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

        --all-kernel      Configure all used events to run in kernel space.
        --all-user        Configure all used events to run in user space.
  # perf record --all-user -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.416 MB perf.data (162 samples) ]
  # perf report | grep '\[k\]'
  # perf record --all-kernel -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.423 MB perf.data (296 samples) ]
  # perf report | grep '\[\.\]'
  #

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-2-git-send-email-jolsa@kernel.org
[ Made those options to be mutually exclusive ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-18 10:48:44 -03:00
Arnaldo Carvalho de Melo a55e566376 perf evlist: Reference count the cpu and thread maps at set_maps()
We were dropping the reference we possibly held but not obtaining one
for the new maps, which we will drop at perf_evlist__delete(), fix it.

This was caught by Steven Noonan in some of the machines which would
produce this output when caught by glibc debug mechanisms:

  $ sudo perf test 21
  21: Test object code reading                                 :***
  Error in `perf': corrupted double-linked list: 0x00000000023ffcd0 ***
  ======= Backtrace: =========
  /usr/lib/libc.so.6(+0x72055)[0x7f25be0f3055]
  /usr/lib/libc.so.6(+0x779b6)[0x7f25be0f89b6]
  /usr/lib/libc.so.6(+0x7a0ed)[0x7f25be0fb0ed]
  /usr/lib/libc.so.6(__libc_calloc+0xba)[0x7f25be0fceda]
  perf(parse_events_lex_init_extra+0x38)[0x4cfff8]
  perf(parse_events+0x55)[0x4a0615]
  perf(perf_evlist__config+0xcf)[0x4eeb2f]
  perf[0x479f82]
  perf(test__code_reading+0x1e)[0x47ad4e]
  perf(cmd_test+0x5dd)[0x46452d]
  perf[0x47f4e3]
  perf(main+0x603)[0x42c723]
  /usr/lib/libc.so.6(__libc_start_main+0xf0)[0x7f25be0a1610]
  perf(_start+0x29)[0x42c859]

Further investigation using valgrind led to the reference count imbalance fixed
in this patch.

Reported-and-Tested-by: Steven Noonan <steven@uplinklabs.net>
Report-Link: http://lkml.kernel.org/r/CAKbGBLjC2Dx5vshxyGmQkcD+VwiAQLbHoXA9i7kvRB2-2opHZQ@mail.gmail.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: f30a79b012 ("perf tools: Add reference counting for cpu_map object")
Link: http://lkml.kernel.org/n/tip-j0u1bdhr47sa511sgg76kb8h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-18 10:48:37 -03:00
Andi Kleen 140aeadc1f perf stat: Abstract stat metrics printing
Abstract the printing of shadow metrics. Instead of every metric calling
fprintf directly and taking care of indentation, use two call backs: one
to print metrics and another to start a new line.

This will allow adding metrics to CSV mode and also using them for other
purposes.

The computation of padding is now done in the central callback, instead
of every metric doing it manually.  This makes it easier to add new
metrics.

v2: Refactor functions, printout now does more. Move
shadow printing. Improve fallback callbacks. Don't
use void * callback data.
v3: Remove unnecessary hunk. Add typedef for new_line
v4: Remove unnecessary hunk. Don't print metrics for CSV/interval
mode yet.  Move printout change to separate patch.
v5: Fix bisect bugs. Avoid bogus frontend cycles printing.
Fix indentation in different aggregation modes.
v6: Delay newline handling

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454173616-17710-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16 17:13:00 -03:00
Jiri Olsa 720e98b5fa perf tools: Add perf data cache feature
Storing CPU cache details under perf data. It's stored as new
HEADER_CACHE feature and it's displayed under header info with -I
option:

  $ perf report --header-only -I
  ...
  # CPU cache info:
  #  L1 Data                 32K [0-1]
  #  L1 Instruction          32K [0-1]
  #  L1 Data                 32K [2-3]
  #  L1 Instruction          32K [2-3]
  #  L2 Unified             256K [0-1]
  #  L2 Unified             256K [2-3]
  #  L3 Unified            4096K [0-3]
  ...

All distinct caches are stored/displayed.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160216150143.GA7119@krava.brq.redhat.com
[ Fixed leak on process_caches(), s/cache_level/cpu_cache_level/g ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16 17:13:00 -03:00
Jiri Olsa dd629cc097 perf tools: Initialize libapi debug output
Setting libapi debug output functions to use perf functions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1455465826-8426-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16 17:12:59 -03:00
Arnaldo Carvalho de Melo bedbdd4297 perf debug: Rename __eprintf(va_list args) to veprintf
Adhering to the naming convention used when va_args is in a printf like
function, e.g. stdio.h.

Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-b5l3wt77ct28dcnriguxtvn6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16 17:12:58 -03:00
Jiri Olsa 607bfbd7ff tools lib api fs: Adopt filename__read_str from perf
We already moved similar functions in here, also it'll be useful for
sysfs__read_str addition in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1455465826-8426-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16 17:12:56 -03:00
Wang Nan 5141d7350d perf data: Fix releasing event_class
A new patch in libbabeltrace [1] reveals a object leak problem in
'perf data' CTF support: perf code never releases the event_class
which is allocated in add_event() and stored in evsel's private field.

If libbabeltrace has the above patch applied, leaking event_class
prevents the writer from being destroyed and flushing metadata. For
example:

  $ perf record ls
  perf.data
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data (12 samples) ]
  $ perf data convert --to-ctf ./out.ctf
  [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
  [ perf data convert: Converted and wrote 0.000 MB (12 samples) ]
  $ cat ./out.ctf/metadata
  $ ls -l  ./out.ctf/metadata
  -rw-r----- 1 w00229757 mm 0 Jan 27 10:49 ./out.ctf/metadata

The correct result should be:
  ...
  $ cat ./out.ctf/metadata
  /* CTF 1.8 */

  trace {
  [SNIP]

  $ ls -l  ./out.ctf/metadata
  -rw-r----- 1 w00229757 mm 2446 Jan 27 10:52 ./out.ctf/metadata

The full story is:

Patch [1] of babeltrace redesigns its reference counting scheme. In that
patch:

 * writer <- trace (bt_ctf_writer_create)
 * trace <- stream_class (bt_ctf_trace_add_stream_class)
 * stream_class <- event_class (bt_ctf_stream_class_add_event_class)
 ('<-' means 'is a parent of')

Holding of event_class causes reference count of corresponding 'writer'
to increase through parent chain. Perf expects that 'writer' is released
(so metadata is flushed) through bt_ctf_writer_put() in
ctf_writer__cleanup(). However, since it never releases event_class, the
reference of 'writer' won't be dropped, so bt_ctf_writer_put() won't
lead to the release of writer.

Before this CTF patch, !(writer <- trace). Even with event_class leaking,
the writer ends up being released.

[1] e6a8e8e474

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1454680939-24963-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 17:27:48 -03:00
Arnaldo Carvalho de Melo 2146afc6b4 perf tools: Rename parse_events__free_terms() to parse_events_terms__delete()
To follow convention used in other tools/perf/ areas. Also remove the
need to check if it is NULL before calling the destructor, again, to
follow convention that goes back to free().

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-w6owu7rb8a46gvunlinxaqwx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 17:09:17 -03:00
Wang Nan d20a5f2b27 perf tools: Free the terms list_head in parse_events__free_terms()
Fixing a leak, since code calling parse_events__free_terms() expect it
to free the list_head too.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
[ Spun off from another patch ]
Link: http://lkml.kernel.org/r/1454680939-24963-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 17:01:17 -03:00
Arnaldo Carvalho de Melo 682dc24c2a perf tools: Use perf_event_terms__purge() for non-malloced terms
In these two cases, a 'perf test' entry and in the PMU code the
list_head is on the stack, so we can't use perf_event__free_terms()
(soon to be renamed to perf_event_terms__delete()), because it will
free the list_head as well.

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-i956ryjhz97gnnqe8iqe7m7s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 16:53:22 -03:00
Arnaldo Carvalho de Melo fc0a2c1d59 perf tools: Introduce parse_events_terms__purge()
Purges 'struct parse_event_term' entries from a list_head.

Some users need this because they don't allocate space for the list
head, it maybe on the stack or embedded into some other struct.

Next patch will convert users that need just purging and then the
perf_events__free_terms() routine will free the list head as well,
finally being renamed to perf_events_terms__delete().

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-4w3zl4ifcl0ed0j4bu3tckqp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 16:53:19 -03:00
Wang Nan a8adfceb38 perf tools: Unlink entries from terms list
We were just freeing them, better unlink and init its nodes to catch
bugs faster if we keep dangling references to them.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
[ Spun off from another patch, use list_del_init() instead of list_del() ]
Link: http://lkml.kernel.org/r/1454680939-24963-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 16:51:15 -03:00
Arnaldo Carvalho de Melo 89fee70943 perf hists: Do column alignment on the format iterator
We were doing column alignment in the format function for each cell,
returning a string padded with spaces so that when the next column is
printed the cursor is at its column alignment.

This ends up needlessly printing trailing spaces, do it at the format
iterator, that is where we know if it is needed, i.e. if there is more
columns to be printed.

This eliminates the need for triming lines when doing a dump using 'P'
in the TUI browser and also produces far saner results with things like
piping 'perf report' to 'less'.

Right now only the formatters for sym->name and the 'locked' column
(perf mem report), that are the ones that end up at the end of lines
in the default 'perf report', 'perf top' and 'perf mem report' tools,
the others will be done in a subsequent patch.

In the end the 'width' parameter for the formatters now mean, in
'printf' terms, the 'precision', where before it was the field 'width'.

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-s7iwl2gj23w92l6tibnrcqzr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 12:52:25 -03:00
Arnaldo Carvalho de Melo 37d9bb580a perf tools: Add comment explaining the repsep_snprintf function
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4j67nvlfwbnkg85b969ewnkr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 12:52:20 -03:00
Wang Nan e7ee404757 perf symbols: Fix symbols searching for module in buildid-cache
Before this patch, if a sample is triggered inside a module not in
/lib/modules/`uname -r`/, even if the module is in buildid-cache, 'perf
report' will still be unable to find the correct symbol.  For example:

  # rm -rf ~/.debug/
  # perf buildid-cache -a ./mymodule.ko
  # perf probe -m ./mymodule.ko -a get_mymodule_val
  Added new event:
    probe:get_mymodule_val (on get_mymodule_val in mymodule)

  You can now use it in all perf tools, such as:

 	perf record -e probe:get_mymodule_val -aR sleep 1

  # perf record -e probe:get_mymodule_val cat /proc/mymodule
  mymodule:3
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]

  # perf report --stdio
  [SNIP]
  #
  # Overhead  Command  Shared Object     Symbol
  # ........  .......  ................  ......................
  #
    100.00%  cat      [mymodule]        [k] 0x0000000000000001

  # perf report -vvvv --stdio
  dso__load_sym: adjusting symbol: st_value: 0 sh_addr: 0 sh_offset: 0x70
  symbol__new: get_mymodule_val 0x70-0x8a
  [SNIP]

This is caused by dso__load() -> dso__load_sym(). In dso__load(), kmod
is true only when its file is found in some well know directories. All
files loaded from buildid-cache are treated as user programs. Following
dso__load_sym() set map->pgoff incorrectly.

This patch gives kernel modules in buildid-cache a chance to adjust
value of kmod. After dso__load() get the type of symbols, if it is
buildid, check the last 3 chars of original filename against '.ko', and
adjust the value of kmod if the file is a kernel module.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1454680939-24963-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 10:54:47 -03:00