2012-09-27 07:05:56 +08:00
|
|
|
perf-trace(1)
|
|
|
|
=============
|
|
|
|
|
|
|
|
NAME
|
|
|
|
----
|
|
|
|
perf-trace - strace inspired tool
|
|
|
|
|
|
|
|
SYNOPSIS
|
|
|
|
--------
|
|
|
|
[verse]
|
|
|
|
'perf trace'
|
2013-09-29 03:13:01 +08:00
|
|
|
'perf trace record'
|
2012-09-27 07:05:56 +08:00
|
|
|
|
|
|
|
DESCRIPTION
|
|
|
|
-----------
|
|
|
|
This command will show the events associated with the target, initially
|
|
|
|
syscalls, but other system events like pagefaults, task lifetime events,
|
|
|
|
scheduling events, etc.
|
|
|
|
|
2013-09-29 03:13:01 +08:00
|
|
|
This is a live mode tool in addition to working with perf.data files like
|
|
|
|
the other perf tools. Files can be generated using the 'perf record' command
|
|
|
|
but the session needs to include the raw_syscalls events (-e 'raw_syscalls:*').
|
2014-09-09 23:18:50 +08:00
|
|
|
Alternatively, 'perf trace record' can be used as a shortcut to
|
2013-09-29 03:13:01 +08:00
|
|
|
automatically include the raw_syscalls events when writing events to a file.
|
|
|
|
|
|
|
|
The following options apply to perf trace; options to perf trace record are
|
|
|
|
found in the perf record man page.
|
2012-09-27 07:05:56 +08:00
|
|
|
|
|
|
|
OPTIONS
|
|
|
|
-------
|
|
|
|
|
2013-08-21 01:15:45 +08:00
|
|
|
-a::
|
2012-09-27 07:05:56 +08:00
|
|
|
--all-cpus::
|
|
|
|
System-wide collection from all CPUs.
|
|
|
|
|
2013-08-09 23:28:31 +08:00
|
|
|
-e::
|
|
|
|
--expr::
|
2017-01-10 04:26:26 +08:00
|
|
|
--event::
|
|
|
|
List of syscalls and other perf events (tracepoints, HW cache events,
|
perf trace: Support syscall name globbing
So now we can use:
# perf trace -e pkey_*
532.784 ( 0.006 ms): pkey/16018 pkey_alloc(init_val: DISABLE_WRITE) = -1 EINVAL Invalid argument
532.795 ( 0.004 ms): pkey/16018 pkey_mprotect(start: 0x7f380d0a6000, len: 4096, prot: READ|WRITE, pkey: -1) = 0
532.801 ( 0.002 ms): pkey/16018 pkey_free(pkey: -1 ) = -1 EINVAL Invalid argument
^C[root@jouet ~]#
Or '-e epoll*', '-e *msg*', etc.
Combining syscall names with perf events, tracepoints, etc, continues to
be valid, i.e. this is possible:
# perf probe -L sys_nanosleep
<SyS_nanosleep@/home/acme/git/linux/kernel/time/hrtimer.c:0>
0 SYSCALL_DEFINE2(nanosleep, struct timespec __user *, rqtp,
struct timespec __user *, rmtp)
{
struct timespec64 tu;
5 if (get_timespec64(&tu, rqtp))
6 return -EFAULT;
if (!timespec64_valid(&tu))
9 return -EINVAL;
11 current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE;
12 current->restart_block.nanosleep.rmtp = rmtp;
13 return hrtimer_nanosleep(&tu, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
}
# perf probe my_probe="sys_nanosleep:12 rmtp"
Added new event:
probe:my_probe (on sys_nanosleep:12 with rmtp)
You can now use it in all perf tools, such as:
perf record -e probe:my_probe -aR sleep 1
#
# perf trace -e probe:my_probe/max-stack=5/,*sleep sleep 1
0.427 ( 0.003 ms): sleep/16690 nanosleep(rqtp: 0x7ffefc245090) ...
0.430 ( ): probe:my_probe:(ffffffffbd112923) rmtp=0)
sys_nanosleep ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
__nanosleep_nocancel (/usr/lib64/libc-2.25.so)
0.427 (1000.208 ms): sleep/16690 ... [continued]: nanosleep()) = 0
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-elycoi8wy6y0w9dkj7ox1mzz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-31 22:50:04 +08:00
|
|
|
etc) to show. Globbing is supported, e.g.: "epoll_*", "*msg*", etc.
|
2017-01-10 04:26:26 +08:00
|
|
|
See 'perf list' for a complete list of events.
|
2013-08-21 23:56:21 +08:00
|
|
|
Prefixing with ! shows all syscalls but the ones specified. You may
|
|
|
|
need to escape it.
|
2013-08-09 23:28:31 +08:00
|
|
|
|
perf trace: Implement --delay
In the perf wiki todo-list[1], there is an entry regarding initial-delay
and 'perf trace'; the following small patch tries to fulfill this point.
It has been generated against the branch tip/perf/core.
It has only been implemented in the "trace__run" case.
Ex.:
$ sudo strace -- ./perf trace --delay 5 sleep 1 2>&1
...
fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
ioctl(7, PERF_EVENT_IOC_ID, 0x7ffc8fd35718) = 0
ioctl(11, PERF_EVENT_IOC_SET_OUTPUT, 0x7) = 0
fcntl(11, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
ioctl(11, PERF_EVENT_IOC_ID, 0x7ffc8fd35718) = 0
write(6, "\0", 1) = 1
close(6) = 0
nanosleep({0, 5000000}, NULL) = 0 # DELAY OF 5 MS BEFORE ENABLING THE EVENTS
ioctl(3, PERF_EVENT_IOC_ENABLE, 0) = 0
ioctl(4, PERF_EVENT_IOC_ENABLE, 0) = 0
ioctl(5, PERF_EVENT_IOC_ENABLE, 0) = 0
ioctl(7, PERF_EVENT_IOC_ENABLE, 0) = 0
...
[1]: https://perf.wiki.kernel.org/index.php/Todo
Signed-off-by: Alexis Berlemont <alexis.berlemont@gmail.com>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161010054328.4028-2-alexis.berlemont@gmail.com
[ Add entry to the manpage, cut'n'pasted from stat's and record's ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-10 13:43:28 +08:00
|
|
|
-D msecs::
|
|
|
|
--delay msecs::
|
|
|
|
After starting the program, wait msecs before measuring. This is useful to
|
|
|
|
filter out the startup phase of the program, which is often very different.
|
|
|
|
|
2013-08-19 23:01:10 +08:00
|
|
|
-o::
|
|
|
|
--output=::
|
|
|
|
Output file name.
|
|
|
|
|
2012-09-27 07:05:56 +08:00
|
|
|
-p::
|
|
|
|
--pid=::
|
|
|
|
Record events on existing process ID (comma separated list).
|
|
|
|
|
2013-08-21 01:15:45 +08:00
|
|
|
-t::
|
2012-09-27 07:05:56 +08:00
|
|
|
--tid=::
|
|
|
|
Record events on existing thread ID (comma separated list).
|
|
|
|
|
2013-08-21 01:15:45 +08:00
|
|
|
-u::
|
2012-09-27 07:05:56 +08:00
|
|
|
--uid=::
|
|
|
|
Record events in threads owned by uid. Name or number.
|
|
|
|
|
perf trace: Support setting cgroups as targets
One can set a cgroup as a default cgroup to be used by all events or
set cgroups with the 'perf stat' and 'perf record' behaviour, i.e.
'-G A' will be the cgroup for events defined so far in the command line.
Here in my main machine, with a kvm instance running a rhel6 guinea pig
I have:
# ls -la /sys/fs/cgroup/perf_event/ | grep drw
drwxr-xr-x. 14 root root 360 Mar 6 12:04 ..
drwxr-xr-x. 3 root root 0 Mar 6 15:05 machine.slice
#
So I can go ahead and use that cgroup hierarchy, say lets see what
syscalls are being emitted by threads in that 'machine.slice' hierarchy
that are taking more than 100ms:
# perf trace --duration 100 -G machine.slice
0.188 (249.850 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
250.274 (249.743 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
500.224 (249.755 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
750.097 (249.934 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
1000.244 (249.780 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
1250.197 (249.796 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
1500.124 (249.859 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
1750.076 (172.900 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
902.570 (1021.116 ms): qemu-system-x8/23667 ppoll(ufds: 0x558151e03180, nfds: 74, tsp: 0x7ffc00cd0900, sigsetsize: 8) = 1
1923.825 (305.133 ms): qemu-system-x8/23667 ppoll(ufds: 0x558151e03180, nfds: 74, tsp: 0x7ffc00cd0900, sigsetsize: 8) = 1
2000.172 (229.002 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
^C #
If we look inside that cgroup hierarchy we get:
# ls -la /sys/fs/cgroup/perf_event/machine.slice/ | grep drw
drwxr-xr-x. 3 root root 0 Mar 6 15:05 .
drwxr-xr-x. 2 root root 0 Mar 6 16:16 machine-qemu\x2d2\x2drhel6.sandy.scope
#
There is just one, but lets say there were more and we would want to see
5 seconds worth of syscall summary for the threads in that cgroup:
# perf trace --summary -G machine.slice/machine-qemu\\x2d2\\x2drhel6.sandy.scope/ -a sleep 5
Summary of events:
qemu-system-x86 (23667), 143858 events, 24.2%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
ppoll 28492 4348.631 0.000 0.153 11.616 1.05%
futex 19661 140.801 0.001 0.007 2.993 3.20%
read 18440 68.084 0.001 0.004 1.653 4.33%
ioctl 5387 24.768 0.002 0.005 0.134 1.62%
CPU 0/KVM (23744), 449455 events, 75.8%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
ioctl 148364 3401.812 0.000 0.023 11.801 1.15%
futex 36131 404.127 0.001 0.011 7.377 2.63%
writev 29452 339.688 0.003 0.012 1.740 1.36%
write 11315 45.992 0.001 0.004 0.105 1.10%
#
See the documentation about how to set more than one cgroup for
different events in the same command line.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-t126jh4occqvu0xdqlcjygex@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07 03:30:51 +08:00
|
|
|
-G::
|
|
|
|
--cgroup::
|
|
|
|
Record events in threads in a cgroup.
|
|
|
|
|
|
|
|
Look for cgroups to set at the /sys/fs/cgroup/perf_event directory, then
|
|
|
|
remove the /sys/fs/cgroup/perf_event/ part and try:
|
|
|
|
|
|
|
|
perf trace -G A -e sched:*switch
|
|
|
|
|
|
|
|
Will set all raw_syscalls:sys_{enter,exit}, pgfault, vfs_getname, etc
|
|
|
|
_and_ sched:sched_switch to the 'A' cgroup, while:
|
|
|
|
|
|
|
|
perf trace -e sched:*switch -G A
|
|
|
|
|
|
|
|
will only set the sched:sched_switch event to the 'A' cgroup, all the
|
|
|
|
other events (raw_syscalls:sys_{enter,exit}, etc are left "without"
|
|
|
|
a cgroup (on the root cgroup, sys wide, etc).
|
|
|
|
|
|
|
|
Multiple cgroups:
|
|
|
|
|
|
|
|
perf trace -G A -e sched:*switch -G B
|
|
|
|
|
|
|
|
the syscall ones go to the 'A' cgroup, the sched:sched_switch goes
|
|
|
|
to the 'B' cgroup.
|
|
|
|
|
perf trace: Introduce --filter-pids
When tracing in X we get event loops due to the tracing activity, i.e.
updates to a gnome-terminal that generate syscalls for X.org, etc.
To get a more useful view of what is happening, syscall wise, system
wide, we need to filter those, like in:
# ps ax|egrep '981|2296|1519' | grep -v egrep
981 tty1 Ss+ 5:40 /usr/bin/Xorg :0 -background none ...
1519 ? Sl 2:22 /usr/bin/gnome-shell
2296 ? Sl 4:16 /usr/libexec/gnome-terminal-server
#
# trace -e write --filter-pids 981,2296,1519
0.385 ( 0.021 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 136) = 136
0.922 ( 0.014 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 140) = 140
5006.525 ( 0.029 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 136) = 136
5007.235 ( 0.023 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 140) = 140
5177.646 ( 0.018 ms): rtkit-daemon/782 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7f7eea70be88, count: 8) = 8
8314.497 ( 0.004 ms): gsd-locate-poi/2084 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7fffe96af7b0, count: 8) = 8
8314.518 ( 0.002 ms): gsd-locate-poi/2084 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7fffe96af0e0, count: 8) = 8
^C#
When this option is used the tracer pid is also filtered.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-f5qmiyy7c0uxdm21ncatpeek@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-22 03:36:52 +08:00
|
|
|
--filter-pids=::
|
|
|
|
Filter out events for these pids and for 'trace' itself (comma separated list).
|
|
|
|
|
2013-08-23 03:49:54 +08:00
|
|
|
-v::
|
|
|
|
--verbose=::
|
|
|
|
Verbosity level.
|
|
|
|
|
2012-09-27 07:05:56 +08:00
|
|
|
--no-inherit::
|
|
|
|
Child tasks do not inherit counters.
|
|
|
|
|
2013-08-21 01:15:45 +08:00
|
|
|
-m::
|
2012-09-27 07:05:56 +08:00
|
|
|
--mmap-pages=::
|
2013-09-01 18:36:13 +08:00
|
|
|
Number of mmap data pages (must be a power of two) or size
|
|
|
|
specification with appended unit character - B/K/M/G. The
|
|
|
|
size is rounded up to have nearest pages power of two value.
|
2012-09-27 07:05:56 +08:00
|
|
|
|
2013-08-21 01:15:45 +08:00
|
|
|
-C::
|
2012-09-27 07:05:56 +08:00
|
|
|
--cpu::
|
|
|
|
Collect samples only on the list of CPUs provided. Multiple CPUs can be provided as a
|
|
|
|
comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
|
|
|
|
In per-thread mode with inheritance mode on (default), Events are captured only when
|
|
|
|
the thread executes on the designated CPUs. Default is to monitor all CPUs.
|
|
|
|
|
2017-11-16 22:26:03 +08:00
|
|
|
--duration::
|
perf trace: Add duration filter
Example:
[acme@sandy linux]$ perf trace --duration 0.025 usleep 1
2.221 ( 0.958 ms): 6724 execve(arg0: 140733557168278, arg1: 140733557178768, arg2: 16134304, arg3: 140733557167840, arg4: 7955998171588342573, arg5: 6723) = -2
3.690 ( 1.443 ms): 6724 execve(arg0: 140733557168295, arg1: 140733557178768, arg2: 16134304, arg3: 140733557167840, arg4: 7955998171588342573, arg5: 6723) = 0
3.979 ( 0.048 ms): 6724 open(filename: 208733843841, flags: 0, mode: 1 ) = 3
4.071 ( 0.075 ms): 6724 open(filename: 139744419925673, flags: 0, mode: 0 ) = 3
4.318 ( 0.056 ms): 6724 nanosleep(rqtp: 140734030404608, rmtp: 0 ) = 0
[acme@sandy linux]$ perf trace --duration 0.100 usleep 1
1.143 ( 1.021 ms): 6726 execve(arg0: 140736323962279, arg1: 140736323972752, arg2: 34926752, arg3: 140736323961824, arg4: 7955998171588342573, arg5: 6725) = 0
[acme@sandy linux]$
Cherry picked from tmp.perf/trace2 branch.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/n/tip-oslw2j2958we9qf0ctra4whd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-08 20:56:00 +08:00
|
|
|
Show only events that had a duration greater than N.M ms.
|
|
|
|
|
2017-11-16 22:26:03 +08:00
|
|
|
--sched::
|
2012-10-18 04:13:12 +08:00
|
|
|
Accrue thread runtime and provide a summary at the end of the session.
|
|
|
|
|
perf trace: Show only failing syscalls
For instance:
# perf probe "vfs_getname=getname_flags:72 pathname=result->name:string"
Added new event:
probe:vfs_getname (on getname_flags:72 with pathname=result->name:string)
You can now use it in all perf tools, such as:
perf record -e probe:vfs_getname -aR sleep 1
# perf trace --failure sleep 1
0.043 ( 0.010 ms): sleep/10978 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
For reference, here are all the syscalls in this case:
# perf trace sleep 1
? ( ): sleep/10976 ... [continued]: execve()) = 0
0.027 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000
0.044 ( 0.010 ms): sleep/10976 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
0.057 ( 0.006 ms): sleep/10976 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
0.064 ( 0.002 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b370) = 0
0.067 ( 0.003 ms): sleep/10976 mmap(len: 111457, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec8615000
0.071 ( 0.001 ms): sleep/10976 close(fd: 3) = 0
0.080 ( 0.007 ms): sleep/10976 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
0.088 ( 0.002 ms): sleep/10976 read(fd: 3, buf: 0x7fffac22b538, count: 832) = 832
0.092 ( 0.001 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b3d0) = 0
0.094 ( 0.002 ms): sleep/10976 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7feec8613000
0.099 ( 0.004 ms): sleep/10976 mmap(len: 3889792, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3) = 0x7feec8057000
0.104 ( 0.007 ms): sleep/10976 mprotect(start: 0x7feec8203000, len: 2097152) = 0
0.112 ( 0.005 ms): sleep/10976 mmap(addr: 0x7feec8403000, len: 24576, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 1753088) = 0x7feec8403000
0.120 ( 0.003 ms): sleep/10976 mmap(addr: 0x7feec8409000, len: 14976, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS|FIXED) = 0x7feec8409000
0.128 ( 0.001 ms): sleep/10976 close(fd: 3) = 0
0.139 ( 0.001 ms): sleep/10976 arch_prctl(option: 4098, arg2: 140663540761856) = 0
0.186 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8403000, len: 16384, prot: READ) = 0
0.204 ( 0.003 ms): sleep/10976 mprotect(start: 0x55bdc0ec3000, len: 4096, prot: READ) = 0
0.209 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8631000, len: 4096, prot: READ) = 0
0.214 ( 0.010 ms): sleep/10976 munmap(addr: 0x7feec8615000, len: 111457) = 0
0.269 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000
0.271 ( 0.002 ms): sleep/10976 brk(brk: 0x55bdc2d25000) = 0x55bdc2d25000
0.274 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d25000
0.278 ( 0.007 ms): sleep/10976 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
0.288 ( 0.001 ms): sleep/10976 fstat(fd: 3</usr/lib/locale/locale-archive>, statbuf: 0x7feec8408aa0) = 0
0.290 ( 0.003 ms): sleep/10976 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec1488000
0.297 ( 0.001 ms): sleep/10976 close(fd: 3</usr/lib/locale/locale-archive>) = 0
0.325 (1000.193 ms): sleep/10976 nanosleep(rqtp: 0x7fffac22c0b0) = 0
1000.560 ( 0.006 ms): sleep/10976 close(fd: 1) = 0
1000.573 ( 0.005 ms): sleep/10976 close(fd: 2) = 0
1000.596 ( ): sleep/10976 exit_group()
#
And can be done systemwide, etc, with backtraces:
# perf trace --max-stack=16 --failure sleep 1
0.048 ( 0.015 ms): sleep/11092 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
__access (inlined)
dl_main (/usr/lib64/ld-2.26.so)
#
Or for some specific syscalls:
# perf trace --max-stack=16 -e openat --failure cat /tmp/rien
cat: /tmp/rien: No such file or directory
0.251 ( 0.012 ms): cat/11106 openat(dfd: CWD, filename: /tmp/rien) = -1 ENOENT No such file or directory
__libc_open64 (inlined)
main (/usr/bin/cat)
__libc_start_main (/usr/lib64/libc-2.26.so)
_start (/usr/bin/cat)
#
Look for inotify* syscalls that fail, system wide, for 2 seconds, with backtraces:
# perf trace -a --max-stack=16 --failure -e inotify* sleep 2
819.165 ( 0.058 ms): gmain/1724 inotify_add_watch(fd: 8<anon_inode:inotify>, pathname: /home/acme/~, mask: 16789454) = -1 ENOENT No such file or directory
__GI_inotify_add_watch (inlined)
_ik_watch (/usr/lib64/libgio-2.0.so.0.5400.3)
_ip_start_watching (/usr/lib64/libgio-2.0.so.0.5400.3)
im_scan_missing (/usr/lib64/libgio-2.0.so.0.5400.3)
g_timeout_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3)
g_main_context_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3)
g_main_context_iterate.isra.23 (/usr/lib64/libglib-2.0.so.0.5400.3)
g_main_context_iteration (/usr/lib64/libglib-2.0.so.0.5400.3)
glib_worker_main (/usr/lib64/libglib-2.0.so.0.5400.3)
g_thread_proxy (/usr/lib64/libglib-2.0.so.0.5400.3)
start_thread (/usr/lib64/libpthread-2.26.so)
__GI___clone (inlined)
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-8f7d3mngaxvi7tlzloz3n7cs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-29 23:22:59 +08:00
|
|
|
--failure::
|
|
|
|
Show only syscalls that failed, i.e. that returned < 0.
|
|
|
|
|
2017-11-16 22:26:03 +08:00
|
|
|
-i::
|
|
|
|
--input::
|
2013-08-29 12:29:52 +08:00
|
|
|
Process events from a given perf data file.
|
|
|
|
|
2017-11-16 22:26:03 +08:00
|
|
|
-T::
|
|
|
|
--time::
|
2013-09-05 02:37:43 +08:00
|
|
|
Print full timestamp rather time relative to first sample.
|
|
|
|
|
perf trace: Add option to show process COMM
Enabled by default, disable with --no-comm, e.g.:
181.821 (0.001 ms): deja-dup-monit/10784 recvmsg(fd: 8, msg: 0x7fff4342baf0, flags: PEEK|TRUNC|CMSG_CLOEXEC ) = 20
181.824 (0.001 ms): deja-dup-monit/10784 geteuid( ) = 1000
181.825 (0.001 ms): deja-dup-monit/10784 getegid( ) = 1000
181.834 (0.002 ms): deja-dup-monit/10784 recvmsg(fd: 8, msg: 0x7fff4342baf0, flags: CMSG_CLOEXEC ) = 20
181.836 (0.001 ms): deja-dup-monit/10784 geteuid( ) = 1000
181.838 (0.001 ms): deja-dup-monit/10784 getegid( ) = 1000
181.705 (0.003 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: PEEK|TRUNC|CMSG_CLOEXEC) = 1256
181.710 (0.002 ms): evolution-addr/10924 geteuid( ) = 1000
181.712 (0.001 ms): evolution-addr/10924 getegid( ) = 1000
181.727 (0.003 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: CMSG_CLOEXEC ) = 1256
181.731 (0.001 ms): evolution-addr/10924 geteuid( ) = 1000
181.734 (0.001 ms): evolution-addr/10924 getegid( ) = 1000
181.908 (0.002 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: PEEK|TRUNC|CMSG_CLOEXEC) = 20
181.913 (0.001 ms): evolution-addr/10924 geteuid( ) = 1000
181.915 (0.001 ms): evolution-addr/10924 getegid( ) = 1000
181.930 (0.003 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: CMSG_CLOEXEC ) = 20
181.934 (0.001 ms): evolution-addr/10924 geteuid( ) = 1000
181.937 (0.001 ms): evolution-addr/10924 getegid( ) = 1000
220.718 (0.010 ms): at-spi2-regist/10715 sendmsg(fd: 3, msg: 0x7fffdb8756c0, flags: NOSIGNAL ) = 200
220.741 (0.000 ms): dbus-daemon/10711 ... [continued]: epoll_wait()) = 1
220.759 (0.004 ms): dbus-daemon/10711 recvmsg(fd: 11, msg: 0x7ffff94594d0, flags: CMSG_CLOEXEC ) = 200
220.780 (0.002 ms): dbus-daemon/10711 recvmsg(fd: 11, msg: 0x7ffff94594d0, flags: CMSG_CLOEXEC ) = 200
220.788 (0.001 ms): dbus-daemon/10711 recvmsg(fd: 11, msg: 0x7ffff94594d0, flags: CMSG_CLOEXEC ) = -1 EAGAIN Resource temporarily unavailable
220.760 (0.004 ms): at-spi2-regist/10715 sendmsg(fd: 3, msg: 0x7fffdb8756c0, flags: NOSIGNAL ) = 200
220.771 (0.023 ms): perf/26347 open(filename: 0xf2e780, mode: 15918976 ) = 19
220.850 (0.002 ms): perf/26347 close(fd: 19 ) = 0
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6be5jvnkdzjptdrebfn5263n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-09-12 23:35:21 +08:00
|
|
|
--comm::
|
|
|
|
Show process COMM right beside its ID, on by default, disable with --no-comm.
|
|
|
|
|
2013-11-13 00:31:15 +08:00
|
|
|
-s::
|
2013-10-09 11:26:53 +08:00
|
|
|
--summary::
|
2013-11-13 00:31:15 +08:00
|
|
|
Show only a summary of syscalls by thread with min, max, and average times
|
|
|
|
(in msec) and relative stddev.
|
|
|
|
|
|
|
|
-S::
|
|
|
|
--with-summary::
|
|
|
|
Show all syscalls followed by a summary by thread with min, max, and
|
|
|
|
average times (in msec) and relative stddev.
|
2013-10-09 11:26:53 +08:00
|
|
|
|
2013-09-28 05:06:19 +08:00
|
|
|
--tool_stats::
|
|
|
|
Show tool stats such as number of times fd->pathname was discovered thru
|
|
|
|
hooking the open syscall return + vfs_getname or via reading /proc/pid/fd, etc.
|
|
|
|
|
2017-11-16 22:26:03 +08:00
|
|
|
-f::
|
|
|
|
--force::
|
|
|
|
Don't complain, do it.
|
|
|
|
|
2014-06-27 00:14:25 +08:00
|
|
|
-F=[all|min|maj]::
|
|
|
|
--pf=[all|min|maj]::
|
|
|
|
Trace pagefaults. Optionally, you can specify whether you want minor,
|
|
|
|
major or all pagefaults. Default value is maj.
|
|
|
|
|
2014-06-27 00:14:28 +08:00
|
|
|
--syscalls::
|
2017-04-13 14:02:12 +08:00
|
|
|
Trace system calls. This options is enabled by default, disable with
|
|
|
|
--no-syscalls.
|
2014-06-27 00:14:28 +08:00
|
|
|
|
2016-04-08 19:34:15 +08:00
|
|
|
--call-graph [mode,type,min[,limit],order[,key][,branch]]::
|
|
|
|
Setup and enable call-graph (stack chain/backtrace) recording.
|
|
|
|
See `--call-graph` section in perf-record and perf-report
|
|
|
|
man pages for details. The ones that are most useful in 'perf trace'
|
|
|
|
are 'dwarf' and 'lbr', where available, try: 'perf trace --call-graph dwarf'.
|
|
|
|
|
2016-04-16 04:52:34 +08:00
|
|
|
Using this will, for the root user, bump the value of --mmap-pages to 4
|
|
|
|
times the maximum for non-root users, based on the kernel.perf_event_mlock_kb
|
|
|
|
sysctl. This is done only if the user doesn't specify a --mmap-pages value.
|
|
|
|
|
2016-04-12 02:49:11 +08:00
|
|
|
--kernel-syscall-graph::
|
|
|
|
Show the kernel callchains on the syscall exit path.
|
|
|
|
|
perf trace: Introduce --max-events
Allow stopping tracing after a number of events take place, considering
strace-like syscalls formatting as one event per enter/exit pair or when
in a multi-process tracing session a syscall is interrupted and printed
ending with '...'.
Examples included in the documentation:
Trace the first 4 open, openat or open_by_handle_at syscalls (in the future more syscalls may match here):
$ perf trace -e open* --max-events 4
[root@jouet perf]# trace -e open* --max-events 4
2272.992 ( 0.037 ms): gnome-shell/1370 openat(dfd: CWD, filename: /proc/self/stat) = 31
2277.481 ( 0.139 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65
3026.398 ( 0.076 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65
4294.665 ( 0.015 ms): sed/15879 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
$
Trace the first minor page fault when running a workload:
# perf trace -F min --max-stack=7 --max-events 1 sleep 1
0.000 ( 0.000 ms): sleep/18006 minfault [__clear_user+0x1a] => 0x5626efa56080 (?k)
__clear_user ([kernel.kallsyms])
load_elf_binary ([kernel.kallsyms])
search_binary_handler ([kernel.kallsyms])
__do_execve_file.isra.33 ([kernel.kallsyms])
__x64_sys_execve ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64 ([kernel.kallsyms])
#
Trace the next min page page fault to take place on the first CPU:
# perf trace -F min --call-graph=dwarf --max-events 1 --cpu 0
0.000 ( 0.000 ms): Web Content/17136 minfault [js::gc::Chunk::fetchNextDecommittedArena+0x4b] => 0x7fbe6181b000 (?.)
js::gc::FreeSpan::initAsEmpty (inlined)
js::gc::Arena::setAsNotAllocated (inlined)
js::gc::Chunk::fetchNextDecommittedArena (/usr/lib64/firefox/libxul.so)
js::gc::Chunk::allocateArena (/usr/lib64/firefox/libxul.so)
js::gc::GCRuntime::allocateArena (/usr/lib64/firefox/libxul.so)
js::gc::ArenaLists::allocateFromArena (/usr/lib64/firefox/libxul.so)
js::gc::GCRuntime::tryNewTenuredThing<JSString, (js::AllowGC)1> (inlined)
js::AllocateString<JSString, (js::AllowGC)1> (/usr/lib64/firefox/libxul.so)
js::Allocate<JSThinInlineString, (js::AllowGC)1> (inlined)
JSThinInlineString::new_<(js::AllowGC)1> (inlined)
AllocateInlineString<(js::AllowGC)1, unsigned char> (inlined)
js::ConcatStrings<(js::AllowGC)1> (/usr/lib64/firefox/libxul.so)
[0x18b26e6bc2bd] (/tmp/perf-17136.map)
Tracing the next four ext4 operations on a specific CPU:
# perf trace -e ext4:*/call-graph=fp/ --max-events 4 --cpu 3
0.000 mutt/3849 ext4:ext4_es_lookup_extent_enter:dev 253,2 ino 57277 lblk 0
ext4_es_lookup_extent ([kernel.kallsyms])
read (/usr/lib64/libc-2.26.so)
0.097 mutt/3849 ext4:ext4_es_lookup_extent_exit:dev 253,2 ino 57277 found 0 [0/0) 0
ext4_es_lookup_extent ([kernel.kallsyms])
read (/usr/lib64/libc-2.26.so)
0.141 mutt/3849 ext4:ext4_ext_map_blocks_enter:dev 253,2 ino 57277 lblk 0 len 1 flags
ext4_ext_map_blocks ([kernel.kallsyms])
read (/usr/lib64/libc-2.26.so)
0.184 mutt/3849 ext4:ext4_ext_load_extent:dev 253,2 ino 57277 lblk 1516511 pblk 18446744071750013657
__read_extent_tree_block ([kernel.kallsyms])
__read_extent_tree_block ([kernel.kallsyms])
ext4_find_extent ([kernel.kallsyms])
ext4_ext_map_blocks ([kernel.kallsyms])
ext4_map_blocks ([kernel.kallsyms])
ext4_mpage_readpages ([kernel.kallsyms])
read_pages ([kernel.kallsyms])
__do_page_cache_readahead ([kernel.kallsyms])
ondemand_readahead ([kernel.kallsyms])
generic_file_read_iter ([kernel.kallsyms])
__vfs_read ([kernel.kallsyms])
vfs_read ([kernel.kallsyms])
ksys_read ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64 ([kernel.kallsyms])
read (/usr/lib64/libc-2.26.so)
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rudá Moura <ruda.moura@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-sweh107bs7ol5bzls0m4tqdz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-19 03:38:27 +08:00
|
|
|
--max-events=N::
|
|
|
|
Stop after processing N events. Note that strace-like events are considered
|
|
|
|
only at exit time or when a syscall is interrupted, i.e. in those cases this
|
|
|
|
option is equivalent to the number of lines printed.
|
|
|
|
|
2016-04-15 05:29:08 +08:00
|
|
|
--max-stack::
|
|
|
|
Set the stack depth limit when parsing the callchain, anything
|
|
|
|
beyond the specified depth will be ignored. Note that at this point
|
|
|
|
this is just about the presentation part, i.e. the kernel is still
|
|
|
|
not limiting, the overhead of callchains needs to be set via the
|
|
|
|
knobs in --call-graph dwarf.
|
|
|
|
|
2016-04-16 03:41:19 +08:00
|
|
|
Implies '--call-graph dwarf' when --call-graph not present on the
|
|
|
|
command line, on systems where DWARF unwinding was built in.
|
|
|
|
|
2016-05-19 22:34:06 +08:00
|
|
|
Default: /proc/sys/kernel/perf_event_max_stack when present for
|
|
|
|
live sessions (without --input/-i), 127 otherwise.
|
2016-04-15 05:29:08 +08:00
|
|
|
|
perf trace: Introduce --min-stack filter
Counterpart to --max-stack, to help focusing on deeply nested calls. Can
be combined with --duration, etc.
E.g.:
System wide syscall tracing looking for call stacks longer than 66:
# trace --mmap-pages 32768 --filter-pid 2711 --call-graph dwarf,16384 --min-stack 66
Or more compactly:
# trace -m 32768 --filt 2711 --call dwarf,16384 --min-st 66
363.027 ( 0.002 ms): gnome-shell/2287 poll(ufds: 0x7ffc5ea24230, nfds: 1, timeout_msecs: 4294967295 ) = 1
[0xf6fdd] (/usr/lib64/libc-2.22.so)
_xcb_conn_wait+0x92 (/usr/lib64/libxcb.so.1.1.0)
_xcb_out_send+0x4d (/usr/lib64/libxcb.so.1.1.0)
xcb_writev+0x45 (/usr/lib64/libxcb.so.1.1.0)
_XSend+0x19e (/usr/lib64/libX11.so.6.3.0)
_XReply+0x82 (/usr/lib64/libX11.so.6.3.0)
XSync+0x4d (/usr/lib64/libX11.so.6.3.0)
dri3_bind_tex_image+0x42 (/usr/lib64/libGL.so.1.2.0)
_cogl_winsys_texture_pixmap_x11_update+0x117 (/usr/lib64/libcogl.so.20.4.1)
_cogl_texture_pixmap_x11_update+0x67 (/usr/lib64/libcogl.so.20.4.1)
_cogl_texture_pixmap_x11_pre_paint+0x13 (/usr/lib64/libcogl.so.20.4.1)
_cogl_pipeline_layer_pre_paint+0x5e (/usr/lib64/libcogl.so.20.4.1)
_cogl_rectangles_validate_layer_cb+0x1b (/usr/lib64/libcogl.so.20.4.1)
cogl_pipeline_foreach_layer+0xbe (/usr/lib64/libcogl.so.20.4.1)
_cogl_framebuffer_draw_multitextured_rectangles+0x77 (/usr/lib64/libcogl.so.20.4.1)
cogl_framebuffer_draw_multitextured_rectangle+0x51 (/usr/lib64/libcogl.so.20.4.1)
paint_clipped_rectangle+0xb6 (/usr/lib64/libmutter.so.0.0.0)
meta_shaped_texture_paint+0x3e3 (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_window_actor_paint+0x14b (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_window_group_paint+0x19f (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
[0x3d970] (/usr/lib64/gnome-shell/libgnome-shell.so)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_stage_paint+0x3a (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_stage_paint+0x45 (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0x164 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
_clutter_stage_do_paint+0x17b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_stage_cogl_redraw+0x496 (/usr/lib64/libclutter-1.0.so.0.2400.2)
_clutter_stage_do_update+0x117 (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_clock_dispatch+0x169 (/usr/lib64/libclutter-1.0.so.0.2400.2)
g_main_context_dispatch+0x15a (/usr/lib64/libglib-2.0.so.0.4600.2)
g_main_context_iterate.isra.29+0x1e0 (/usr/lib64/libglib-2.0.so.0.4600.2)
g_main_loop_run+0xc2 (/usr/lib64/libglib-2.0.so.0.4600.2)
meta_run+0x2c (/usr/lib64/libmutter.so.0.0.0)
main+0x3f7 (/usr/bin/gnome-shell)
__libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
[0x2909] (/usr/bin/gnome-shell)
363.038 ( 0.006 ms): gnome-shell/2287 writev(fd: 5<socket:[32540]>, vec: 0x7ffc5ea243a0, vlen: 3 ) = 4
__GI___writev+0x2d (/usr/lib64/libc-2.22.so)
_xcb_conn_wait+0x359 (/usr/lib64/libxcb.so.1.1.0)
_xcb_out_send+0x4d (/usr/lib64/libxcb.so.1.1.0)
xcb_writev+0x45 (/usr/lib64/libxcb.so.1.1.0)
_XSend+0x19e (/usr/lib64/libX11.so.6.3.0)
_XReply+0x82 (/usr/lib64/libX11.so.6.3.0)
XSync+0x4d (/usr/lib64/libX11.so.6.3.0)
dri3_bind_tex_image+0x42 (/usr/lib64/libGL.so.1.2.0)
_cogl_winsys_texture_pixmap_x11_update+0x117 (/usr/lib64/libcogl.so.20.4.1)
_cogl_texture_pixmap_x11_update+0x67 (/usr/lib64/libcogl.so.20.4.1)
_cogl_texture_pixmap_x11_pre_paint+0x13 (/usr/lib64/libcogl.so.20.4.1)
_cogl_pipeline_layer_pre_paint+0x5e (/usr/lib64/libcogl.so.20.4.1)
_cogl_rectangles_validate_layer_cb+0x1b (/usr/lib64/libcogl.so.20.4.1)
cogl_pipeline_foreach_layer+0xbe (/usr/lib64/libcogl.so.20.4.1)
_cogl_framebuffer_draw_multitextured_rectangles+0x77 (/usr/lib64/libcogl.so.20.4.1)
cogl_framebuffer_draw_multitextured_rectangle+0x51 (/usr/lib64/libcogl.so.20.4.1)
paint_clipped_rectangle+0xb6 (/usr/lib64/libmutter.so.0.0.0)
meta_shaped_texture_paint+0x3e3 (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_window_actor_paint+0x14b (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_window_group_paint+0x19f (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
[0x3d970] (/usr/lib64/gnome-shell/libgnome-shell.so)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_stage_paint+0x3a (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_stage_paint+0x45 (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0x164 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
_clutter_stage_do_paint+0x17b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_stage_cogl_redraw+0x496 (/usr/lib64/libclutter-1.0.so.0.2400.2)
_clutter_stage_do_update+0x117 (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_clock_dispatch+0x169 (/usr/lib64/libclutter-1.0.so.0.2400.2)
g_main_context_dispatch+0x15a (/usr/lib64/libglib-2.0.so.0.4600.2)
g_main_context_iterate.isra.29+0x1e0 (/usr/lib64/libglib-2.0.so.0.4600.2)
g_main_loop_run+0xc2 (/usr/lib64/libglib-2.0.so.0.4600.2)
meta_run+0x2c (/usr/lib64/libmutter.so.0.0.0)
main+0x3f7 (/usr/bin/gnome-shell)
__libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
[0x2909] (/usr/bin/gnome-shell)
363.086 ( 0.042 ms): gnome-shell/2287 poll(ufds: 0x7ffc5ea24250, nfds: 1, timeout_msecs: 4294967295 ) = 1
[0xf6fdd] (/usr/lib64/libc-2.22.so)
_xcb_conn_wait+0x92 (/usr/lib64/libxcb.so.1.1.0)
wait_for_reply+0xb7 (/usr/lib64/libxcb.so.1.1.0)
xcb_wait_for_reply+0x61 (/usr/lib64/libxcb.so.1.1.0)
_XReply+0x127 (/usr/lib64/libX11.so.6.3.0)
XSync+0x4d (/usr/lib64/libX11.so.6.3.0)
dri3_bind_tex_image+0x42 (/usr/lib64/libGL.so.1.2.0)
_cogl_winsys_texture_pixmap_x11_update+0x117 (/usr/lib64/libcogl.so.20.4.1)
_cogl_texture_pixmap_x11_update+0x67 (/usr/lib64/libcogl.so.20.4.1)
_cogl_texture_pixmap_x11_pre_paint+0x13 (/usr/lib64/libcogl.so.20.4.1)
_cogl_pipeline_layer_pre_paint+0x5e (/usr/lib64/libcogl.so.20.4.1)
_cogl_rectangles_validate_layer_cb+0x1b (/usr/lib64/libcogl.so.20.4.1)
cogl_pipeline_foreach_layer+0xbe (/usr/lib64/libcogl.so.20.4.1)
_cogl_framebuffer_draw_multitextured_rectangles+0x77 (/usr/lib64/libcogl.so.20.4.1)
cogl_framebuffer_draw_multitextured_rectangle+0x51 (/usr/lib64/libcogl.so.20.4.1)
paint_clipped_rectangle+0xb6 (/usr/lib64/libmutter.so.0.0.0)
meta_shaped_texture_paint+0x3e3 (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_window_actor_paint+0x14b (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_window_group_paint+0x19f (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
[0x3d970] (/usr/lib64/gnome-shell/libgnome-shell.so)
_g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_stage_paint+0x3a (/usr/lib64/libclutter-1.0.so.0.2400.2)
meta_stage_paint+0x45 (/usr/lib64/libmutter.so.0.0.0)
_g_closure_invoke_va+0x164 (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2)
g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2)
clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2)
_clutter_stage_do_paint+0x17b (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_stage_cogl_redraw+0x496 (/usr/lib64/libclutter-1.0.so.0.2400.2)
_clutter_stage_do_update+0x117 (/usr/lib64/libclutter-1.0.so.0.2400.2)
clutter_clock_dispatch+0x169 (/usr/lib64/libclutter-1.0.so.0.2400.2)
g_main_context_dispatch+0x15a (/usr/lib64/libglib-2.0.so.0.4600.2)
g_main_context_iterate.isra.29+0x1e0 (/usr/lib64/libglib-2.0.so.0.4600.2)
g_main_loop_run+0xc2 (/usr/lib64/libglib-2.0.so.0.4600.2)
meta_run+0x2c (/usr/lib64/libmutter.so.0.0.0)
main+0x3f7 (/usr/bin/gnome-shell)
__libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
[0x2909] (/usr/bin/gnome-shell)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jncuxju9fibq2rl6olhqwjw6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-15 22:10:31 +08:00
|
|
|
--min-stack::
|
|
|
|
Set the stack depth limit when parsing the callchain, anything
|
|
|
|
below the specified depth will be ignored. Disabled by default.
|
|
|
|
|
2016-04-16 03:41:19 +08:00
|
|
|
Implies '--call-graph dwarf' when --call-graph not present on the
|
|
|
|
command line, on systems where DWARF unwinding was built in.
|
|
|
|
|
2018-01-22 22:38:54 +08:00
|
|
|
--print-sample::
|
|
|
|
Print the PERF_RECORD_SAMPLE PERF_SAMPLE_ info for the
|
|
|
|
raw_syscalls:sys_{enter,exit} tracepoints, for debugging.
|
|
|
|
|
2015-06-17 21:51:11 +08:00
|
|
|
--proc-map-timeout::
|
|
|
|
When processing pre-existing threads /proc/XXX/mmap, it may take a long time,
|
|
|
|
because the file may be huge. A time out is needed in such cases.
|
|
|
|
This option sets the time out limit. The default value is 500 ms.
|
|
|
|
|
2014-06-27 00:14:25 +08:00
|
|
|
PAGEFAULTS
|
|
|
|
----------
|
|
|
|
|
|
|
|
When tracing pagefaults, the format of the trace is as follows:
|
|
|
|
|
|
|
|
<min|maj>fault [<ip.symbol>+<ip.offset>] => <addr.dso@addr.offset> (<map type><addr level>).
|
|
|
|
|
|
|
|
- min/maj indicates whether fault event is minor or major;
|
|
|
|
- ip.symbol shows symbol for instruction pointer (the code that generated the
|
|
|
|
fault); if no debug symbols available, perf trace will print raw IP;
|
|
|
|
- addr.dso shows DSO for the faulted address;
|
|
|
|
- map type is either 'd' for non-executable maps or 'x' for executable maps;
|
|
|
|
- addr level is either 'k' for kernel dso or '.' for user dso.
|
|
|
|
|
|
|
|
For symbols resolution you may need to install debugging symbols.
|
|
|
|
|
|
|
|
Please be aware that duration is currently always 0 and doesn't reflect actual
|
|
|
|
time it took for fault to be handled!
|
|
|
|
|
|
|
|
When --verbose specified, perf trace tries to print all available information
|
|
|
|
for both IP and fault address in the form of dso@symbol+offset.
|
|
|
|
|
|
|
|
EXAMPLES
|
|
|
|
--------
|
|
|
|
|
2014-06-27 00:14:28 +08:00
|
|
|
Trace only major pagefaults:
|
|
|
|
|
|
|
|
$ perf trace --no-syscalls -F
|
|
|
|
|
2014-06-27 00:14:25 +08:00
|
|
|
Trace syscalls, major and minor pagefaults:
|
|
|
|
|
|
|
|
$ perf trace -F all
|
|
|
|
|
|
|
|
1416.547 ( 0.000 ms): python/20235 majfault [CRYPTO_push_info_+0x0] => /lib/x86_64-linux-gnu/libcrypto.so.1.0.0@0x61be0 (x.)
|
|
|
|
|
|
|
|
As you can see, there was major pagefault in python process, from
|
|
|
|
CRYPTO_push_info_ routine which faulted somewhere in libcrypto.so.
|
|
|
|
|
perf trace: Introduce --max-events
Allow stopping tracing after a number of events take place, considering
strace-like syscalls formatting as one event per enter/exit pair or when
in a multi-process tracing session a syscall is interrupted and printed
ending with '...'.
Examples included in the documentation:
Trace the first 4 open, openat or open_by_handle_at syscalls (in the future more syscalls may match here):
$ perf trace -e open* --max-events 4
[root@jouet perf]# trace -e open* --max-events 4
2272.992 ( 0.037 ms): gnome-shell/1370 openat(dfd: CWD, filename: /proc/self/stat) = 31
2277.481 ( 0.139 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65
3026.398 ( 0.076 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65
4294.665 ( 0.015 ms): sed/15879 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
$
Trace the first minor page fault when running a workload:
# perf trace -F min --max-stack=7 --max-events 1 sleep 1
0.000 ( 0.000 ms): sleep/18006 minfault [__clear_user+0x1a] => 0x5626efa56080 (?k)
__clear_user ([kernel.kallsyms])
load_elf_binary ([kernel.kallsyms])
search_binary_handler ([kernel.kallsyms])
__do_execve_file.isra.33 ([kernel.kallsyms])
__x64_sys_execve ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64 ([kernel.kallsyms])
#
Trace the next min page page fault to take place on the first CPU:
# perf trace -F min --call-graph=dwarf --max-events 1 --cpu 0
0.000 ( 0.000 ms): Web Content/17136 minfault [js::gc::Chunk::fetchNextDecommittedArena+0x4b] => 0x7fbe6181b000 (?.)
js::gc::FreeSpan::initAsEmpty (inlined)
js::gc::Arena::setAsNotAllocated (inlined)
js::gc::Chunk::fetchNextDecommittedArena (/usr/lib64/firefox/libxul.so)
js::gc::Chunk::allocateArena (/usr/lib64/firefox/libxul.so)
js::gc::GCRuntime::allocateArena (/usr/lib64/firefox/libxul.so)
js::gc::ArenaLists::allocateFromArena (/usr/lib64/firefox/libxul.so)
js::gc::GCRuntime::tryNewTenuredThing<JSString, (js::AllowGC)1> (inlined)
js::AllocateString<JSString, (js::AllowGC)1> (/usr/lib64/firefox/libxul.so)
js::Allocate<JSThinInlineString, (js::AllowGC)1> (inlined)
JSThinInlineString::new_<(js::AllowGC)1> (inlined)
AllocateInlineString<(js::AllowGC)1, unsigned char> (inlined)
js::ConcatStrings<(js::AllowGC)1> (/usr/lib64/firefox/libxul.so)
[0x18b26e6bc2bd] (/tmp/perf-17136.map)
Tracing the next four ext4 operations on a specific CPU:
# perf trace -e ext4:*/call-graph=fp/ --max-events 4 --cpu 3
0.000 mutt/3849 ext4:ext4_es_lookup_extent_enter:dev 253,2 ino 57277 lblk 0
ext4_es_lookup_extent ([kernel.kallsyms])
read (/usr/lib64/libc-2.26.so)
0.097 mutt/3849 ext4:ext4_es_lookup_extent_exit:dev 253,2 ino 57277 found 0 [0/0) 0
ext4_es_lookup_extent ([kernel.kallsyms])
read (/usr/lib64/libc-2.26.so)
0.141 mutt/3849 ext4:ext4_ext_map_blocks_enter:dev 253,2 ino 57277 lblk 0 len 1 flags
ext4_ext_map_blocks ([kernel.kallsyms])
read (/usr/lib64/libc-2.26.so)
0.184 mutt/3849 ext4:ext4_ext_load_extent:dev 253,2 ino 57277 lblk 1516511 pblk 18446744071750013657
__read_extent_tree_block ([kernel.kallsyms])
__read_extent_tree_block ([kernel.kallsyms])
ext4_find_extent ([kernel.kallsyms])
ext4_ext_map_blocks ([kernel.kallsyms])
ext4_map_blocks ([kernel.kallsyms])
ext4_mpage_readpages ([kernel.kallsyms])
read_pages ([kernel.kallsyms])
__do_page_cache_readahead ([kernel.kallsyms])
ondemand_readahead ([kernel.kallsyms])
generic_file_read_iter ([kernel.kallsyms])
__vfs_read ([kernel.kallsyms])
vfs_read ([kernel.kallsyms])
ksys_read ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64 ([kernel.kallsyms])
read (/usr/lib64/libc-2.26.so)
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rudá Moura <ruda.moura@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-sweh107bs7ol5bzls0m4tqdz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-19 03:38:27 +08:00
|
|
|
Trace the first 4 open, openat or open_by_handle_at syscalls (in the future more syscalls may match here):
|
|
|
|
|
|
|
|
$ perf trace -e open* --max-events 4
|
|
|
|
[root@jouet perf]# trace -e open* --max-events 4
|
|
|
|
2272.992 ( 0.037 ms): gnome-shell/1370 openat(dfd: CWD, filename: /proc/self/stat) = 31
|
|
|
|
2277.481 ( 0.139 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65
|
|
|
|
3026.398 ( 0.076 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65
|
|
|
|
4294.665 ( 0.015 ms): sed/15879 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
|
|
|
|
$
|
|
|
|
|
|
|
|
Trace the first minor page fault when running a workload:
|
|
|
|
|
|
|
|
# perf trace -F min --max-stack=7 --max-events 1 sleep 1
|
|
|
|
0.000 ( 0.000 ms): sleep/18006 minfault [__clear_user+0x1a] => 0x5626efa56080 (?k)
|
|
|
|
__clear_user ([kernel.kallsyms])
|
|
|
|
load_elf_binary ([kernel.kallsyms])
|
|
|
|
search_binary_handler ([kernel.kallsyms])
|
|
|
|
__do_execve_file.isra.33 ([kernel.kallsyms])
|
|
|
|
__x64_sys_execve ([kernel.kallsyms])
|
|
|
|
do_syscall_64 ([kernel.kallsyms])
|
|
|
|
entry_SYSCALL_64 ([kernel.kallsyms])
|
|
|
|
#
|
|
|
|
|
|
|
|
Trace the next min page page fault to take place on the first CPU:
|
|
|
|
|
|
|
|
# perf trace -F min --call-graph=dwarf --max-events 1 --cpu 0
|
|
|
|
0.000 ( 0.000 ms): Web Content/17136 minfault [js::gc::Chunk::fetchNextDecommittedArena+0x4b] => 0x7fbe6181b000 (?.)
|
|
|
|
js::gc::FreeSpan::initAsEmpty (inlined)
|
|
|
|
js::gc::Arena::setAsNotAllocated (inlined)
|
|
|
|
js::gc::Chunk::fetchNextDecommittedArena (/usr/lib64/firefox/libxul.so)
|
|
|
|
js::gc::Chunk::allocateArena (/usr/lib64/firefox/libxul.so)
|
|
|
|
js::gc::GCRuntime::allocateArena (/usr/lib64/firefox/libxul.so)
|
|
|
|
js::gc::ArenaLists::allocateFromArena (/usr/lib64/firefox/libxul.so)
|
|
|
|
js::gc::GCRuntime::tryNewTenuredThing<JSString, (js::AllowGC)1> (inlined)
|
|
|
|
js::AllocateString<JSString, (js::AllowGC)1> (/usr/lib64/firefox/libxul.so)
|
|
|
|
js::Allocate<JSThinInlineString, (js::AllowGC)1> (inlined)
|
|
|
|
JSThinInlineString::new_<(js::AllowGC)1> (inlined)
|
|
|
|
AllocateInlineString<(js::AllowGC)1, unsigned char> (inlined)
|
|
|
|
js::ConcatStrings<(js::AllowGC)1> (/usr/lib64/firefox/libxul.so)
|
|
|
|
[0x18b26e6bc2bd] (/tmp/perf-17136.map)
|
|
|
|
#
|
|
|
|
|
perf trace: Introduce per-event maximum number of events property
Call it 'nr', as in this context it should be expressive enough, i.e.:
# perf trace -e sched:*waking/nr=8,call-graph=fp/
0.000 :0/0 sched:sched_waking:comm=rcu_sched pid=10 prio=120 target_cpu=001
try_to_wake_up ([kernel.kallsyms])
sched_clock ([kernel.kallsyms])
3.933 :0/0 sched:sched_waking:comm=rcu_sched pid=10 prio=120 target_cpu=001
try_to_wake_up ([kernel.kallsyms])
sched_clock ([kernel.kallsyms])
3.970 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
try_to_wake_up ([kernel.kallsyms])
__libc_write (/usr/lib64/libpthread-2.26.so)
20.069 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
try_to_wake_up ([kernel.kallsyms])
__libc_write (/usr/lib64/libpthread-2.26.so)
37.170 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
try_to_wake_up ([kernel.kallsyms])
__libc_write (/usr/lib64/libpthread-2.26.so)
53.267 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
try_to_wake_up ([kernel.kallsyms])
__libc_write (/usr/lib64/libpthread-2.26.so)
70.365 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
try_to_wake_up ([kernel.kallsyms])
__libc_write (/usr/lib64/libpthread-2.26.so)
75.781 Web Content/3649 sched:sched_waking:comm=JS Helper pid=3670 prio=120 target_cpu=000
try_to_wake_up ([kernel.kallsyms])
try_to_wake_up ([kernel.kallsyms])
wake_up_q ([kernel.kallsyms])
futex_wake ([kernel.kallsyms])
do_futex ([kernel.kallsyms])
__x64_sys_futex ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
pthread_cond_signal@@GLIBC_2.3.2 (/usr/lib64/libpthread-2.26.so)
#
# perf trace -e sched:*switch/nr=2/,block:*_plug/nr=4/,block:*_unplug/nr=1/,net:*dev_queue/nr=3,max-stack=16/
0.000 :0/0 sched:sched_switch:swapper/0:0 [120] S ==> trace:3367 [120]
0.046 :0/0 sched:sched_switch:swapper/1:0 [120] S ==> kworker/u16:58:2722 [120]
570.670 irq/50-iwlwifi/680 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051ef00 len=66
__dev_queue_xmit ([kernel.kallsyms])
1106.141 jbd2/dm-0-8/476 block:block_plug:[jbd2/dm-0-8]
1106.175 jbd2/dm-0-8/476 block:block_unplug:[jbd2/dm-0-8] 1
1618.088 kworker/u16:30/2694 block:block_plug:[kworker/u16:30]
1810.000 :0/0 net:net_dev_queue:dev=vnet0 skbaddr=0xffff93498051ef00 len=52
__dev_queue_xmit ([kernel.kallsyms])
3857.974 :0/0 net:net_dev_queue:dev=vnet0 skbaddr=0xffff93498051f900 len=52
__dev_queue_xmit ([kernel.kallsyms])
4790.277 jbd2/dm-2-8/748 block:block_plug:[jbd2/dm-2-8]
4790.448 jbd2/dm-2-8/748 block:block_plug:[jbd2/dm-2-8]
#
The global --max-events has precendence:
# trace --max-events 3 -e sched:*switch/nr=2/,block:*_plug/nr=4/,block:*_unplug/nr=1/,net:*dev_queue/nr=3,max-stack=16/
0.000 :0/0 sched:sched_switch:swapper/0:0 [120] S ==> qemu-system-x86:2252 [120]
0.029 qemu-system-x8/2252 sched:sched_switch:qemu-system-x86:2252 [120] D ==> swapper/0:0 [120]
58.047 DNS Res~er #14/31661 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff9346966af100 len=84
__dev_queue_xmit ([kernel.kallsyms])
__libc_send (/usr/lib64/libpthread-2.26.so)
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-s4jswltvh660ughvg9nwngah@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-23 01:14:16 +08:00
|
|
|
Trace the next two sched:sched_switch events, four block:*_plug events, the
|
|
|
|
next block:*_unplug and the next three net:*dev_queue events, this last one
|
|
|
|
with a backtrace of at most 16 entries, system wide:
|
|
|
|
|
|
|
|
# perf trace -e sched:*switch/nr=2/,block:*_plug/nr=4/,block:*_unplug/nr=1/,net:*dev_queue/nr=3,max-stack=16/
|
|
|
|
0.000 :0/0 sched:sched_switch:swapper/2:0 [120] S ==> rcu_sched:10 [120]
|
|
|
|
0.015 rcu_sched/10 sched:sched_switch:rcu_sched:10 [120] R ==> swapper/2:0 [120]
|
|
|
|
254.198 irq/50-iwlwifi/680 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051f600 len=66
|
|
|
|
__dev_queue_xmit ([kernel.kallsyms])
|
|
|
|
273.977 :0/0 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051f600 len=78
|
|
|
|
__dev_queue_xmit ([kernel.kallsyms])
|
|
|
|
274.007 :0/0 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051ff00 len=78
|
|
|
|
__dev_queue_xmit ([kernel.kallsyms])
|
|
|
|
2930.140 kworker/u16:58/2722 block:block_plug:[kworker/u16:58]
|
|
|
|
2930.162 kworker/u16:58/2722 block:block_unplug:[kworker/u16:58] 1
|
|
|
|
4466.094 jbd2/dm-2-8/748 block:block_plug:[jbd2/dm-2-8]
|
|
|
|
8050.123 kworker/u16:30/2694 block:block_plug:[kworker/u16:30]
|
|
|
|
8050.271 kworker/u16:30/2694 block:block_plug:[kworker/u16:30]
|
|
|
|
#
|
|
|
|
|
2012-09-27 07:05:56 +08:00
|
|
|
SEE ALSO
|
|
|
|
--------
|
|
|
|
linkperf:perf-record[1], linkperf:perf-script[1]
|