OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Arnaldo Carvalho de Melo	e4b00e930b	perf trace: Add "sendfile64" alias to the "sendfile" syscall We were looking in tracefs for: /sys/kernel/debug/tracing/events/syscalls/sys_enter_sendfile/format when what is there is just /sys/kernel/debug/tracing/events/syscalls/sys_enter_sendfile/format Its the same id, 40 in x86_64, so just add an alias and let the existing logic take care of that. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-km2hmg7hru6u4pawi5fi903q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	ad4153f964	perf trace: Reuse BPF augmenters from syscalls with similar args signature We have an augmenter for the "open" syscall, which has just one pointer, in the first argument, a "const char *", so any other syscall that has just one pointer and that is the first can reuse the "open" BPF augmenter program. Even more, syscalls that get two pointers with the first being a string can reuse "open"'s BPF augmenter till we have an augmenter that better matches that syscall with two pointers. With this the few augmenters we have, for open (first arg is a string), openat (2nd arg is a string), renameat (2nd and 4th are strings) can be reused by a lot of syscalls, ditto for "bind" reusing "connect" because both have the 2nd argument as a sockaddr and the 3rd as its len. Lets see how this makes the "bind" syscall reuse the "connect" BPF prog augmenter found in tools/perf/examples/bpf/augmented_raw_syscalls.c: # perf trace -e bind,connect systemctl restart sshd connect(3, { .family: PF_LOCAL, path: /run/systemd/private }, 23) = 0 # Oh, it just connects to some daemon, so we better do it system wide and then stop/start sshd: # perf trace -e bind,connect systemctl/10124 connect(3, { .family: PF_LOCAL, path: /run/systemd/private }, 23) = 0 sshd/10102 connect(7, { .family: PF_LOCAL, path: /dev/log }, 110) = 0 systemctl/10126 connect(3, { .family: PF_LOCAL, path: /run/systemd/private }, 23) = 0 systemd/10128 ... [continued]: connect()) = 0 (sshd)/10128 connect(3, { .family: PF_LOCAL, path: /run/systemd/journal/stdout }, 30) ... sshd/10128 bind(3, { .family: PF_NETLINK }, 12) = 0 sshd/10128 connect(4, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) sshd/10128 connect(3, { .family: PF_INET6, port: 22, addr: :: }, 28) = 0 sshd/10128 connect(3, { .family: PF_UNSPEC }, 16) = 0 sshd/10128 connect(3, { .family: PF_INET, port: 22, addr: 0.0.0.0 }, 16) = 0 sshd/10128 connect(3, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) sshd/10128 connect(3, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) sshd/10128 connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) sshd/10128 connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) sshd/10128 bind(4, { .family: PF_INET, port: 22, addr: 0.0.0.0 }, 16) = 0 sshd/10128 connect(6, { .family: PF_LOCAL, path: /dev/log }, 110) = 0 sshd/10128 bind(6, { .family: PF_INET6, port: 22, addr: :: }, 28) = 0 sshd/10128 connect(7, { .family: PF_LOCAL, path: /dev/log }, 110) = 0 ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-zfley2ghs4nim1uq4nu6ed3l@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	30a910d7d3	perf trace: Preallocate the syscall table We'll continue reading its details from tracefs as we need it, but preallocate the whole thing otherwise we may realloc and end up with pointers to the previous buffer. I.e. in an upcoming algorithm we'll look for syscalls that have function signatures that are similar to a given syscall to see if we can reuse its BPF augmenter, so we may be at syscall 42, having a 'struct syscall' pointing to that slot in trace->syscalls.table[] and try to read the slot for an yet unread syscall, which would realloc that table to read the info for syscall 43, say, which would trigger a realoc of trace->syscalls.table[], and then the pointer we had for syscall 42 would be pointing to the previous block of memory. b00m. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-m3cjzzifibs13imafhkk77a0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	b8b1033fca	perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages There are holes in syscall tables with IDs not associated with any syscall, mark those when trying to read information for syscalls, which could happen when iterating thru all syscalls from 0 to the highest numbered syscall id. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-cku9mpcrcsqaiq0jepu86r68@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	5d2bd88975	perf trace: Forward error codes when trying to read syscall info We iterate thru the syscall table produced from the kernel syscall tables reading info, propagate the error and add to the debug message. This helps in fixing further bugs, such as failing to read the "sendfile" syscall info when it really should try the aliasm "sendfile64". Problems reading syscall 40: 2 (No such file or directory)(sendfile) information # grep sendfile /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c [40] = "sendfile", # I.e. in the tracefs format file for the syscall tracepoints we have it as sendfile64: # find /sys -type f -name format \| grep sendfile /sys/kernel/debug/tracing/events/syscalls/sys_enter_sendfile64/format /sys/kernel/debug/tracing/events/syscalls/sys_exit_sendfile64/format # But as "sendfile" in the file used to build the syscall table used in perf: $ grep sendfile arch/x86/entry/syscalls/syscall_64.tbl 40 common sendfile __x64_sys_sendfile64 $ So we need to add, in followup patches, aliases in 'perf trace' syscall data structures to cope with thie. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-w3eluap63x9je0bb8o3t79tz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	cfa9ac73d6	perf trace beauty: Add BPF augmenter for the 'rename' syscall I.e. two strings: # perf trace -e rename systemd/1 rename("/run/systemd/units/.#invocation:dnf-makecache.service970761b7f2840dcc", "/run/systemd/units/invocation:dnf-makecache.service") = 0 systemd-journa/715 rename("/run/systemd/journal/streams/.#9:17539785BJDblc", "/run/systemd/journal/streams/9:17539785") = 0 mv/1936 rename("/tmp/build/perf/fd/.array.o.tmp", "/tmp/build/perf/fd/.array.o.cmd") = 0 sh/1949 rename("/tmp/build/perf/.cpu.o.tmp", "/tmp/build/perf/.cpu.o.cmd") = 0 mv/1954 rename("/tmp/build/perf/fs/.tracing_path.o.tmp", "/tmp/build/perf/fs/.tracing_path.o.cmd") = 0 mv/1963 rename("/tmp/build/perf/common-cmds.h+", "/tmp/build/perf/common-cmds.h") = 0 :1975/1975 rename("/tmp/build/perf/.exec-cmd.o.tmp", "/tmp/build/perf/.exec-cmd.o.cmd") = 0 mv/1979 rename("/tmp/build/perf/fs/.fs.o.tmp", "/tmp/build/perf/fs/.fs.o.cmd") = 0 mv/2005 rename("/tmp/build/perf/.debug.o.tmp", "/tmp/build/perf/.debug.o.cmd") = 0 mv/2012 rename("/tmp/build/perf/.str_error_r.o.tmp", "/tmp/build/perf/.str_error_r.o.cmd") = 0 mv/2019 rename("/tmp/build/perf/.help.o.tmp", "/tmp/build/perf/.help.o.cmd") = 0 mv/2031 rename("/tmp/build/perf/.trace-seq.o.tmp", "/tmp/build/perf/.trace-seq.o.cmd") = 0 make/2038 ... [continued]: rename()) = 0 :2038/2038 rename("/tmp/build/perf/.event-plugin.o.tmp", "/tmp/build/perf/.event-plugin.o.cmd") ... ar/2035 rename("/tmp/build/perf/stzwBX3a", "/tmp/build/perf/libapi.a") = 0 mv/2051 rename("/tmp/build/perf/.parse-utils.o.tmp", "/tmp/build/perf/.parse-utils.o.cmd") = 0 mv/2069 rename("/tmp/build/perf/.subcmd-config.o.tmp", "/tmp/build/perf/.subcmd-config.o.cmd") = 0 make/2080 rename("/tmp/build/perf/.parse-filter.o.tmp", "/tmp/build/perf/.parse-filter.o.cmd") = 0 mv/2099 rename("/tmp/build/perf/.pager.o.tmp", "/tmp/build/perf/.pager.o.cmd") = 0 :2124/2124 rename("/tmp/build/perf/.sigchain.o.tmp", "/tmp/build/perf/.sigchain.o.cmd") = 0 make/2140 rename("/tmp/build/perf/.event-parse.o.tmp", "/tmp/build/perf/.event-parse.o.cmd") = 0 mv/2164 rename("/tmp/build/perf/.kbuffer-parse.o.tmp", "/tmp/build/perf/.kbuffer-parse.o.cmd") = 0 sh/2174 rename("/tmp/build/perf/.run-command.o.tmp", "/tmp/build/perf/.run-command.o.cmd") = 0 mv/2190 rename("/tmp/build/perf/.tep_strerror.o.tmp", "/tmp/build/perf/.tep_strerror.o.cmd") = 0 :2261/2261 rename("/tmp/build/perf/.event-parse-api.o.tmp", "/tmp/build/perf/.event-parse-api.o.cmd") = 0 :2480/2480 rename("/tmp/build/perf/stLv3kG2", "/tmp/build/perf/libtraceevent.a") = 0 ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-6hh2rl27uri6gsxhmk6q3hx5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	247dd65b90	perf trace beauty: Beautify bind's sockaddr arg By reusing the "connect" BPF collector. Testing it system wide and stopping/starting sshd: # perf trace -e bind LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o DNS Res~er #18/15132 bind(243, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #19/4833 bind(247, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #19/4833 bind(238, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #18/15132 bind(243, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #18/10327 bind(258, { .family: PF_NETLINK }, 12) = 0 :6507/6507 bind(24, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #19/4833 bind(238, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #18/15132 bind(242, { .family: PF_NETLINK }, 12) = 0 sshd/6514 bind(3, { .family: PF_NETLINK }, 12) = 0 sshd/6514 bind(5, { .family: PF_INET, port: 22, addr: 0.0.0.0 }, 16) = 0 sshd/6514 bind(7, { .family: PF_INET6, port: 22, addr: :: }, 28) = 0 DNS Res~er #18/10327 bind(229, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #18/15132 bind(231, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #19/4833 bind(229, { .family: PF_NETLINK }, 12) = 0 ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-m2hmxqrckxxw2ciki0tu889u@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	3c475bc021	perf trace beauty: Beautify 'sendto's sockaddr arg By just writing the collector in the augmented_raw_syscalls.c BPF program: # perf trace -e sendto <SNIP> ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64 ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64 ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64 ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64 Socket Thread/3573 sendto(247, 0x7fb32d49c000, 120, NONE, { .family: PF_UNSPEC }, NULL) = 120 DNS Res~er #18/11374 sendto(242, 0x7fb342cfe420, 20, NONE, { .family: PF_NETLINK }, 0xc) = 20 DNS Res~er #18/11374 sendto(242, 0x7fb342cfcca0, 42, MSG_NOSIGNAL, { .family: PF_UNSPEC }, NULL) = 42 DNS Res~er #18/11374 sendto(242, 0x7fb342cfcccc, 42, MSG_NOSIGNAL, { .family: PF_UNSPEC }, NULL) = 42 ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64 Socket Thread/3573 sendto(242, 0x7fb308bb1c08, 296, NONE, { .family: PF_UNSPEC }, NULL) = 296 ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64 ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64 ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64 ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64 ^C # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-p0l0rlvq19v5zf8qc2x2itow@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	ef969ca64d	perf trace beauty: Do not try to use the fd->pathname beautifier for bind/connect fd arg Doesn't make sense and also we now beautify the sockaddr, which provides enough info: # trace -e close,socket,connec* ssh www.bla.com <SNIP> close(5) = 0 socket(PF_INET, SOCK_DGRAM\|CLOEXEC\|NONBLOCK, IPPROTO_IP) = 5 connect(5, { .family: PF_INET, port: 53, addr: 192.168.44.1 }, 16) = 0 close(5) = 0 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 5 ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-h9drpb7ail808d2mh4n7tla4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	79d725cdf2	perf trace beauty: Disable fd->pathname when close() not enabled As we invalidate the fd->pathname table in the SCA_CLOSE_FD beautifier, if we don't have it we may end up keeping an fd->pathname association that then gets misprinted. The previous behaviour continues when the close() syscall is enabled, which may still be a a problem if we lose records (i.e. we may lose a 'close' record and then get that fd reused by socket()) but then the tool will notify that records are being lost and the user will be warned that some of the heuristics will fall apart. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-b7t6h8sq9lebemvfy2zh3qq1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	1d86275225	perf trace beauty: Make connect's addrlen be printed as an int, not hex # perf trace -e connec* ssh www.bla.com connect(3</var/lib/sss/mc/passwd>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) connect(3</var/lib/sss/mc/passwd>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) connect(4<socket:[16610959]>, { .family: PF_LOCAL, path: /var/lib/sss/pipes/nss }, 110) = 0 connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) connect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) connect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_INET, port: 53, addr: 192.168.44.1 }, 16) = 0 connect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_INET, port: 22, addr: 146.112.61.108 }, 16) = 0 connect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_INET6, port: 22, addr: ::ffff:146.112.61.108 }, 28) = 0 ^Cconnect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_INET, port: 22, addr: 146.112.61.108 }, 16) = -1 (unknown) (INTERNAL ERROR: strerror_r(512, [buf], 128)=22) # Argh, the SCA_FD needs to invalidate its cache when close is done... It works if the 'close' syscall is not filtered out ;-\ # perf trace -e close,connec* ssh www.bla.com close(3) = 0 close(3</usr/lib64/libpcre2-8.so.0.8.0>) = 0 close(3) = 0 close(3</usr/lib64/libkrb5.so.3.3>) = 0 close(3</usr/lib64/libkrb5.so.3.3>) = 0 close(3) = 0 close(3</usr/lib64/libk5crypto.so.3.1>) = 0 close(3</usr/lib64/libk5crypto.so.3.1>) = 0 close(3</usr/lib64/libcom_err.so.2.1>) = 0 close(3</usr/lib64/libcom_err.so.2.1>) = 0 close(3) = 0 close(3</usr/lib64/libkrb5support.so.0.1>) = 0 close(3</usr/lib64/libkrb5support.so.0.1>) = 0 close(3</usr/lib64/libkeyutils.so.1.8>) = 0 close(3</usr/lib64/libkeyutils.so.1.8>) = 0 close(3) = 0 close(3) = 0 close(3) = 0 close(3) = 0 close(4) = 0 close(3) = 0 close(3) = 0 connect(3</etc/nsswitch.conf>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) close(3</etc/nsswitch.conf>) = 0 connect(3</usr/lib64/libnss_sss.so.2>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) close(3</usr/lib64/libnss_sss.so.2>) = 0 close(3</usr/lib64/libnss_sss.so.2>) = 0 close(3) = 0 close(3) = 0 connect(4<socket:[16616519]>, { .family: PF_LOCAL, path: /var/lib/sss/pipes/nss }, 110) = 0 ^C # Will disable this beautifier when 'close' is filtered out... Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-ekuiciyx4znchvy95c8p1yyi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	212b9ab677	perf augmented_raw_syscalls: Augment sockaddr arg in 'connect' We already had a beautifier for an augmented sockaddr payload, but that was when we were hooking on each syscalls:sys_enter_foo tracepoints, since now we're almost doing that by doing a tail call from raw_syscalls:sys_enter, its almost the same, we can reuse it straight away. # perf trace -e connec* ssh www.bla.com connect(3</var/lib/sss/mc/passwd>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 0x6e) = -1 ENOENT (No such file or directory) connect(3</var/lib/sss/mc/passwd>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 0x6e) = -1 ENOENT (No such file or directory) connect(4<socket:[16604782]>, { .family: PF_LOCAL, path: /var/lib/sss/pipes/nss }, 0x6e) = 0 connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 0x6e) = -1 ENOENT (No such file or directory) connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 0x6e) = -1 ENOENT (No such file or directory) connect(5</etc/hosts>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 0x6e) = -1 ENOENT (No such file or directory) connect(5</etc/hosts>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 0x6e) = -1 ENOENT (No such file or directory) connect(5</etc/hosts>, { .family: PF_INET, port: 53, addr: 192.168.44.1 }, 0x10) = 0 connect(5</etc/hosts>, { .family: PF_INET, port: 22, addr: 146.112.61.108 }, 0x10) = 0 connect(5</etc/hosts>, { .family: PF_INET6, port: 22, addr: ::ffff:146.112.61.108 }, 0x1c) = 0 ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-5xkrbcpjsgnr3zt1aqdd7nvc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	6f56367493	perf augmented_raw_syscalls: Rename augmented_args_filename to augmented_args_payload It'll get other stuff in there than just filenames, starting with sockaddr for 'connect' and 'bind'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-bsexidtsn91ehdpzcd6n5fm9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	8b8044e5c9	perf trace: Look for default name for entries in the syscalls prog array I.e. just look for "!syscalls:sys_enter_" or "exit_" plus the syscall name, that way we need just to add entries to the augmented_raw_syscalls.c BPF source to add handlers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-6xavwddruokp6ohs7tf4qilb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	8d5da2649d	perf augmented_raw_syscalls: Support copying two string syscall args Starting with the renameat and renameat2 syscall, that both receive as second and fourth parameters a pathname: # perf trace -e rename* mv one ANOTHER LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o mv: cannot stat 'one': No such file or directory renameat2(AT_FDCWD, "one", AT_FDCWD, "ANOTHER", RENAME_NOREPLACE) = -1 ENOENT (No such file or directory) # Since the per CPU scratch buffer map has space for two maximum sized pathnames, the verifier is satisfied that there will be no overrun. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-x2uboyg5kx2wqeru288209b6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	bf134ca6c8	perf augmented_raw_syscalls: Switch to using BPF_MAP_TYPE_PROG_ARRAY Trying to control what arguments to copy, which ones were strings, etc all from userspace via maps went nowhere, lots of difficulties to get the verifier satisfied, so use what the fine BPF guys designed for such a syscall handling mechanism: bpf_tail_call + BPF_MAP_TYPE_PROG_ARRAY. The series leading to this should have explained it thoroughly, but the end result, explained via gdb should help understand this: Breakpoint 1, syscall_arg__scnprintf_filename (bf=0xc002b1 "", size=2031, arg=0x7fffffff7970) at builtin-trace.c:1268 1268 { (gdb) n 1269 unsigned long ptr = arg->val; (gdb) n 1271 if (arg->augmented.args) (gdb) n 1272 return syscall_arg__scnprintf_augmented_string(arg, bf, size); (gdb) s syscall_arg__scnprintf_augmented_string (arg=0x7fffffff7970, bf=0xc002b1 "", size=2031) at builtin-trace.c:1251 1251 { (gdb) n 1252 struct augmented_arg augmented_arg = arg->augmented.args; (gdb) n 1253 size_t printed = scnprintf(bf, size, "\"%.s\"", augmented_arg->size, augmented_arg->value); (gdb) n 1258 int consumed = sizeof(augmented_arg) + augmented_arg->size; (gdb) p bf $1 = 0xc002b1 "\"/etc/ld.so.cache\"" (gdb) bt #0 syscall_arg__scnprintf_augmented_string (arg=0x7fffffff7970, bf=0xc002b1 "\"/etc/ld.so.cache\"", size=2031) at builtin-trace.c:1258 #1 0x0000000000492634 in syscall_arg__scnprintf_filename (bf=0xc002b1 "\"/etc/ld.so.cache\"", size=2031, arg=0x7fffffff7970) at builtin-trace.c:1272 #2 0x0000000000493cd7 in syscall__scnprintf_val (sc=0xc0de68, bf=0xc002b1 "\"/etc/ld.so.cache\"", size=2031, arg=0x7fffffff7970, val=140737354091036) at builtin-trace.c:1689 #3 0x000000000049404f in syscall__scnprintf_args (sc=0xc0de68, bf=0xc002a7 "AT_FDCWD, \"/etc/ld.so.cache\"", size=2041, args=0x7ffff6cbf1ec "\234\377\377\377", augmented_args=0x7ffff6cbf21c, augmented_args_size=28, trace=0x7fffffffa170, thread=0xbff940) at builtin-trace.c:1756 #4 0x0000000000494a97 in trace__sys_enter (trace=0x7fffffffa170, evsel=0xbe1900, event=0x7ffff6cbf1a0, sample=0x7fffffff7b00) at builtin-trace.c:1975 #5 0x0000000000496ff1 in trace__handle_event (trace=0x7fffffffa170, event=0x7ffff6cbf1a0, sample=0x7fffffff7b00) at builtin-trace.c:2685 #6 0x0000000000497edb in __trace__deliver_event (trace=0x7fffffffa170, event=0x7ffff6cbf1a0) at builtin-trace.c:3029 #7 0x000000000049801e in trace__deliver_event (trace=0x7fffffffa170, event=0x7ffff6cbf1a0) at builtin-trace.c:3056 #8 0x00000000004988de in trace__run (trace=0x7fffffffa170, argc=2, argv=0x7fffffffd660) at builtin-trace.c:3258 #9 0x000000000049c2d3 in cmd_trace (argc=2, argv=0x7fffffffd660) at builtin-trace.c:4220 #10 0x00000000004dcb6c in run_builtin (p=0xa18e00 <commands+576>, argc=5, argv=0x7fffffffd660) at perf.c:304 #11 0x00000000004dcdd9 in handle_internal_command (argc=5, argv=0x7fffffffd660) at perf.c:356 #12 0x00000000004dcf20 in run_argv (argcp=0x7fffffffd4bc, argv=0x7fffffffd4b0) at perf.c:400 #13 0x00000000004dd28c in main (argc=5, argv=0x7fffffffd660) at perf.c:522 (gdb) (gdb) continue Continuing. openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY\|O_CLOEXEC) = 3 Now its a matter of automagically assigning the BPF programs copying syscall arg pointers to functions that are "open"-like (i.e. that need only the first syscall arg copied as a string), or "openat"-like (2nd arg, etc). End result in tool output: # perf trace -e open ls /tmp/notthere LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY\|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib64/libselinux.so.1", O_RDONLY\|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib64/libcap.so.2", O_RDONLY\|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY\|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib64/libpcre2-8.so.0", O_RDONLY\|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib64/libdl.so.2", O_RDONLY\|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib64/libpthread.so.0", O_RDONLY\|O_CLOEXEC) = 3 openat(AT_FDCWD, "", O_RDONLY\|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY\|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = ls: cannot access '/tmp/notthere'-1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en_US/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/coreutils.mo", O_RDONLY: No such file or directory) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en_US/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-snc7ry99cl6r0pqaspjim98x@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	236dd58388	perf augmented_raw_syscalls: Add handler for "openat" I.e. for a syscall that has its second argument being a string, its difficult these days to find 'open' being used in the wild :-) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-yf3kbzirqrukd3fb2sp5qx4p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	b119970aa5	perf trace: Handle raw_syscalls:sys_enter just like the BPF_OUTPUT augmented event So, we use a PERF_COUNT_SW_BPF_OUTPUT to output the augmented sys_enter payload, i.e. to output more than just the raw syscall args, and if something goes wrong when handling an unfiltered syscall, we bail out and just return 1 in the bpf program associated with raw_syscalls:sys_enter, meaning, don't filter that tracepoint, in which case what will appear in the perf ring buffer isn't the BPF_OUTPUT event, but the original raw_syscalls:sys_enter event with its normal payload. Now that we're switching to using a bpf_tail_call + BPF_MAP_TYPE_PROG_ARRAY we're going to use this in the common case, so a bug where raw_syscalls:sys_enter wasn't being handled by trace__sys_enter() surfaced and for that case, instead of using the strace-like augmenter (trace__sys_enter()), we continued to use the normal generic tracepoint handler: (gdb) p evsel $2 = (struct perf_evsel ) 0xc03e40 (gdb) p evsel->name $3 = 0xbc56c0 "raw_syscalls:sys_enter" (gdb) p ((struct perf_evsel ) 0xc03e40)->name $4 = 0xbc56c0 "raw_syscalls:sys_enter" (gdb) p ((struct perf_evsel ) 0xc03e40)->handler $5 = (void ) 0x495eb3 <trace__event_handler> This resulted in this: 0.027 raw_syscalls:sys_enter:NR 12 (0, 7fcfcac64c9b, 4d, 7fcfcac64c9b, 7fcfcac6ce00, 19) ... [continued]: brk()) = 0x563b88677000 I.e. only the sys_exit tracepoint was being properly handled, but since the sys_enter went to the generic trace__event_handler() we printed it using libtraceevent's formatter instead of 'perf trace's strace-like one. Fix it by setting trace__sys_enter() as the handler for raw_syscalls:sys_enter and setup the tp_field tracepoint field accessors. Now, to test it we just make raw_syscalls:sys_enter return 1 right after checking if the pid is filtered, making it not use bpf_perf_output_event() but rather ask for the tracepoint not to be filtered and the result is the expected one: brk(NULL) = 0x556f42d6e000 I.e. raw_syscalls:sys_enter returns 1, gets handled by trace__sys_enter() and gets it combined with the raw_syscalls:sys_exit in a strace-like way. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-0mkocgk31nmy0odknegcby4z@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	3803a22931	perf trace: Put the per-syscall entry/exit prog_array BPF map infrastructure in place I.e. look for "syscalls_sys_enter" and "syscalls_sys_exit" BPF maps of type PROG_ARRAY and populate it with the handlers as specified per syscall, for now only 'open' is wiring it to something, in time all syscalls that need to copy arguments entering a syscall or returning from one will set these to the right handlers, reusing when possible pre-existing ones. Next step is to use bpf_tail_call() into that. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-t0p4u43i9vbpzs1xtowna3gb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	6ff8fff456	perf trace: Allow specifying the bpf prog to augment specific syscalls This is a step in the direction of being able to use a BPF_MAP_TYPE_PROG_ARRAY to handle syscalls that need to copy pointer payloads in addition to the raw tracepoint syscall args. There is a first example in tools/perf/examples/bpf/augmented_raw_syscalls.c for the 'open' syscall. Next step is to introduce the prog array map and use this 'open' augmenter, then use that augmenter in other syscalls that also only copy the first arg as a string, and then show how to use with a syscall that reads more than one filename, like 'rename', etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-pys4v57x5qqrybb4cery2mc8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	5834da7f10	perf trace: Add BPF handler for unaugmented syscalls Will be used to assign to syscalls that don't need augmentation, i.e. those with just integer args. All syscalls will be in a BPF_MAP_TYPE_PROG_ARRAY, and the bpf_tail_call() keyed by the syscall id will either find nothing in place, which means the syscall is being filtered, or a function that will either add things like filenames to the ring buffer, right after the raw syscall args, or be this unaugmented handler that will just return 1, meaning don't filter the original raw_syscalls:sys_{enter,exit} tracepoint. For now it is not really being used, this is just leg work to break the patch into smaller pieces. It introduces a trace__find_bpf_program_by_title() helper that in turn uses libbpf's bpf_object__find_program_by_title() on the BPF object with the __augmented_syscalls__ map. "title" is how libbpf calls the SEC() argument for functions, i.e. the ELF section that follows a convention to specify what BPF program (a function with this SEC() marking) should be connected to which tracepoint, kprobes, etc. In perf anything that is of the form SEC("sys:event_name") will be connected to that tracepoint by perf's BPF loader. In this case its something that will be bpf_tail_call()ed from either the "raw_syscalls:sys_enter" or "raw_syscall:sys_exit" tracepoints, so its named "!raw_syscalls:unaugmented" to convey that idea, i.e. its not going to be directly attached to a tracepoint, thus it starts with a "!". Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-meucpjx2u0slpkayx56lxqq6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	83e69b92b1	perf trace: Order -e syscalls table The ev_qualifier is an array with the syscall ids passed via -e on the command line, sort it as we'll search it when setting up the BPF_MAP_TYPE_PROG_ARRAY. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-c8hprylp3ai6e0z9burn2r3s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	5ca0b7f500	perf trace: Look up maps just on the __augmented_syscalls__ BPF object We can conceivably have multiple BPF object files for other purposes, so better look just on the BPF object containing the __augmented_syscalls__ map for all things augmented_syscalls related. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-3jt8knkuae9lt705r1lns202@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	c8c805707e	perf trace: Add pointer to BPF object containing __augmented_syscalls__ So that we can use it when looking for other components of that object file, such as other programs to add to the BPF_MAP_TYPE_PROG_ARRAY and use with bpf_tail_call(). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-1ibmz7ouv6llqxajy7m8igtd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	af4a0991f4	perf evsel: Store backpointer to attached bpf_object We may want to get to this bpf_object, to search for other BPF programs, etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-3y8hrb6lszjfi23vjlic3cib@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	2620b7e369	perf bpf: Do not attach a BPF prog to a tracepoint if its name starts with ! With BPF_MAP_TYPE_PROG_ARRAY + bpf_tail_call() we want to have BPF programs, i.e. functions in a object file that perf's BPF loader shouldn't try to attach to anything, i.e. "!syscalls:sys_enter_open" should just stay there, not be attached to a tracepoint with that name, it'll be used by, for instance, 'perf trace' to associate with syscalls that copy, in addition to the syscall raw args, a filename pointed by the first arg, i.e. multiple syscalls that need copying the same pointer arg in the same way, as a filename, for instance, will share the same BPF program/function. Right now when perf's BPF loader sees a function with a name "sys:name" it'll look for a tracepoint and will associate that BPF program with it, say: SEC("raw_syscalls:sys_enter") int sys_enter(struct syscall_enter_args *args) { //SNIP } Will crate a perf_evsel tracepoint event and then associate with it that BPF program. This convention at some point will switch to the one used by the BPF loader in libbpf, but to experiment with BPF_MAP_TYPE_PROG_ARRAY in 'perf trace' lets do this, that will not require changing too much stuff. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-lk6dasjr1yf9rtvl292b2hpc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:40 -03:00
Arnaldo Carvalho de Melo	941a7658e0	perf include bpf: Add bpf_tail_call() prototype Will be used together with BPF_MAP_TYPE_PROG_ARRAY in tools/perf/examples/bpf/augmented_raw_syscalls.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-pd1bpy8i31nta6jqwdex871g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:40 -03:00
Vince Weaver	2e9a06dda1	perf tools: Fix perf.data documentation units for memory size The perf.data-file-format documentation incorrectly says the HEADER_TOTAL_MEM results are in bytes. The results are in kilobytes (perf reads the value from /proc/meminfo) Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1907251155500.22624@macbook-air Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 09:03:43 -03:00
Numfor Mbiziwo-Tiapo	20f9781f49	perf header: Fix use of unitialized value warning When building our local version of perf with MSAN (Memory Sanitizer) and running the perf record command, MSAN throws a use of uninitialized value warning in "tools/perf/util/util.c:333:6". This warning stems from the "buf" variable being passed into "write". It originated as the variable "ev" with the type union perf_event* defined in the "perf_event__synthesize_attr" function in "tools/perf/util/header.c". In the "perf_event__synthesize_attr" function they allocate space with a malloc call using ev, then go on to only assign some of the member variables before passing "ev" on as a parameter to the "process" function therefore "ev" contains uninitialized memory. Changing the malloc call to zalloc to initialize all the members of "ev" which gets rid of the warning. To reproduce this warning, build perf by running: make -C tools/perf CLANG=1 CC=clang EXTRA_CFLAGS="-fsanitize=memory\ -fsanitize-memory-track-origins" (Additionally, llvm might have to be installed and clang might have to be specified as the compiler - export CC=/usr/bin/clang) then running: tools/perf/perf record -o - ls / \| tools/perf/perf --no-pager annotate\ -i - --stdio Please see the cover letter for why false positive warnings may be generated. Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Drayton <mbd@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190724234500.253358-2-nums@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 09:03:43 -03:00
Vince Weaver	7622236ceb	perf header: Fix divide by zero error if f_header.attr_size==0 So I have been having lots of trouble with hand-crafted perf.data files causing segfaults and the like, so I have started fuzzing the perf tool. First issue found: If f_header.attr_size is 0 in the perf.data file, then perf will crash with a divide-by-zero error. Committer note: Added a pr_err() to tell the user why the command failed. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1907231100440.14532@macbook-air Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 09:03:43 -03:00
Arnaldo Carvalho de Melo	7ee526152d	tools perf beauty: Fix usbdevfs_ioctl table generator to handle _IOC() In addition to _IOW() and _IOR(), to handle this case: #define USBDEVFS_CONNINFO_EX(len) _IOC(_IOC_READ, 'U', 32, len) That will happen in the next sync of this header file. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-3br5e4t64e4lp0goo84che3s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 09:03:42 -03:00
Arnaldo Carvalho de Melo	820571af72	tools include UAPI: Sync x86's syscalls_64.tbl and generic unistd.h to pick up clone3 and pidfd_open `05a70a8ec2` ("unistd: protect clone3 via __ARCH_WANT_SYS_CLONE3") `8f3220a806` ("arch: wire-up clone3() syscall") `7615d9e178` ("arch: wire-up pidfd_open()") Silencing the following tools/perf build warnings Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h' diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl' diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl Now 'perf trace -e pidfd,clone' will trace those syscalls as well as the others with those prefixes. $ diff -u /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c --- /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before 2019-07-26 12:24:55.020944201 -0300 +++ /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c 2019-07-26 12:25:03.919047217 -0300 @@ -344,5 +344,7 @@ [431] = "fsconfig", [432] = "fsmount", [433] = "fspick", + [434] = "pidfd_open", + [435] = "clone3", }; -#define SYSCALLTBL_x86_64_MAX_ID 433 +#define SYSCALLTBL_x86_64_MAX_ID 435 $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Christian Brauner <christian@brauner.io> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-0isnnqxtr1ihz6p8wzjiy47d@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-26 12:31:28 -03:00
Ingo Molnar	49902052fc	perf/urgent fixes: perf.data: Alexey Budankov: - Fix loading of compressed data split across adjacent records Jiri Olsa: - Fix buffer size setting for processing CPU topology perf.data header. perf stat: Jiri Olsa: - Fix segfault for event group in repeat mode Cong Wang: - Always separate "stalled cycles per insn" line, it was being appended to the "instructions" line. perf script: Andi Kleen: - Fix --max-blocks man page description. - Improve man page description of metrics. - Fix off by one in brstackinsn IPC computation. perf probe: Arnaldo Carvalho de Melo: - Avoid calling freeing routine multiple times for same pointer. perf build: - Do not use -Wshadow on gcc < 4.8, avoiding too strict warnings treated as errors, breaking the build. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXTcpSQAKCRCyPKLppCJ+ J0s1AQCY4uEiw7ZDUMPkztqG/9nder8M4ncd2FYwsQObmjxhBQEA+u/jvJ9UcUKk X9BpjDE+1Pi3LrMaFjDQMKgpSutzXgg= =rAom -----END PGP SIGNATURE----- Merge tag 'perf-urgent-for-mingo-5.3-20190723' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: perf.data: Alexey Budankov: - Fix loading of compressed data split across adjacent records Jiri Olsa: - Fix buffer size setting for processing CPU topology perf.data header. perf stat: Jiri Olsa: - Fix segfault for event group in repeat mode Cong Wang: - Always separate "stalled cycles per insn" line, it was being appended to the "instructions" line. perf script: Andi Kleen: - Fix --max-blocks man page description. - Improve man page description of metrics. - Fix off by one in brstackinsn IPC computation. perf probe: Arnaldo Carvalho de Melo: - Avoid calling freeing routine multiple times for same pointer. perf build: - Do not use -Wshadow on gcc < 4.8, avoiding too strict warnings treated as errors, breaking the build. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-07-23 23:41:33 +02:00
Arnaldo Carvalho de Melo	d95daf5acc	perf probe: Avoid calling freeing routine multiple times for same pointer When perf_add_probe_events() we call cleanup_perf_probe_events() for the pev pointer it receives, then, as part of handling this failure the main 'perf probe' goes on and calls cleanup_params() and that will again call cleanup_perf_probe_events()for the same pointer, so just set nevents to zero when handling the failure of perf_add_probe_events() to avoid the double free. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-x8qgma4g813z96dvtw9w219q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-23 09:04:41 -03:00
Arnaldo Carvalho de Melo	df8350ed56	perf probe: Set pev->nargs to zero after freeing pev->args entries So that, when perf_add_probe_events() fails, like in: # perf probe icmp_rcv:64 "type=icmph->type" Failed to find 'icmph' in this function. Error: Failed to add events. Segmentation fault (core dumped) # We don't segfault. clear_perf_probe_event() was zeroing the whole pev, and since the switch to zfree() for the members in the pev, that memset() was removed, which left nargs with its original value, in the above case 1. With the memset the same pev could be passed to clear_perf_probe_event() multiple times, since all it would have would be zeroes, and free() accepts zero, the loop would not happen and we would just memset it again to zeroes. Without it we got that segfault, so zero nargs to keep it like it was, next cset will avoid calling clear_perf_probe_event() for the same pevs in case of failure. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: `d8f9da2404` ("perf tools: Use zfree() where applicable") Link: https://lkml.kernel.org/n/tip-802f2jypnwqsvyavvivs8464@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-23 09:04:25 -03:00
Alexey Budankov	872c8ee8f0	perf session: Fix loading of compressed data split across adjacent records Fix decompression failure found during the loading of compressed trace collected on larger scale systems (>48 cores). The error happened due to lack of decompression space for a mmaped buffer data chunk split across adjacent PERF_RECORD_COMPRESSED records. $ perf report -i bt.16384.data --stats failed to decompress (B): 63869 -> 0 : Destination buffer is too small user stack dump failure Can't parse sample, err = -14 0x2637e436 [0x4080]: failed to process type: 9 Error: failed to process sample $ perf test 71 71: Zstd perf.data compression/decompression : Ok Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/4d839e1b-9c48-89c4-9702-a12217420611@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-23 09:04:03 -03:00
Cong Wang	146540fb54	perf stat: Always separate stalled cycles per insn The "stalled cycles per insn" is appended to "instructions" when the CPU has this hardware counter directly. We should always make it a separate line, which also aligns to the output when we hit the "if (total && avg)" branch. Before: $ sudo perf stat --all-cpus --field-separator , --log-fd 1 -einstructions,cycles -- sleep 1 4565048704,,instructions,64114578096,100.00,1.34,insn per cycle,, 3396325133,,cycles,64146628546,100.00,, After: $ sudo ./tools/perf/perf stat --all-cpus --field-separator , --log-fd 1 -einstructions,cycles -- sleep 1 6721924,,instructions,24026790339,100.00,0.22,insn per cycle ,,,,,0.00,stalled cycles per insn 30939953,,cycles,24025512526,100.00,, Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20190517221039.8975-1-xiyou.wangcong@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-23 09:03:46 -03:00
Jiri Olsa	08ef3af157	perf stat: Fix segfault for event group in repeat mode Numfor Mbiziwo-Tiapo reported segfault on stat of event group in repeat mode: # perf stat -e '{cycles,instructions}' -r 10 ls It's caused by memory corruption due to not cleaned evsel's id array and index, which needs to be rebuilt in every stat iteration. Currently the ids index grows, while the array (which is also not freed) has the same size. Fixing this by releasing id array and zeroing ids index in perf_evsel__close function. We also need to keep the evsel_list alive for stat record (which is disabled in repeat mode). Reported-by: Numfor Mbiziwo-Tiapo <nums@google.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Drayton <mbd@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190715142121.GC6032@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-23 09:00:05 -03:00
Jiri Olsa	79b2fe5e75	perf tools: Fix proper buffer size for feature processing After Song Liu's segfault fix for pipe mode, Arnaldo reported following error: # perf record -o - \| perf script 0x514 [0x1ac]: failed to process type: 80 It's caused by wrong buffer size setup in feature processing, which makes cpu topology feature fail, because it's using buffer size to recognize its header version. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Fixes: `e9def1b2e7` ("perf tools: Add feature header record to pipe-mode") Link: http://lkml.kernel.org/r/20190715140426.32509-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-23 08:59:49 -03:00
Andi Kleen	dde4e732a5	perf script: Fix off by one in brstackinsn IPC computation When we hit the end of a program block, need to count the last instruction too for the IPC computation. This caused large errors for small blocks. % perf script -b ls / > /dev/null Before: % perf script -F +brstackinsn --xed ... 00007f94c9ac70d8 jz 0x7f94c9ac70e3 # PRED 3 cycles [36] 4.33 IPC 00007f94c9ac70e3 testb $0x20, 0x31d(%rbx) 00007f94c9ac70ea jnz 0x7f94c9ac70b0 00007f94c9ac70ec testb $0x8, 0x205ad(%rip) 00007f94c9ac70f3 jz 0x7f94c9ac6ff0 # PRED 1 cycles [37] 3.00 IPC After: % perf script -F +brstackinsn --xed ... 00007f94c9ac70d8 jz 0x7f94c9ac70e3 # PRED 3 cycles [15] 4.67 IPC 00007f94c9ac70e3 testb $0x20, 0x31d(%rbx) 00007f94c9ac70ea jnz 0x7f94c9ac70b0 00007f94c9ac70ec testb $0x8, 0x205ad(%rip) 00007f94c9ac70f3 jz 0x7f94c9ac6ff0 # PRED 1 cycles [16] 4.00 IPC Suggested-by: Denis Bakhvalov <denis.bakhvalov@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190711181922.18765-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-23 08:59:37 -03:00
Andi Kleen	7db7218a7e	perf script: Improve man page description of metrics Clarify that a metric is based on events, not referring to itself. Also some improvements with the sentences. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190711181922.18765-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-23 08:58:11 -03:00
Andi Kleen	5f8eec3225	perf script: Fix --max-blocks man page description The --max-blocks description was using the old name brstackasm. Use brstackinsn instead. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190711181922.18765-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-23 08:57:54 -03:00
Linus Torvalds	46f5c0cc3a	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf tooling updates from Thomas Gleixner: "A set of perf improvements and fixes: perf db-export: - Improvements in how COMM details are exported to databases for post processing and use in the sql-viewer.py UI. - Export switch events to the database. BPF: - Bump rlimit(MEMLOCK) for 'perf test bpf' and 'perf trace', just like selftests/bpf/bpf_rlimit.h do, which makes errors due to exhaustion of this limit, which are kinda cryptic (EPERM sometimes) less frequent. perf version: - Fix segfault due to missing OPT_END(), noticed on PowerPC. perf vendor events: - Add JSON files for IBM s/390 machine type 8561. perf cs-etm (ARM): - Fix two cases of error returns not bing done properly: Invalid ERR_PTR() use and loss of propagation error codes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits) perf version: Fix segfault due to missing OPT_END() perf vendor events s390: Add JSON files for machine type 8561 perf cs-etm: Return errcode in cs_etm__process_auxtrace_info() perf cs-etm: Remove errnoeous ERR_PTR() usage in cs_etm__process_auxtrace_info perf scripts python: export-to-postgresql.py: Export switch events perf scripts python: export-to-sqlite.py: Export switch events perf db-export: Export switch events perf db-export: Factor out db_export__threads() perf script: Add scripting operation process_switch() perf scripts python: exported-sql-viewer.py: Use new 'has_calls' column perf scripts python: exported-sql-viewer.py: Remove redundant semi-colons perf scripts python: export-to-postgresql.py: Add has_calls column to comms table perf scripts python: export-to-sqlite.py: Add has_calls column to comms table perf db-export: Also export thread's current comm perf db-export: Factor out db_export__comm() perf scripts python: export-to-postgresql.py: Export comm details perf scripts python: export-to-sqlite.py: Export comm details perf db-export: Export comm details perf db-export: Fix a white space issue in db_export__sample() perf db-export: Move export__comm_thread into db_export__sample() ...	2019-07-20 11:06:12 -07:00
Linus Torvalds	818e95c768	The main changes in this release include: - Add user space specific memory reading for kprobes - Allow kprobes to be executed earlier in boot The rest are mostly just various clean ups and small fixes. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXS88txQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qhaPAQDHaAmu6wXtZjZE6GU4ZP61UNgDECmZ 4wlGrNc1AAlqAQD/QC8339p37aDCp9n27VY1wmJwF3nca+jAHfQLqWkkYgw= =n/tz -----END PGP SIGNATURE----- Merge tag 'trace-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "The main changes in this release include: - Add user space specific memory reading for kprobes - Allow kprobes to be executed earlier in boot The rest are mostly just various clean ups and small fixes" * tag 'trace-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits) tracing: Make trace_get_fields() global tracing: Let filter_assign_type() detect FILTER_PTR_STRING tracing: Pass type into tracing_generic_entry_update() ftrace/selftest: Test if set_event/ftrace_pid exists before writing ftrace/selftests: Return the skip code when tracing directory not configured in kernel tracing/kprobe: Check registered state using kprobe tracing/probe: Add trace_event_call accesses APIs tracing/probe: Add probe event name and group name accesses APIs tracing/probe: Add trace flag access APIs for trace_probe tracing/probe: Add trace_event_file access APIs for trace_probe tracing/probe: Add trace_event_call register API for trace_probe tracing/probe: Add trace_probe init and free functions tracing/uprobe: Set print format when parsing command tracing/kprobe: Set print format right after parsed command kprobes: Fix to init kprobes in subsys_initcall tracepoint: Use struct_size() in kmalloc() ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS ftrace: Enable trampoline when rec count returns back to one tracing/kprobe: Do not run kprobe boot tests if kprobe_event is on cmdline tracing: Make a separate config for trace event self tests ...	2019-07-18 11:51:00 -07:00
Ravi Bangoria	916c31fff9	perf version: Fix segfault due to missing OPT_END() 'perf version' on powerpc segfaults when used with non-supported option: # perf version -a Segmentation fault (core dumped) Fix this. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Tested-by: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20190611030109.20228-1-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-15 07:59:05 -03:00
Thomas Richter	6e67d77d67	perf vendor events s390: Add JSON files for machine type 8561 Add CPU measurement counter facility event description files (JSON) for IBM machine types 8561 and 8562. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Link: http://lkml.kernel.org/r/20190712123113.100882-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-12 13:51:59 -03:00
YueHaibing	6285bd151b	perf cs-etm: Return errcode in cs_etm__process_auxtrace_info() The 'err' variable is set in the error path, but it's not returned to callers. Don't always return -EINVAL, return err. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Fixes: `cd8bfd8c97` ("perf tools: Add processing of coresight metadata") Link: http://lkml.kernel.org/r/20190321023122.21332-3-yuehaibing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-11 12:45:02 -03:00
YueHaibing	edc82a9943	perf cs-etm: Remove errnoeous ERR_PTR() usage in cs_etm__process_auxtrace_info intlist__findnew() doesn't uses ERR_PTR() as a return mechanism so its callers shouldn't try to extract the error using PTR_ERR( ret) from intlist__findnew(), make cs_etm__process_auxtrace_info return -ENOMEM instead. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Fixes: `cd8bfd8c97` ("perf tools: Add processing of coresight metadata") Link: http://lkml.kernel.org/r/20190321023122.21332-2-yuehaibing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-11 12:42:46 -03:00
Adrian Hunter	56789f3dc1	perf scripts python: export-to-postgresql.py: Export switch events Export switch events to a new table 'context_switches' and create a view 'context_switches_view'. The table and view will show automatically in the exported-sql-viewer.py script. If the table ends up empty, then it and the view are dropped. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-22-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 13:05:12 -03:00
Adrian Hunter	37c1f991b1	perf scripts python: export-to-sqlite.py: Export switch events Export switch events to a new table 'context_switches' and create a view 'context_switches_view'. The table and view will show automatically in the exported-sql-viewer.py script. If the table ends up empty, then it and the view are dropped. Committer testing: Use the exported-sql-viewer.py and look at "Tables" -> "context_switches": id machine_id time cpu thread_out_id comm_out_id thread_in_id comm_in_id flags 1 1 187836111885918 7 1 1 2 2 3 2 1 187836111889369 7 1 1 2 2 0 3 1 187836112464618 7 2 3 1 1 1 4 1 187836112465511 7 2 3 1 1 0 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-21-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:37:35 -03:00
Adrian Hunter	abde8722d9	perf db-export: Export switch events Export details of switch events including the threads and their current comms. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-20-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:35:38 -03:00
Adrian Hunter	b3694e6c0a	perf db-export: Factor out db_export__threads() In preparation for exporting switch events, factor out db_export__threads(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-19-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:35:18 -03:00
Adrian Hunter	5bf83c29a0	perf script: Add scripting operation process_switch() Add scripting operation process_switch() to process switch events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-18-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:34:09 -03:00
Adrian Hunter	26c11206f4	perf scripts python: exported-sql-viewer.py: Use new 'has_calls' column If the new 'has_calls' column is present, use it with the call graph and call tree to select only comms that have calls. Committer testing: Just started the exported-sql-view.py and accessed all the reports, no backtraces. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-17-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:31:25 -03:00
Adrian Hunter	266887291c	perf scripts python: exported-sql-viewer.py: Remove redundant semi-colons Remove redundant semi-colons added inadvertently. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-16-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:31:01 -03:00
Adrian Hunter	d9efc1d252	perf scripts python: export-to-postgresql.py: Add has_calls column to comms table Now that a thread's current comm is exported, it shows up in the call graph and call tree even if it has no calls. That can happen because the calls are recorded against the main thread's initial comm. Add a table column to make it easy for the exported-sql-viewer.py script to select only comms with calls. Committer testing: $ rm -f simple-retpoline.db $ sudo ~acme/bin/perf script -i simple-retpoline.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls 2019-07-10 12:25:33.200529 Creating database ... 2019-07-10 12:25:33.211548 Writing records... 2019-07-10 12:25:33.549630 Adding indexes 2019-07-10 12:25:33.560715 Dropping unused tables 2019-07-10 12:25:33.580201 Done $ sha256sum tools/perf/scripts/python/export-to-sqlite.py ~/libexec/perf-core/scripts/python/export-to-sqlite.py 2922b642c392004dffa1d8789296478c85904623f5895bcb9b6cbf33e3ca999f tools/perf/scripts/python/export-to-sqlite.py 2922b642c392004dffa1d8789296478c85904623f5895bcb9b6cbf33e3ca999f /home/acme/libexec/perf-core/scripts/python/export-to-sqlite.py $ $ sqlite3 simple-retpoline.db SQLite version 3.26.0 2018-12-01 12:34:55 Enter ".help" for usage hints. sqlite> .schema comms CREATE TABLE comms (id integer NOT NULL PRIMARY KEY,comm varchar(16),c_thread_id bigint,c_time bigint,exec_flag boolean, has_calls boolean); sqlite> select id,has_calls from comms; 0\|1 1\|1 sqlite> select distinct comm_id from calls; 0 1 sqlite> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-15-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:25:05 -03:00
Adrian Hunter	ecc8c9984d	perf scripts python: export-to-sqlite.py: Add has_calls column to comms table Now that a thread's current comm is exported, it shows up in the call graph and call tree even if it has no calls. That can happen because the calls are recorded against the main thread's initial comm. Add a table column to make it easy for the exported-sql-viewer.py script to select only comms with calls. Committer notes: Running the export-to-sqlite.py worked without warnings and using the exported-sql-viewer.py worked as before. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-14-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:23:55 -03:00
Adrian Hunter	4650c7bed7	perf db-export: Also export thread's current comm Currently, the initial comm of the main thread is exported. Export also a thread's current comm. That better supports the tracing of multi-threaded applications that set different comms for different threads to make it easier to distinguish them. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-13-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:14:07 -03:00
Adrian Hunter	80859c947a	perf db-export: Factor out db_export__comm() In preparation for exporting the current comm for a thread, factor out db_export__comm(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-12-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:13:51 -03:00
Adrian Hunter	8534b5de81	perf scripts python: export-to-postgresql.py: Export comm details Add table columns for thread id, comm start time and exec flag. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:13:36 -03:00
Adrian Hunter	41085f2bdd	perf scripts python: export-to-sqlite.py: Export comm details Add table columns for thread id, comm start time and exec flag. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:13:26 -03:00
Adrian Hunter	8ebf5cc0f6	perf db-export: Export comm details In preparation for exporting the current comm for a thread, export comm thread id, start time and exec flag. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:13:08 -03:00
Adrian Hunter	a5defb2f39	perf db-export: Fix a white space issue in db_export__sample() Fix a white space issue in db_export__sample() Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:12:56 -03:00
Adrian Hunter	1ed1195898	perf db-export: Move export__comm_thread into db_export__sample() Move call to db_export__comm_thread() from db_export__thread() into db_export__sample() because it makes the code easier to understand, and add explanatory comments. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:12:45 -03:00
Adrian Hunter	6319790bcf	perf db-export: Export comm before exporting thread Export comm before exporting the non-main thread because db_export__thread() also exports the comm_thread. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:12:25 -03:00
Adrian Hunter	19207d8694	perf db-export: Export main_thread in db_export__sample() Export main_thread in db_export__sample() because it makes the code easier to understand, and prepares db_export__thread() for further simplification. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:12:05 -03:00
Adrian Hunter	ed5c0a16fe	perf db-export: Pass main_thread to db_export__thread() Calls to db_export__thread() already have main_thread so there is no reason to get it again, instead pass it as a parameter. Note that one difference in this approach is that the main thread is not created if it does not exist. It is better if it is not created because: - If main_thread is being traced it will have been created already. - If it is not being traced, there will be no other information about it, and it will never get deleted because there will be no EXIT event. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:11:04 -03:00
Adrian Hunter	208032fef1	perf db-export: Rename db_export__comm() to db_export__exec_comm() Rename db_export__comm() to db_export__exec_comm() to better reflect what it does and add explanatory comments. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190710085810.1650-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:10:27 -03:00
Adrian Hunter	fead24e523	perf db-export: Get rid of db_export__deferred() db_export__deferred() deferred the export of comms if the comm string had not been "set" (changed from :<pid>) however that problem was fixed a long time ago by commit `e803cf97a4` ("perf record: Synthesize COMM event for a command line workload"), so get rid of db_export__deferred(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190710085810.1650-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-10 12:07:40 -03:00
Arnaldo Carvalho de Melo	c3e78a3403	perf trace: Auto bump rlimit(MEMLOCK) for eBPF maps sake Circa v5.2 this started to fail: # perf trace -e /wb/augmented_raw_syscalls.o event syntax error: '/wb/augmented_raw_syscalls.o' \___ Operation not permitted (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf trace [<options>] [<command>] or: perf trace [<options>] -- <command> [<options>] or: perf trace record [<options>] [<command>] or: perf trace record [<options>] -- <command> [<options>] -e, --event <event> event/syscall selector. use 'perf list' to list available events # In verbose mode we some -EPERM when creating a BPF map: # perf trace -v -e /wb/augmented_raw_syscalls.o <SNIP> libbpf: failed to create map (name: '__augmented_syscalls__'): Operation not permitted libbpf: failed to load object '/wb/augmented_raw_syscalls.o' bpf: load objects failed: err=-1: (Operation not permitted) event syntax error: '/wb/augmented_raw_syscalls.o' \___ Operation not permitted (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf trace [<options>] [<command>] or: perf trace [<options>] -- <command> [<options>] or: perf trace record [<options>] [<command>] or: perf trace record [<options>] -- <command> [<options>] -e, --event <event> event/syscall selector. use 'perf list' to list available events # If we bumped 'ulimit -l 128' to get it from the 64k default to double that, it worked, so use the recently added rlimit__bump_memlock() helper: # perf trace -e /wb/augmented_raw_syscalls.o -e open,sleep sleep 1 0.000 ( 0.007 ms): sleep/28042 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY\|CLOEXEC) = 3 0.022 ( 0.004 ms): sleep/28042 openat(dfd: CWD, filename: "/lib64/libc.so.6", flags: RDONLY\|CLOEXEC) = 3 0.201 ( 0.007 ms): sleep/28042 openat(dfd: CWD, filename: "", flags: RDONLY\|CLOEXEC) = 3 0.241 (1000.421 ms): sleep/28042 nanosleep(rqtp: 0x7ffd6c3e6ed0) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-j6f2ioa6hj9dinzpjvlhcjoc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 16:36:45 -03:00
Arnaldo Carvalho de Melo	d3280ce01e	perf test: Auto bump rlimit(MEMLOCK) for BPF test sake I noticed that the 'perf test bpf' was failing: # perf test bpf 41: BPF filter : 41.1: Basic BPF filtering : Skip 41.2: BPF pinning : Skip 41.3: BPF prologue generation : Skip 41.4: BPF relocation checker : Skip # ulimit -l 64 # Using verbose mode we get just a line bout -EPERF being returned from libbpf's bpf_load_program_xattr(), that ends up being used in 'perf test bpf' initial program loading capability query: Missing basic BPF support, skip this test: Operation not permitted Not that informative, but on a separate problem when creating BPF maps bumping rlimit(MEMLOCK) helped, so I tried it here as well, works: # ulimit -l 128 # perf test bpf 41: BPF filter : 41.1: Basic BPF filtering : Ok 41.2: BPF pinning : Ok 41.3: BPF prologue generation : Ok 41.4: BPF relocation checker : Ok # So use the recently added rlimit__bump_memlock() helper: # ulimit -l 64 # perf test bpf 41: BPF filter : 41.1: Basic BPF filtering : Ok 41.2: BPF pinning : Ok 41.3: BPF prologue generation : Ok 41.4: BPF relocation checker : Ok # ulimit -l 64 # I.e. the bumping of memlock is restricted to the 'perf test' instance, not changing the global value. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-b9fubkhr4jm192lu7y8hgjvo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 16:27:01 -03:00
Arnaldo Carvalho de Melo	4975223b81	perf tools: Introduce rlimit__bump_memlock() helper Just like the BPF guys did when faced with failures with map creation, etc, i.e. their solution is: tools/testing/selftests/bpf/bpf_rlimit.h For perf use this function in 'perf test' and in 'perf trace'. Make it bump to 4 times the current value, if it fails twice the current value and if it still fails, warn that things like BPF map creation may fail, to help in diagnosing the problem. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-muvqef2i7n6pzqbmu7tn2d2y@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 14:59:11 -03:00
Leo Yan	323fd74982	perf intel-pt: Fix potential NULL pointer dereference found by the smatch tool Based on the following report from Smatch, fix the potential NULL pointer dereference check. tools/perf/util/intel-pt.c:3200 intel_pt_process_auxtrace_info() error: we previously assumed 'session->itrace_synth_opts' could be null (see line 3196) tools/perf/util/intel-pt.c:3206 intel_pt_process_auxtrace_info() warn: variable dereferenced before check 'session->itrace_synth_opts' (see line 3200) tools/perf/util/intel-pt.c 3196 if (session->itrace_synth_opts && session->itrace_synth_opts->set) { 3197 pt->synth_opts = *session->itrace_synth_opts; 3198 } else { 3199 itrace_synth_opts__set_default(&pt->synth_opts, 3200 session->itrace_synth_opts->default_no_sample); ^^^^^^^^^^^^^^^^^^^^^^^^^^ 3201 if (!session->itrace_synth_opts->default_no_sample && 3202 !session->itrace_synth_opts->inject) { 3203 pt->synth_opts.branches = false; 3204 pt->synth_opts.callchain = true; 3205 } 3206 if (session->itrace_synth_opts) ^^^^^^^^^^^^^^^^^^^^^^^^^^ 3207 pt->synth_opts.thread_stack = 3208 session->itrace_synth_opts->thread_stack; 3209 } 'session->itrace_synth_opts' is impossible to be a NULL pointer in intel_pt_process_auxtrace_info(), thus this patch removes the NULL test for 'session->itrace_synth_opts'. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190708143937.7722-4-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:28 -03:00
Leo Yan	1d48145881	perf intel-bts: Fix potential NULL pointer dereference found by the smatch tool Based on the following report from Smatch, fix the potential NULL pointer dereference check. tools/perf/util/intel-bts.c:898 intel_bts_process_auxtrace_info() error: we previously assumed 'session->itrace_synth_opts' could be null (see line 894) tools/perf/util/intel-bts.c:899 intel_bts_process_auxtrace_info() warn: variable dereferenced before check 'session->itrace_synth_opts' (see line 898) tools/perf/util/intel-bts.c 894 if (session->itrace_synth_opts && session->itrace_synth_opts->set) { 895 bts->synth_opts = *session->itrace_synth_opts; 896 } else { 897 itrace_synth_opts__set_default(&bts->synth_opts, 898 session->itrace_synth_opts->default_no_sample); ^^^^^^^^^^^^^^^^^^^^^^^^^^ 899 if (session->itrace_synth_opts) ^^^^^^^^^^^^^^^^^^^^^^^^^^ 900 bts->synth_opts.thread_stack = 901 session->itrace_synth_opts->thread_stack; 902 } 'session->itrace_synth_opts' is impossible to be a NULL pointer in intel_bts_process_auxtrace_info(), thus this patch removes the NULL test for 'session->itrace_synth_opts'. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190708143937.7722-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:28 -03:00
Song Liu	9d49169c59	perf script: Assume native_arch for pipe mode In pipe mode, session->header.env.arch is not populated until the events are processed. Therefore, the following command crashes: perf record -o - \| perf script (gdb) bt It fails when we try to compare env.arch against uts.machine: if (!strcmp(uts.machine, session->header.env.arch) \|\| (!strcmp(uts.machine, "x86_64") && !strcmp(session->header.env.arch, "i386"))) native_arch = true; In pipe mode, it is tricky to find env.arch at this stage. To keep it simple, let's just assume native_arch is always true for pipe mode. Reported-by: David Carrillo Cisneros <davidca@fb.com> Signed-off-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kernel-team@fb.com Cc: stable@vger.kernel.org #v5.1+ Fixes: `3ab481a1cf` ("perf script: Support insn output for normal samples") Link: http://lkml.kernel.org/r/20190621014438.810342-1-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:28 -03:00
Adrian Hunter	1334bb94cd	perf scripts python: export-to-sqlite.py: Fix DROP VIEW power_events_view Drop power_events_view before its dependent tables. SQLite does not seem to mind but the fix was needed for PostgreSQL (export-to-postgresql.py script), so do the same fix for the SQLite. It is more logical and keeps the 2 scripts following the same approach. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Fixes: `5130c6e555` ("perf scripts python: export-to-sqlite.py: Export Intel PT power and ptwrite events") Link: http://lkml.kernel.org/r/20190708055232.5032-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:28 -03:00
Adrian Hunter	d8d051df9f	perf scripts python: export-to-postgresql.py: Fix DROP VIEW power_events_view PostgreSQL can error if power_events_view is not dropped before its dependent tables e.g. Exception: Query failed: ERROR: cannot drop table mwait because other objects depend on it DETAIL: view power_events_view depends on table mwait Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Fixes: `aba44287a2` ("perf scripts python: export-to-postgresql.py: Export Intel PT power and ptwrite events") Link: http://lkml.kernel.org/r/20190708055232.5032-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:27 -03:00
Leo Yan	ceb75476db	perf hists browser: Fix potential NULL pointer dereference found by the smatch tool Based on the following report from Smatch, fix the potential NULL pointer dereference check. tools/perf/ui/browsers/hists.c:641 hist_browser__run() error: we previously assumed 'hbt' could be null (see line 625) tools/perf/ui/browsers/hists.c:3088 perf_evsel__hists_browse() error: we previously assumed 'browser->he_selection' could be null (see line 2902) tools/perf/ui/browsers/hists.c:3272 perf_evsel_menu__run() error: we previously assumed 'hbt' could be null (see line 3260) This patch firstly validating the pointers before access them, so can fix potential NULL pointer dereference. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190708143937.7722-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:27 -03:00
Leo Yan	0702f23c98	perf cs-etm: Fix potential NULL pointer dereference found by the smatch tool Based on the following report from Smatch, fix the potential NULL pointer dereference check. tools/perf/util/cs-etm.c:2545 cs_etm__process_auxtrace_info() error: we previously assumed 'session->itrace_synth_opts' could be null (see line 2541) tools/perf/util/cs-etm.c 2541 if (session->itrace_synth_opts && session->itrace_synth_opts->set) { 2542 etm->synth_opts = *session->itrace_synth_opts; 2543 } else { 2544 itrace_synth_opts__set_default(&etm->synth_opts, 2545 session->itrace_synth_opts->default_no_sample); ^^^^^^^^^^^^^^^^^^^^^^^^^^ 2546 etm->synth_opts.callchain = false; 2547 } 'session->itrace_synth_opts' is impossible to be a NULL pointer in cs_etm__process_auxtrace_info(), thus this patch removes the NULL test for 'session->itrace_synth_opts'. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190708143937.7722-5-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:27 -03:00
Luke Mujica	72de3fd97f	perf parse-events: Remove unused variable: error Remove the 'error' variable because it is declared but not used in parse-events.y or in the generated parse-events.c. Signed-off-by: Luke Mujica <lukemujica@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190703222509.109616-2-lukemujica@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:27 -03:00
Luke Mujica	34c9af571e	perf parse-events: Remove unused variable 'i' Remove the 'int i' because it is declared but not used in parse-events.y or in the generated parse-events.c. Signed-off-by: Luke Mujica <lukemujica@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190703222509.109616-1-lukemujica@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:27 -03:00
Arnaldo Carvalho de Melo	acc7bfb3db	perf metricgroup: Add missing list_del_init() when flushing egroups list So that at the end each of the entries have its list node struct cleared and the egroup list head ends emptied. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-dxzj1ah350fy9ec0xbhb15b6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:27 -03:00
Arnaldo Carvalho de Melo	e56fbc9dc7	perf tools: Use list_del_init() more thorougly To allow for destructors to check if they're operating on a object still in a list, and to avoid going from use after free list entries into still valid, or even also other already removed from list entries. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-deh17ub44atyox3j90e6rksu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:27 -03:00
Arnaldo Carvalho de Melo	d8f9da2404	perf tools: Use zfree() where applicable In places where the equivalent was already being done, i.e.: free(a); a = NULL; And in placs where struct members are being freed so that if we have some erroneous reference to its struct, then accesses to freed members will result in segfaults, which we can detect faster than use after free to areas that may still have something seemingly valid. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-jatyoofo5boc1bsvoig6bb6i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:27 -03:00
Arnaldo Carvalho de Melo	7f7c536f23	tools lib: Adopt zalloc()/zfree() from tools/perf Eroding a bit more the tools/perf/util/util.h hodpodge header. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:26 -03:00
Arnaldo Carvalho de Melo	e5653eb82d	perf tools: Move get_current_dir_name() cond prototype out of util.h And in a separate header, so that we erode util.h a bit more. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-xpzvuu9d0gei9jl9bkzgobln@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:26 -03:00
Arnaldo Carvalho de Melo	245aec7f7f	perf namespaces: Move the conditional setns() prototype to namespaces.h Out of util.h, to reduce its scope, and since we have a namespaces.h header, much better to have it there, where it is related to. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-zlu81bbtccuzygh7m8nmgybc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:26 -03:00
Arnaldo Carvalho de Melo	215a0d305c	perf tools: Add missing headers, mostly stdlib.h Part of the erosion of util/util.h, that will lose its include stdlib.h, we need to add it to places where it is needed but was getting it indirectly. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-1imnqezw99ahc07fjeb51qby@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 10:13:22 -03:00
Arnaldo Carvalho de Melo	fc50e0ba9b	perf evsel: perf_evsel__name(NULL) is valid, no need to check evsel It'll return "unknown", no need to open code it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-4okvjmm18arjrcyfhuahgfxm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 09:33:55 -03:00
Leo Yan	f3c8d90757	perf session: Fix potential NULL pointer dereference found by the smatch tool Based on the following report from Smatch, fix the potential NULL pointer dereference check. tools/perf/util/session.c:1252 dump_read() error: we previously assumed 'evsel' could be null (see line 1249) tools/perf/util/session.c 1240 static void dump_read(struct perf_evsel evsel, union perf_event event) 1241 { 1242 struct read_event *read_event = &event->read; 1243 u64 read_format; 1244 1245 if (!dump_trace) 1246 return; 1247 1248 printf(": %d %d %s %" PRIu64 "\n", event->read.pid, event->read.tid, 1249 evsel ? perf_evsel__name(evsel) : "FAIL", 1250 event->read.value); 1251 1252 read_format = evsel->attr.read_format; ^^^^^^^ 'evsel' could be NULL pointer, for this case this patch directly bails out without dumping read_event. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexios Zavras <alexios.zavras@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Song Liu <songliubraving@fb.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190702103420.27540-9-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 09:33:55 -03:00
Arnaldo Carvalho de Melo	40978e9bf2	perf inject: The tool->read() call may pass a NULL evsel, handle it Check first, as machines__deliver_event() may have perf_evlist__id2evsel() returning NULL. This was found while checking a report from Leo Yan that used the smatch tool to find places where a pointer is checked before use and then, later in the same function gets used without checking. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-muvb8xqyh0gysgfjfq35w642@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 09:33:55 -03:00
Leo Yan	363bbaef63	perf map: Fix potential NULL pointer dereference found by smatch tool Based on the following report from Smatch, fix the potential NULL pointer dereference check. tools/perf/util/map.c:479 map__fprintf_srccode() error: we previously assumed 'state' could be null (see line 466) tools/perf/util/map.c 465 /* Avoid redundant printing / 466 if (state && 467 state->srcfile && 468 !strcmp(state->srcfile, srcfile) && 469 state->line == line) { 470 free(srcfile); 471 return 0; 472 } 473 474 srccode = find_sourceline(srcfile, line, &len); 475 if (!srccode) 476 goto out_free_line; 477 478 ret = fprintf(fp, "\|%-8d %.s", line, len, srccode); 479 state->srcfile = srcfile; ^^^^^^^ 480 state->line = line; ^^^^^^^ This patch validates 'state' pointer before access its elements. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexios Zavras <alexios.zavras@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Song Liu <songliubraving@fb.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Fixes: `dd2e18e9ac` ("perf tools: Support 'srccode' output") Link: http://lkml.kernel.org/r/20190702103420.27540-8-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 09:33:55 -03:00
Leo Yan	7a6d49dc8c	perf trace: Fix potential NULL pointer dereference found by the smatch tool Based on the following report from Smatch, fix the potential NULL pointer dereference check. tools/perf/builtin-trace.c:1044 thread_trace__new() error: we previously assumed 'ttrace' could be null (see line 1041). tools/perf/builtin-trace.c 1037 static struct thread_trace thread_trace__new(void) 1038 { 1039 struct thread_trace ttrace = zalloc(sizeof(struct thread_trace)); 1040 1041 if (ttrace) 1042 ttrace->files.max = -1; 1043 1044 ttrace->syscall_stats = intlist__new(NULL); ^^^^^^^^ 1045 1046 return ttrace; 1047 } Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexios Zavras <alexios.zavras@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Song Liu <songliubraving@fb.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190702103420.27540-6-leo.yan@linaro.org [ Just made it look like other tools/perf constructors, same end result ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 09:33:55 -03:00
Leo Yan	600c787dbf	perf annotate: Fix dereferencing freed memory found by the smatch tool Based on the following report from Smatch, fix the potential dereferencing freed memory check. tools/perf/util/annotate.c:1125 disasm_line__parse() error: dereferencing freed memory 'namep' tools/perf/util/annotate.c 1100 static int disasm_line__parse(char line, const char namep, char rawp) 1101 { 1102 char tmp, name = ltrim(line); [...] 1114 namep = strdup(name); 1115 1116 if (namep == NULL) 1117 goto out_free_name; [...] 1124 out_free_name: 1125 free((void )namep); ^^^^^ 1126 namep = NULL; ^^^^^^ 1127 return -1; 1128 } If strdup() fails to allocate memory space for namep, we don't need to free memory with pointer 'namep', which is resident in data structure disasm_line::ins::name; and namep is NULL pointer for this failure, so it's pointless to assign NULL to *namep again. Committer note: Freeing namep, which is the address of the first entry of the 'struct ins' that is the first member of struct disasm_line would in fact free that disasm_line instance, if it was allocated via malloc/calloc, which, later, would a dereference of freed memory. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexios Zavras <alexios.zavras@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Song Liu <songliubraving@fb.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190702103420.27540-5-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 09:33:55 -03:00
Leo Yan	111442cfc8	perf top: Fix potential NULL pointer dereference detected by the smatch tool Based on the following report from Smatch, fix the potential NULL pointer dereference check. tools/perf/builtin-top.c:109 perf_top__parse_source() warn: variable dereferenced before check 'he' (see line 103) tools/perf/builtin-top.c:233 perf_top__show_details() warn: variable dereferenced before check 'he' (see line 228) tools/perf/builtin-top.c 101 static int perf_top__parse_source(struct perf_top top, struct hist_entry he) 102 { 103 struct perf_evsel evsel = hists_to_evsel(he->hists); ^^^^ 104 struct symbol sym; 105 struct annotation notes; 106 struct map map; 107 int err = -1; 108 109 if (!he \|\| !he->ms.sym) 110 return -1; This patch moves the values assignment after validating pointer 'he'. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexios Zavras <alexios.zavras@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Song Liu <songliubraving@fb.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190702103420.27540-4-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 09:33:54 -03:00
Leo Yan	c74b05030e	perf stat: Fix use-after-freed pointer detected by the smatch tool Based on the following report from Smatch, fix the use-after-freed pointer. tools/perf/builtin-stat.c:1353 add_default_attributes() warn: passing freed memory 'str'. The pointer 'str' has been freed but later it is still passed into the function parse_events_print_error(). This patch fixes this use-after-freed issue. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexios Zavras <alexios.zavras@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: linux-arm-kernel@lists.infradead.org Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Song Liu <songliubraving@fb.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lkml.kernel.org/r/20190702103420.27540-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 09:33:54 -03:00
Numfor Mbiziwo-Tiapo	4e4cf62b37	perf test mmap-thread-lookup: Initialize variable to suppress memory sanitizer warning Running the 'perf test' command after building perf with a memory sanitizer causes a warning that says: WARNING: MemorySanitizer: use-of-uninitialized-value... in mmap-thread-lookup.c Initializing the go variable to 0 silences this harmless warning. Committer warning: This was harmless, just a simple test writing whatever was at that sizeof(int) memory area just to signal another thread blocked reading that file created with pipe(). Initialize it tho so that we don't get this warning. Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Drayton <mbd@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190702173716.181223-1-nums@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-09 09:33:54 -03:00
Arnaldo Carvalho de Melo	e3b22a6534	Merge remote-tracking branch 'tip/perf/core' into perf/urgent To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-08 13:06:57 -03:00
Ingo Molnar	552a031ba1	Linux 5.2 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0idTweHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGZesIAJDKicw2Voyx8K8m 3pXSK+71RuO/d3Y9M51mdfTMKRP4PHR9/4wVZ9wHPwC4dV6wxgsmIYCF69a1Wety LD1MpDCP1DK5wVfPNKVX2xmj7ua6iutPtSsJHzdzM2TlscgsrFKjmUccqJ5JLwL5 c34nqwXWnzzRyI5Ga9cQSlwzAXq0vDHXyML3AnCosSsLX0lKFrHlK1zttdOPNkfj dXRN62g3q+9kVQozzhDXb8atZZ7IkBk8Q0lujpNXW83Ci1VjaVNv3SB8GZTXIlLj U15VdyuwfJDfpBgFBN6/unzVaAB6FFrEKy0jT1aeTyKarMKDKgOnJjn10aKjDNno /bXsKKc= =TVqV -----END PGP SIGNATURE----- Merge tag 'v5.2' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-07-08 18:04:41 +02:00
Arnaldo Carvalho de Melo	05c78468a6	tools build: Check if gettid() is available before providing helper Laura reported that the perf build failed in fedora when we got a glibc that provides gettid(), which I reproduced using fedora rawhide with the glibc-devel-2.29.9000-26.fc31.x86_64 package. Add a feature check to avoid providing a gettid() helper in such systems. On a fedora rawhide system with this patch applied we now get: [root@7a5f55352234 perf]# grep gettid /tmp/build/perf/FEATURE-DUMP feature-gettid=1 [root@7a5f55352234 perf]# cat /tmp/build/perf/feature/test-gettid.make.output [root@7a5f55352234 perf]# ldd /tmp/build/perf/feature/test-gettid.bin linux-vdso.so.1 (0x00007ffc6b1f6000) libc.so.6 => /lib64/libc.so.6 (0x00007f04e0a74000) /lib64/ld-linux-x86-64.so.2 (0x00007f04e0c47000) [root@7a5f55352234 perf]# nm /tmp/build/perf/feature/test-gettid.bin \| grep -w gettid U gettid@@GLIBC_2.30 [root@7a5f55352234 perf]# While on a fedora:29 system: [acme@quaco perf]$ grep gettid /tmp/build/perf/FEATURE-DUMP feature-gettid=0 [acme@quaco perf]$ cat /tmp/build/perf/feature/test-gettid.make.output test-gettid.c: In function ‘main’: test-gettid.c:8:9: error: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Werror=implicit-function-declaration] return gettid(); ^~~~~~ getgid cc1: all warnings being treated as errors [acme@quaco perf]$ Reported-by: Laura Abbott <labbott@redhat.com> Tested-by: Laura Abbott <labbott@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lkml.kernel.org/n/tip-yfy3ch53agmklwu9o7rlgf9c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-07 17:53:09 -03:00
Jiri Olsa	dab0f4ebb2	perf jvmti: Address gcc string overflow warning for strncpy() We are getting false positive gcc warning when we compile with gcc9 (9.1.1): CC jvmti/libjvmti.o In file included from /usr/include/string.h:494, from jvmti/libjvmti.c:5: In function ‘strncpy’, inlined from ‘copy_class_filename.constprop’ at jvmti/libjvmti.c:166:3: /usr/include/bits/string_fortified.h:106:10: error: ‘__builtin_strncpy’ specified bound depends on the length of the source argument [-Werror=stringop-overflow=] 106 \| return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest)); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ jvmti/libjvmti.c: In function ‘copy_class_filename.constprop’: jvmti/libjvmti.c:165:26: note: length computed here 165 \| size_t file_name_len = strlen(file_name); \| ^~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors As per Arnaldo's suggestion use strlcpy(), which does the same thing and keeps gcc silent. Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190531131321.GB1281@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-07 12:33:32 -03:00
Arnaldo Carvalho de Melo	c18ae6327a	perf python: Remove -fstack-protector-strong if clang doesn't have it Some distros put -fstack-protector-strong in the compiler flags to be used to build python extensions, but then, the clang version in that distro doesn't know about that, only gcc does. Check if that is the case and remove it from the set of options used to build the python binding with clang. Case at hand: oraclelinux:7 $ head -2 /etc/os-release NAME="Oracle Linux Server" VERSION="7.6" $ grep stack-protector /usr/lib64/python2.7/_sysconfigdata.py \| head -1 \| cut -c-120 'CFLAGS': '-fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --para $ gcc version 4.8.5 20150623 (Red Hat 4.8.5-36.0.1) (GCC) clang version 3.4.2 (tags/RELEASE_34/dot2-final) clang: error: unknown argument: '-fstack-protector-strong' clang: error: unknown argument: '-fstack-protector-strong' error: command 'clang' failed with exit status 1 cp: cannot stat '/tmp/build/perf/python_ext_build/lib/perf.so': No such file or directory make[2]: ** [/tmp/build/perf/python/perf.so] Error 1 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-brmp2415zxpbhz45etkgjoma@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-07 12:32:46 -03:00
Arnaldo Carvalho de Melo	d5b2179d6a	perf annotate TUI browser: Do not use member from variable within its own initialization Some compilers will complain when using a member of a struct to initialize another member, in the same struct initialization. For instance: debian:8 Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0) oraclelinux:7 clang version 3.4.2 (tags/RELEASE_34/dot2-final) Produce: ui/browsers/annotate.c:104:12: error: variable 'ops' is uninitialized when used within its own initialization [-Werror,-Wuninitialized] (!ops.current_entry \|\| ^~~ 1 error generated. So use an extra variable, initialized just before that struct, to have the value used in the expressions used to init two of the struct members. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: `c298304bd7` ("perf annotate: Use a ops table for annotation_line__write()") Link: https://lkml.kernel.org/n/tip-f9nexro58q62l3o9hez8hr0i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-06 16:59:11 -03:00
Seeteena Thoufeek	bff5a556c1	perf tests: Fix record+probe_libc_inet_pton.sh for powerpc64 'probe libc's inet_pton & backtrace it with ping' testcase sometimes fails on powerpc because distro ping binary does not have symbol information and thus it prints "[unknown]" function name in the backtrace. Accept "[unknown]" as valid function name for powerpc as well. # perf test -v "probe libc's inet_pton & backtrace it with ping" Before: 59: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 79695 ping 79718 [077] 96483.787025: probe_libc:inet_pton: (7fff83a754c8) 7fff83a754c8 __GI___inet_pton+0x8 (/usr/lib64/power9/libc-2.28.so) 7fff83a2b7a0 gaih_inet.constprop.7+0x1020 (/usr/lib64/power9/libc-2.28.so) 7fff83a2c170 getaddrinfo+0x160 (/usr/lib64/power9/libc-2.28.so) 1171830f4 [unknown] (/usr/bin/ping) FAIL: expected backtrace entry ".\+0x[[:xdigit:]]+[[:space:]]$./bin/ping.*$$" got "1171830f4 [unknown] (/usr/bin/ping)" test child finished with -1 ---- end ---- probe libc's inet_pton & backtrace it with ping: FAILED! After: 59: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 79085 ping 79108 [045] 96400.214177: probe_libc:inet_pton: (7fffbb9654c8) 7fffbb9654c8 __GI___inet_pton+0x8 (/usr/lib64/power9/libc-2.28.so) 7fffbb91b7a0 gaih_inet.constprop.7+0x1020 (/usr/lib64/power9/libc-2.28.so) 7fffbb91c170 getaddrinfo+0x160 (/usr/lib64/power9/libc-2.28.so) 132e830f4 [unknown] (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com> Reviewed-by: Kim Phillips <kim.phillips@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan@linux.ibm.com> Fixes: `1632936480` ("perf tests: Fix record+probe_libc_inet_pton.sh without ping's debuginfo") Link: http://lkml.kernel.org/r/1561630614-3216-1-git-send-email-s1seetee@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-06 14:31:01 -03:00
Jiri Olsa	cd13618937	perf evsel: Do not rely on errno values for precise_ip fallback Konstantin reported problem with default perf record command, which fails on some AMD servers, because of the default maximum precise config. The current fallback mechanism counts on getting ENOTSUP errno for precise_ip fails, but that's not the case on some AMD servers. We can fix this by removing the errno check completely, because the precise_ip fallback is separated. We can just try (if requested by evsel->precise_max) all possible precise_ip, and if one succeeds we win, if not, we continue with standard fallback. Reported-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin.monnet@netronome.com> Cc: Kim Phillips <kim.phillips@amd.com> Link: http://lkml.kernel.org/r/20190703080949.10356-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-06 14:30:30 -03:00
Arnaldo Carvalho de Melo	4c00af0e94	perf thread: Allow references to thread objects after machine__exit() Threads are created when we either synthesize PERF_RECORD_FORK events for pre-existing threads or when we receive PERF_RECORD_FORK events from the kernel as new threads get created. We then keep them in machine->threads[].entries rb trees till when we receive a PERF_RECORD_EXIT, i.e. that thread terminated. The thread object has a reference count that is grabbed when, for instance, we keep that thread referenced in struct hist_entry, in 'perf report' and 'perf top'. When we receive a PERF_RECORD_EXIT we remove the thread object from the rb tree and move it to the corresponding machine->threads[].dead list, then we do a thread__put(), dropping the reference we had for keeping it in the rb tree. In thread__put() we were assuming that when the reference count hit zero we should remove it from the dead list by simply doing a list_del_init(&thread->node). That works well when all the thread lifetime is during the machine that has the list heads lifetime, since we know that we can do the list_del_init() and it will update the 'dead' list_head. But in 'perf sched lat' we were doing: machine__new() (via perf_session__new) process events, grabbing refcounts to keep those thread objects in 'perf sched' local data structures. machine__exit() (via perf_session__delete) which would delete the 'dead' list heads. And then doing the final thread__put() for the refcounts 'perf sched' rightfully obtained for keeping those thread object references. b00m, since thread__put() would do the list_del_init() touching a dead dead list head. Fix it by removing all the dead threads from machine->threads[].dead at machine__exit(), since whatever is there should have refcounts taken by things like 'perf sched lat', and make thread__put() check if the thread is in a linked list before removing it from that list. Reported-by: Wei Li <liwei391@huawei.com> Link: https://lkml.kernel.org/r/20190508143648.8153-1-liwei391@huawei.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhipeng Xie <xiezhipeng1@huawei.com> Link: https://lkml.kernel.org/r/20190704194355.GI10740@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-06 14:29:32 -03:00
Song Liu	c952b35f4b	perf header: Assign proper ff->ph in perf_event__synthesize_features() bpf/btf write_* functions need ff->ph->env. With this missing, pipe-mode (perf record -o -) would crash like: Program terminated with signal SIGSEGV, Segmentation fault. This patch assign proper ph value to ff. Committer testing: (gdb) run record -o - Starting program: /root/bin/perf record -o - PERFILE2 <SNIP start of perf.data headers> Thread 1 "perf" received signal SIGSEGV, Segmentation fault. __do_write_buf (size=4, buf=0x160, ff=0x7fffffff8f80) at util/header.c:126 126 memcpy(ff->buf + ff->offset, buf, size); (gdb) bt #0 __do_write_buf (size=4, buf=0x160, ff=0x7fffffff8f80) at util/header.c:126 #1 do_write (ff=ff@entry=0x7fffffff8f80, buf=buf@entry=0x160, size=4) at util/header.c:137 #2 0x00000000004eddba in write_bpf_prog_info (ff=0x7fffffff8f80, evlist=<optimized out>) at util/header.c:912 #3 0x00000000004f69d7 in perf_event__synthesize_features (tool=tool@entry=0x97cc00 <record>, session=session@entry=0x7fffe9c6d010, evlist=0x7fffe9cae010, process=process@entry=0x4435d0 <process_synthesized_event>) at util/header.c:3695 #4 0x0000000000443c79 in record__synthesize (tail=tail@entry=false, rec=0x97cc00 <record>) at builtin-record.c:1214 #5 0x0000000000444ec9 in __cmd_record (rec=0x97cc00 <record>, argv=<optimized out>, argc=0) at builtin-record.c:1435 #6 cmd_record (argc=0, argv=<optimized out>) at builtin-record.c:2450 #7 0x00000000004ae3e9 in run_builtin (p=p@entry=0x98e058 <commands+216>, argc=argc@entry=3, argv=0x7fffffffd670) at perf.c:304 #8 0x000000000042eded in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:356 #9 run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:400 #10 main (argc=3, argv=<optimized out>) at perf.c:522 (gdb) After the patch the SEGSEGV is gone. Reported-by: David Carrillo Cisneros <davidca@fb.com> Signed-off-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kernel-team@fb.com Cc: stable@vger.kernel.org # v5.1+ Fixes: `606f972b13` ("perf bpf: Save bpf_prog_info information as headers to perf.data") Link: http://lkml.kernel.org/r/20190620010453.4118689-1-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-06 14:29:04 -03:00
Arnaldo Carvalho de Melo	15a108af1a	perf script: Allow specifying the files to process guest samples The 'perf kvm' command set up things so that we can record, report, top, etc, but not 'script', so make 'perf script' be able to process samples by allowing to pass guest kallsyms, vmlinux, modules, etc, and if at least one of those is provided, set perf_guest to true so that guest samples get properly resolved. Testing it: # perf kvm --guest --guestkallsyms /wb/rhel6.kallsyms --guestmodules /wb/rhel6.modules record -e cycles:Gk ^C[ perf record: Woken up 7 times to write data ] [ perf record: Captured and wrote 3.602 MB perf.data.guest (10492 samples) ] # # perf evlist -i perf.data.guest cycles:Gk # perf evlist -v -i perf.data.guest cycles:Gk: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP\|TID\|TIME\|CPU\|PERIOD, read_format: ID, disabled: 1, inherit: 1, exclude_user: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_host: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # # perf kvm --guestkallsyms /wb/rhel6.kallsyms --guestmodules /wb/rhel6.modules report --stdio -s sym \| head -30 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 10K of event 'cycles:Gk' # Event count (approx.): 2434201408 # # Overhead Symbol # ........ .............................................. # 11.93% [g] avtab_search_node 3.95% [g] sidtab_context_to_sid 2.41% [g] n_tty_write 2.20% [g] _spin_unlock_irqrestore 1.37% [g] _aesni_dec4 1.33% [g] kmem_cache_alloc 1.07% [g] native_write_cr0 0.99% [g] kfree 0.95% [g] _spin_lock 0.91% [g] __memset 0.87% [g] schedule 0.83% [g] _spin_lock_irqsave 0.76% [g] __kmalloc 0.67% [g] avc_has_perm_noaudit 0.66% [g] kmem_cache_free 0.65% [g] glue_xts_crypt_128bit 0.59% [g] __d_lookup 0.59% [g] __audit_syscall_exit 0.56% [g] __memcpy # Then, when trying to use perf script to generate a python script and then process the events after adding a python hook for non-tracepoint events: # perf script -i perf.data.guest -g python generated Python script: perf-script.py # vim perf-script.py # tail -2 perf-script.py def process_event(param_dict): print(param_dict["symbol"]) # # perf script -i perf.data.guest -s perf-script.py \| head in trace_begin vmx_vmexit vmx_vmexit vmx_vmexit vmx_vmexit vmx_vmexit vmx_vmexit vmx_vmexit vmx_vmexit vmx_vmexit 231 # We'd see just the vmx_vmexit, i.e. the samples from the guest don't show up. After this patch: # perf script --guestkallsyms /wb/rhel6.kallsyms --guestmodules /wb/rhel6.modules -i perf.data.guest -s perf-script.py 2> /dev/null \| head -30 in trace_begin apic_timer_interrupt apic_timer_interrupt apic_timer_interrupt apic_timer_interrupt apic_timer_interrupt save_args do_timer drain_array inode_permission avc_has_perm_noaudit run_timer_softirq apic_timer_interrupt apic_timer_interrupt apic_timer_interrupt apic_timer_interrupt apic_timer_interrupt kvm_guest_apic_eoi_write run_posix_cpu_timers _spin_lock handle_pte_fault rcu_irq_enter delay_tsc delay_tsc native_read_tsc apic_timer_interrupt sys_open internal_add_timer list_del rcu_exit_nohz # Jiri Olsa noticed we need to set 'perf_guest' to true if we want to process guest samples and I made it be set if one of the guest files settings get set via the command line options added in this patch, that match those present in the 'perf kvm' command. We probably want to have 'perf record', 'perf report' etc to notice that there are guest samples and do the right thing, which is to look for files with some suffix that make it be associated with the guest used to collect the samples, i.e. if a vmlinux file is passed, we can get the build-id from it, if not some other identifier or simply looking for "kallsyms.guest", for instance, in the current directory. Reported-by: Mariano Pache <npache@redhat.com> Tested-by: Mariano Pache <npache@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Cc: Ali Raza <alirazabhutta.10@gmail.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Mario <jmario@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Orran Krieger <okrieger@redhat.com> Cc: Ramkumar Ramachandra <artagnon@gmail.com> Cc: Yunlong Song <yunlong.song@huawei.com> Link: https://lkml.kernel.org/n/tip-d54gj64rerlxcqsrod05biwn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-03 00:13:25 -03:00
Andi Kleen	488c3bf7ec	perf tools metric: Don't include duration_time in group The Memory_BW metric generates groups including duration_time, which maps to a software event. For some reason this makes the group always not count. Always put duration_time outside a group when generating metrics. It's always the same time, so no need to group it. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190628220737.13259-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 16:08:16 -03:00
Andi Kleen	9c344d15f5	perf list: Avoid extra : for --raw metrics When printing the metrics raw, don't print : after the metricgroups. This helps the command line completion to complete those too. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190628220737.13259-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 16:08:16 -03:00
Andi Kleen	4df79ba3eb	perf vendor events intel: Metric fixes for SKX/CLX - Add a missing filter for the DRAM_Latency / DRAM_Parallel_Reads metrics - Remove the useless PMM_* metrics from Skylake Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190628220737.13259-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 16:08:16 -03:00
Andi Kleen	734ac47e23	perf tools: Fix typos / broken sentences - Fix a typo in the man page - Fix a tip that doesn't make any sense. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190628220900.13741-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 16:08:16 -03:00
John Garry	edd93a4076	perf jevents: Add support for Hisi hip08 L3C PMU aliasing Add support for Hisi hip08 L3C PMU aliasing. The kernel driver is in drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1561732552-143038-5-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 16:08:16 -03:00
John Garry	8f5b703add	perf jevents: Add support for Hisi hip08 HHA PMU aliasing Add support for Hisi hip08 HHA PMU aliasing. The kernel driver is in drivers/perf/hisilicon/hisi_uncore_hha_pmu.c Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1561732552-143038-4-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 16:08:15 -03:00
John Garry	57cc732479	perf jevents: Add support for Hisi hip08 DDRC PMU aliasing Add support for Hisi hip08 DDRC PMU aliasing. We can now do something like this: $perf list [snip] uncore ddrc: uncore_hisi_ddrc.act_cmd [DDRC active commands. Unit: hisi_sccl,ddrc] uncore_hisi_ddrc.flux_rcmd [DDRC read commands. Unit: hisi_sccl,ddrc] uncore_hisi_ddrc.flux_wcmd [DDRC write commands. Unit: hisi_sccl,ddrc] uncore_hisi_ddrc.flux_wr [DDRC precharge commands. Unit: hisi_sccl,ddrc] uncore_hisi_ddrc.rnk_chg [DDRC rank commands. Unit: hisi_sccl,ddrc] uncore_hisi_ddrc.rw_chg [DDRC read and write changes. Unit: hisi_sccl,ddrc] Performance counter stats for 'system wide': 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl1_ddrc0] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl3_ddrc1] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl5_ddrc2] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl7_ddrc3] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl5_ddrc0] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl7_ddrc1] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl1_ddrc3] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl1_ddrc1] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl3_ddrc2] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl5_ddrc3] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl3_ddrc0] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl5_ddrc1] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl7_ddrc2] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl7_ddrc0] 20,421 uncore_hisi_ddrc.flux_rcmd [hisi_sccl1_ddrc2] 0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl3_ddrc3] 1.001559011 seconds time elapsed The kernel driver is in drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1561732552-143038-3-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 16:08:15 -03:00
John Garry	730670b1d1	perf pmu: Support more complex PMU event aliasing The jevent "Unit" field is used for uncore PMU alias definition. The form uncore_pmu_example_X is supported, where "X" is a wildcard, to support multiple instances of the same PMU in a system. Unfortunately this format not suitable for all uncore PMUs; take the Hisi DDRC uncore PMU for example, where the name is in the form hisi_scclX_ddrcY. For for current jevent parsing, we would be required to hardcode an uncore alias translation for each possible value of X. This is not scalable. Instead, add support for "Unit" field in the form "hisi_sccl,ddrc", where we can match by hisi_scclX and ddrcY. Tokens in Unit field are delimited by ','. Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1561732552-143038-2-git-send-email-john.garry@huawei.com [ Shut up older gcc complianing about the last arg to strtok_r() being uninitialized, set that tmp to NULL ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 16:07:36 -03:00
Jin Yao	c8f7bc1a08	perf diff: Documentation -c cycles option Documentation the new computation selection 'cycles'. v4: --- Change the column 'Block cycles diff [start:end]' to '[Program Block Range] Cycles Diff' Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-8-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 13:20:51 -03:00
Jin Yao	b10c78c509	perf diff: Print the basic block cycles diff $ perf record -b ./div $ perf record -b ./div Following is the default perf diff output $ perf diff # Event 'cycles' # # Baseline Delta Abs Shared Object Symbol # ........ ......... ................ .................................. # 48.75% +0.33% div [.] main 8.21% -0.20% div [.] compute_flag 19.02% -0.12% libc-2.23.so [.] __random_r 16.17% -0.09% libc-2.23.so [.] __random 2.27% -0.03% div [.] rand@plt +0.02% [i915] [k] gen8_irq_handler 5.52% +0.02% libc-2.23.so [.] rand This patch creates a new computation selection 'cycles'. $ perf diff -c cycles # Event 'cycles' # # Baseline [Program Block Range] Cycles Diff Shared Object Symbol # ........ ....................................... ......................................... # 48.75% [div.c:42 -> div.c:45] 147 div [.] main 48.75% [div.c:31 -> div.c:40] 4 div [.] main 48.75% [div.c:40 -> div.c:40] 0 div [.] main 48.75% [div.c:42 -> div.c:42] 0 div [.] main 48.75% [div.c:42 -> div.c:44] 0 div [.] main 19.02% [random_r.c:357 -> random_r.c:360] 0 libc-2.23.so [.] __random_r 19.02% [random_r.c:357 -> random_r.c:373] 0 libc-2.23.so [.] __random_r 19.02% [random_r.c:357 -> random_r.c:376] 0 libc-2.23.so [.] __random_r 19.02% [random_r.c:357 -> random_r.c:380] 0 libc-2.23.so [.] __random_r 19.02% [random_r.c:357 -> random_r.c:392] 0 libc-2.23.so [.] __random_r 16.17% [random.c:288 -> random.c:291] 0 libc-2.23.so [.] __random 16.17% [random.c:288 -> random.c:291] 0 libc-2.23.so [.] __random 16.17% [random.c:288 -> random.c:295] 0 libc-2.23.so [.] __random 16.17% [random.c:288 -> random.c:297] 0 libc-2.23.so [.] __random 16.17% [random.c:291 -> random.c:291] 0 libc-2.23.so [.] __random 16.17% [random.c:293 -> random.c:293] 0 libc-2.23.so [.] __random 8.21% [div.c:22 -> div.c:22] 148 div [.] compute_flag 8.21% [div.c:22 -> div.c:25] 0 div [.] compute_flag 8.21% [div.c:27 -> div.c:28] 0 div [.] compute_flag 5.52% [rand.c:26 -> rand.c:27] 0 libc-2.23.so [.] rand 5.52% [rand.c:26 -> rand.c:28] 0 libc-2.23.so [.] rand 2.27% [rand@plt+0 -> rand@plt+0] 0 div [.] rand@plt 0.01% [entry_64.S:694 -> entry_64.S:694] 16 [vmlinux] [k] native_irq_return_iret 0.00% [fair.c:7676 -> fair.c:7665] 162 [vmlinux] [k] update_blocked_averages "[Program Block Range]" indicates the range of program basic block (start -> end). If we can find the source line it prints the source line otherwise it prints the symbol+offset instead. v4: --- Use source lines or symbol+offset to indicate the basic block. It should be easier to understand. v3: --- Cast 'struct hist_entry' to 'struct block_hist' in hist_entry__block_fprintf. Use symbol_conf.report_block to check if executing hist_entry__block_fprintf. v2: --- Keep standard perf diff format and display the 'Baseline' and 'Shared Object'. The output is sorted by "Baseline" and the basic blocks in the same function are sorted by cycles diff. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-7-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 13:20:51 -03:00
Jin Yao	f3810817b2	perf diff: Link same basic blocks among different data The target is to compare the performance difference (cycles diff) for the same basic blocks in different data files. The same basic block means same function, same start address and same end address. This patch finds the same basic blocks from different data files and link them together and resort by the cycles diff. v3: --- The block stuffs are maintained by new structure 'block_hist', so this patch is update accordingly. v2: --- Since now the basic block hists is changed to per symbol, the patch only links the basic block hists for the same symbol in different data files. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-6-git-send-email-yao.jin@linux.intel.com [ sym->name is an array, not a pointer, so no need to check it for NULL, fixes de build in some distros ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 13:20:15 -03:00
Jin Yao	99150a1faa	perf diff: Use hists to manage basic blocks per symbol The hist__account_cycles() can account cycles per basic block. The basic block information is saved in cycles_hist structure. This patch processes each symbol, get basic blocks from cycles_hist and add the basic block entries to a new hists (in 'struct block_hist'). Using a hists is because we need to compare, sort and print the basic blocks later. v6: --- Since 'ops' argument is removed from hists__add_entry_block, update the code accordingly. No functional change. v5: --- Since now we still carry block_info in 'struct hist_entry' we don't need to use our own new/free ops for hist entries. And the block_info is released in hist_entry__delete. v3: --- 1. In v2, we put block stuffs in 'struct hist_entry', but it's not a good design. In v3, we create a new 'struct block_hist' and cast the 'struct hist_entry' to 'struct block_hist' in some places, which can avoid adding new stuffs in 'struct hist_entry'. 2. abs() -> labs(), in block_cycles_diff_cmp(). v2: --- v1 adds the basic block entries to per data-file hists but v2 adds the basic block entries to per symbol hists. That is to keep current perf-diff format. Will show the result in next patches. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-5-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 12:47:07 -03:00
Jin Yao	30d815534e	perf diff: Check if all data files with branch stacks We will expand perf diff to support diff cycles of individual programs blocks, so it requires all data files having branch stacks. This patch checks HEADER_BRANCH_STACK in header, and only set the flag has_br_stack when HEADER_BRANCH_STACK are set in all data files. v2: --- Move check_file_brstack() from __cmd_diff() to cmd_diff(). Because later patch will check flag 'has_br_stack' before ui_init(). Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 12:46:11 -03:00
Jin Yao	fe96245c7f	perf hists: Add block_info in hist_entry The block_info contains the program basic block information, i.e, contains the start address and the end address of this basic block and how much cycles it takes. We need to compare, sort and even print out the basic block by some orders, i.e. sort by cycles. For this purpose, we add block_info field to hist_entry. In order not to impact current interface, we creates a new function hists__add_entry_block. v6: --- Remove the 'ops' argument in hists__add_entry_block Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 12:45:23 -03:00
Jin Yao	0cec2447e7	perf symbol: Create block_info structure 'perf diff' currently can only diff symbols(functions). We should expand it to diff cycles of individual programs blocks as reported by timed LBR. This would allow to identify changes in specific code accurately. We need a new structure to maintain the basic block information, such as, symbol(function), start/end address of this block, cycles. This patch creates this structure and with some ops. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-02 12:44:19 -03:00
Luke Mujica	06c642c0e9	perf jevents: Use nonlocal include statements in pmu-events.c Change pmu-events.c to not use local include statements. The code that creates the include statements for pmu-events.c is in jevents.c. pmu-events.c is a generated file, and for build systems that put generated files in a separate directory, include statements with local pathing cannot find non-generated files. Signed-off-by: Luke Mujica <lukemujica@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Numfor Mbiziwo-Tiapo <nums@google.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lkml.kernel.org/n/tip-prgnwmaoo1pv9zz4vnv1bjaj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-01 22:50:42 -03:00
Mao Han	aa23aa5516	perf annotate: Add csky support This patch add basic arch initialization and instruction associate support for the csky CPU architecture. E.g.: $ perf annotate --stdio2 Samples: 161 of event 'cpu-clock:pppH', 4000 Hz, Event count (approx.): 40250000, [percent: local period] test_4() /usr/lib/perf-test/callchain_test Percent Disassembly of section .text: 00008420 <test_4>: test_4(): subi sp, sp, 4 st.w r8, (sp, 0x0) mov r8, sp subi sp, sp, 8 subi r3, r8, 4 movi r2, 0 st.w r2, (r3, 0x0) ↓ br 2e 100.00 14: subi r3, r8, 4 ld.w r2, (r3, 0x0) subi r3, r8, 8 st.w r2, (r3, 0x0) subi r3, r8, 4 ld.w r3, (r3, 0x0) addi r2, r3, 1 subi r3, r8, 4 st.w r2, (r3, 0x0) 2e: subi r3, r8, 4 ld.w r2, (r3, 0x0) lrw r3, 0x98967f // 8598 <main+0x28> cmplt r3, r2 ↑ bf 14 mov r0, r0 mov r0, r0 mov sp, r8 ld.w r8, (sp, 0x0) addi sp, sp, 4 ← rts Signed-off-by: Mao Han <han_mao@c-sky.com> Acked-by: Guo Ren <guoren@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-csky@vger.kernel.org Link: http://lkml.kernel.org/r/d874d7782d9acdad5d98f2f5c4a6fb26fbe41c5d.1561531557.git.han_mao@c-sky.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-01 22:50:41 -03:00
Andi Kleen	e3a9427323	perf stat: Fix metrics with --no-merge Since Fixes: `8c5421c016` ("perf pmu: Display pmu name when printing unmerged events in stat") using --no-merge adds the PMU name to the evsel name. This breaks the metric value lookup because the parser doesn't know about this. Remove the extra postfixes for the metric evaluation. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Agustin Vega-Frias <agustinv@codeaurora.org> Cc: Kan Liang <kan.liang@linux.intel.com> Fixes: `8c5421c016` ("perf pmu: Display pmu name when printing unmerged events in stat") Link: http://lkml.kernel.org/r/20190624193711.35241-5-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-01 22:50:41 -03:00
Andi Kleen	2f87f33f42	perf stat: Fix group lookup for metric group The metric group code tries to find a group it added earlier in the evlist. Fix the lookup to handle groups with partially overlaps correctly. When a sub string match fails and we reset the match, we have to compare the first element again. I also renamed the find_evsel function to find_evsel_group to make its purpose clearer. With the earlier changes this fixes: Before: % perf stat -M UPI,IPC sleep 1 ... 1,032,922 uops_retired.retire_slots # 1.1 UPI 1,896,096 inst_retired.any 1,896,096 inst_retired.any 1,177,254 cpu_clk_unhalted.thread After: % perf stat -M UPI,IPC sleep 1 ... 1,013,193 uops_retired.retire_slots # 1.1 UPI 932,033 inst_retired.any 932,033 inst_retired.any # 0.9 IPC 1,091,245 cpu_clk_unhalted.thread Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Fixes: `b18f3e3650` ("perf stat: Support JSON metrics in perf stat") Link: http://lkml.kernel.org/r/20190624193711.35241-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-01 22:50:41 -03:00
Andi Kleen	6c5f4e5cb3	perf stat: Don't merge events in the same PMU Event merging is mainly to collapse similar events in lots of different duplicated PMUs. It can break metric displaying. It's possible for two metrics to have the same event, and when the two events happen in a row the second wouldn't be displayed. This would also not show the second metric. To avoid this don't merge events in the same PMU. This makes sense, if we have multiple events in the same PMU there is likely some reason for it (e.g. using multiple groups) and we better not merge them. While in theory it would be possible to construct metrics that have events with the same name in different PMU no current metrics have this problem. This is the fix for perf stat -M UPI,IPC (needs also another bug fix to completely work) Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Fixes: `430daf2dc7` ("perf stat: Collapse identically named events") Link: http://lkml.kernel.org/r/20190624193711.35241-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-01 22:50:41 -03:00
Andi Kleen	145c407c80	perf stat: Make metric event lookup more robust After setting up metric groups through the event parser, the metricgroup code looks them up again in the event list. Make sure we only look up events that haven't been used by some other metric. The data structures currently cannot handle more than one metric per event. This avoids problems with multiple events partially overlapping. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Link: http://lkml.kernel.org/r/20190624193711.35241-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-01 22:50:41 -03:00
Arnaldo Carvalho de Melo	9c10548c42	tools lib: Move argv_{split,free} from tools/perf/util/ This came from the kernel lib/argv_split.c, so move it to tools/lib/argv_split.c, to get it closer to the kernel structure. We need to audit the usage of argv_split() to figure out if it is really necessary to do have one allocation per argv[] entry, looking at one of its users I guess that is not the case and we probably are even leaking those allocations by not using argv_free() judiciously, for later. With this we further remove stuff from tools/perf/util/, reducing the perf specific codebase and encouraging other tools/ code to use these routines so as to keep the style and constructs used with the kernel. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-j479s1ive9h75w5lfg16jroz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-01 22:50:40 -03:00
Arnaldo Carvalho de Melo	af0de0c5f0	perf tools: Drop strxfrchar(), use strreplace() equivalent from kernel No change in behaviour intended, just reducing the codebase and using something available in tools/lib/. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-oyi6zif3810nwi4uu85odnhv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-01 22:50:40 -03:00
Arnaldo Carvalho de Melo	13c230ab6e	perf tools: Ditch rtrim(), use strim() from tools/lib Cleaning up a bit more tools/perf/util/ by using things we got from the kernel and have in tools/lib/ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-7hluuoveryoicvkclshzjf1k@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-01 22:50:33 -03:00
Linus Torvalds	57103eb7c6	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Various fixes, most of them related to bugs perf fuzzing found in the x86 code" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/regs: Use PERF_REG_EXTENDED_MASK perf/x86: Remove pmu->pebs_no_xmm_regs perf/x86: Clean up PEBS_XMM_REGS perf/x86/regs: Check reserved bits perf/x86: Disable extended registers for non-supported PMUs perf/ioctl: Add check for the sample_period value perf/core: Fix perf_sample_regs_user() mm check	2019-06-29 19:39:17 +08:00
Arnaldo Carvalho de Melo	3ca43b6053	perf tools: Remove trim() implementation, use tools/lib's strim() Moving more stuff out of tools/perf/util/ and using the kernel idiom. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-wpj8rktj62yse5dq6ckny6de@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-26 12:06:20 -03:00
Arnaldo Carvalho de Melo	328584804e	perf tools: Ditch rtrim(), use skip_spaces() to get closer to the kernel No change in behaviour, just using the same kernel idiom for such operation. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: André Goddard Rosa <andre.goddard@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-a85lkptkt0ru40irpga8yf54@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-26 11:42:03 -03:00
Arnaldo Carvalho de Melo	526bbbdd44	perf report: Use skip_spaces() No change in behaviour intended. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-lcywlfqbi37nhegmhl1ar6wg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-26 11:31:43 -03:00
Arnaldo Carvalho de Melo	80e9073f1f	perf metricgroup: Use strsep() No change in behaviour intended, trivial optimization done by avoiding looking for spaces in 'g' right after setting it to "No_group". Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-f2siadtp3hb5o0l1w7bvd8bk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-26 11:31:43 -03:00
Arnaldo Carvalho de Melo	c1fc14cbdc	perf strfilter: Use skip_spaces() No change in behaviour. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-p9rtamq7lvre9zhti70azfwe@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-26 11:31:43 -03:00
Arnaldo Carvalho de Melo	ee44b5b51f	perf probe: Use skip_spaces() for argv handling The skip_sep() routine has the same implementation as skip_spaces(), recently adopted from the kernel, sources, switch to it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-0ix211a81z2016dl5nmtdci4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-26 11:31:37 -03:00
Arnaldo Carvalho de Melo	9bb5a27ac7	perf time-utils: Use skip_spaces() No change in behaviour intended. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-cpugv7qd5vzhbtvnlydo90jv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 21:39:18 -03:00
Arnaldo Carvalho de Melo	fc6a172600	perf header: Use skip_spaces() in __write_cpudesc() No change in behaviour. Cc: Stephane Eranian <eranian@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-0dbfpi70aa66s6mtd8z6p391@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 21:34:31 -03:00
Arnaldo Carvalho de Melo	810826acd1	perf stat: Use recently introduced skip_spaces() No change in behaviour. Cc: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-ncpvp4eelf8fqhuy29uv56z9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 21:28:49 -03:00
Arnaldo Carvalho de Melo	bd9860bf05	perf tools: Use linux/ctype.h in more places There were a few places where we still were using the libc version of ctype.h, switch to the one in tools/lib/ctype.c that the rest of perf uses. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-wa4nz4kt61eze88eprk20tfd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 21:13:51 -03:00
Arnaldo Carvalho de Melo	3052ba56bc	tools perf: Move from sane_ctype.h obtained from git to the Linux's original We got the sane_ctype.h headers from git and kept using it so far, but since that code originally came from the kernel sources to the git sources, perhaps its better to just use the one in the kernel, so that we can leverage tools/perf/check_headers.sh to be notified when our copy gets out of sync, i.e. when fixes or goodies are added to the code we've copied. This will help with things like tools/lib/string.c where we want to have more things in common with the kernel, such as strim(), skip_spaces(), etc so as to go on removing the things that we have in tools/perf/util/ and instead using the code in the kernel, indirectly and removing things like EXPORT_SYMBOL(), etc, getting notified when fixes and improvements are made to the original code. Hopefully this also should help with reducing the difference of code hosted in tools/ to the one in the kernel proper. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-7k9868l713wqtgo01xxygn12@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 21:02:47 -03:00
Arnaldo Carvalho de Melo	1b2fc358dd	perf tools: Add missing util.h to pick up 'page_size' variable Not to depend of getting it indirectly. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-tirjsmvu4ektw0k7lm8k9lhu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 18:35:34 -03:00
Arnaldo Carvalho de Melo	9f3926e08c	perf tools: Remove old baggage that is util/include/linux/ctype.h It was just including a ../util.h that wasn't even there: $ cat tools/perf/util/include/linux/../util.h cat: tools/perf/util/include/linux/../util.h: No such file or directory $ This would make kallsyms.h get util.h somehow and then files including it would get util.h defined stuff, a mess, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-wlzwken4psiat4zvfbvaoqiw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 18:31:12 -03:00
Arnaldo Carvalho de Melo	cf8b6970f4	perf symbols: We need util.h in symbol-elf.c for zfree() Continuing to untangle the headers, we're about to remove the old odd baggage that is tools/perf/util/include/linux/ctype.h. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-gapezcq3p8bzrsi96vdtq0o0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 18:31:06 -03:00
Arnaldo Carvalho de Melo	155681fcd7	perf kallsyms: Adopt hex2u64 from tools/perf/util/util.h Just removing more stuff from tools/perf/, this is mostly used in the kallsyms parsing and in places in perf where kallsyms is involved, so we get it for free there. With this we reduce a bit more util.h. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-5mc1zg0jqdwgkn8c358kaba6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 18:13:17 -03:00
Arnaldo Carvalho de Melo	af41949d9e	tools x86 machine: Add missing util.h to pick up 'page_size' We're getting it by sheer luck, add that util.h to get the 'page_size' definition. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-347078mgj3d2jfygtxs4ntti@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 18:01:47 -03:00
Arnaldo Carvalho de Melo	6a9fa4e3bd	perf string: Move 'dots' and 'graph_dotted_line' out of sane_ctype.h Those are not in that file in the git repo, lets move it from there so that we get that sane ctype code fully isolated to allow getting it in sync either with the git sources or better with the kernel sources (include/linux/ctype.h + lib/ctype.h), that way we can use check_headers.h to get notified when changes are made in the original code so that we can cherry-pick. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-ioh5sghn3943j0rxg6lb2dgs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 17:31:26 -03:00
Arnaldo Carvalho de Melo	93d50edc80	perf ctype: Remove now unused 'spaces' variable We can left justify just fine using the 'field width' modifier in %s printf, ditch this variable. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-2td8u86mia7143lbr5ttl0kf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 16:28:40 -03:00
Arnaldo Carvalho de Melo	b598c34ffc	perf ui stdio: No need to use 'spaces' to left align We can just use the 'field width' for the %s used to print the alignment, this way we'll get the same result without requiring having a variable with just lots of space chars. No way to do that for the dots tho, we still need that variable filled with dot chars. # perf report --stdio --hierarchy > before # perf report --stdio --hierarchy > after # diff before after # I.e. it continues as: # perf report --stdio --hierarchy \| head -15 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 107 of event 'cycles' # Event count (approx.): 31378313 # # Overhead Command / Shared Object / Symbol # .............. ............................................ # 80.13% swapper 72.29% [kernel.vmlinux] 49.85% [k] intel_idle 9.05% [k] tick_nohz_next_event # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-9s1dxik37waveor7c84hqti2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 16:24:20 -03:00
Arnaldo Carvalho de Melo	828e27a899	perf ctype: Remove unused 'graph_line' variable Not being used at all anywhere. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-1e567f8tn8m4ii7dy1w9dp39@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 16:04:17 -03:00
Adrian Hunter	aba44287a2	perf scripts python: export-to-postgresql.py: Export Intel PT power and ptwrite events The format of synthesized events is determined by the attribute config. For the formats for Intel PT power and ptwrite events, create tables and populate them when the synth_data handler is called. If the tables remain empty, drop them at the end. The tables and views, including a combined power_events_view, will display automatically from the tables menu of the exported exported-sql-viewer.py script. Note, currently only Atoms since Gemini Lake have support for ptwrite and mwait, pwre, exstop and pwrx, but all Intel PT implementations support cbr. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190622093248.581-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 08:47:10 -03:00
Adrian Hunter	5130c6e555	perf scripts python: export-to-sqlite.py: Export Intel PT power and ptwrite events The format of synthesized events is determined by the attribute config. For the formats for Intel PT power and ptwrite events, create tables and populate them when the synth_data handler is called. If the tables remain empty, drop them at the end. The tables and views, including a combined power_events_view, will display automatically from the tables menu of the exported exported-sql-viewer.py script. Note, currently only Atoms since Gemini Lake have support for ptwrite and mwait, pwre, exstop and pwrx, but all Intel PT implementations support cbr. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190622093248.581-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 08:47:10 -03:00
Adrian Hunter	b9322cab17	perf db-export: Export synth events Synthesized events are samples but with architecture-specific data stored in sample->raw_data. They are identified by attribute type PERF_TYPE_SYNTH. Add a function to export them. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190622093248.581-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 08:47:10 -03:00
Adrian Hunter	5fe2cf7d19	perf intel-pt: Synthesize CBR events when last seen value changes The first core-to-bus ratio (CBR) event will not be shown if --itrace 's' option (skip initial number of events) is used, nor if time intervals are specified that do not include the start of tracing. Change the logic to record the last CBR value seen by the user, and synthesize CBR events whenever that changes. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190622093248.581-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 08:47:10 -03:00
Adrian Hunter	51b0918618	perf intel-pt: Add CBR value to decoder state For convenience, add the core-to-bus ratio (CBR) value to the decoder state. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190622093248.581-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 08:47:10 -03:00
Adrian Hunter	91de8684f1	perf intel-pt: Cater for CBR change in PSB+ PSB+ provides status information only so the core-to-bus ratio (CBR) in PSB+ will not have changed from its previous value. However, cater for the possibility of a another CBR change that gets caught up in the PSB+ anyway. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190622093248.581-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 08:47:10 -03:00
Adrian Hunter	abe5a1d3e4	perf intel-pt: Decoder to output CBR changes immediately The core-to-bus ratio (CBR) provides the CPU frequency. With branches enabled, the decoder was outputting CBR changes only when there was a branch. That loses the correct time of the change if the trace is not in context (e.g. not tracing kernel space). Change to output the CBR change immediately. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190622093248.581-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 08:47:10 -03:00
Kyle Meyer	9f94c7f947	perf tools: Increase MAX_NR_CPUS and MAX_CACHES Attempting to profile 1024 or more CPUs with perf causes two errors: perf record -a [ perf record: Woken up X times to write data ] way too many cpu caches.. [ perf record: Captured and wrote X MB perf.data (X samples) ] perf report -C 1024 Error: failed to set cpu bitmap Requested CPU 1024 too large. Consider raising MAX_NR_CPUS Increasing MAX_NR_CPUS from 1024 to 2048 and redefining MAX_CACHES as MAX_NR_CPUS * 4 returns normal functionality to perf: perf record -a [ perf record: Woken up X times to write data ] [ perf record: Captured and wrote X MB perf.data (X samples) ] perf report -C 1024 ... Signed-off-by: Kyle Meyer <kyle.meyer@hpe.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190620193630.154025-1-meyerk@stormcage.eag.rdlabs.hpecorp.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 08:47:10 -03:00
Adrian Hunter	eb5d854456	perf thread-stack: Eliminate code duplicating thread_stack__pop_ks() Use new function thread_stack__pop_ks() in place of equivalent code. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190619064429.14940-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 08:47:10 -03:00
Adrian Hunter	97860b483c	perf thread-stack: Fix thread stack return from kernel for kernel-only case Commit `f08046cb30` ("perf thread-stack: Represent jmps to the start of a different symbol") had the side-effect of introducing more stack entries before return from kernel space. When user space is also traced, those entries are popped before entry to user space, but when user space is not traced, they get stuck at the bottom of the stack, making the stack grow progressively larger. Fix by detecting a return-from-kernel branch type, and popping kernel addresses from the stack then. Note, the problem and fix affect the exported Call Graph / Tree but not the callindent option used by "perf script --call-trace". Example: perf-with-kcore record example -e intel_pt//k -- ls perf-with-kcore script example --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py example.db branches calls ~/libexec/perf-core/scripts/python/exported-sql-viewer.py example.db Menu option: Reports -> Context-Sensitive Call Graph Before: (showing Call Path column only) Call Path ▶ perf ▼ ls ▼ 12111:12111 ▶ setup_new_exec ▶ __task_pid_nr_ns ▶ perf_event_pid_type ▶ perf_event_comm_output ▶ perf_iterate_ctx ▶ perf_iterate_sb ▶ perf_event_comm ▶ __set_task_comm ▶ load_elf_binary ▶ search_binary_handler ▶ __do_execve_file.isra.41 ▶ __x64_sys_execve ▶ do_syscall_64 ▼ entry_SYSCALL_64_after_hwframe ▼ swapgs_restore_regs_and_return_to_usermode ▼ native_iret ▶ error_entry ▶ do_page_fault ▼ error_exit ▼ retint_user ▶ prepare_exit_to_usermode ▼ native_iret ▶ error_entry ▶ do_page_fault ▼ error_exit ▼ retint_user ▶ prepare_exit_to_usermode ▼ native_iret ▶ error_entry ▶ do_page_fault ▼ error_exit ▼ retint_user ▶ prepare_exit_to_usermode ▶ native_iret After: (showing Call Path column only) Call Path ▶ perf ▼ ls ▼ 12111:12111 ▶ setup_new_exec ▶ __task_pid_nr_ns ▶ perf_event_pid_type ▶ perf_event_comm_output ▶ perf_iterate_ctx ▶ perf_iterate_sb ▶ perf_event_comm ▶ __set_task_comm ▶ load_elf_binary ▶ search_binary_handler ▶ __do_execve_file.isra.41 ▶ __x64_sys_execve ▶ do_syscall_64 ▶ entry_SYSCALL_64_after_hwframe ▶ page_fault ▼ entry_SYSCALL_64 ▼ do_syscall_64 ▶ __x64_sys_brk ▶ __x64_sys_access ▶ __x64_sys_openat ▶ __x64_sys_newfstat ▶ __x64_sys_mmap ▶ __x64_sys_close ▶ __x64_sys_read ▶ __x64_sys_mprotect ▶ __x64_sys_arch_prctl ▶ __x64_sys_munmap ▶ exit_to_usermode_loop ▶ __x64_sys_set_tid_address ▶ __x64_sys_set_robust_list ▶ __x64_sys_rt_sigaction ▶ __x64_sys_rt_sigprocmask ▶ __x64_sys_prlimit64 ▶ __x64_sys_statfs ▶ __x64_sys_ioctl ▶ __x64_sys_getdents64 ▶ __x64_sys_write ▶ __x64_sys_exit_group Committer notes: The first arg to the perf-with-kcore needs to be the same for the 'record' and 'script' lines, otherwise we'll record the perf.data file and kcore_dir/ files in one directory ('example') to then try to use it from the 'bep' directory, fix the instructions above it so that both use 'example'. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Fixes: `f08046cb30` ("perf thread-stack: Represent jmps to the start of a different symbol") Link: http://lkml.kernel.org/r/20190619064429.14940-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 08:47:10 -03:00
Numfor Mbiziwo-Tiapo	2d7102a045	perf tools: Fix cache.h include directive Change the include path so that progress.c can find cache.h since it was previously searching in the wrong directory. Committer notes: $ ls -la tools/perf/ui/../cache.h ls: cannot access 'tools/perf/ui/../cache.h': No such file or directory So it really should include ../../util/cache.h, or plain cache.h, since we have -Iutil in INC_FLAGS in tools/perf/Makefile.config Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com> Cc: Jiri Olsa <jolsa@redhat.com>, Cc: Luke Mujica <lukemujica@google.com>, Cc: Stephane Eranian <eranian@google.com> To: Ian Rogers <irogers@google.com> Link: https://lkml.kernel.org/n/tip-pud8usyutvd2npg2vpsygncz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-25 08:47:09 -03:00
Ingo Molnar	b9271f0c65	Linux 5.2-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Os1seHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGtx4H/j6i482XzcGFKTBm A7mBoQpy+kLtoUov4EtBAR62OuwI8rsahW9di37QKndPoQrczWaKBmr3De6LCdPe v3pl3O6wBbvH5ru+qBPFX9PdNbDvimEChh7LHxmMxNQq3M+AjZAZVJyfpoiFnx35 Fbge+LZaH/k8HMwZmkMr5t9Mpkip715qKg2o9Bua6dkH0AqlcpLlC8d9a+HIVw/z aAsyGSU8jRwhoAOJsE9bJf0acQ/pZSqmFp0rDKqeFTSDMsbDRKLGq/dgv4nW0RiW s7xqsjb/rdcvirRj3rv9+lcTVkOtEqwk0PVdL9WOf7g4iYrb3SOIZh8ZyViaDSeH VTS5zps= =huBY -----END PGP SIGNATURE----- Merge tag 'v5.2-rc6' into perf/core, to refresh branch Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-06-24 19:25:52 +02:00
Kan Liang	8b12b812f5	perf/x86/regs: Use PERF_REG_EXTENDED_MASK Use the macro defined in kernel ABI header to replace the local name. No functional change. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: https://lkml.kernel.org/r/1559081314-9714-5-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-06-24 19:19:26 +02:00
Thomas Gleixner	d2912cb15b	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-19 17:09:55 +02:00
Thomas Gleixner	b15f321b9f	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 480 Based on 1 normalized pattern(s): adapted from oprofile gplv2 support extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to add the SPDX license identifier to 1 file(s) Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081204.397687630@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-19 17:09:51 +02:00
Thomas Gleixner	6d8a639ade	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 479 Based on 1 normalized pattern(s): released under the gpl v2 based on gplv2 source code extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081204.281377867@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-19 17:09:51 +02:00
Arnaldo Carvalho de Melo	78d6ccce03	perf build: Handle slang being in /usr/include and in /usr/include/slang/ In some distros slang.h may be in a /usr/include 'slang' subdir, so use the if slang is not explicitely disabled (by using NO_SLANG=1) and its feature test for the common case (having /usr/include/slang.h) failed, use the results for the test that checks if it is in slang/slang.h. Change the only file in perf that includes slang.h to use HAVE_SLANG_INCLUDE_SUBDIR and forget about this for good. On a rhel6 system now we have: $ /tmp/build/perf/perf -vv \| grep slang libslang: [ on ] # HAVE_SLANG_SUPPORT $ ldd /tmp/build/perf/perf \| grep libslang libslang.so.2 => /usr/lib64/libslang.so.2 (0x00007fa2d5a8d000) $ grep slang /tmp/build/perf/FEATURE-DUMP feature-libslang=0 feature-libslang-include-subdir=1 $ cat /etc/redhat-release CentOS release 6.10 (Final) $ While on fedora:29: $ /tmp/build/perf/perf -vv \| grep slang libslang: [ on ] # HAVE_SLANG_SUPPORT $ ldd /tmp/build/perf/perf \| grep slang libslang.so.2 => /lib64/libslang.so.2 (0x00007f8eb11a7000) $ grep slang /tmp/build/perf/FEATURE-DUMP feature-libslang=1 feature-libslang-include-subdir=1 $ $ cat /etc/fedora-release Fedora release 29 (Twenty Nine) $ The feature-libslang-include-subdir=1 line is because the 'gettid()' test was added to test-all.c as the new glibc has an implementation for that, so we soon should have it not failing, i.e. should be the common case soon. Perhaps I should move it out till it becomes the norm... Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: `1955c8cf5e` ("perf tools: Don't hardcode host include path for libslang") Link: https://lkml.kernel.org/n/tip-bkgtpsu3uit821fuwsdhj9gd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-18 17:48:12 -03:00
Florian Fainelli	1955c8cf5e	perf tools: Don't hardcode host include path for libslang Hardcoding /usr/include/slang is fundamentally incompatible with cross compilation and will lead to the inability for a cross-compiled environment to properly detect whether slang is available or not. If /usr/include/slang is necessary that is a distribution specific knowledge that could be solved with either a standard pkg-config .pc file (which slang has) or simply overriding CFLAGS accordingly, but the default perf Makefile should be clean of all of that. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: bcm-kernel-feedback-list@broadcom.com Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin.monnet@netronome.com> Cc: Stanislav Fomichev <sdf@google.com> Fixes: `ef7b93a119` ("perf report: Librarize the annotation code and use it in the newt browser") Link: http://lkml.kernel.org/r/20190614183949.5588-1-f.fainelli@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:20 -03:00
Arnaldo Carvalho de Melo	fdbdd7e858	perf evsel: Make perf_evsel__name() accept a NULL argument In which case it simply returns "unknown", like when it can't figure out the evsel->name value. This makes this code more robust and fixes a problem in 'perf trace' where a NULL evsel was being passed to a routine that only used the evsel for printing its name when a invalid syscall id was passed. Reported-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-f30ztaasku3z935cn3ak3h53@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:20 -03:00
Arnaldo Carvalho de Melo	016f327ce4	perf trace: Fixup pointer arithmetic when consuming augmented syscall args We can't just add the consumed bytes to the arg->augmented.args member, as it is not void , so it will access (consumed sizeof(struct augmented_arg)) in the next augmented arg, totally wrong, cast the member to void pointe before adding the number of bytes consumed, duh. With this and hardcoding handling the 'renameat' and 'renameat2' syscalls in the tools/perf/examples/bpf/augmented_raw_syscalls.c eBPF proggie, we get: mv/24388 renameat2(AT_FDCWD, "/tmp/build/perf/util/.bpf-event.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.bpf-event.o.cmd", RENAME_NOREPLACE) = 0 mv/24394 renameat2(AT_FDCWD, "/tmp/build/perf/util/.perf-hooks.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.perf-hooks.o.cmd", RENAME_NOREPLACE) = 0 mv/24398 renameat2(AT_FDCWD, "/tmp/build/perf/util/.pmu-bison.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.pmu-bison.o.cmd", RENAME_NOREPLACE) = 0 mv/24401 renameat2(AT_FDCWD, "/tmp/build/perf/util/.expr-bison.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.expr-bison.o.cmd", RENAME_NOREPLACE) = 0 mv/24406 renameat2(AT_FDCWD, "/tmp/build/perf/util/.pmu.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.pmu.o.cmd", RENAME_NOREPLACE) = 0 mv/24407 renameat2(AT_FDCWD, "/tmp/build/perf/util/.pmu-flex.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.pmu-flex.o.cmd", RENAME_NOREPLACE) = 0 mv/24416 renameat2(AT_FDCWD, "/tmp/build/perf/util/.parse-events-flex.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.parse-events-flex.o.cmd", RENAME_NOREPLACE) = 0 I.e. it works with two string args in the same syscall. Now back to taming the verifier... Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: `8195168e87` ("perf trace: Consume the augmented_raw_syscalls payload") Link: https://lkml.kernel.org/n/tip-n1w59lpxks6m1le7fpo6rmyw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:19 -03:00
John Garry	599ee18f07	perf pmu: Fix uncore PMU alias list for ARM64 In commit `292c34c102` ("perf pmu: Fix core PMU alias list for X86 platform"), we fixed the issue of CPU events being aliased to uncore events. Fix this same issue for ARM64, since the said commit left the (broken) behaviour untouched for ARM64. Signed-off-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Cc: stable@vger.kernel.org Fixes: `292c34c102` ("perf pmu: Fix core PMU alias list for X86 platform") Link: http://lkml.kernel.org/r/1560521283-73314-2-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:19 -03:00
Arnaldo Carvalho de Melo	5875cf4cd3	perf tests: Add missing SPDX headers Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-p0kg493z2m8qizjbdefzip1i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:19 -03:00
Arnaldo Carvalho de Melo	99f26f8548	perf trace: Streamline validation of select syscall names list Rename the 'i' variable to 'nr_used' and use set 'nr_allocated' since the start of this function, leaving the final assignment of the longer named trace->ev_qualifier_ids.nr state to 'nr_used' at the end of the function. No change in behaviour intended. Cc: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lkml.kernel.org/n/tip-kpgyn8xjdjgt0timrrnniquv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:19 -03:00
Arnaldo Carvalho de Melo	a4066d64d9	perf trace: Fix exclusion of not available syscall names from selector list We were just skipping the syscalls not available in a particular architecture without reflecting this in the number of entries in the ev_qualifier_ids.nr variable, fix it. This was done with the most minimalistic way, reusing the index variable 'i', a followup patch will further clean this by making 'i' renamed to 'nr_used' and using 'nr_allocated' in a few more places. Reported-by: Leo Yan <leo.yan@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Fixes: `04c41bcb86` ("perf trace: Skip unknown syscalls when expanding strace like syscall groups") Link: https://lkml.kernel.org/r/20190613181514.GC1402@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:19 -03:00
Arnaldo Carvalho de Melo	4541a8bb13	tools build: Check if gettid() is available before providing helper Laura reported that the perf build failed in fedora when we got a glibc that provides gettid(), which I reproduced using fedora rawhide with the glibc-devel-2.29.9000-26.fc31.x86_64 package. Add a feature check to avoid providing a gettid() helper in such systems. On a fedora rawhide system with this patch applied we now get: [root@7a5f55352234 perf]# grep gettid /tmp/build/perf/FEATURE-DUMP feature-gettid=1 [root@7a5f55352234 perf]# cat /tmp/build/perf/feature/test-gettid.make.output [root@7a5f55352234 perf]# ldd /tmp/build/perf/feature/test-gettid.bin linux-vdso.so.1 (0x00007ffc6b1f6000) libc.so.6 => /lib64/libc.so.6 (0x00007f04e0a74000) /lib64/ld-linux-x86-64.so.2 (0x00007f04e0c47000) [root@7a5f55352234 perf]# nm /tmp/build/perf/feature/test-gettid.bin \| grep -w gettid U gettid@@GLIBC_2.30 [root@7a5f55352234 perf]# While on a fedora:29 system: [acme@quaco perf]$ grep gettid /tmp/build/perf/FEATURE-DUMP feature-gettid=0 [acme@quaco perf]$ cat /tmp/build/perf/feature/test-gettid.make.output test-gettid.c: In function ‘main’: test-gettid.c:8:9: error: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Werror=implicit-function-declaration] return gettid(); ^~~~~~ getgid cc1: all warnings being treated as errors [acme@quaco perf]$ Reported-by: Laura Abbott <labbott@redhat.com> Tested-by: Laura Abbott <labbott@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lkml.kernel.org/n/tip-yfy3ch53agmklwu9o7rlgf9c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:19 -03:00
Adrian Hunter	e01f0ef509	perf intel-pt: Add callchain to synthesized PEBS sample Like other synthesized events, if there is also an Intel PT branch trace, then a call stack can also be synthesized. Add that. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-12-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:18 -03:00
Adrian Hunter	975846eddf	perf intel-pt: Add memory information to synthesized PEBS sample Add memory information from PEBS data in the Intel PT trace to the synthesized PEBS sample. This provides sample types PERF_SAMPLE_ADDR, PERF_SAMPLE_WEIGHT, and PERF_SAMPLE_TRANSACTION, but not PERF_SAMPLE_DATA_SRC. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:18 -03:00
Adrian Hunter	aa62afd7da	perf intel-pt: Add LBR information to synthesized PEBS sample Add LBR information from PEBS data in the Intel PT trace to the synthesized PEBS sample. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:18 -03:00
Adrian Hunter	143d34a6b3	perf intel-pt: Add XMM registers to synthesized PEBS sample Add XMM register information from PEBS data in the Intel PT trace to the synthesized PEBS sample. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:18 -03:00
Adrian Hunter	9e9a618afc	perf intel-pt: Add gp registers to synthesized PEBS sample Add general purpose register information from PEBS data in the Intel PT trace to the synthesized PEBS sample. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:18 -03:00
Adrian Hunter	9d0bc53e35	perf intel-pt: Synthesize PEBS sample basic information Synthesize a PEBS sample using basic information (ip, timestamp) only. Other PEBS information will be added in later patches. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:18 -03:00
Adrian Hunter	0dfded34a2	perf intel-pt: Factor out common sample preparation for re-use Factor out common sample preparation for re-use when synthesizing PEBS samples. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:18 -03:00
Adrian Hunter	e62ca655ee	perf intel-pt: Prepare to synthesize PEBS samples Add infrastructure to prepare for synthesizing PEBS samples but leave the actual synthesis to later patches. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:17 -03:00
Adrian Hunter	4c35595e1e	perf intel-pt: Add decoder support for PEBS via PT PEBS data is encoded in Block Item Packets (BIP). Populate a new structure intel_pt_blk_items with the values and, upon a Block End Packet (BEP), report them as a new Intel PT sample type INTEL_PT_BLK_ITEMS. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:17 -03:00
Adrian Hunter	a0db77bf88	perf intel-pt: Add Intel PT packet decoder test Add Intel PT packet decoder test. This test feeds byte sequences to the Intel PT packet decoder and checks the results. Changes to the packet context are also checked. Committer testing: # perf test "Intel PT" 65: Intel PT packet decoder : Ok # perf test -v "Intel PT" 65: Intel PT packet decoder : --- start --- test child forked, pid 6360 Decoded ok: 00 PAD Decoded ok: 04 TNT N (1) Decoded ok: 06 TNT T (1) Decoded ok: 80 TNT NNNNNN (6) Decoded ok: fe TNT TTTTTT (6) Decoded ok: 02 a3 02 00 00 00 00 00 TNT N (1) Decoded ok: 02 a3 03 00 00 00 00 00 TNT T (1) Decoded ok: 02 a3 00 00 00 00 00 80 TNT NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN (47) Decoded ok: 02 a3 ff ff ff ff ff ff TNT TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT (47) Decoded ok: 0d TIP no ip Decoded ok: 2d 01 02 TIP 0x201 Decoded ok: 4d 01 02 03 04 TIP 0x4030201 Decoded ok: 6d 01 02 03 04 05 06 TIP 0x60504030201 Decoded ok: 8d 01 02 03 04 05 06 TIP 0x60504030201 Decoded ok: cd 01 02 03 04 05 06 07 08 TIP 0x807060504030201 Decoded ok: 11 TIP.PGE no ip Decoded ok: 31 01 02 TIP.PGE 0x201 Decoded ok: 51 01 02 03 04 TIP.PGE 0x4030201 Decoded ok: 71 01 02 03 04 05 06 TIP.PGE 0x60504030201 Decoded ok: 91 01 02 03 04 05 06 TIP.PGE 0x60504030201 Decoded ok: d1 01 02 03 04 05 06 07 08 TIP.PGE 0x807060504030201 Decoded ok: 01 TIP.PGD no ip Decoded ok: 21 01 02 TIP.PGD 0x201 Decoded ok: 41 01 02 03 04 TIP.PGD 0x4030201 Decoded ok: 61 01 02 03 04 05 06 TIP.PGD 0x60504030201 Decoded ok: 81 01 02 03 04 05 06 TIP.PGD 0x60504030201 Decoded ok: c1 01 02 03 04 05 06 07 08 TIP.PGD 0x807060504030201 Decoded ok: 1d FUP no ip Decoded ok: 3d 01 02 FUP 0x201 Decoded ok: 5d 01 02 03 04 FUP 0x4030201 Decoded ok: 7d 01 02 03 04 05 06 FUP 0x60504030201 Decoded ok: 9d 01 02 03 04 05 06 FUP 0x60504030201 Decoded ok: dd 01 02 03 04 05 06 07 08 FUP 0x807060504030201 Decoded ok: 02 43 02 04 06 08 0a 0c PIP 0x60504030201 (NR=0) Decoded ok: 02 43 03 04 06 08 0a 0c PIP 0x60504030201 (NR=1) Decoded ok: 99 00 MODE.Exec 16 Decoded ok: 99 01 MODE.Exec 64 Decoded ok: 99 02 MODE.Exec 32 Decoded ok: 99 20 MODE.TSX TXAbort:0 InTX:0 Decoded ok: 99 21 MODE.TSX TXAbort:0 InTX:1 Decoded ok: 99 22 MODE.TSX TXAbort:1 InTX:0 Decoded ok: 02 83 TraceSTOP Decoded ok: 02 03 12 00 CBR 0x12 Decoded ok: 19 01 02 03 04 05 06 07 TSC 0x7060504030201 Decoded ok: 59 12 MTC 0x12 Decoded ok: 02 73 00 00 00 00 00 TMA CTC 0x0 FC 0x0 Decoded ok: 02 73 01 02 00 00 00 TMA CTC 0x201 FC 0x0 Decoded ok: 02 73 00 00 00 ff 01 TMA CTC 0x0 FC 0x1ff Decoded ok: 02 73 80 c0 00 ff 01 TMA CTC 0xc080 FC 0x1ff Decoded ok: 03 CYC 0x0 Decoded ok: 0b CYC 0x1 Decoded ok: fb CYC 0x1f Decoded ok: 07 02 CYC 0x20 Decoded ok: ff fe CYC 0xfff Decoded ok: 07 01 02 CYC 0x1000 Decoded ok: ff ff fe CYC 0x7ffff Decoded ok: 07 01 01 02 CYC 0x80000 Decoded ok: ff ff ff fe CYC 0x3ffffff Decoded ok: 07 01 01 01 02 CYC 0x4000000 Decoded ok: ff ff ff ff fe CYC 0x1ffffffff Decoded ok: 07 01 01 01 01 02 CYC 0x200000000 Decoded ok: ff ff ff ff ff fe CYC 0xffffffffff Decoded ok: 07 01 01 01 01 01 02 CYC 0x10000000000 Decoded ok: ff ff ff ff ff ff fe CYC 0x7fffffffffff Decoded ok: 07 01 01 01 01 01 01 02 CYC 0x800000000000 Decoded ok: ff ff ff ff ff ff ff fe CYC 0x3fffffffffffff Decoded ok: 07 01 01 01 01 01 01 01 02 CYC 0x40000000000000 Decoded ok: ff ff ff ff ff ff ff ff fe CYC 0x1fffffffffffffff Decoded ok: 07 01 01 01 01 01 01 01 01 02 CYC 0x2000000000000000 Decoded ok: ff ff ff ff ff ff ff ff ff 0e CYC 0xffffffffffffffff Decoded ok: 02 c8 01 02 03 04 05 VMCS 0x504030201 Decoded ok: 02 f3 OVF Decoded ok: 02 f3 OVF Decoded ok: 02 f3 OVF Decoded ok: 02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB Decoded ok: 02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB Decoded ok: 02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB Decoded ok: 02 23 PSBEND Decoded ok: 02 c3 88 01 02 03 04 05 06 07 00 MNT 0x7060504030201 Decoded ok: 02 12 01 02 03 04 PTWRITE 0x4030201 IP:0 Decoded ok: 02 32 01 02 03 04 05 06 07 08 PTWRITE 0x807060504030201 IP:0 Decoded ok: 02 92 01 02 03 04 PTWRITE 0x4030201 IP:1 Decoded ok: 02 b2 01 02 03 04 05 06 07 08 PTWRITE 0x807060504030201 IP:1 Decoded ok: 02 62 EXSTOP IP:0 Decoded ok: 02 e2 EXSTOP IP:1 Decoded ok: 02 c2 00 00 00 00 00 00 00 00 MWAIT 0x0 Hints 0x0 Extensions 0x0 Decoded ok: 02 c2 01 02 03 04 05 06 07 08 MWAIT 0x807060504030201 Hints 0x1 Extensions 0x1 Decoded ok: 02 c2 ff 02 03 04 07 06 07 08 MWAIT 0x8070607040302ff Hints 0xff Extensions 0x3 Decoded ok: 02 22 00 00 PWRE 0x0 HW:0 CState:0 Sub-CState:0 Decoded ok: 02 22 01 02 PWRE 0x201 HW:0 CState:0 Sub-CState:2 Decoded ok: 02 22 80 34 PWRE 0x3480 HW:1 CState:3 Sub-CState:4 Decoded ok: 02 22 00 56 PWRE 0x5600 HW:0 CState:5 Sub-CState:6 Decoded ok: 02 a2 00 00 00 00 00 PWRX 0x0 Last CState:0 Deepest CState:0 Wake Reason 0x0 Decoded ok: 02 a2 01 02 03 04 05 PWRX 0x504030201 Last CState:0 Deepest CState:1 Wake Reason 0x2 Decoded ok: 02 a2 ff ff ff ff ff PWRX 0xffffffffff Last CState:15 Deepest CState:15 Wake Reason 0xf Decoded ok: 02 63 00 BBP SZ 8-byte Type 0x0 Decoded ok: 02 63 80 BBP SZ 4-byte Type 0x0 Decoded ok: 02 63 1f BBP SZ 8-byte Type 0x1f Decoded ok: 02 63 9f BBP SZ 4-byte Type 0x1f Decoded ok: 04 00 00 00 00 BIP ID 0x00 Value 0x0 Decoded ok: fc 00 00 00 00 BIP ID 0x1f Value 0x0 Decoded ok: 04 01 02 03 04 BIP ID 0x00 Value 0x4030201 Decoded ok: fc 01 02 03 04 BIP ID 0x1f Value 0x4030201 Decoded ok: 04 00 00 00 00 00 00 00 00 BIP ID 0x00 Value 0x0 Decoded ok: fc 00 00 00 00 00 00 00 00 BIP ID 0x1f Value 0x0 Decoded ok: 04 01 02 03 04 05 06 07 08 BIP ID 0x00 Value 0x807060504030201 Decoded ok: fc 01 02 03 04 05 06 07 08 BIP ID 0x1f Value 0x807060504030201 Decoded ok: 02 33 BEP IP:0 Decoded ok: 02 b3 BEP IP:1 Decoded ok: 02 33 BEP IP:0 Decoded ok: 02 b3 BEP IP:1 test child finished with 0 ---- end ---- Intel PT packet decoder: Ok # Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:17 -03:00
Adrian Hunter	edff7809c8	perf intel-pt: Add new packets for PEBS via PT Add 3 new packets to supports PEBS via PT, namely Block Begin Packet (BBP), Block Item Packet (BIP) and Block End Packet (BEP). PEBS data is encoded into multiple BIP packets that come between BBP and BEP. The BEP packet might be associated with a FUP packet. That is indicated by using a separate packet type (INTEL_PT_BEP_IP) similar to other packets types with the _IP suffix. Refer to the Intel SDM for more information about PEBS via PT: https://software.intel.com/en-us/articles/intel-sdm May 2019 version: Vol. 3B 18.5.5.2 PEBS output to Intel® Processor Trace Decoding of BIP packets conflicts with single-byte TNT packets. Since BIP packets only occur in the context of a block (i.e. between BBP and BEP), that context must be recorded and passed to the packet decoder. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:17 -03:00
Mathieu Poirier	374d910f87	perf: cs-etm: Optimize option setup for CPU-wide sessions Call function cs_etm_set_option() once with all relevant options set rather than multiple times to avoid going through the list of CPU more than once. Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190611204528.20093-1-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:16 -03:00
Raphael Gault	010e3e8fc1	perf tests arm64: Compile tests unconditionally In order to subsequently add more tests for the arm64 architecture we compile the tests target for arm64 systematically. Further explanation provided by Mark Rutland: Given prior questions regarding this commit, it's probably worth spelling things out more explicitly, e.g. Currently we only build the arm64/tests directory if CONFIG_DWARF_UNWIND is selected, which is fine as the only test we have is arm64/tests/dwarf-unwind.o. So that we can add more tests to the test directory, let's unconditionally build the directory, but conditionally build dwarf-unwind.o depending on CONFIG_DWARF_UNWIND. There should be no functional change as a result of this patch. Signed-off-by: Raphael Gault <raphael.gault@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190611125315.18736-2-raphael.gault@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-17 15:57:16 -03:00
Ingo Molnar	3ce5aceb5d	perf/core improvements and fixes: perf record: Alexey Budankov: - Allow mixing --user-regs with --call-graph=dwarf, making sure that the minimal set of registers for DWARF unwinding is present in the set of user registers requested to be present in each sample, while warning the user that this may make callchains unreliable if more that the minimal set of registers is needed to unwind. yuzhoujian: - Add support to collect callchains from kernel or user space only, IOW allow setting the perf_event_attr.exclude_callchain_{kernel,user} bits from the command line. perf trace: Arnaldo Carvalho de Melo: - Remove x86_64 specific syscall numbers from the augmented_raw_syscalls BPF in-kernel collector of augmented raw_syscalls:sys_{enter,exit} payloads, use instead the syscall numbers obtainer either by the arch specific syscalltbl generators or from audit-libs. - Allow 'perf trace' to ask for the number of bytes to collect for string arguments, for now ask for PATH_MAX, i.e. the whole pathnames, which ends up being just a way to speficy which syscall args are pathnames and thus should be read using bpf_probe_read_str(). - Skip unknown syscalls when expanding strace like syscall groups. This helps using the 'string' group of syscalls to work in arm64, where some of the syscalls present in x86_64 that deal with strings, for instance 'access', are deprecated and this should not be asked for tracing. Leo Yan: - Exit when failing to build eBPF program. perf config: Arnaldo Carvalho de Melo: - Bail out when a handler returns failure for a key-value pair. This helps with cases where processing a key-value pair is not just a matter of setting some tool specific knob, involving, for instance building a BPF program to then attach to the list of events 'perf trace' will use, e.g. augmented_raw_syscalls.c. perf.data: Kan Liang: - Read and store die ID information available in new Intel processors in CPUID.1F in the CPU topology written in the perf.data header. perf stat: Kan Liang: - Support per-die aggregation. Documentation: Arnaldo Carvalho de Melo: - Update perf.data documentation about the CPU_TOPOLOGY, MEM_TOPOLOGY, CLOCKID and DIR_FORMAT headers. Song Liu: - Add description of headers HEADER_BPF_PROG_INFO and HEADER_BPF_BTF. Leo Yan: - Update default value for llvm.clang-bpf-cmd-template in 'man perf-config'. JVMTI: Jiri Olsa: - Address gcc string overflow warning for strncpy() core: - Remove superfluous nthreads system_wide setup in perf_evsel__alloc_fd(). Intel PT: Adrian Hunter: - Add support for samples to contain IPC ratio, collecting cycles information from CYC packets, showing the IPC info periodically, because Intel PT does not update the cycle count on every branch or instruction, the incremental values will often be zero. When there are values, they will be the number of instructions and number of cycles since the last update, and thus represent the average IPC since the last IPC value. E.g.: # perf record --cpu 1 -m200000 -a -e intel_pt/cyc/u sleep 0.0001 rounding mmap pages size to 1024M (262144 pages) [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 2.208 MB perf.data ] # perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid # <SNIP + add line numbering to make sense of IPC counts e.g.: (18/3)> 1 cc1 63501.650479626: 7f5219ac27bf _int_free+0x3f jnz 0x7f5219ac2af0 IPC: 0.81 (36/44) 2 cc1 63501.650479626: 7f5219ac27c5 _int_free+0x45 cmp $0x1f, %rbp 3 cc1 63501.650479626: 7f5219ac27c9 _int_free+0x49 jbe 0x7f5219ac2b00 4 cc1 63501.650479626: 7f5219ac27cf _int_free+0x4f test $0x8, %al 5 cc1 63501.650479626: 7f5219ac27d1 _int_free+0x51 jnz 0x7f5219ac2b00 6 cc1 63501.650479626: 7f5219ac27d7 _int_free+0x57 movq 0x13c58a(%rip), %rcx 7 cc1 63501.650479626: 7f5219ac27de _int_free+0x5e mov %rdi, %r12 8 cc1 63501.650479626: 7f5219ac27e1 _int_free+0x61 movq %fs:(%rcx), %rax 9 cc1 63501.650479626: 7f5219ac27e5 _int_free+0x65 test %rax, %rax 10 cc1 63501.650479626: 7f5219ac27e8 _int_free+0x68 jz 0x7f5219ac2821 11 cc1 63501.650479626: 7f5219ac27ea _int_free+0x6a leaq -0x11(%rbp), %rdi 12 cc1 63501.650479626: 7f5219ac27ee _int_free+0x6e mov %rdi, %rsi 13 cc1 63501.650479626: 7f5219ac27f1 _int_free+0x71 shr $0x4, %rsi 14 cc1 63501.650479626: 7f5219ac27f5 _int_free+0x75 cmpq %rsi, 0x13caf4(%rip) 15 cc1 63501.650479626: 7f5219ac27fc _int_free+0x7c jbe 0x7f5219ac2821 16 cc1 63501.650479626: 7f5219ac2821 _int_free+0xa1 cmpq 0x13f138(%rip), %rbp 17 cc1 63501.650479626: 7f5219ac2828 _int_free+0xa8 jnbe 0x7f5219ac28d8 18 cc1 63501.650479626: 7f5219ac28d8 _int_free+0x158 testb $0x2, 0x8(%rbx) 19 cc1 63501.650479628: 7f5219ac28dc _int_free+0x15c jnz 0x7f5219ac2ab0 IPC: 6.00 (18/3) <SNIP> - Allow using time ranges with Intel PT, i.e. these features, already present but not optimially usable with Intel PT, should be now: Select the second 10% time slice: $ perf script --time 10%/2 Select from 0% to 10% time slice: $ perf script --time 0%-10% Select the first and second 10% time slices: $ perf script --time 10%/1,10%/2 Select from 0% to 10% and 30% to 40% slices: $ perf script --time 0%-10%,30%-40% cs-etm (ARM): Mathieu Poirier: - Add support for CPU-wide trace scenarios. s390: Thomas Richter: - Fix missing kvm module load for s390. - Fix OOM error in TUI mode on s390 - Support s390 diag event display when doing analysis on !s390 architectures. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXP/1xQAKCRCyPKLppCJ+ J9xcAQCwOITAshE7op7HbKUPtkqiMNu+hpNa3skhxEpGHvKO0AEArpBXtuvEP8EU PZsp+8vcVrlZ+dZutttgvkRz25mScg8= =kfFb -----END PGP SIGNATURE----- Merge tag 'perf-core-for-mingo-5.3-20190611' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: perf record: Alexey Budankov: - Allow mixing --user-regs with --call-graph=dwarf, making sure that the minimal set of registers for DWARF unwinding is present in the set of user registers requested to be present in each sample, while warning the user that this may make callchains unreliable if more that the minimal set of registers is needed to unwind. yuzhoujian: - Add support to collect callchains from kernel or user space only, IOW allow setting the perf_event_attr.exclude_callchain_{kernel,user} bits from the command line. perf trace: Arnaldo Carvalho de Melo: - Remove x86_64 specific syscall numbers from the augmented_raw_syscalls BPF in-kernel collector of augmented raw_syscalls:sys_{enter,exit} payloads, use instead the syscall numbers obtainer either by the arch specific syscalltbl generators or from audit-libs. - Allow 'perf trace' to ask for the number of bytes to collect for string arguments, for now ask for PATH_MAX, i.e. the whole pathnames, which ends up being just a way to speficy which syscall args are pathnames and thus should be read using bpf_probe_read_str(). - Skip unknown syscalls when expanding strace like syscall groups. This helps using the 'string' group of syscalls to work in arm64, where some of the syscalls present in x86_64 that deal with strings, for instance 'access', are deprecated and this should not be asked for tracing. Leo Yan: - Exit when failing to build eBPF program. perf config: Arnaldo Carvalho de Melo: - Bail out when a handler returns failure for a key-value pair. This helps with cases where processing a key-value pair is not just a matter of setting some tool specific knob, involving, for instance building a BPF program to then attach to the list of events 'perf trace' will use, e.g. augmented_raw_syscalls.c. perf.data: Kan Liang: - Read and store die ID information available in new Intel processors in CPUID.1F in the CPU topology written in the perf.data header. perf stat: Kan Liang: - Support per-die aggregation. Documentation: Arnaldo Carvalho de Melo: - Update perf.data documentation about the CPU_TOPOLOGY, MEM_TOPOLOGY, CLOCKID and DIR_FORMAT headers. Song Liu: - Add description of headers HEADER_BPF_PROG_INFO and HEADER_BPF_BTF. Leo Yan: - Update default value for llvm.clang-bpf-cmd-template in 'man perf-config'. JVMTI: Jiri Olsa: - Address gcc string overflow warning for strncpy() core: - Remove superfluous nthreads system_wide setup in perf_evsel__alloc_fd(). Intel PT: Adrian Hunter: - Add support for samples to contain IPC ratio, collecting cycles information from CYC packets, showing the IPC info periodically, because Intel PT does not update the cycle count on every branch or instruction, the incremental values will often be zero. When there are values, they will be the number of instructions and number of cycles since the last update, and thus represent the average IPC since the last IPC value. E.g.: # perf record --cpu 1 -m200000 -a -e intel_pt/cyc/u sleep 0.0001 rounding mmap pages size to 1024M (262144 pages) [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 2.208 MB perf.data ] # perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid # <SNIP + add line numbering to make sense of IPC counts e.g.: (18/3)> 1 cc1 63501.650479626: 7f5219ac27bf _int_free+0x3f jnz 0x7f5219ac2af0 IPC: 0.81 (36/44) 2 cc1 63501.650479626: 7f5219ac27c5 _int_free+0x45 cmp $0x1f, %rbp 3 cc1 63501.650479626: 7f5219ac27c9 _int_free+0x49 jbe 0x7f5219ac2b00 4 cc1 63501.650479626: 7f5219ac27cf _int_free+0x4f test $0x8, %al 5 cc1 63501.650479626: 7f5219ac27d1 _int_free+0x51 jnz 0x7f5219ac2b00 6 cc1 63501.650479626: 7f5219ac27d7 _int_free+0x57 movq 0x13c58a(%rip), %rcx 7 cc1 63501.650479626: 7f5219ac27de _int_free+0x5e mov %rdi, %r12 8 cc1 63501.650479626: 7f5219ac27e1 _int_free+0x61 movq %fs:(%rcx), %rax 9 cc1 63501.650479626: 7f5219ac27e5 _int_free+0x65 test %rax, %rax 10 cc1 63501.650479626: 7f5219ac27e8 _int_free+0x68 jz 0x7f5219ac2821 11 cc1 63501.650479626: 7f5219ac27ea _int_free+0x6a leaq -0x11(%rbp), %rdi 12 cc1 63501.650479626: 7f5219ac27ee _int_free+0x6e mov %rdi, %rsi 13 cc1 63501.650479626: 7f5219ac27f1 _int_free+0x71 shr $0x4, %rsi 14 cc1 63501.650479626: 7f5219ac27f5 _int_free+0x75 cmpq %rsi, 0x13caf4(%rip) 15 cc1 63501.650479626: 7f5219ac27fc _int_free+0x7c jbe 0x7f5219ac2821 16 cc1 63501.650479626: 7f5219ac2821 _int_free+0xa1 cmpq 0x13f138(%rip), %rbp 17 cc1 63501.650479626: 7f5219ac2828 _int_free+0xa8 jnbe 0x7f5219ac28d8 18 cc1 63501.650479626: 7f5219ac28d8 _int_free+0x158 testb $0x2, 0x8(%rbx) 19 cc1 63501.650479628: 7f5219ac28dc _int_free+0x15c jnz 0x7f5219ac2ab0 IPC: 6.00 (18/3) <SNIP> - Allow using time ranges with Intel PT, i.e. these features, already present but not optimially usable with Intel PT, should be now: Select the second 10% time slice: $ perf script --time 10%/2 Select from 0% to 10% time slice: $ perf script --time 0%-10% Select the first and second 10% time slices: $ perf script --time 10%/1,10%/2 Select from 0% to 10% and 30% to 40% slices: $ perf script --time 0%-10%,30%-40% cs-etm (ARM): Mathieu Poirier: - Add support for CPU-wide trace scenarios. s390: Thomas Richter: - Fix missing kvm module load for s390. - Fix OOM error in TUI mode on s390 - Support s390 diag event display when doing analysis on !s390 architectures. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-06-17 20:48:14 +02:00
Ingo Molnar	bddb363673	Merge branch 'x86/cpu' into perf/core, to pick up dependent changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-06-17 12:29:16 +02:00
Arnaldo Carvalho de Melo	04c41bcb86	perf trace: Skip unknown syscalls when expanding strace like syscall groups We have $INSTALL_DIR/share/perf-core/strace/groups/string files with syscalls that should be selected when 'string' is used, meaning, in this case, syscalls that receive as one of its arguments a string, like a pathname. But those were first selected and tested on x86_64, and end up failing in architectures where some of those syscalls are not available, like the 'access' syscall on arm64, which makes using 'perf trace -e string' in such archs to fail. Since this the routine doing the validation is used only when reading such files, do not fail when some syscall is not found in the syscalltbl, instead just use pr_debug() to register that in case people are suspicious of problems. Now using 'perf trace -e string' should work on arm64, selecting only the syscalls that have a string and are available on that architecture. Reported-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yonghong Song <yhs@fb.com> Link: https://lkml.kernel.org/r/20190610184754.GU21245@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 17:50:04 -03:00
Thomas Richter	180ca71cf1	perf report: Support s390 diag event display on x86 Perf report fails to display s390 specific event numbered bd000 on an x86 platform. For example on s390 this works without error: [root@m35lp76 perf]# uname -m s390x [root@m35lp76 perf]# ./perf record -e rbd000 -- find / >/dev/null [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.549 MB perf.data ] [root@m35lp76 perf]# ./perf report -D --stdio > /dev/null [root@m35lp76 perf]# Transfering this perf.data file to an x86 platform and executing the same report command produces: [root@f29 perf]# uname -m x86_64 [root@f29 perf]# ./perf report -i ~/perf.data.m35lp76 --stdio interpreting bpf_prog_info from systems with endianity is not yet supported interpreting btf from systems with endianity is not yet supported 0x8c890 [0x8]: failed to process type: 68 Error: failed to process sample Event bd000 generates auxiliary data which is stored in big endian format in the perf data file. This error is caused by missing endianess handling on the x86 platform when the data is displayed. Fix this by handling s390 auxiliary event data depending on the local platform endianness. Output after on x86: [root@f29 perf]# ./perf report -D -i ~/perf.data.m35lp76 --stdio > /dev/null interpreting bpf_prog_info from systems with endianity is not yet supported interpreting btf from systems with endianity is not yet supported [root@f29 perf]# Committer notes: Fix build breakage on older systems, such as CentOS:6 where using nesting calls to the endian.h macros end up redefining local variables: util/s390-cpumsf.c: In function 's390_cpumsf_trailer_show': util/s390-cpumsf.c:333: error: declaration of '__v' shadows a previous local util/s390-cpumsf.c:333: error: shadowed declaration is here util/s390-cpumsf.c:333: error: declaration of '__x' shadows a previous local util/s390-cpumsf.c:333: error: shadowed declaration is here util/s390-cpumsf.c:334: error: declaration of '__v' shadows a previous local util/s390-cpumsf.c:334: error: shadowed declaration is here util/s390-cpumsf.c:334: error: declaration of '__x' shadows a previous local util/s390-cpumsf.c:334: error: shadowed declaration is here [perfbuilder@455a63ef60dc perf]$ gcc -v \|& tail -1 gcc version 4.4.7 20120313 (Red Hat 4.4.7-23) (GCC) [perfbuilder@455a63ef60dc perf]$ Since there are several uses of be64toh(te->flags) Introduce a variable to hold that and then use it, avoiding this case that causes the above problems: - local.bsdes = be16toh((be64toh(te->flags) >> 16 & 0xffff)); + local.bsdes = be16toh((flags >> 16 & 0xffff)); Its the same construct used in s390_cpumsf_diag_show() where we have a 'word' variable that is used just once, s390_cpumsf_basic_show() has lots of uses and also uses a variable to hold the result of be16toh(). Some of those temp variables needed to be converted from 'unsigned long' to 'unsigned long long' so as to build on 32-bit arches such as debian:experimental-x-mipsel, the android NDK ones and fedora:24-x-ARC-uClibc. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20190522064325.25596-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 17:48:30 -03:00
Thomas Richter	8a07aa4e9b	perf report: Fix OOM error in TUI mode on s390 Debugging a OOM error using the TUI interface revealed this issue on s390: [tmricht@m83lp54 perf]$ cat /proc/kallsyms \|sort .... 00000001119b7158 B radix_tree_node_cachep 00000001119b8000 B __bss_stop 00000001119b8000 B _end 000003ff80002850 t autofs_mount [autofs4] 000003ff80002868 t autofs_show_options [autofs4] 000003ff80002a98 t autofs_evict_inode [autofs4] .... There is a huge gap between the last kernel symbol __bss_stop/_end and the first kernel module symbol autofs_mount (from autofs4 module). After reading the kernel symbol table via functions: dso__load() +--> dso__load_kernel_sym() +--> dso__load_kallsyms() +--> __dso_load_kallsyms() +--> symbols__fixup_end() the symbol __bss_stop has a start address of 1119b8000 and an end address of 3ff80002850, as can be seen by this debug statement: symbols__fixup_end __bss_stop start:0x1119b8000 end:0x3ff80002850 The size of symbol __bss_stop is 0x3fe6e64a850 bytes! It is the last kernel symbol and fills up the space until the first kernel module symbol. This size kills the TUI interface when executing the following code: process_sample_event() hist_entry_iter__add() hist_iter__report_callback() hist_entry__inc_addr_samples() symbol__inc_addr_samples(symbol = __bss_stop) symbol__cycles_hist() annotated_source__alloc_histograms(..., symbol__size(sym), ...) This function allocates memory to save sample histograms. The symbol_size() marco is defined as sym->end - sym->start, which results in above value of 0x3fe6e64a850 bytes and the call to calloc() in annotated_source__alloc_histograms() fails. The histgram memory allocation might fail, make this failure no-fatal and continue processing. Output before: [tmricht@m83lp54 perf]$ ./perf --debug stderr=1 report -vvvvv \ -i ~/slow.data 2>/tmp/2 [tmricht@m83lp54 perf]$ tail -5 /tmp/2 __symbol__inc_addr_samples(875): ENOMEM! sym->name=__bss_stop, start=0x1119b8000, addr=0x2aa0005eb08, end=0x3ff80002850, func: 0 problem adding hist entry, skipping event 0x938b8 [0x8]: failed to process type: 68 [Cannot allocate memory] [tmricht@m83lp54 perf]$ Output after: [tmricht@m83lp54 perf]$ ./perf --debug stderr=1 report -vvvvv \ -i ~/slow.data 2>/tmp/2 [tmricht@m83lp54 perf]$ tail -5 /tmp/2 symbol__inc_addr_samples map:0x1597830 start:0x110730000 end:0x3ff80002850 symbol__hists notes->src:0x2aa2a70 nr_hists:1 symbol__inc_addr_samples sym:unlink_anon_vmas src:0x2aa2a70 __symbol__inc_addr_samples: addr=0x11094c69e 0x11094c670 unlink_anon_vmas: period++ [addr: 0x11094c69e, 0x2e, evidx=0] => nr_samples: 1, period: 526008 [tmricht@m83lp54 perf]$ There is no error about failed memory allocation and the TUI interface shows all entries. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/90cb5607-3e12-5167-682d-978eba7dafa8@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:13 -03:00
Thomas Richter	53fe307dfd	perf test 6: Fix missing kvm module load for s390 Command # perf test -Fv 6 fails with error running test 100 'kvm-s390:kvm_s390_create_vm' failed to parse event 'kvm-s390:kvm_s390_create_vm', err -1, str 'unknown tracepoint' event syntax error: 'kvm-s390:kvm_s390_create_vm' \___ unknown tracepoint when the kvm module is not loaded or not built in. Fix this by adding a valid function which tests if the module is loaded. Loaded modules (or builtin KVM support) have a directory named /sys/kernel/debug/tracing/events/kvm-s390 for this tracepoint. Check for existence of this directory. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20190604053504.43073-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:13 -03:00
Adrian Hunter	a77a05e233	perf time-utils: Add support for multiple explicit time intervals Currently only a single explicit time range is accepted. Add support for multiple ranges separated by spaces, which requires the string to be quoted. Update the time utils test accordingly. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-20-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:13 -03:00
Adrian Hunter	e39a12cbd2	perf tests: Add a test for time-utils Test time ranges work as expected. Committer testing: $ perf test "time utils" 59: time utils : Ok $ perf test -v "time utils" 59: time utils : --- start --- test child forked, pid 31711 parse_nsec_time("0") 0 parse_nsec_time("1") 1000000000 parse_nsec_time("0.000000001") 1 parse_nsec_time("1.000000001") 1000000001 parse_nsec_time("123456.123456") 123456123456000 parse_nsec_time("1234567.123456789") 1234567123456789 parse_nsec_time("18446744073.709551615") 18446744073709551615 perf_time__parse_str("1234567.123456789,1234567.123456789") start time 1234567123456789, end time 1234567123456789 perf_time__parse_str("1234567.123456789,1234567.123456790") start time 1234567123456789, end time 1234567123456790 perf_time__parse_str("1234567.123456789,") start time 1234567123456789, end time 0 perf_time__parse_str(",1234567.123456789") start time 0, end time 1234567123456789 perf_time__parse_str("0,1234567.123456789") start time 0, end time 1234567123456789 perf_time__parse_for_ranges("1234567.123456789,1234567.123456790") start time 1234567123456789, end time 1234567123456790 perf_time__parse_for_ranges("10%/1") first_sample_time 7654321000000000 last_sample_time 7654321000000100 start time 0: 7654321000000000, end time 0: 7654321000000009 perf_time__parse_for_ranges("10%/2") first_sample_time 7654321000000000 last_sample_time 7654321000000100 start time 0: 7654321000000010, end time 0: 7654321000000019 perf_time__parse_for_ranges("10%/1,10%/2") first_sample_time 11223344000000000 last_sample_time 11223344000000100 start time 0: 11223344000000000, end time 0: 11223344000000009 start time 1: 11223344000000010, end time 1: 11223344000000019 perf_time__parse_for_ranges("10%/1,10%/3,10%/10") first_sample_time 11223344000000000 last_sample_time 11223344000000100 start time 0: 11223344000000000, end time 0: 11223344000000009 start time 1: 11223344000000020, end time 1: 11223344000000029 start time 2: 11223344000000090, end time 2: 11223344000000100 test child finished with 0 ---- end ---- time utils: Ok $ Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-19-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	929afa0092	perf time-utils: Make perf_time__parse_for_ranges() more logical Explicit time ranges never contain a percent sign whereas percentage ranges always do, so it is possible to call the correct parser. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-18-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	2a8afddc08	perf time-utils: Simplify perf_time__parse_for_ranges() error paths slightly Simplify perf_time__parse_for_ranges() error paths slightly. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-17-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	0ccc69ba0a	perf time-utils: Fix --time documentation Correct some punctuation and spelling and correct the format to show that the time resolution is nanoseconds not microseconds. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-16-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	b16bfeb3db	perf time-utils: Prevent percentage time range overlap Prevent percentage time range overlap. This is only a 1 nanosecond change but makes the results more logical e.g. a sample cannot be in both the first 10% and the second 20%. Note, there is a later patch that adds a test for time-utils. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-15-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	c763242a5e	perf time-utils: Factor out set_percent_time() Factor out set_percent_time() so it can be reused. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-14-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	f79a7689d9	perf time-utils: Treat time ranges consistently Currently, options allow only 1 explicit (non-percentage) time range. In preparation for adding support for multiple explicit time ranges, treat time ranges consistently. Instead of treating some time ranges as inclusive and some as excluding the end time, treat all time ranges as inclusive. This is only a 1 nanosecond change but is necessary to treat multiple explicit time ranges in a consistent manner. Note, there is a later patch that adds a test for time-utils. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-13-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	2c47db90ed	perf intel-pt: Add support for efficient time interval filtering Set up time ranges for efficient time interval filtering using the new "fast forward" facility. Because decoding is done in time order, intel_pt_time_filter() needs to look only at the next start or end timestamp - refer intel_pt_next_time(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-12-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	da9000ae35	perf intel-pt: Add support for lookahead Implement the lookahead callback to let the decoder access subsequent buffers. intel_pt_lookahead() manages the buffer lifetime and calls the decoder for each buffer until the decoder returns a non-zero value. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	e96f7df880	perf intel-pt: Factor out intel_pt_get_buffer() Factor out intel_pt_get_buffer() so it can be reused. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	a7fa19f5a2	perf intel-pt: Add intel_pt_fast_forward() Intel PT decoding is done in time order. In order to support efficient time interval filtering, add a facility to "fast forward" towards a particular timestamp. That involves finding the right buffer, stepping to that buffer, and then stepping forward PSBs. Because decoding must begin at a PSB, "fast forward" stops at the last PSB that has a timestamp before the target timestamp. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	6c1f0b18ac	perf intel-pt: Add reposition parameter to intel_pt_get_data() When the decoder gets the next trace buffer, some state is reset if the buffer is not consecutive to the previous buffer. Add a parameter 'reposition' so that can be done also to support a "fast forward" facility. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	6492e5f013	perf intel-pt: Factor out intel_pt_reposition() Factor out intel_pt_reposition() so it can be reused. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	e72b52a2cf	perf intel-pt: Factor out intel_pt_8b_tsc() Factor out intel_pt_8b_tsc() so it can be reused. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	4d678e9039	perf intel-pt: Add lookahead callback Add a callback function to enable the decoder to lookahead at subsequent trace buffers. This will be used to implement a "fast forward" facility which will be needed to support efficient time interval filtering. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	4885c90c5e	perf report: Set perf time interval in itrace_synth_ops Instruction trace decoders can optimize output based on what time intervals will be filtered, so pass that information in itrace_synth_ops. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:12 -03:00
Adrian Hunter	400ae9818f	perf script: Set perf time interval in itrace_synth_ops Instruction trace decoders can optimize output based on what time intervals will be filtered, so pass that information in itrace_synth_ops. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:11 -03:00
Adrian Hunter	33526f362b	perf auxtrace: Add perf time interval to itrace_synth_ops Instruction trace decoders can optimize output based on what time intervals will be filtered, so pass that information in itrace_synth_ops. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:11 -03:00
Leo Yan	87407fa58b	perf config: Update default value for llvm.clang-bpf-cmd-template The clang bpf cmdline template has defined default value in the file tools/perf/util/llvm-utils.c, which has been changed for several times. This patch updates the documentation to reflect the latest default value for the configuration llvm.clang-bpf-cmd-template. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Drayton <mbd@fb.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Fixes: `d35b168c3d` ("perf bpf: Give precedence to bpf header dir") Fixes: `cb76371441` ("perf llvm: Allow passing options to llc in addition to clang") Fixes: `1b16fffa38` ("perf llvm-utils: Add bpf include path to clang command line") Link: http://lkml.kernel.org/r/20190607143508.18141-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:11 -03:00
Arnaldo Carvalho de Melo	965e176f3c	perf cs-etm: Remove duplicate GENMASK() define, use linux/bits.h instead Suzuki noticed that this should be more useful in a generic header, and after looking I noticed we have it already in our copy of include/linux/bits.h in tools/include, so just use it, test built on x86-64 and ubuntu 19.04 with: perfbuilder@46646c9e848e:/$ aarch64-linux-gnu-gcc --version \|& head -1 aarch64-linux-gnu-gcc (Ubuntu/Linaro 8.3.0-6ubuntu1) 8.3.0 perfbuilder@46646c9e848e:/$ Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lkml.kernel.org/r/68c1c548-33cd-31e8-100d-7ffad008c7b2@arm.com Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org, Link: https://lkml.kernel.org/n/tip-69pd3mqvxdlh2shddsc7yhyv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:11 -03:00
Mathieu Poirier	e45c48a9a4	perf cs-etm: Properly set the value of 'old' and 'head' in snapshot mode This patch adds the necessary intelligence to properly compute the value of 'old' and 'head' when operating in snapshot mode. That way we can get the latest information in the AUX buffer and be compatible with the generic AUX ring buffer mechanic. Tester notes: > Leo, have you had the chance to test/review this one? Suzuki? Sure. I applied this patch on the perf/core branch (with latest commit 3e4fbf36c1e3 'perf augmented_raw_syscalls: Move reading filename to the loop') and passed testing with below steps: # perf record -e cs_etm/@tmc_etr0/ -S -m,64 --per-thread ./sort & [1] 19097 Bubble sorting array of 30000 elements # kill -USR2 19097 # kill -USR2 19097 # kill -USR2 19097 [ perf record: Woken up 4 times to write data ] [ perf record: Captured and wrote 0.753 MB perf.data ] Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190605161633.12245-1-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:11 -03:00
Arnaldo Carvalho de Melo	36edfb9401	perf data: Fix perf.data documentation for HEADER_CPU_TOPOLOGY The 'die' info isn't in the same array as core and socket ids, and we missed the 'dies' string list, that comes right after the 'core' + 'socket' id variable length array, followed by the VLA for the dies. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: c9cb12c5ba08 ("perf header: Add die information in CPU topology") Link: https://lkml.kernel.org/n/tip-nubi6mxp2n8ofvlx7ph6k3h6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:11 -03:00
Kan Liang	0ccdb8407a	perf tools: Apply new CPU topology sysfs attributes The existing "thread_siblings" and "thread_siblings_list" attribute will be deprecated. Use the new CPU topology sysfs attributes, "core_cpus" and "core_cpus_list", which are synonymous with the deprecated attributes. Check the new name first. If not available, use the deprecated name to be compatible with old kernel. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1559688644-106558-5-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:11 -03:00
Kan Liang	e05a899718	perf header: Rename "sibling cores" to "sibling sockets" The "sibling cores" actually shows the sibling CPUs of a socket. The name "sibling cores" is very misleading. Rename "sibling cores" to "sibling sockets" Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1559688644-106558-4-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:20:11 -03:00
Kan Liang	db5742b684	perf stat: Support per-die aggregation It is useful to aggregate counts per die. E.g. Uncore becomes die-scope on Xeon Cascade Lake-AP. Introduce a new option "--per-die" to support per-die aggregation. The global id for each core has been changed to socket + die id + core id. The global id for each die is socket + die id. Add die information for per-core aggregation. The output of per-core aggregation will be changed from "S0-C0" to "S0-D0-C0". Any scripts which rely on the output format of per-core aggregation probably be broken. For 'perf stat record/report', there is no die information when processing the old perf.data. The per-die result will be the same as per-socket. Committer notes: Renamed 'die' variable to 'die_id' to fix the build in some systems: CC /tmp/build/perf/builtin-script.o cc1: warnings being treated as errors builtin-stat.c: In function 'perf_env__get_die': builtin-stat.c:963: error: declaration of 'die' shadows a global declaration util/util.h:19: error: shadowed declaration is here mv: cannot stat `/tmp/build/perf/.builtin-stat.o.tmp': No such file or directory Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-bsnhx7vgsuu6ei307mw60mbj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 16:19:59 -03:00
Kan Liang	acae8b36cd	perf header: Add die information in CPU topology With the new CPUID.1F, a new level type of CPU topology, 'die', is introduced. The 'die' information in CPU topology should be added in perf header. To be compatible with old perf.data, the patch checks the section size before reading the die information. The new info is added at the end of the cpu_topology section, the old perf tool ignores the extra data. It never reads data crossing the section boundary. The new perf tool with the patch can be used on legacy kernel. Add a new function has_die_topology() to check if die topology information is supported by kernel. The function only check X86 and CPU 0. Assuming other CPUs have same topology. Use similar method for core and socket to support die id and sibling dies string. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1559688644-106558-2-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:02 -03:00
Kan Liang	b74d8686a1	perf cpumap: Retrieve die id information There is no function to retrieve die id information of a given CPU. Add cpu_map__get_die_id() to retrieve die id information. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1559688644-106558-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:02 -03:00
Mathieu Poirier	21fe8dc119	perf cs-etm: Add support for CPU-wide trace scenarios Add support for CPU-wide trace scenarios by correlating range packets with timestamp packets. That way range packets received on different ETMQ/traceID channels can be processed and synthesized in chronological order. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-18-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:02 -03:00
Mathieu Poirier	675f302fc2	perf cs-etm: Add notion of time to decoding code This patch deals with timestamp packets received from the decoding library in order to give the front end packet processing loop a handle on the time instruction conveyed by range packets have been executed at. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-17-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:02 -03:00
Mathieu Poirier	0a6be300eb	perf cs-etm: Linking PE contextID with perf thread mechanic Link contextID packets received from the decoder with the perf tool thread mechanic so that we know the specifics of the process currently executing. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-16-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:02 -03:00
Mathieu Poirier	c152d4d49a	perf cs-etm: Add support for multiple traceID queues When operating in CPU-wide trace mode with a source/sink topology of N:1 packets with multiple traceID will end up in the same cs_etm_queue. In order to properly decode packets they need to be split in different queues, i.e one queue per traceID. As such add support for multiple traceID per cs_etm_queue by adding a new cs_etm_traceid_queue every time a new traceID is discovered in the trace stream. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-15-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:02 -03:00
Mathieu Poirier	af21577c05	perf cs-etm: Use traceID aware memory callback API When working with CPU-wide traces different traceID may be found in the same stream. As such we need to use the decoder callback that provides the traceID in order to know the thread context being decoded. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-14-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:02 -03:00
Mathieu Poirier	0abb868bbc	perf cs-etm: Move tid/pid to traceid_queue The tid/pid fields of structure cs_etm_queue are CPU dependent and as such need to be part of the cs_etm_traceid_queue in order to support CPU-wide trace scenarios. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-13-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:02 -03:00
Mathieu Poirier	3c21d7d813	perf cs-etm: Move thread to traceid_queue The thread field of structure cs_etm_queue is CPU dependent and as such need to be part of the cs_etm_traceid_queue in order to support CPU-wide trace scenarios. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-12-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:02 -03:00
Mathieu Poirier	6672559307	perf cs-etm: Get rid of unused cpu in struct cs_etm_queue Nowadays the synthesize code is using the packet's cpu information, making cs_etm_queue::cpu useless. As such simply remove it. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-11-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:02 -03:00
Mathieu Poirier	c7bfa2fd0d	perf cs-etm: Introduce the concept of trace ID queues In an ideal world there is one CPU per cs_etm_queue and as such, one trace ID per cs_etm_queue. In the real world CoreSight topologies allow multiple CPUs to use the same sink, which translates to multiple trace IDs per cs_etm_queue. To deal with this a new cs_etm_traceid_queue structure is introduced to enclose all the information related to a single trace ID, allowing a cs_etm_queue to handle traces generated by any number of CPUs. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-10-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:02 -03:00
Mathieu Poirier	882f4874ad	perf cs-etm: Fix indentation in function cs_etm__process_decoder_queue() Fixing wrong indentation of the while() loop - no change of functionality. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Fixes: `3fa0e83e29` ("perf cs-etm: Modularize main packet processing loop") Link: http://lkml.kernel.org/r/20190524173508.29044-9-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:02 -03:00
Mathieu Poirier	5f7cb03555	perf cs-etm: Move packet queue out of decoder structure The decoder needs to work with more than one traceID queue if we want to support CPU-wide scenarios with N:1 source/sink topologies. As such move the packet buffer and related fields out of the decoder structure and into the cs_etm_queue structure. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-8-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:01 -03:00
Mathieu Poirier	3470d48a4e	perf cs-etm: Refactor error path in cs_etm_decoder__new() There is no point in having two different error goto statement since the openCSD API to free a decoder handles NULL pointers. As such function cs_etm_decoder__free() can be called to deal with all aspect of freeing decoder memory. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-7-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:01 -03:00
Mathieu Poirier	e0d170fa9a	perf cs-etm: Add handling of switch-CPU-wide events Add handling of SWITCH-CPU-WIDE events in order to add the tid/pid of the incoming process to the perf tools machine infrastructure. This information is later retrieved when a contextID packet is found in the trace stream. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-6-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:01 -03:00
Mathieu Poirier	a465f3c3e3	perf cs-etm: Add handling of itrace start events Add handling of ITRACE events in order to add the tid/pid of the executing process to the perf tools machine infrastructure. This information is later retrieved when a contextID packet is found in the trace stream. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-5-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:01 -03:00
Mathieu Poirier	e5993c42e8	perf cs-etm: Configure SWITCH_EVENTS in CPU-wide mode Ask the perf core to generate an event when processes are swapped in/out of context. That way proper action can be taken by the decoding code when faced with such event. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-4-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:01 -03:00
Mathieu Poirier	1c839a5a40	perf cs-etm: Configure timestamp generation in CPU-wide mode When operating in CPU-wide mode tracers need to generate timestamps in order to correlate the code being traced on one CPU with what is executed on other CPUs. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-3-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:01 -03:00
Mathieu Poirier	3399ad9ac2	perf cs-etm: Configure contextID tracing in CPU-wide mode When operating in CPU-wide mode being notified of contextID changes is required so that the decoding mechanic is aware of the process context switch. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: Suzuki Poulouse <suzuki.poulose@arm.com> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190524173508.29044-2-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:01 -03:00
Jiri Olsa	10981c8012	perf evsel: Remove superfluous nthreads system_wide setup in alloc_fd() It's already setup in the only caller of this method in perf_evsel__open(), right before calling perf_evsel__alloc_fd(), no need to do it again. Also it's better to have it out of the function before we move it to libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-1k8lhyjxfk7o8v4g3r7eyjc9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:01 -03:00
yuzhoujian	53651b28cf	perf record: Add support to collect callchains from kernel or user space only One can just record callchains in the kernel or user space with this new options. We can use it together with "--all-kernel" options. This two options is used just like print_stack(sys) or print_ustack(usr) for systemtap. Shown below is the usage of this new option combined with "--all-kernel" options: 1. Configure all used events to run in kernel space and just collect kernel callchains. $ perf record -a -g --all-kernel --kernel-callchains 2. Configure all used events to run in kernel space and just collect user callchains. $ perf record -a -g --all-kernel --user-callchains Committer notes: Improved documentation to state that asking for kernel callchains really is asking for excluding user callchains, and vice versa. Further mentioned that using both won't get both, but nothing, as both will be excluded. Signed-off-by: yuzhoujian <yuzhoujian@didichuxing.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1559222962-22891-1-git-send-email-ufo19890607@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:01 -03:00
Arnaldo Carvalho de Melo	22d4621987	perf config: Bail out when a handler returns failure for a key-value pair So perf_config() uses: int ret = 0; perf_config_set__for_each_entry(config_set, section, item) { ... ret = fn(); if (ret < 0) break; } return ret; Expecting that that break will imediatelly go to function exit to return that error value (ret). The problem is that perf_config_set__for_each_entry() expands into two nested for() loops, one traversing the sections in a config and the second the items in each of those sections, so we have to change that 'break' to a goto label right before that final 'return ret'. With that, for instance 'perf trace' now correctly bails out when a event that is requested to be added via its 'trace.add_events' ~/.perfconfig entry gets rejected by the kernel BPF verifier: # perf trace ls event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o' \___ Kernel verifier blocks program loading (add -v to see detail) Run 'perf list' for a list of valid events Error: wrong config key-value pair trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o # While before it would continue and explode later, when trying to find maps that would have been in place had that augmented_raw_syscalls.o precompiled BPF proggie been accepted by the, humm, bast... rigorous kernel BPF verifier 8-) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Yonghong Song <yhs@fb.com> Fixes: `8a0a9c7e91` ("perf config: Introduce new init() and exit()") Link: https://lkml.kernel.org/n/tip-qvqxfk9d0rn1l7lcntwiezrr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:50:01 -03:00
Leo Yan	012749caf9	perf trace: Exit when failing to build eBPF program On my Juno board with ARM64 CPUs, perf trace command reports the eBPF program building failure but the command will not exit and continue to run. If we define an eBPF event in config file, the event will be parsed with below flow: perf_config() `> trace__config() `> parse_events_option() `> parse_events__scanner() `-> parse_events_parse() `> parse_events_load_bpf() `> llvm__compile_bpf() Though the low level functions return back error values when detect eBPF building failure, but parse_events_option() returns 1 for this case and trace__config() passes 1 to perf_config(); perf_config() doesn't treat the returned value 1 as failure and it continues to parse other configurations. Thus the perf command continues to run even without enabling eBPF event successfully. This patch changes error handling in trace__config(), when it detects failure it will return -1 rather than directly pass error value (1); finally, perf_config() will directly bail out and perf will exit for this case. Committer notes: Simplified the patch to just check directly the return of parse_events_option() and it it is non-zero, change err from its initial zero value to -1. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Yonghong Song <yhs@fb.com> Fixes: `ac96287cae` ("perf trace: Allow specifying a set of events to add in perfconfig") Link: https://lkml.kernel.org/n/tip-x4i63f5kscykfok0hqim3zma@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-10 15:49:43 -03:00
Thomas Gleixner	b886d83c5b	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation version 2 of the license extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 315 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:17 +02:00
Thomas Gleixner	1c6bec5b3d	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 433 Based on 1 normalized pattern(s): released under the gpl v2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 2 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190114.749096322@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:16 +02:00
Thomas Gleixner	1514e85117	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 407 Based on 1 normalized pattern(s): this application is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation version 2 this application is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190112.401137591@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:14 +02:00
Thomas Gleixner	5a8e0ff9b3	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 393 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation version 2 of the license not later! this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation inc 59 temple place suite 330 boston ma 02111 1307 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 3 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531081038.198919026@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:11 +02:00
Thomas Gleixner	5576656858	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 376 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 only as published by the free software foundation extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531081036.798138318@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:10 +02:00
Thomas Gleixner	5efdfe759a	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 305 Based on 1 normalized pattern(s): licensed under the gplv2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 6 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190530000433.961827334@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:04 +02:00
Thomas Gleixner	2025cf9e19	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 263 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:36:37 +02:00
Thomas Gleixner	910070454e	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 251 Based on 1 normalized pattern(s): released under the gpl v2 and only v2 not any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 12 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141332.526460839@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:30:26 +02:00
Arnaldo Carvalho de Melo	dea87bfb7b	perf trace: Associate more argument names with the filename beautifier For instance, the rename* family uses "oldname", "newname", so check if "name" is at the end and treat it as a filename. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-wjy7j4bk06g7atzwoz1mid24@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 10:53:06 -03:00
Arnaldo Carvalho de Melo	8195168e87	perf trace: Consume the augmented_raw_syscalls payload To support the SCA_FILENAME beautifier in more than one syscall arg, as needed for syscalls such as the rename* family, we need to, after processing one such arg, bump the augmented pointers so that the next augmented arg don't reuse data for the previous augmented arguments. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-4e4cmzyjxb3wkonfo1x9a27y@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 10:52:19 -03:00
Jiri Olsa	279ab04dbe	perf jvmti: Address gcc string overflow warning for strncpy() We are getting false positive gcc warning when we compile with gcc9 (9.1.1): CC jvmti/libjvmti.o In file included from /usr/include/string.h:494, from jvmti/libjvmti.c:5: In function ‘strncpy’, inlined from ‘copy_class_filename.constprop’ at jvmti/libjvmti.c:166:3: /usr/include/bits/string_fortified.h:106:10: error: ‘__builtin_strncpy’ specified bound depends on the length of the source argument [-Werror=stringop-overflow=] 106 \| return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest)); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ jvmti/libjvmti.c: In function ‘copy_class_filename.constprop’: jvmti/libjvmti.c:165:26: note: length computed here 165 \| size_t file_name_len = strlen(file_name); \| ^~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors As per Arnaldo's suggestion use strlcpy(), which does the same thing and keeps gcc silent. Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190531131321.GB1281@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:51:26 -03:00
Arnaldo Carvalho de Melo	602bce09fb	perf augmented_raw_syscalls: Move reading filename to the loop Almost there, next step is to copy more than one filename payload. Probably to read syscall arg structs, etc we'll need just a variation of this that will decide what to use, if probe_read_str() or plain probe_read for structs, i.e. fixed size. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-uf6u0pld6xe4xuo16f04owlz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:48:55 -03:00
Arnaldo Carvalho de Melo	deaf4da48a	perf augmented_raw_syscalls: Change helper to consider just the augmented_filename part So that we can use it for multiple args, baby steps not to step into the verifier toes. In the process make sure we handle -EFAULT from bpf_prog_read_str(), as this really is needed now that we'll handle more than one augmented argument, i.e. if there is failure, then we have the argument that fails have: (size = 0, err = -EFAULT, value = [] ) followed by the next, lets say that worked for a second pathname: (size = 4, err = 0, value = "/tmp" ) So we can skip the first while telling the user about the problem and then process the second. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-deyvqi39um6gp6hux6jovos8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:48:54 -03:00
Arnaldo Carvalho de Melo	0c95a7ff76	perf augmented_raw_syscalls: Move the probe_read_str to a separate function One more step into copying multiple filenames to support syscalls like rename*. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-xdqtjexdyp81oomm1rkzeifl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:58 -03:00
Arnaldo Carvalho de Melo	4cae8675ea	perf augmented_raw_syscalls: Tell which args are filenames and how many bytes to copy Since we know what args are strings from reading the syscall descriptions in tracefs and also already mark such args to be beautified using the syscall_arg__scnprintf_filename() helper, all we need is to fill in this info in the 'syscalls' BPF map we were using to state which syscalls the user is interested in, i.e. the syscall filter. Right now just set that with PATH_MAX and unroll the syscall arg in the BPF program, as the verifier isn't liking something clang generates when unrolling the loop. This also makes the augmented_raw_syscalls.c program support all arches, since we removed that set of defines with the hard coded syscall numbers, all should be automatically set for all arches, with the syscall id mapping done correcly. Doing baby steps here, i.e. just the first string arg for a syscall is printed, syscalls with more than one, say, the various rename* syscalls, need further work, but lets get first something that the BPF verifier accepts before increasing the complexity To test it, something like: # perf trace -e string -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c With: # cat ~/.perfconfig [llvm] dump-obj = true clang-opt = -g [trace] #add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c show_zeros = yes show_duration = no no_inherit = yes show_timestamp = no show_arg_names = no args_alignment = 40 show_prefix = yes # That commented add_events line is needed for developing this augmented_raw_syscalls.c BPF program, as if we add it via the 'add_events' mechanism so as to shorten the 'perf trace' command lines, then we end up not setting up the -v option which precludes us having access to the bpf verifier log :-\ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Andrii Nakryiko <andriin@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Link: https://lkml.kernel.org/n/tip-dn863ya0cbsqycxuy0olvbt1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:58 -03:00
Adrian Hunter	80b3fb64a5	perf scripts python: exported-sql-viewer.py: Select find text when find bar is activated The user probably wants to replace the find text, so select the find text when the find bar is activated. That is fairly standard behaviour for search text entry. Entering text will replace the current text, but using edit keys (arrows, home, end etc) cancels the selection and enables editing. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-23-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:58 -03:00
Adrian Hunter	b3b660792e	perf scripts python: exported-sql-viewer.py: Add IPC information to Call Tree Enhance the call tree to display IPC information if it is available. Committer testing: [acme@quaco adrian.hunter]$ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db Reports -> Call Tree, then expand a few trees, then select with the mouse and press control+C (copy): Call Path Object Call Time Time Time(%) Insn Insn Cyc Cyc IPC Branch Branch ▼ simple-retpolin (ns) Cnt Cnt(%) Cnt Cnt(%) Count Count(%) ▼ 23003:23003 ▼ _start ld-2.28.so 112195670 218295 100.0 127746 100.0 207320 100.0 0.62 13046 100.0 ▶ unknown unknown 112195987 3202 1.5 0 0.0 0 0.0 0 1 0.0 ▶ _dl_start ld-2.28.so 112199189 188471 86.3 123394 96.6 180007 86.8 0.69 12529 96.0 ▼ _dl_init ld-2.28.so 112387660 13406 6.1 3207 2.5 14868 7.2 0.22 327 2.5 ▶ call_init.part.0 ld-2.28.so 112387773 117 0.9 70 2.2 639 4.3 0.11 3 0.9 ▶ call_init.part.0 ld-2.28.so 112387890 13129 97.9 3103 96.8 14100 94.8 0.22 315 96.3 ▶ call_init.part.0 ld-2.28.so 112401020 0 0.0 0 0.0 0 0.0 0 2 0.6 ▼ _start simple-retpol 112401066 12899 5.9 1142 0.9 11561 5.6 0.10 184 1.4 ▶ unknown unknown 112401388 846 6.6 0 0.0 0 0.0 0 1 0.5 ▼ __libc_start_main libc-2.28.so 112402344 11621 90.1 1129 98.9 10350 89.5 0.11 181 98.4 ▶ __cxa_atexit libc-2.28.so 112402360 2302 19.8 101 8.9 1817 17.6 0.06 13 7.2 ▶ __libc_csu_init simple-retpol 112404673 121 1.0 43 3.8 340 3.3 0.13 8 4.4 ▶ _setjmp libc-2.28.so 112404794 74 0.6 46 4.1 206 2.0 0.22 4 2.2 ▼ main simple-retpol 112404892 44 0.4 23 2.0 126 1.2 0.18 12 6.6 ▼ foo simple-retpol 112404892 19 43.2 12 52.2 55 43.7 0.22 5 41.7 bar simple-retpol 112404896 12 63.2 3 25.0 34 61.8 0.09 1 20.0 ▼ foo simple-retpol 112404911 25 56.8 11 47.8 71 56.3 0.15 5 41.7 ▶ bar simple-retpol 112404924 10 40.0 3 27.3 27 38.0 0.11 1 20.0 ▶ exit libc-2.28.so 112404936 9029 77.7 878 77.8 7765 75.0 0.11 139 76.8 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-22-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:57 -03:00
Adrian Hunter	38a846d47f	perf scripts python: exported-sql-viewer.py: Add IPC information to Call Graph Graph Enhance the call graph to display IPC information if it is available. Committer testing: [acme@quaco adrian.hunter]$ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db Reports -> Context Sensitive Callgraph, then expand a few trees, then select with the mouse and press control+C: Call Path Object Count Time(ns) Time(%) Insn Insn Cyc Cyc IPC Branch Branch ▼ simple-retpolin Cnt Cnt(%) Cnt Cnt(%) Cnt Cnt(%) ▼ 23003:23003 ▼ _start ld-2.28.so 1 218295 100.0 127746 100.0 207320 100.0 0.62 13046 100.0 ▶ unknown unknown 1 3202 1.5 0 0.0 0 0.0 0 1 0.0 ▶ _dl_start ld-2.28.so 1 188471 86.3 123394 96.6 180007 86.8 0.69 12529 96.0 ▶ _dl_init ld-2.28.so 1 13406 6.1 3207 2.5 14868 7.2 0.22 327 2.5 ▼ _start simple-retpoline 1 12899 5.9 1142 0.9 11561 5.6 0.10 184 1.4 ▶ unknown unknown 1 846 6.6 0 0.0 0 0.0 0 1 0.5 ▼ __libc_start_main libc-2.28.so 1 11621 90.1 1129 98.9 10350 89.5 0.11 181 98.4 ▶ __cxa_atexit libc-2.28.so 1 2302 19.8 101 8.9 1817 17.6 0.06 13 7.2 ▶ __libc_csu_init simple-retpoline 1 121 1.0 43 3.8 340 3.3 0.13 8 4.4 ▼ _setjmp libc-2.28.so 1 74 0.6 46 4.1 206 2.0 0.22 4 2.2 ▼ __sigsetjmp libc-2.28.so 1 74 100.0 46 100.0 206 100.0 0.22 3 75.0 ▶ __sigjmp_save libc-2.28.so 1 0 0.0 0 0.0 0 0.0 0 1 33.3 ▼ main simple-retpoline 1 44 0.4 23 2.0 126 1.2 0.18 12 6.6 ▼ foo simple-retpoline 2 44 100.0 23 100.0 126 100.0 0.18 10 83.3 bar simple-retpoline 2 22 50.0 6 26.1 61 48.4 0.10 2 20.0 ▶ exit libc-2.28.so 1 9029 77.7 878 77.8 7765 75.0 0.11 139 76.8 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-21-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:57 -03:00
Adrian Hunter	4a0979d4b4	perf scripts python: exported-sql-viewer.py: Add CallGraphModelParams Add a parameter to call graph and call tree, to determine whether IPC information is available. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-20-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:57 -03:00
Adrian Hunter	530e22fd5c	perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports Enhance the "All branches" and "Selected branches" reports to display IPC information if it is available. Committer testing: So, testing this I noticed that it all starts with the left arrow in every line, that should mean there is some tree there, i.e. look at all those ▶ symbols: Reports -> All Branches: Time CPU Command PID TID Branch Type In Tx Insn Cnt Cyc Cnt IPC Branch ▶ 187836112195670 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4f110 +_start (ld-2.28.so) ▶ 187836112195987 7 simple-retpolin 23003 23003 trace end No 0 883 0 7f6f33d4f110 _start (ld-2.28.so) -> 0 unknown +(unknown) ▶ 187836112199189 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4f110 +_start (ld-2.28.so) ▶ 187836112199189 7 simple-retpolin 23003 23003 call No 0 0 0 7f6f33d4f113 _start+0x3 (ld-2.28.so) -> 7f6f33d4ff50 +_dl_start (ld-2.28.so) ▶ 187836112199544 7 simple-retpolin 23003 23003 trace end No 17 996 0.02 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so) -> 0 +unknown (unknown) ▶ 187836112200939 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4ff73 +_dl_start+0x23 (ld-2.28.so) ▶ 187836112201229 7 simple-retpolin 23003 23003 trace end No 1 816 0.00 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so) -> 0 +unknown (unknown) ▶ 187836112203500 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4ff7a +_dl_start+0x2a (ld-2.28.so) But if you click on it, that ▶ disappears and a new click doesn't make it reappear, looks buggy, minor oddity, reported to Adrian. Reports -> Selected Branches, then ask for branches in the ld-2.28.so DSO: Time CPU Command PID TID Branch Type In Tx Insn Cnt Cyc Cnt IPC Branch ▶ 187836112195987 7 simple-retpolin 23003 23003 trace end No 0 883 0 7f6f33d4f110 _start (ld-2.28.so) -> 0 unknown (unknown) ▶ 187836112199189 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4f110 _start (ld-2.28.so) ▶ 187836112199189 7 simple-retpolin 23003 23003 call No 0 0 0 7f6f33d4f113 _start+0x3 (ld-2.28.so) -> 7f6f33d4ff50 _dl_start (ld-2.28.so) ▶ 187836112199544 7 simple-retpolin 23003 23003 trace end No 17 996 0.02 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so) -> 0 unknown (unknown) ▶ 187836112200939 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so) ▶ 187836112201229 7 simple-retpolin 23003 23003 trace end No 1 816 0.00 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so) -> 0 unknown (unknown) ▶ 187836112203500 7 simple-retpolin 23003 23003 trace begin No 0 0 0 0 unknown (unknown) -> 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so) ▶ 187836112203528 7 simple-retpolin 23003 23003 unconditional jump No 0 0 0 7f6f33d4ffe7 _dl_start+0x97 (ld-2.28.so) -> 7f6f33d5000b _dl_start+0xbb (ld-2.28.so) ▶ 187836112203528 7 simple-retpolin 23003 23003 conditional jump No 0 0 0 7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so) ▶ 187836112203528 7 simple-retpolin 23003 23003 conditional jump No 0 0 0 7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so) ▶ 187836112203539 7 simple-retpolin 23003 23003 conditional jump No 0 0 0 7f6f33d50025 _dl_start+0xd5 (ld-2.28.so) -> 7f6f33d50210 _dl_start+0x2c0 (ld-2.28.so) ▶ 187836112203539 7 simple-retpolin 23003 23003 conditional jump No 0 0 0 7f6f33d5021a _dl_start+0x2ca (ld-2.28.so) -> 7f6f33d50360 _dl_start+0x410 (ld-2.28.so) ▶ 187836112203539 7 simple-retpolin 23003 23003 unconditional jump No 0 0 0 7f6f33d50377 _dl_start+0x427 (ld-2.28.so) -> 7f6f33d4ffff _dl_start+0xaf (ld-2.28.so) ▶ 187836112203539 7 simple-retpolin 23003 23003 conditional jump No 0 0 0 7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so) ▶ 187836112203562 7 simple-retpolin 23003 23003 conditional jump No 0 0 0 7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so) Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-19-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:57 -03:00
Adrian Hunter	ec7f448e2b	perf scripts python: export-to-postgresql.py: Export IPC information Export cycle and instruction counts on samples and calls tables. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-18-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:57 -03:00
Adrian Hunter	64adadb3f9	perf scripts python: export-to-sqlite.py: Export IPC information Export cycle and instruction counts on samples and calls tables. Committer testing: First runs some workload collecting intel_pt with the 'cyc' ter just for userspace: [root@quaco adrian.hunter]# perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.035 MB simple-retpoline.perf.data ] [root@quaco adrian.hunter]# Then use the export-to-sqlite.py script to see if the changes in this cset don't make it to break and if the changes in the db schema are the ones expected: [root@quaco adrian.hunter]# perf script -i simple-retpoline.perf.data --itrace=be -s ~acme/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls 2019-05-31 11:50:46.942710 Creating database ... 2019-05-31 11:50:46.949663 Writing records... 2019-05-31 11:50:47.224033 Adding indexes 2019-05-31 11:50:47.231599 Done [root@quaco adrian.hunter]# Now lets use the db: [root@quaco adrian.hunter]# sqlite3 simple-retpoline.db SQLite version 3.26.0 2018-12-01 12:34:55 Enter ".help" for usage hints. sqlite> .schema samples CREATE TABLE samples (id integer NOT NULL PRIMARY KEY,evsel_id bigint,machine_id bigint,thread_id bigint,comm_id bigint,dso_id bigint,symbol_id bigint,sym_offset bigint,ip bigint,time bigint,cpuinteger,to_dso_id bigint,to_symbol_id bigint,to_sym_offset bigint,to_ip bigint,branch_type integer,in_tx boolean,call_path_id bigint,insn_count bigint,cyc_count bigint); sqlite> Cool, the 'insn_count' and 'cyc_count' are there, now lets see if we can use them in a query: sqlite> select insn_count,cyc_count from samples where cyc_count > 1500 and insn_count < 10; 6\|1507 sqlite> select insn_count,cyc_count from samples where cyc_count > 1500; 118\|2210 140\|1516 3783\|1861 132\|1521 6\|1507 sqlite> Seems to work :-) Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-17-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:57 -03:00
Adrian Hunter	52a2ab6fa9	perf db-export: Export IPC information Export cycle and instruction counts on samples and call-returns. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-16-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:57 -03:00
Adrian Hunter	1159facee9	perf db-export: Add brief documentation Add brief documentation to explain how the database export maintains backward and forward compatibility. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-15-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:57 -03:00
Adrian Hunter	003ccdc716	perf thread-stack: Accumulate IPC information Cycle and instruction counts are added to the stack. The IPC of a function and all functions it calls, is also recorded. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-14-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:57 -03:00
Adrian Hunter	5db47f43cc	perf intel-pt: Document IPC usage Add brief documentation about instructions-per-cycle (IPC) information derived from Intel PT. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-13-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:57 -03:00
Adrian Hunter	3f05516758	perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packets When CYC packets are not available, it is still possible to count cycles using TSC/TMA/MTC timestamps. As the timestamp increments in TSC ticks, convert to CPU cycles using the current core-to-bus ratio. Do not accumulate cycles when control flow packet generation is not enabled, nor when time has been "lost", typically due to mwait, which is indicated by a TSC/TMA packet that is not part of PSB+. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-12-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:56 -03:00
Adrian Hunter	f3c98c4b5a	perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ip To make it easier to add new code for different TIP cases, separate each case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:56 -03:00
Adrian Hunter	9bc668e3bc	perf intel-pt: Record when decoding PSB+ packets In preparation for using MTC packets to count cycles, record whether decoding is between a PSB and PSBEND packets. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:56 -03:00
Adrian Hunter	68fb45bf17	perf script: Add output of IPC ratio Add field 'ipc' to display instructions-per-cycle. Example: perf record -e intel_pt/cyc/u ls perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid ls 2670177.697113434: 7f0dfdbcd090 _start+0x0 mov %rsp, %rdi IPC: 0.00 (1/877) ls 2670177.697113434: 7f0dfdbcd093 _start+0x3 callq 0x7f0dfdbce030 ls 2670177.697113434: 7f0dfdbce030 _dl_start+0x0 pushq %rbp ls 2670177.697113434: 7f0dfdbce031 _dl_start+0x1 mov %rsp, %rbp ls 2670177.697113434: 7f0dfdbce034 _dl_start+0x4 pushq %r15 ls 2670177.697113434: 7f0dfdbce036 _dl_start+0x6 pushq %r14 ls 2670177.697113434: 7f0dfdbce038 _dl_start+0x8 pushq %r13 ls 2670177.697113434: 7f0dfdbce03a _dl_start+0xa pushq %r12 ls 2670177.697113434: 7f0dfdbce03c _dl_start+0xc mov %rdi, %r12 ls 2670177.697113434: 7f0dfdbce03f _dl_start+0xf pushq %rbx ls 2670177.697113434: 7f0dfdbce040 _dl_start+0x10 sub $0x38, %rsp ls 2670177.697113434: 7f0dfdbce044 _dl_start+0x14 rdtsc ls 2670177.697113434: 7f0dfdbce046 _dl_start+0x16 mov %eax, %eax ls 2670177.697113434: 7f0dfdbce048 _dl_start+0x18 shl $0x20, %rdx ls 2670177.697113434: 7f0dfdbce04c _dl_start+0x1c or %rax, %rdx ls 2670177.697114471: 7f0dfdbce04f _dl_start+0x1f movq 0x27e22(%rip), %rax IPC: 0.00 (15/1685) ls 2670177.697116177: 7f0dfdbce056 _dl_start+0x26 movq %rdx, 0x27683(%rip) IPC: 0.00 (1/881) Note, the IPC values are low due to page faults at the beginning of execution. The additional cycles are due to the time to enter the kernel, not the actual kernel page fault handler. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:56 -03:00
Adrian Hunter	5b1dc0fd1d	perf intel-pt: Add support for samples to contain IPC ratio Copy the incremental instruction count and cycle count onto 'instructions' and 'branches' samples. Because Intel PT does not update the cycle count on every branch or instruction, the incremental values will often be zero. When there are values, they will be the number of instructions and number of cycles since the last update, and thus represent the average IPC since the last IPC value. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:56 -03:00
Adrian Hunter	61d276f428	perf tools: Add IPC information to perf_sample Add counts of instructions and cycles, in order to represent instructions-per-cycle (IPC). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:55 -03:00
Adrian Hunter	7b4b4f8388	perf intel-pt: Accumulate cycle count from CYC packets In preparation for providing instructions-per-cycle (IPC) information, accumulate cycle count from CYC packets. Although CYC packets are optional (requires config term 'cyc' to enable cycle-accurate mode when recording), the simplest way to count cycles is with CYC packets. The first complication is that cycles must be counted only when also counting instructions. That means when control flow packet generation is enabled i.e. between TIP.PGE and TIP.PGD packets. Also, sampling the cycle count follows the same rules as sampling the timestamp, that is, not before the instruction to which the decoder is walking is reached. In addition, the cycle count is not accurate for any but the first branch of a TNT packet. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:54 -03:00
Adrian Hunter	948e9dc8bb	perf intel-pt: Factor out intel_pt_update_sample_time To eliminate some duplication and make the code more understandable, factor out intel_pt_update_sample_time. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:54 -03:00
Alexey Budankov	d194d8fccf	perf record: Allow mixing --user-regs with --call-graph=dwarf When DWARF stacks were requested and at the same time that the user specifies a register set using the --user-regs option the full register context was being captured on samples: $ perf record -g --call-graph dwarf,1024 --user-regs=IP,SP,BP -- stack_test2.g.O3 188143843893585 0x6b48 [0x4f8]: PERF_RECORD_SAMPLE(IP, 0x4002): 23828/23828: 0x401236 period: 1363819 addr: 0x7ffedbdd51ac ... FP chain: nr:0 ... user regs: mask 0xff0fff ABI 64-bit .... AX 0x53b .... BX 0x7ffedbdd3cc0 .... CX 0xffffffff .... DX 0x33d3a .... SI 0x7f09b74c38d0 .... DI 0x0 .... BP 0x401260 .... SP 0x7ffedbdd3cc0 .... IP 0x401236 .... FLAGS 0x20a .... CS 0x33 .... SS 0x2b .... R8 0x7f09b74c3800 .... R9 0x7f09b74c2da0 .... R10 0xfffffffffffff3ce .... R11 0x246 .... R12 0x401070 .... R13 0x7ffedbdd5db0 .... R14 0x0 .... R15 0x0 ... ustack: size 1024, offset 0xe0 . data_src: 0x5080021 ... thread: stack_test2.g.O:23828 ...... dso: /root/abudanko/stacks/stack_test2.g.O3 I.e. the --user-regs=IP,SP,BP was being ignored, being overridden by the needs of --call-graph=dwarf. After applying the change in this patch the sample data contains the user specified register, but making sure that at least the minimal set of register needed for DWARF unwinding (DWARF_MINIMAL_REGS) is requested. The user is warned that DWARF unwinding may not work if extra registers end up being needed. -g call-graph dwarf,K full_regs --user-regs=user_regs user_regs -g call-graph dwarf,K --user-regs=user_regs user_regs + DWARF_MINIMAL_REGS $ perf record -g --call-graph dwarf,1024 --user-regs=BP -- ls WARNING: The use of --call-graph=dwarf may require all the user registers, specifying a subset with --user-regs may render DWARF unwinding unreliable, so the minimal registers set (IP, SP) is explicitly forced. arch COPYING Documentation include Kbuild lbuild MAINTAINERS modules.builtin Module.symvers perf.data.old scripts System.map virt block CREDITS drivers init Kconfig lib Makefile modules.builtin.modinfo net README security tools vmlinux certs crypto fs ipc kernel LICENSES mm modules.order perf.data samples sound usr vmlinux.o [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ] 188368474305373 0x5e40 [0x470]: PERF_RECORD_SAMPLE(IP, 0x4002): 23839/23839: 0x401236 period: 1260507 addr: 0x7ffd3d85e96c ... FP chain: nr:0 ... user regs: mask 0x1c0 ABI 64-bit .... BP 0x401260 .... SP 0x7ffd3d85cc20 .... IP 0x401236 ... ustack: size 1024, offset 0x58 . data_src: 0x5080021 Committer notes: Detected build failures on arches where PERF_REGS_ is not available, such as debian:experimental-x-{mips,mips64,mipsel}, fedora 24 and 30 for ARC uClibc and glibc, reported to Alexey that provided a patch moving the DWARF_MINIMAL_REGS from evsel.c to util/perf_regs.h, where it is guarded by an HAVE_PERF_REGS_SUPPORT ifdef. Committer testing: # perf record --user-regs=bp,ax -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.955 MB perf.data (1773 samples) ] # perf script -F+uregs \| grep AX: \| head -5 perf 1719 [000] 181.272398: 1 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00 perf 1719 [000] 181.272402: 1 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00 perf 1719 [000] 181.272403: 8 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00 perf 1719 [000] 181.272405: 181 cycles: ffffffffba06a7c6 native_write_msr+0x6 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00 perf 1719 [000] 181.272406: 4405 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00 # perf record --call-graph=dwarf --user-regs=bp,ax -a sleep 1 WARNING: The use of --call-graph=dwarf may require all the user registers, specifying a subset with --user-regs may render DWARF unwinding unreliable, so the minimal registers set (IP, SP) is explicitly forced. [ perf record: Woken up 55 times to write data ] [ perf record: Captured and wrote 24.184 MB perf.data (2841 samples) ] [root@quaco ~]# perf script --hide-call-graph -F+uregs \| grep AX: \| head -5 perf 1729 [000] 211.268006: 1 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db perf 1729 [000] 211.268014: 1 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db perf 1729 [000] 211.268017: 5 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db perf 1729 [000] 211.268020: 48 cycles: ffffffffba06a7c6 native_write_msr+0x6 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db perf 1729 [000] 211.268024: 490 cycles: ffffffffba00e471 intel_bts_enable_local+0x21 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db # Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/e7fd37b1-af22-0d94-a0dc-5895e803bbfe@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:54 -03:00
Leo Yan	e5f177a578	perf symbols: Remove unused variable 'err' Variable 'err' is defined but never used in function symsrc__init(), remove it and directly return -1 at the end of the function. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190530093801.20510-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:54 -03:00
Arnaldo Carvalho de Melo	0da6ae94e4	perf data: Document directory format header: HEADER_DIR_FORMAT We forgot to update the perf.data file format document for the HEADER_DIR_FORMAT header, do it now from comments in the patch introducing it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Chong Jiang <chongjiang@chromium.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Fixes: `258031c017` ("perf header: Add DIR_FORMAT feature to describe directory data") Link: https://lkml.kernel.org/n/tip-jbrzb7ijb5al33gi8br6f9rr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:53 -03:00
Arnaldo Carvalho de Melo	a9de7cfc76	perf data: Document clockid header: HEADER_CLOCKID We forgot to update the perf.data file format document for the HEADER_CLOCKID header, do it now from comments in the patch introducing it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Chong Jiang <chongjiang@chromium.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Fixes: `cf7905165f` ("perf record: Encode -k clockid frequency into Perf trace") Link: https://lkml.kernel.org/n/tip-slhnjp06027j3ae17qqetzxj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:53 -03:00
Arnaldo Carvalho de Melo	835fbf126c	perf data: Document memory topology header: HEADER_MEM_TOPOLOGY We forgot to update the perf.data file format document for the HEADER_MEM_TOPOLOGY header, do it now from comments in the patch introducing it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Chong Jiang <chongjiang@chromium.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Simon Que <sque@chromium.org> Fixes: `e2091cedd5` ("perf tools: Add MEM_TOPOLOGY feature to perf data file") Link: https://lkml.kernel.org/n/tip-h5lcm1nbe9ztxwm61gmadd56@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:53 -03:00
Song Liu	8e21be4f81	perf data: Add description of header HEADER_BPF_PROG_INFO and HEADER_BPF_BTF This patch addes description of HEADER_BPF_PROG_INFO and HEADER_BPF_BTF to perf.data-file-format.txt. Requested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: `606f972b13` ("perf bpf: Save bpf_prog_info information as headers to perf.data") Link: http://lkml.kernel.org/r/20190521064406.2498925-1-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-06-05 09:47:52 -03:00
Ingo Molnar	f7b6a8b30c	Linux 5.2-rc3 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlz0N88eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG3kIH/2uP/+A3STjoURBh nCZVThVUXryD+9eughto97PfkBsVs6Wfylx/WX4Qhi4zi8PnIM8DnY9MuCdfhT5+ 7WN76MQrCxagHOtHfGf2yXYtYP4wfNmbttWPxsxtEsWVNMzboCMILTGeSpZlwD04 bb5qdRVeAcULO3A0xAJXS/sSAvX9mFDLDfOV24G2ksRbmrzDs8KPRVJBoSicem+Z Rz0wktu+G3GAb8j3mBu2DcDe66pLGLCbQ3VxwpbCN0+ZyEXUkiY7khGCFEX0SxLH 1+SICNVbdJWMvhQf4p0eEUX/5NhIhtZyUFMiXX/vHnglECTRk4AQ9LQaVuYXDey9 wsnlA9o= =KXpG -----END PGP SIGNATURE----- Merge tag 'v5.2-rc3' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-06-03 11:56:35 +02:00
Linus Torvalds	6751b8d91a	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "On the kernel side there's a bunch of ring-buffer ordering fixes for a reproducible bug, plus a PEBS constraints regression fix. Plus tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools headers UAPI: Sync kvm.h headers with the kernel sources perf record: Fix s390 missing module symbol and warning for non-root users perf machine: Read also the end of the kernel perf test vmlinux-kallsyms: Ignore aliases to _etext when searching on kallsyms perf session: Add missing swap ops for namespace events perf namespace: Protect reading thread's namespace tools headers UAPI: Sync drm/drm.h with the kernel tools headers UAPI: Sync drm/i915_drm.h with the kernel tools headers UAPI: Sync linux/fs.h with the kernel tools headers UAPI: Sync linux/sched.h with the kernel tools arch x86: Sync asm/cpufeatures.h with the with the kernel tools include UAPI: Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls perf arm64: Fix mksyscalltbl when system kernel headers are ahead of the kernel perf data: Fix 'strncat may truncate' build failure with recent gcc perf/ring-buffer: Use regular variables for nesting perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb->user_page data perf/ring_buffer: Add ordering to rb->nest increment perf/ring_buffer: Fix exposing a temporarily decreased data_head perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints	2019-06-02 11:08:12 -07:00
Thomas Gleixner	4f19048fd0	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 166 Based on 1 normalized pattern(s): licensed under the terms of the gnu gpl license version 2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 62 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070033.929121379@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:26:39 -07:00
Thomas Gleixner	c942fddf87	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 Based on 3 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version [author] [kishon] [vijay] [abraham] [i] [kishon]@[ti] [com] this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version [author] [graeme] [gregory] [gg]@[slimlogic] [co] [uk] [author] [kishon] [vijay] [abraham] [i] [kishon]@[ti] [com] [based] [on] [twl6030]_[usb] [c] [author] [hema] [hk] [hemahk]@[ti] [com] this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1105 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070033.202006027@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:26:37 -07:00
Thomas Gleixner	1a59d1b8e0	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation inc 59 temple place suite 330 boston ma 02111 1307 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1334 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:26:35 -07:00
Thomas Gleixner	2874c5fd28	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:26:32 -07:00
Adrian Hunter	14f1cfd4f7	perf intel-pt: Rationalize intel_pt_sync_switch()'s use of next_tid Returning 1 from intel_pt_sync_switch() causes the current tid to be set. That negates the need to keep next_tid anymore. Rationalize the code to that effect. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190412113830.4126-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:45 -03:00
Adrian Hunter	c7b4f15ff7	perf intel-pt: Improve sync_switch by processing PERF_RECORD_SWITCH* in events sync_switch is a facility to synchronize decoding more closely with the point in the kernel when the context actually switched. Improve it by processing "context switch in" events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190412113830.4126-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:45 -03:00
Adrian Hunter	3cd3216dbb	perf scripts python: export-to-postgresql.py: Add support for pyside2 pyside2 is the future for pyside support. Note pyside use Qt4 whereas pyside2 uses Qt5. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190412113830.4126-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:45 -03:00
Adrian Hunter	bfb3170e24	perf scripts python: export-to-sqlite.py: Add support for pyside2 pyside2 is the future for pyside support. Note pyside use Qt4 whereas pyside2 uses Qt5. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190412113830.4126-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:45 -03:00
Adrian Hunter	df8ea22a8f	perf scripts python: exported-sql-viewer.py: Add support for pyside2 pyside2 is the future for pyside support. Note pyside use Qt4 whereas pyside2 uses Qt5. Committer testing: On a system with just: # rpm -qa\| grep -i pyside python2-pyside-1.2.4-7.fc29.x86_64 # Running: $ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db & [1] 7438 Makes it use the pyside 1 files: $ grep -i pyside /proc/7438/maps \| cut -d ' ' -f 6- \| sort -u /usr/lib64/libpyside-python2.7.so.1.2.4 /usr/lib64/python2.7/site-packages/PySide/QtCore.so /usr/lib64/python2.7/site-packages/PySide/QtGui.so /usr/lib64/python2.7/site-packages/PySide/QtSql.so $ rpm -qf /usr/lib64/libpyside-python2.7.so.1.2.4 python2-pyside-1.2.4-7.fc29.x86_64 $ To get PySide2 I guess one needs to do: $ pip install PySide2 But thats a 142MiB download I can't do right now, perhaps before pushing upstream... Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190412113830.4126-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:45 -03:00
Adrian Hunter	1ed7f47fd3	perf scripts python: exported-sql-viewer.py: Use argparse module for argument parsing The argparse module makes it easier to add new arguments. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190412113830.4126-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:45 -03:00
Adrian Hunter	c6aba1bf25	perf scripts python: exported-sql-viewer.py: Change python2 to python Now that there is also support for python3, there is no need to specify python2 explicitly. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190412113830.4126-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:45 -03:00
Arnaldo Carvalho de Melo	2d45ef7033	perf top: Lower message level for failure on synthesizing events for pre-existing BPF programs Move it from being a pr_warning() to a pr_debug(). Also capitalize BPF and explain what gets missing when we're not able to synthesize these events: we'll not be able to resolve symbols, etc. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: Yonghong Song <yhs@fb.com> Link: https://lkml.kernel.org/n/tip-whpnfnw6xtd939odgt9bw9as@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:45 -03:00
Arnaldo Carvalho de Melo	7952fa3b54	perf python: Remove -fstack-protector-strong if clang doesn't have it Some distros put -fstack-protector-strong in the compiler flags to be used to build python extensions, but then, the clang version in that distro doesn't know about that, only gcc does. Check if that is the case and remove it from the set of options used to build the python binding with clang. Case at hand: oraclelinux:7 $ head -2 /etc/os-release NAME="Oracle Linux Server" VERSION="7.6" $ grep stack-protector /usr/lib64/python2.7/_sysconfigdata.py \| head -1 \| cut -c-120 'CFLAGS': '-fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --para $ gcc version 4.8.5 20150623 (Red Hat 4.8.5-36.0.1) (GCC) clang version 3.4.2 (tags/RELEASE_34/dot2-final) clang: error: unknown argument: '-fstack-protector-strong' clang: error: unknown argument: '-fstack-protector-strong' error: command 'clang' failed with exit status 1 cp: cannot stat '/tmp/build/perf/python_ext_build/lib/perf.so': No such file or directory make[2]: ** [/tmp/build/perf/python/perf.so] Error 1 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-brmp2415zxpbhz45etkgjoma@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:44 -03:00
Arnaldo Carvalho de Melo	da2019633f	perf annotate TUI browser: Do not use member from variable within its own initialization Some compilers will complain when using a member of a struct to initialize another member, in the same struct initialization. For instance: debian:8 Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0) oraclelinux:7 clang version 3.4.2 (tags/RELEASE_34/dot2-final) Produce: ui/browsers/annotate.c:104:12: error: variable 'ops' is uninitialized when used within its own initialization [-Werror,-Wuninitialized] (!ops.current_entry \|\| ^~~ 1 error generated. So use an extra variable, initialized just before that struct, to have the value used in the expressions used to init two of the struct members. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: `c298304bd7` ("perf annotate: Use a ops table for annotation_line__write()") Link: https://lkml.kernel.org/n/tip-f9nexro58q62l3o9hez8hr0i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:44 -03:00
Donald Yandt	34b65affe1	perf machine: Return NULL instead of null-terminating /proc/version array Return NULL instead of null-terminating version char array when fgets fails due to end-of-file or error. Signed-off-by: Donald Yandt <donald.yandt@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Fixes: `30ba5b0e66` ("perf machine: Null-terminate version char array upon fgets(/proc/version) error") Link: http://lkml.kernel.org/r/20190528134128.30841-1-donald.yandt@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:44 -03:00
Arnaldo Carvalho de Melo	80ec26d110	perf version: Append 12 git SHA chars to the version string Bumping it from just 4: Before: $ perf -v perf version 5.2.rc1.g80978f $ After: $ perf -v perf version 5.2.rc1.g80978fc864c5 $ Requested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-p4yun2nxlo7eeeohyx5v4kw7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:44 -03:00
Jiri Olsa	8201787cbb	perf script: Remove superfluous BPF event titles There's no need to display "ksymbol event with" text for the PERF_RECORD_KSYMBOL event and "bpf event with" test for the PERF_RECORD_BPF_EVENT event. Remove it so it also goes along with other side-band events display. Before: # perf script --show-bpf-events ... swapper 0 [000] 0.000000: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0ef971d len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174 swapper 0 [000] 0.000000: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 36 After: # perf script --show-bpf-events ... swapper 0 [000] 0.000000: PERF_RECORD_KSYMBOL addr ffffffffc0ef971d len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174 swapper 0 [000] 0.000000: PERF_RECORD_BPF_EVENT type 1, flags 0, id 36 Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190508132010.14512-12-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:44 -03:00
Jiri Olsa	490c8cc949	perf script: Add --show-bpf-events to show eBPF related events Add the --show-bpf-events command line option to show the eBPF related events: PERF_RECORD_KSYMBOL PERF_RECORD_BPF_EVENT Usage: # perf record -a ... # perf script --show-bpf-events ... swapper 0 [000] 0.000000: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0ef971d len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174 swapper 0 [000] 0.000000: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 36 ... Committer testing: # perf script --show-bpf-events \| egrep -i 'PERF_RECORD_(BPF\|KSY)' 0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc029a6c3 len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba 0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 47 0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc029c1ae len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174 0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 48 0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc02ddd1c len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba 0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 49 0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc02dfc11 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174 0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 50 0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc045da0a len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba 0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 51 0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc04ef4b4 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174 0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 52 0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc09e15da len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba 0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 53 0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0d2b1a3 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174 0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 54 0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0fd9850 len 381 type 1 flags 0x0 name bpf_prog_819967866022f1e1_sys_enter 0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 179 0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0feb1ec len 191 type 1 flags 0x0 name bpf_prog_c1bd85c092d6e4aa_sys_exit 0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 180 ^C[root@quaco pt]# perf evlist intel_pt//ku dummy:u # Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190508132010.14512-11-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:44 -03:00
Jiri Olsa	4f600bcf65	perf tests: Add map_groups__merge_in test Add map_groups__merge_in test to test the map_groups__merge_in function usage - merging kcore maps into existing eBPF maps. Committer testing: # perf test merge 59: map_groups__merge_in : Ok # perf test -v merge 59: map_groups__merge_in : --- start --- test child forked, pid 8349 test child finished with 0 ---- end ---- map_groups__merge_in: Ok # Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190508132010.14512-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:44 -03:00
Jiri Olsa	1c4924220c	perf script: Pad DSO name for --call-trace Pad the DSO name in --call-trace so we don't have the indent screwed by different DSO name lengths, as now for kernel there's also BPF code displayed. # perf-with-kcore record pt -e intel_pt//ku -- sleep 1 # perf-core/perf-with-kcore script pt --call-trace Before: sleep 3660 [16] 57036.806464404: ([kernel.kallsyms]) kretprobe_perf_func sleep 3660 [16] 57036.806464404: ([kernel.kallsyms]) trace_call_bpf sleep 3660 [16] 57036.806464404: ([kernel.kallsyms]) __x86_indirect_thunk_rax sleep 3660 [16] 57036.806464404: ([kernel.kallsyms]) __x86_indirect_thunk_rax sleep 3660 [16] 57036.806464725: (bpf_prog_da4fe6b3d2c29b25_trace_return) bpf_get_current_pid_tgid sleep 3660 [16] 57036.806464725: (bpf_prog_da4fe6b3d2c29b25_trace_return) bpf_ktime_get_ns sleep 3660 [16] 57036.806464725: ([kernel.kallsyms]) __x86_indirect_thunk_rax sleep 3660 [16] 57036.806464725: ([kernel.kallsyms]) __x86_indirect_thunk_rax sleep 3660 [16] 57036.806465045: (bpf_prog_da4fe6b3d2c29b25_trace_return) __htab_map_lookup_elem sleep 3660 [16] 57036.806465366: ([kernel.kallsyms]) memcmp sleep 3660 [16] 57036.806465687: (bpf_prog_da4fe6b3d2c29b25_trace_return) bpf_probe_read sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) probe_kernel_read sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) __check_object_size sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) check_stack_object sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) copy_user_enhanced_fast_string sleep 3660 [16] 57036.806465687: (bpf_prog_da4fe6b3d2c29b25_trace_return) bpf_probe_read sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) probe_kernel_read sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) __check_object_size sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) check_stack_object sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) copy_user_enhanced_fast_string sleep 3660 [16] 57036.806466008: (bpf_prog_da4fe6b3d2c29b25_trace_return) bpf_get_current_uid_gid sleep 3660 [16] 57036.806466008: ([kernel.kallsyms]) from_kgid sleep 3660 [16] 57036.806466008: ([kernel.kallsyms]) from_kuid sleep 3660 [16] 57036.806466008: (bpf_prog_da4fe6b3d2c29b25_trace_return) bpf_perf_event_output sleep 3660 [16] 57036.806466328: ([kernel.kallsyms]) perf_event_output sleep 3660 [16] 57036.806466328: ([kernel.kallsyms]) perf_prepare_sample sleep 3660 [16] 57036.806466328: ([kernel.kallsyms]) perf_misc_flags sleep 3660 [16] 57036.806466328: ([kernel.kallsyms]) __x86_indirect_thunk_rax sleep 3660 [16] 57036.806466328: ([kernel.kallsyms]) __x86_indirect_thunk_rax sleep 3660 [16] 57036.806466328: ([kvm]) kvm_is_in_guest sleep 3660 [16] 57036.806466649: ([kernel.kallsyms]) __perf_event_header__init_id.isra.0 sleep 3660 [16] 57036.806466649: ([kernel.kallsyms]) perf_output_begin After: sleep 3660 [16] 57036.806464404: ([kernel.kallsyms] ) kretprobe_perf_func sleep 3660 [16] 57036.806464404: ([kernel.kallsyms] ) trace_call_bpf sleep 3660 [16] 57036.806464404: ([kernel.kallsyms] ) __x86_indirect_thunk_rax sleep 3660 [16] 57036.806464404: ([kernel.kallsyms] ) __x86_indirect_thunk_rax sleep 3660 [16] 57036.806464725: (bpf_prog_da4fe6b3d2c29b25_trace_return ) bpf_get_current_pid_tgid sleep 3660 [16] 57036.806464725: (bpf_prog_da4fe6b3d2c29b25_trace_return ) bpf_ktime_get_ns sleep 3660 [16] 57036.806464725: ([kernel.kallsyms] ) __x86_indirect_thunk_rax sleep 3660 [16] 57036.806464725: ([kernel.kallsyms] ) __x86_indirect_thunk_rax sleep 3660 [16] 57036.806465045: (bpf_prog_da4fe6b3d2c29b25_trace_return ) __htab_map_lookup_elem sleep 3660 [16] 57036.806465366: ([kernel.kallsyms] ) memcmp sleep 3660 [16] 57036.806465687: (bpf_prog_da4fe6b3d2c29b25_trace_return ) bpf_probe_read sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) probe_kernel_read sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) __check_object_size sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) check_stack_object sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) copy_user_enhanced_fast_string sleep 3660 [16] 57036.806465687: (bpf_prog_da4fe6b3d2c29b25_trace_return ) bpf_probe_read sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) probe_kernel_read sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) __check_object_size sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) check_stack_object sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) copy_user_enhanced_fast_string sleep 3660 [16] 57036.806466008: (bpf_prog_da4fe6b3d2c29b25_trace_return ) bpf_get_current_uid_gid sleep 3660 [16] 57036.806466008: ([kernel.kallsyms] ) from_kgid sleep 3660 [16] 57036.806466008: ([kernel.kallsyms] ) from_kuid sleep 3660 [16] 57036.806466008: (bpf_prog_da4fe6b3d2c29b25_trace_return ) bpf_perf_event_output sleep 3660 [16] 57036.806466328: ([kernel.kallsyms] ) perf_event_output sleep 3660 [16] 57036.806466328: ([kernel.kallsyms] ) perf_prepare_sample sleep 3660 [16] 57036.806466328: ([kernel.kallsyms] ) perf_misc_flags sleep 3660 [16] 57036.806466328: ([kernel.kallsyms] ) __x86_indirect_thunk_rax sleep 3660 [16] 57036.806466328: ([kernel.kallsyms] ) __x86_indirect_thunk_rax Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190508132010.14512-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:44 -03:00
Jiri Olsa	6c398d723a	perf dso: Add BPF DSO read and size hooks Add BPF related code into DSO reading paths to return size (bpf_size) and read the BPF code (bpf_read). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190508132010.14512-5-jolsa@kernel.org [ Use uintptr_t when casting from u64 to u8 pointers ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:44 -03:00
Jiri Olsa	cacddfe7b0	perf dso: Simplify dso_cache__read function There's no need for the while loop now, also we can connect two (ret > 0) condition legs together. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190508132010.14512-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:44 -03:00
Jiri Olsa	ea5db1bd5a	perf dso: Separate generic code in dso_cache__read Move the file specific code in the dso_cache__read function to a separate file_read function. I'll add BPF specific code in the following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190508132010.14512-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:44 -03:00
Jiri Olsa	5523769ee1	perf dso: Separate generic code in dso__data_file_size() Moving file specific code in dso__data_file_size function into separate file_size function. I'll add bpf specific code in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190508132010.14512-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:43 -03:00
Namhyung Kim	7cb10a08df	perf tools: Remove const from thread read accessors The namespaces and comm fields of a thread are protected by rwsem and require write access for it. So it ended up using a cast to remove the const qualifier. Let's get rid of the const then. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Krister Johansen <kjlx@templeofstupid.com> Link: http://lkml.kernel.org/r/20190527061149.168640-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:43 -03:00
Namhyung Kim	a0c0a4ac02	perf top: Add --namespaces option Since 'perf record' already have this option, let's have it for 'perf top' as well. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Krister Johansen <kjlx@templeofstupid.com> Link: http://lkml.kernel.org/r/20190522053250.207156-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:43 -03:00
Arnaldo Carvalho de Melo	a9a187a749	perf trace: Beautify 'sync_file_range' arguments Use existing beautifiers for the first arg, fd, assigned using the heuristic that looks for syscall arg names and associates SCA_FD with 'fd' named argumes, and wire up the recently introduced sync_file_range flags table generator. Now it should be possible to just use: perf trace -e sync_file_range As root and see all sync_file_range syscalls with its args beautified. Doing a syscall strace like session looking for this syscall, then run postgresql's initdb command: # perf trace -e sync_file_range <SNIP> initdb/1332 sync_file_range(6</var/lib/pgsql/data/global/1260_fsm>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(6</var/lib/pgsql/data/global/1260_fsm>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(5</var/lib/pgsql/data/global>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(5</var/lib/pgsql/data/global>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(5</var/lib/pgsql/data/global>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(5</var/lib/pgsql/data/global>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(5</var/lib/pgsql/data/global>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(5</var/lib/pgsql/data/global>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(5</var/lib/pgsql/data/global>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(5</var/lib/pgsql/data/global>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(7</var/lib/pgsql/data/base/1/2682>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(6</var/lib/pgsql/data/global/1260_fsm>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(7</var/lib/pgsql/data/base/1/2682>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(6</var/lib/pgsql/data/global/1260_fsm>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(5</var/lib/pgsql/data/global>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(5</var/lib/pgsql/data/global>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(5</var/lib/pgsql/data/global>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(5</var/lib/pgsql/data/global>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(4</var/lib/pgsql/data>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 initdb/1332 sync_file_range(4</var/lib/pgsql/data>, 0, 0, SYNC_FILE_RANGE_WRITE) = 0 ^C # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-8tqy34xhpg8gwnaiv74xy93w@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:43 -03:00
Arnaldo Carvalho de Melo	8ef6d74e1d	perf beauty: Add generator for sync_file_range's 'flags' arg values $ tools/perf/trace/beauty/sync_file_range.sh static const char *sync_file_range_flags[] = { [ilog2(1) + 1] = "WAIT_BEFORE", [ilog2(2) + 1] = "WRITE", [ilog2(4) + 1] = "WAIT_AFTER", }; $ When all are the above are present, then we have something called SYNC_FILE_RANGE_WRITE_AND_WAIT, that will be special cased in the upcoming scnprintf beautifier for this flags arg. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-uf2vd7bc8fkz65j7yit8dh84@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:43 -03:00
Arnaldo Carvalho de Melo	ee364dcdcd	perf trace beauty clone: Handle CLONE_PIDFD In addition to the older flags. This will allow something like this to be implemented in 'perf trace" perf trace -e clone/PIDFD in flags/ I.e. ask for strace like tracing, system wide, looking for 'clone' syscalls that have the CLONE_PIDFD bit set in the 'flags' arg. For now we'll just see PIDFD if it is set in the 'flags' arg. Cc: Christian Brauner <christian@brauner.io> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-drq9h7s8gcv8b87064fp6lb0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:43 -03:00
Arnaldo Carvalho de Melo	f6af095668	perf trace: Beautify 'fsmount' arguments Use existing beautifiers for the first arg, fd, assigned using the heuristic that looks for syscall arg names and associates SCA_FD with 'fd' named argumes, and wire up the recently introduced fsmount attr_flags table generator. Now it should be possible to just use: perf trace -e fsmount As root and see all fsmount syscalls with its args beautified. # cat sys_fsmount.c #define _GNU_SOURCE /* See feature_test_macros(7) / #include <unistd.h> #include <sys/syscall.h> / For SYS_xxx definitions / #define __NR_fsmount 432 #define MOUNT_ATTR_RDONLY 0x00000001 / Mount read-only / #define MOUNT_ATTR_NOSUID 0x00000002 / Ignore suid and sgid bits / #define MOUNT_ATTR_NODEV 0x00000004 / Disallow access to device special files / #define MOUNT_ATTR_NOEXEC 0x00000008 / Disallow program execution / #define MOUNT_ATTR__ATIME 0x00000070 / Setting on how atime should be updated / #define MOUNT_ATTR_RELATIME 0x00000000 / - Update atime relative to mtime/ctime. / #define MOUNT_ATTR_NOATIME 0x00000010 / - Do not update access times. / #define MOUNT_ATTR_STRICTATIME 0x00000020 / - Always perform atime updates / #define MOUNT_ATTR_NODIRATIME 0x00000080 / Do not update directory access times / static inline int sys_fsmount(int fs_fd, int flags, int attr_flags) { syscall(__NR_fsmount, fs_fd, flags, attr_flags); } int main(int argc, char argv[]) { int attr_flags = 0, fs_fd = 0; sys_fsmount(fs_fd++, 0, attr_flags); attr_flags \|= MOUNT_ATTR_RDONLY; sys_fsmount(fs_fd++, 1, attr_flags); attr_flags \|= MOUNT_ATTR_NOSUID; sys_fsmount(fs_fd++, 0, attr_flags); attr_flags \|= MOUNT_ATTR_NODEV; sys_fsmount(fs_fd++, 1, attr_flags); attr_flags \|= MOUNT_ATTR_NOEXEC; sys_fsmount(fs_fd++, 0, attr_flags); attr_flags \|= MOUNT_ATTR_NOATIME; sys_fsmount(fs_fd++, 1, attr_flags); attr_flags \|= MOUNT_ATTR_STRICTATIME; sys_fsmount(fs_fd++, 0, attr_flags); attr_flags \|= MOUNT_ATTR_NODIRATIME; sys_fsmount(fs_fd++, 0, attr_flags); return 0; } # # perf trace -e fsmount ./sys_fsmount fsmount(0, 0, MOUNT_ATTR_RELATIME) = -1 EINVAL (Invalid argument) fsmount(1, FSMOUNT_CLOEXEC, MOUNT_ATTR_RDONLY\|MOUNT_ATTR_RELATIME) = -1 EINVAL (Invalid argument) fsmount(2, 0, MOUNT_ATTR_RDONLY\|MOUNT_ATTR_NOSUID\|MOUNT_ATTR_RELATIME) = -1 EINVAL (Invalid argument) fsmount(3, FSMOUNT_CLOEXEC, MOUNT_ATTR_RDONLY\|MOUNT_ATTR_NOSUID\|MOUNT_ATTR_NODEV\|MOUNT_ATTR_RELATIME) = -1 EBADF (Bad file descriptor) fsmount(4, 0, MOUNT_ATTR_RDONLY\|MOUNT_ATTR_NOSUID\|MOUNT_ATTR_NODEV\|MOUNT_ATTR_NOEXEC\|MOUNT_ATTR_RELATIME) = -1 EBADF (Bad file descriptor) fsmount(5, FSMOUNT_CLOEXEC, MOUNT_ATTR_RDONLY\|MOUNT_ATTR_NOSUID\|MOUNT_ATTR_NODEV\|MOUNT_ATTR_NOEXEC\|MOUNT_ATTR_NOATIME) = -1 EBADF (Bad file descriptor) fsmount(6, 0, MOUNT_ATTR_RDONLY\|MOUNT_ATTR_NOSUID\|MOUNT_ATTR_NODEV\|MOUNT_ATTR_NOEXEC\|MOUNT_ATTR_NOATIME\|MOUNT_ATTR_STRICTATIME) = -1 EINVAL (Invalid argument) fsmount(7, 0, MOUNT_ATTR_RDONLY\|MOUNT_ATTR_NOSUID\|MOUNT_ATTR_NODEV\|MOUNT_ATTR_NOEXEC\|MOUNT_ATTR_NOATIME\|MOUNT_ATTR_STRICTATIME\|MOUNT_ATTR_NODIRATIME) = -1 EINVAL (Invalid argument) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-w71uge0sfo6ns9uclhwtthca@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:43 -03:00
Arnaldo Carvalho de Melo	f5b91dbba1	perf trace: Introduce syscall_arg__scnprintf_strarray_flags So that one can just define a strarray and process it as a set of flags, similar to syscall_arg__scnprintf_strarray() with plain arrays. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-nnt25wkpkow2w0yefhi6sb7q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:43 -03:00
Arnaldo Carvalho de Melo	3637c64731	perf beauty: Add generator for fsmount's 'attr_flags' arg values $ tools/perf/trace/beauty/fsmount.sh static const char *fsmount_attr_flags[] = { [ilog2(0x00000001) + 1] = "RDONLY", [ilog2(0x00000002) + 1] = "NOSUID", [ilog2(0x00000004) + 1] = "NODEV", [ilog2(0x00000008) + 1] = "NOEXEC", [ilog2(0x00000010) + 1] = "NOATIME", [ilog2(0x00000020) + 1] = "STRICTATIME", [ilog2(0x00000080) + 1] = "NODIRATIME", } MOUNT_ATTR__ATIME and MOUNT_ATTR_RELATIME will be special cased in the fsmount__scnprintf_flags() beautifier. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-sl24d7m2ge82mfmrbaf1mb0s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:43 -03:00
Arnaldo Carvalho de Melo	dcc6fd64f2	perf trace: Beautify 'fsconfig' arguments Use existing beautifiers for the first arg, fd, assigned using the heuristic that looks for syscall arg names and associates SCA_FD with 'fd' named argumes, and wire up the recently introduced fsconfig cmd table generator. Now it should be possible to just use: perf trace -e fsconfig As root and see all fsconfig syscalls with its args beautified, more work needed to look at the command and according to it handle the 'key', 'value' and 'aux' args, using the 'fcntl' and 'futex' beautifiers as a starting point to see how to suppress sets of these last three args that may not be used by the 'cmd' arg, etc. # cat sys_fsconfig.c #define _GNU_SOURCE /* See feature_test_macros(7) / #include <unistd.h> #include <sys/syscall.h> / For SYS_xxx definitions / #include <fcntl.h> #define __NR_fsconfig 431 enum fsconfig_command { FSCONFIG_SET_FLAG = 0, / Set parameter, supplying no value / FSCONFIG_SET_STRING = 1, / Set parameter, supplying a string value / FSCONFIG_SET_BINARY = 2, / Set parameter, supplying a binary blob value / FSCONFIG_SET_PATH = 3, / Set parameter, supplying an object by path / FSCONFIG_SET_PATH_EMPTY = 4, / Set parameter, supplying an object by (empty) path / FSCONFIG_SET_FD = 5, / Set parameter, supplying an object by fd / FSCONFIG_CMD_CREATE = 6, / Invoke superblock creation / FSCONFIG_CMD_RECONFIGURE = 7, / Invoke superblock reconfiguration / }; static inline int sys_fsconfig(int fd, int cmd, const char key, const void value, int aux) { syscall(__NR_fsconfig, fd, cmd, key, value, aux); } int main(int argc, char argv[]) { int fd = 0, aux = 0; open("/foo", 0); sys_fsconfig(fd++, FSCONFIG_SET_FLAG, "/foo1", "/bar1", aux++); sys_fsconfig(fd++, FSCONFIG_SET_STRING, "/foo2", "/bar2", aux++); sys_fsconfig(fd++, FSCONFIG_SET_BINARY, "/foo3", "/bar3", aux++); sys_fsconfig(fd++, FSCONFIG_SET_PATH, "/foo4", "/bar4", aux++); sys_fsconfig(fd++, FSCONFIG_SET_PATH_EMPTY, "/foo5", "/bar5", aux++); sys_fsconfig(fd++, FSCONFIG_SET_FD, "/foo6", "/bar6", aux++); sys_fsconfig(fd++, FSCONFIG_CMD_CREATE, "/foo7", "/bar7", aux++); sys_fsconfig(fd++, FSCONFIG_CMD_RECONFIGURE, "/foo8", "/bar8", aux++); return 0; } # trace -e fsconfig ./sys_fsconfig fsconfig(0, FSCONFIG_SET_FLAG, 0x40201b, 0x402015, 0) = -1 EINVAL (Invalid argument) fsconfig(1, FSCONFIG_SET_STRING, 0x402027, 0x402021, 1) = -1 EINVAL (Invalid argument) fsconfig(2, FSCONFIG_SET_BINARY, 0x402033, 0x40202d, 2) = -1 EINVAL (Invalid argument) fsconfig(3, FSCONFIG_SET_PATH, 0x40203f, 0x402039, 3) = -1 EBADF (Bad file descriptor) fsconfig(4, FSCONFIG_SET_PATH_EMPTY, 0x40204b, 0x402045, 4) = -1 EBADF (Bad file descriptor) fsconfig(5, FSCONFIG_SET_FD, 0x402057, 0x402051, 5) = -1 EINVAL (Invalid argument) fsconfig(6, FSCONFIG_CMD_CREATE, 0x402063, 0x40205d, 6) = -1 EINVAL (Invalid argument) fsconfig(7, FSCONFIG_CMD_RECONFIGURE, 0x40206f, 0x402069, 7) = -1 EINVAL (Invalid argument) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-fb04b76cm59zfuv1wzu40uxy@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:43 -03:00
Arnaldo Carvalho de Melo	d35293004a	perf beauty: Add generator for fsconfig's 'cmd' arg values $ tools/perf/trace/beauty/fsconfig.sh static const char *fsconfig_cmds[] = { [0] = "SET_FLAG", [1] = "SET_STRING", [2] = "SET_BINARY", [3] = "SET_PATH", [4] = "SET_PATH_EMPTY", [5] = "SET_FD", [6] = "CMD_CREATE", [7] = "CMD_RECONFIGURE", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-u721396rkqmawmt91dwwsntu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:43 -03:00
Arnaldo Carvalho de Melo	693bd3949b	perf trace: Beautify 'fspick' arguments Use existing beautifiers for the first 2 args (dfd, path) and wire up the recently introduced fspick flags table generator. Now it should be possible to just use: perf trace -e fspick As root and see all move_mount syscalls with its args beautified, either using the vfs_getname perf probe method or using the augmented_raw_syscalls.c eBPF helper to get the pathnames, the other args should work in all cases, i.e. all that is needed can be obtained directly from the raw_syscalls:sys_enter tracepoint args. # cat sys_fspick.c #define _GNU_SOURCE /* See feature_test_macros(7) / #include <unistd.h> #include <sys/syscall.h> / For SYS_xxx definitions / #include <fcntl.h> #define __NR_fspick 433 #define FSPICK_CLOEXEC 0x00000001 #define FSPICK_SYMLINK_NOFOLLOW 0x00000002 #define FSPICK_NO_AUTOMOUNT 0x00000004 #define FSPICK_EMPTY_PATH 0x00000008 static inline int sys_fspick(int fd, const char path, int flags) { syscall(__NR_fspick, fd, path, flags); } int main(int argc, char *argv[]) { int flags = 0, fd = 0; open("/foo", 0); sys_fspick(fd++, "/foo1", flags); flags \|= FSPICK_CLOEXEC; sys_fspick(fd++, "/foo2", flags); flags \|= FSPICK_SYMLINK_NOFOLLOW; sys_fspick(fd++, "/foo3", flags); flags \|= FSPICK_NO_AUTOMOUNT; sys_fspick(fd++, "/foo4", flags); flags \|= FSPICK_EMPTY_PATH; return sys_fspick(fd++, "/foo5", flags); } # perf trace -e fspick ./sys_fspick LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o fspick(0, "/foo1", 0) = -1 ENOENT (No such file or directory) fspick(1, "/foo2", FSPICK_CLOEXEC) = -1 ENOENT (No such file or directory) fspick(2, "/foo3", FSPICK_CLOEXEC\|FSPICK_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory) fspick(3, "/foo4", FSPICK_CLOEXEC\|FSPICK_SYMLINK_NOFOLLOW\|FSPICK_NO_AUTOMOUNT) = -1 ENOENT (No such file or directory) fspick(4, "/foo5", FSPICK_CLOEXEC\|FSPICK_SYMLINK_NOFOLLOW\|FSPICK_NO_AUTOMOUNT\|FSPICK_EMPTY_PATH) = -1 ENOENT (No such file or directory) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-erau5xjtt8wvgnhvdbchstuk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:42 -03:00
Arnaldo Carvalho de Melo	a1c729a5f6	perf beauty: Add generator for fspick's 'flags' arg values $ tools/perf/trace/beauty/fspick.sh static const char *fspick_flags[] = { [ilog2(0x00000001) + 1] = "CLOEXEC", [ilog2(0x00000002) + 1] = "SYMLINK_NOFOLLOW", [ilog2(0x00000004) + 1] = "NO_AUTOMOUNT", [ilog2(0x00000008) + 1] = "EMPTY_PATH", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-8i16btocq1ax2u6542ya79t5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:42 -03:00
Arnaldo Carvalho de Melo	566e30229e	perf trace: Beautify 'move_mount' arguments Use existing beautifiers for the first 4 args (to/from fds, pathnames) and wire up the recently introduced move_mount flags table generator. Now it should be possible to just use: perf trace -e move_mount As root and see all move_mount syscalls with its args beautified, except for the filenames, that need work in the augmented_raw_syscalls.c eBPF helper to pass more than one, see comment in the augmented_raw_syscalls.c source code, the other args should work in all cases, i.e. all that is needed can be obtained directly from the raw_syscalls:sys_enter tracepoint args. Running without the strace "skin" (.perfconfig setting output formatting switches to look like strace output + BPF to collect strings, as we still need to support collecting multiple string args for the same syscall, like with move_mount): # cat sys_move_mount.c #define _GNU_SOURCE /* See feature_test_macros(7) / #include <unistd.h> #include <sys/syscall.h> / For SYS_xxx definitions / #define __NR_move_mount 429 #define MOVE_MOUNT_F_SYMLINKS 0x00000001 / Follow symlinks on from path / #define MOVE_MOUNT_F_AUTOMOUNTS 0x00000002 / Follow automounts on from path / #define MOVE_MOUNT_F_EMPTY_PATH 0x00000004 / Empty from path permitted / #define MOVE_MOUNT_T_SYMLINKS 0x00000010 / Follow symlinks on to path / #define MOVE_MOUNT_T_AUTOMOUNTS 0x00000020 / Follow automounts on to path / #define MOVE_MOUNT_T_EMPTY_PATH 0x00000040 / Empty to path permitted / static inline int sys_move_mount(int from_fd, const char from_pathname, int to_fd, const char to_pathname, int flags) { syscall(__NR_move_mount, from_fd, from_pathname, to_fd, to_pathname, flags); } int main(int argc, char argv[]) { int flags = 0, from_fd = 0, to_fd = 100; sys_move_mount(from_fd++, "/foo", to_fd++, "bar", flags); flags \|= MOVE_MOUNT_F_SYMLINKS; sys_move_mount(from_fd++, "/foo1", to_fd++, "bar1", flags); flags \|= MOVE_MOUNT_F_AUTOMOUNTS; sys_move_mount(from_fd++, "/foo2", to_fd++, "bar2", flags); flags \|= MOVE_MOUNT_F_EMPTY_PATH; sys_move_mount(from_fd++, "/foo3", to_fd++, "bar3", flags); flags \|= MOVE_MOUNT_T_SYMLINKS; sys_move_mount(from_fd++, "/foo4", to_fd++, "bar4", flags); flags \|= MOVE_MOUNT_T_AUTOMOUNTS; sys_move_mount(from_fd++, "/foo5", to_fd++, "bar5", flags); flags \|= MOVE_MOUNT_T_EMPTY_PATH; return sys_move_mount(from_fd++, "/foo6", to_fd++, "bar6", flags); } # mv ~/.perfconfig ~/.perfconfig.OFF # perf trace -e move_mount ./sys_move_mount 0.000 ( 0.009 ms): sys_move_mount/28971 move_mount(from_pathname: 0x402010, to_dfd: 100, to_pathname: 0x402015) = -1 ENOENT (No such file or directory) 0.011 ( 0.003 ms): sys_move_mount/28971 move_mount(from_dfd: 1, from_pathname: 0x40201e, to_dfd: 101, to_pathname: 0x402019, flags: F_SYMLINKS) = -1 ENOENT (No such file or directory) 0.016 ( 0.002 ms): sys_move_mount/28971 move_mount(from_dfd: 2, from_pathname: 0x402029, to_dfd: 102, to_pathname: 0x402024, flags: F_SYMLINKS\|F_AUTOMOUNTS) = -1 ENOENT (No such file or directory) 0.020 ( 0.002 ms): sys_move_mount/28971 move_mount(from_dfd: 3, from_pathname: 0x402034, to_dfd: 103, to_pathname: 0x40202f, flags: F_SYMLINKS\|F_AUTOMOUNTS\|F_EMPTY_PATH) = -1 ENOENT (No such file or directory) 0.023 ( 0.002 ms): sys_move_mount/28971 move_mount(from_dfd: 4, from_pathname: 0x40203f, to_dfd: 104, to_pathname: 0x40203a, flags: F_SYMLINKS\|F_AUTOMOUNTS\|F_EMPTY_PATH\|T_SYMLINKS) = -1 ENOENT (No such file or directory) 0.027 ( 0.002 ms): sys_move_mount/28971 move_mount(from_dfd: 5, from_pathname: 0x40204a, to_dfd: 105, to_pathname: 0x402045, flags: F_SYMLINKS\|F_AUTOMOUNTS\|F_EMPTY_PATH\|T_SYMLINKS\|T_AUTOMOUNTS) = -1 ENOENT (No such file or directory) 0.031 ( 0.017 ms): sys_move_mount/28971 move_mount(from_dfd: 6, from_pathname: 0x402055, to_dfd: 106, to_pathname: 0x402050, flags: F_SYMLINKS\|F_AUTOMOUNTS\|F_EMPTY_PATH\|T_SYMLINKS\|T_AUTOMOUNTS\|T_EMPTY_PATH) = -1 ENOENT (No such file or directory) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-83rim8g4k0s4gieieh5nnlck@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:42 -03:00
Arnaldo Carvalho de Melo	eefa09b499	perf beauty: Add generator for 'move_mount' flags argument $ tools/perf/trace/beauty/move_mount_flags.sh static const char *move_mount_flags[] = { [ilog2(0x00000001) + 1] = "F_SYMLINKS", [ilog2(0x00000002) + 1] = "F_AUTOMOUNTS", [ilog2(0x00000004) + 1] = "F_EMPTY_PATH", [ilog2(0x00000010) + 1] = "T_SYMLINKS", [ilog2(0x00000020) + 1] = "T_AUTOMOUNTS", [ilog2(0x00000040) + 1] = "T_EMPTY_PATH", }; $ Will be wired up to the 'perf trace' arg in a followup patch. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-px7v33suw1k2ehst52l7bwa3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:42 -03:00
Arnaldo Carvalho de Melo	8a70c6b162	perf augmented_raw_syscalls: Fix up comment Cut'n'paste error, the second comment is about the syscalls that have as its second arg a string. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-zo5s6rloy42u41acsf6q3pvi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:42 -03:00
Jiri Olsa	fb5a88d413	perf tools: Preserve eBPF maps when loading kcore We need to preserve eBPF maps even if they are covered by kcore, because we need to access eBPF dso for source data. Add the map_groups__merge_in function to do that. It merges a map into map_groups by splitting the new map within the existing map regions. Suggested-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190508132010.14512-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:42 -03:00
Jiri Olsa	8529f2e673	perf machine: Keep zero in pgoff BPF map With pgoff set to zero, the map__map_ip function will return BPF addresses based from 0, which is what we need when we read the data from a BPF DSO. Adding BPF symbols with mapped IP addresses as well. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190508132010.14512-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:42 -03:00
Adrian Hunter	a2d8a1585e	perf intel-pt: Fix itrace defaults for perf script intel-pt documentation Fix intel-pt documentation to reflect the change of itrace defaults for perf script. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Fixes: `4eb0681571` ("perf script: Make itrace script default to all calls") Link: http://lkml.kernel.org/r/20190520113728.14389-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:42 -03:00
Adrian Hunter	355200e0f6	perf auxtrace: Fix itrace defaults for perf script Commit `4eb0681571` ("perf script: Make itrace script default to all calls") does not work for the case when '--itrace' only is used, because default_no_sample is not being passed. Example: Before: $ perf record -e intel_pt/cyc/u ls $ perf script --itrace > cmp1.txt $ perf script --itrace=cepwx > cmp2.txt $ diff -sq cmp1.txt cmp2.txt Files cmp1.txt and cmp2.txt differ After: $ perf script --itrace > cmp1.txt $ perf script --itrace=cepwx > cmp2.txt $ diff -sq cmp1.txt cmp2.txt Files cmp1.txt and cmp2.txt are identical Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Fixes: `4eb0681571` ("perf script: Make itrace script default to all calls") Link: http://lkml.kernel.org/r/20190520113728.14389-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:42 -03:00
Adrian Hunter	26f19c2eb7	perf intel-pt: Fix itrace defaults for perf script Commit `4eb0681571` ("perf script: Make itrace script default to all calls") does not work because 'use_browser' is being used to determine whether to default to periodic sampling (i.e. better for perf report). The result is that nothing but CBR events display for perf script when no --itrace option is specified. Fix by using 'default_no_sample' and 'inject' instead. Example: Before: $ perf record -e intel_pt/cyc/u ls $ perf script > cmp1.txt $ perf script --itrace=cepwx > cmp2.txt $ diff -sq cmp1.txt cmp2.txt Files cmp1.txt and cmp2.txt differ After: $ perf script > cmp1.txt $ perf script --itrace=cepwx > cmp2.txt $ diff -sq cmp1.txt cmp2.txt Files cmp1.txt and cmp2.txt are identical Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.20+ Fixes: `90e457f7be` ("perf tools: Add Intel PT support") Link: http://lkml.kernel.org/r/20190520113728.14389-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:42 -03:00
Adrian Hunter	a685c7a4a2	perf-with-kcore.sh: Always allow fix_buildid_cache_permissions The user's buildid cache may contain entries added by root even if root has its own home directory (e.g. by using perfconfig to specify the same buildid dir), so remove that validation. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190412113830.4126-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 18:37:42 -03:00
Thomas Richter	6738028dd5	perf record: Fix s390 missing module symbol and warning for non-root users Command 'perf record' and 'perf report' on a system without kernel debuginfo packages uses /proc/kallsyms and /proc/modules to find addresses for kernel and module symbols. On x86 this works for root and non-root users. On s390, when invoked as non-root user, many of the following warnings are shown and module symbols are missing: proc/{kallsyms,modules} inconsistency while looking for "[sha1_s390]" module! Command 'perf record' creates a list of module start addresses by parsing the output of /proc/modules and creates a PERF_RECORD_MMAP record for the kernel and each module. The following function call sequence is executed: machine__create_kernel_maps machine__create_module modules__parse machine__create_module --> for each line in /proc/modules arch__fix_module_text_start Function arch__fix_module_text_start() is s390 specific. It opens file /sys/module/<name>/sections/.text to extract the module's .text section start address. On s390 the module loader prepends a header before the first section, whereas on x86 the module's text section address is identical the the module's load address. However module section files are root readable only. For non-root the read operation fails and machine__create_module() returns an error. Command perf record does not generate any PERF_RECORD_MMAP record for loaded modules. Later command perf report complains about missing module maps. To fix this function arch__fix_module_text_start() always returns success. For root users there is no change, for non-root users the module's load address is used as module's text start address (the prepended header then counts as part of the text section). This enable non-root users to use module symbols and avoid the warning when perf report is executed. Output before: [tmricht@m83lp54 perf]$ ./perf report -D \| fgrep MMAP 0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text Output after: [tmricht@m83lp54 perf]$ ./perf report -D \| fgrep MMAP 0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text 0 0x1b8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../autofs4.ko.xz 0 0x250 [0xa8]: PERF_RECORD_MMAP ... x /lib/modules/.../sha_common.ko.xz 0 0x2f8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../des_generic.ko.xz Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Link: http://lkml.kernel.org/r/20190522144601.50763-4-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:52:23 -03:00
Jiri Olsa	ed9adb2035	perf machine: Read also the end of the kernel We mark the end of kernel based on the first module, but that could cover some bpf program maps. Reading _etext symbol if it's present to get precise kernel map end. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lkml.kernel.org/r/20190508132010.14512-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:52:23 -03:00
Arnaldo Carvalho de Melo	93f678b9ae	perf test vmlinux-kallsyms: Ignore aliases to _etext when searching on kallsyms No need to search for aliases for the symbol that marks the end of the kernel text segment, the following patch will make such symbols not to be found when searching in the kallsyms maps causing this test to fail. So as a prep patch to avoid breaking bisection, ignore such symbols. Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stanislav Fomichev <sdf@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lkml.kernel.org/n/tip-qfwuih8cvmk9doh7k5k244eq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:52:23 -03:00
Namhyung Kim	acd244b84b	perf session: Add missing swap ops for namespace events In case it's recorded in a different arch. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> <hbathini@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Krister Johansen <kjlx@templeofstupid.com> Fixes: `f3b3614a28` ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info") Link: http://lkml.kernel.org/r/20190522053250.207156-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:52:23 -03:00
Namhyung Kim	6584140ba9	perf namespace: Protect reading thread's namespace It seems that the current code lacks holding the namespace lock in thread__namespaces(). Otherwise it can see inconsistent results. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Krister Johansen <kjlx@templeofstupid.com> Link: http://lkml.kernel.org/r/20190522053250.207156-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:52:23 -03:00
Arnaldo Carvalho de Melo	fba29f1820	tools include UAPI: Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls Copy the headers changed by these csets: `d8076bdb56` ("uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]") `9c8ad7a2ff` ("uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2]") `cf3cba4a42` ("vfs: syscall: Add fspick() to select a superblock for reconfiguration") `93766fbd26` ("vfs: syscall: Add fsmount() to create a mount for a superblock") `ecdab150fd` ("vfs: syscall: Add fsconfig() for configuring and managing a context") `24dcb3d90a` ("vfs: syscall: Add fsopen() to prepare for superblock creation") `2db154b3ea` ("vfs: syscall: Add move_mount(2) to move mounts around") `a07b200047` ("vfs: syscall: Add open_tree(2) to reference or clone a mount") We need to create tables for all the flags argument in the new syscalls, in followup patches. This silences these perf build warnings: Warning: Kernel ABI header at 'tools/include/uapi/linux/mount.h' differs from latest version at 'include/uapi/linux/mount.h' diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl' diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h' diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-knpqr1u2ffvz6641056z2mwu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:49:03 -03:00
Vitaly Chikunov	f95d050cdc	perf arm64: Fix mksyscalltbl when system kernel headers are ahead of the kernel When a host system has kernel headers that are newer than a compiling kernel, mksyscalltbl fails with errors such as: <stdin>: In function 'main': <stdin>:271:44: error: '__NR_kexec_file_load' undeclared (first use in this function) <stdin>:271:44: note: each undeclared identifier is reported only once for each function it appears in <stdin>:272:46: error: '__NR_pidfd_send_signal' undeclared (first use in this function) <stdin>:273:43: error: '__NR_io_uring_setup' undeclared (first use in this function) <stdin>:274:43: error: '__NR_io_uring_enter' undeclared (first use in this function) <stdin>:275:46: error: '__NR_io_uring_register' undeclared (first use in this function) tools/perf/arch/arm64/entry/syscalls//mksyscalltbl: line 48: /tmp/create-table-xvUQdD: Permission denied mksyscalltbl is compiled with default host includes, but run with compiling kernel tree includes, causing some syscall numbers to being undeclared. Committer testing: Before this patch, in my cross build environment, no build problems, but these new syscalls were not in the syscalls.c generated from the unistd.h file, which is a bug, this patch fixes it: perfbuilder@6e20056ed532:/git/perf$ tail /tmp/build/perf/arch/arm64/include/generated/asm/syscalls.c [292] = "io_pgetevents", [293] = "rseq", [294] = "kexec_file_load", [424] = "pidfd_send_signal", [425] = "io_uring_setup", [426] = "io_uring_enter", [427] = "io_uring_register", [428] = "syscalls", }; perfbuilder@6e20056ed532:/git/perf$ strings /tmp/build/perf/perf \| egrep '^(io_uring_\|pidfd_\|kexec_file)' kexec_file_load pidfd_send_signal io_uring_setup io_uring_enter io_uring_register perfbuilder@6e20056ed532:/git/perf$ $ Well, there is that last "syscalls" thing, but that looks like some other bug. Signed-off-by: Vitaly Chikunov <vt@altlinux.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Michael Petlan <mpetlan@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20190521030203.1447-1-vt@altlinux.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:49:03 -03:00
Shawn Landden	97acec7df1	perf data: Fix 'strncat may truncate' build failure with recent gcc This strncat() is safe because the buffer was allocated with zalloc(), however gcc doesn't know that. Since the string always has 4 non-null bytes, just use memcpy() here. CC /home/shawn/linux/tools/perf/util/data-convert-bt.o In file included from /usr/include/string.h:494, from /home/shawn/linux/tools/lib/traceevent/event-parse.h:27, from util/data-convert-bt.c:22: In function ‘strncat’, inlined from ‘string_set_value’ at util/data-convert-bt.c:274:4: /usr/include/powerpc64le-linux-gnu/bits/string_fortified.h:136:10: error: ‘__builtin_strncat’ output may be truncated copying 4 bytes from a string of length 4 [-Werror=stringop-truncation] 136 \| return __builtin___strncat_chk (__dest, __src, __len, __bos (__dest)); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Shawn Landden <shawn@git.icu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> LPU-Reference: 20190518183238.10954-1-shawn@git.icu Link: https://lkml.kernel.org/n/tip-289f1jice17ta7tr3tstm9jm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:49:03 -03:00
Masami Hiramatsu	1e032f7cfa	perf-probe: Add user memory access attribute support Add user memory access attribute for kprobe event arguments. If a given 'local variable' is in user-space, User can specify memory access method by '@user' suffix. This is not only for string but also for data structure. If we access a field of data structure in user memory from kernel on some arch, it will fail. e.g. perf probe -a "sched_setscheduler param->sched_priority" This will fail to access the "param->sched_priority" because the param is __user pointer. Instead, we can now specify @user suffix for such argument. perf probe -a "sched_setscheduler param->sched_priority@user" Note that kernel memory access with "@user" must always fail on any arch. Link: http://lkml.kernel.org/r/155789874562.26965.10836126971405890891.stgit@devnote2 Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2019-05-25 23:04:42 -04:00
Thomas Gleixner	ec8f24b7fa	treewide: Add SPDX license identifier - Makefile/Kconfig Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-21 10:50:46 +02:00
Jin Yao	4fc4d8dfa0	perf stat: Support 'percore' event qualifier With this patch, we can use the 'percore' event qualifier in perf-stat. root@skl:/tmp# perf stat -e cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/ -a -A -I1000 1.000773050 S0-C0 98,352,832 cpu/event=0,umask=0x3,percore=1/ (50.01%) 1.000773050 S0-C1 103,763,057 cpu/event=0,umask=0x3,percore=1/ (50.02%) 1.000773050 S0-C2 196,776,995 cpu/event=0,umask=0x3,percore=1/ (50.02%) 1.000773050 S0-C3 176,493,779 cpu/event=0,umask=0x3,percore=1/ (50.02%) 1.000773050 CPU0 47,699,641 cpu/event=0,umask=0x3/ (50.02%) 1.000773050 CPU1 49,052,451 cpu/event=0,umask=0x3/ (49.98%) 1.000773050 CPU2 102,771,422 cpu/event=0,umask=0x3/ (49.98%) 1.000773050 CPU3 100,784,662 cpu/event=0,umask=0x3/ (49.98%) 1.000773050 CPU4 43,171,342 cpu/event=0,umask=0x3/ (49.98%) 1.000773050 CPU5 54,152,158 cpu/event=0,umask=0x3/ (49.98%) 1.000773050 CPU6 93,618,410 cpu/event=0,umask=0x3/ (49.98%) 1.000773050 CPU7 74,477,589 cpu/event=0,umask=0x3/ (49.99%) In this example, we count the event 'ref-cycles' per-core and per-CPU in one perf stat command-line. From the output, we can see: S0-C0 = CPU0 + CPU4 S0-C1 = CPU1 + CPU5 S0-C2 = CPU2 + CPU6 S0-C3 = CPU3 + CPU7 So the result is expected (tiny difference is ignored). Note that, the 'percore' event qualifier needs to use with option '-A'. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1555077590-27664-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:24 -03:00
Jin Yao	40480a8136	perf stat: Factor out aggregate counts printing Move the aggregate counts printing to a new function print_counter_aggrdata, which will be used in following patches. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1555077590-27664-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:24 -03:00
Jin Yao	064b4e82aa	perf tools: Add a 'percore' event qualifier Add a 'percore' event qualifier, like cpu/event=0,umask=0x3,percore=1/, that sums up the event counts for both hardware threads in a core. We can already do this with --per-core, but it's often useful to do this together with other metrics that are collected per hardware thread. So we need to support this per-core counting on a event level. This can be implemented in only the user tool, no kernel support needed. v4: --- 1. Add Arnaldo's patch which updates the documentation for this new qualifier. 2. Rebase to latest perf/core branch v3: --- Simplify the code according to Jiri's comments. Before: "return term->val.percore ? true : false;" Now: "return term->val.percore;" v2: --- Change the qualifier name from 'coresum' to 'percore' according to comments from Jiri and Andi. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1555077590-27664-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:24 -03:00
Thomas Richter	6cf6265639	perf docs: Add description for stderr 'perf report' displays recorded data on the screen and emits warnings and debug messages in the status line (last one on screen). perf also supports the possibility to write all debug messages to stderr (instead of writing them to the status line). This is achieved with the following command: # ./perf --debug stderr=1 report -vvvvv -i ~/fast.data 2>/tmp/2 # ll /tmp/2 -rw-rw-r-- 1 tmricht tmricht `5420835` May 7 13:46 /tmp/2 # The usage of variable stderr=1 is not documented, so add it to the perf man page. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20190513080220.91966-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:24 -03:00
Adrian Hunter	1b6599a9d8	perf intel-pt: Fix sample timestamp wrt non-taken branches The sample timestamp is updated to ensure that the timestamp represents the time of the sample and not a branch that the decoder is still walking towards. The sample timestamp is updated when the decoder returns, but the decoder does not return for non-taken branches. Update the sample timestamp then also. Note that commit `3f04d98e97` ("perf intel-pt: Improve sample timestamp") was also a stable fix and appears, for example, in v4.4 stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample timestamp"). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.4+ Fixes: `3f04d98e97` ("perf intel-pt: Improve sample timestamp") Link: http://lkml.kernel.org/r/20190510124143.27054-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:24 -03:00
Adrian Hunter	61b6e08dc8	perf intel-pt: Fix improved sample timestamp The decoder uses its current timestamp in samples. Usually that is a timestamp that has already passed, but in some cases it is a timestamp for a branch that the decoder is walking towards, and consequently hasn't reached. The intel_pt_sample_time() function decides which is which, but was not handling TNT packets exactly correctly. In the case of TNT, the timestamp applies to the first branch, so the decoder must first walk to that branch. That means intel_pt_sample_time() should return true for TNT, and this patch makes that change. However, if the first branch is a non-taken branch (i.e. a 'N'), then intel_pt_sample_time() needs to return false for subsequent taken branches in the same TNT packet. To handle that, introduce a new state INTEL_PT_STATE_TNT_CONT to distinguish the cases. Note that commit `3f04d98e97` ("perf intel-pt: Improve sample timestamp") was also a stable fix and appears, for example, in v4.4 stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample timestamp"). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.4+ Fixes: `3f04d98e97` ("perf intel-pt: Improve sample timestamp") Link: http://lkml.kernel.org/r/20190510124143.27054-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:23 -03:00
Adrian Hunter	7ba8fa20e2	perf intel-pt: Fix instructions sampling rate The timestamp used to determine if an instruction sample is made, is an estimate based on the number of instructions since the last known timestamp. A consequence is that it might go backwards, which results in extra samples. Change it so that a sample is only made when the timestamp goes forwards. Note this does not affect a sampling period of 0 or sampling periods specified as a count of instructions. Example: Before: $ perf script --itrace=i10us ls 13812 [003] 2167315.222583: 3270 instructions:u: 7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 30902 instructions:u: 7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 10 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 8 instructions:u: 7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 14 instructions:u: 7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 6 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 14 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 4 instructions:u: 7fac71e2dab2 _dl_cache_libcmp+0xd2 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222728: 16423 instructions:u: 7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222734: 12731 instructions:u: 7fac71e27938 _dl_name_match_p+0x68 (/lib/x86_64-linux-gnu/ld-2.28.so) ... After: $ perf script --itrace=i10us ls 13812 [003] 2167315.222583: 3270 instructions:u: 7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 30902 instructions:u: 7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222728: 16479 instructions:u: 7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so) ... Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Fixes: `f4aa081949` ("perf tools: Add Intel PT decoder") Link: http://lkml.kernel.org/r/20190510124143.27054-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:23 -03:00
Kan Liang	6466ec14aa	perf regs x86: Add X86 specific arch__intr_reg_mask() XMM registers can be collected on Icelake and later platforms. Add specific arch__intr_reg_mask(), which creating an event to check if the kernel and hardware can collect XMM registers. Test on Skylake which doesn't support XMM registers collection. There is nothing changed. #perf record -I? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use '-I?' to list register names #perf record -I [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.905 MB perf.data (2520 samples) ] #perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP\|TID\|TIME\|CPU\|PERIOD\|REGS_INTR, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, sample_regs_intr: 0xff0fff Test on Icelake which support XMM registers collection. #perf record -I? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9 XMM10 XMM11 XMM12 XMM13 XMM14 XMM15 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use '-I?' to list register names #perf record -I [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.800 MB perf.data (318 samples) ] #perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP\|TID\|TIME\|CPU\|PERIOD\|REGS_INTR, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, sample_regs_intr: 0xffffffff00ff0fff Committer notes: Don't set attr.sample_period as a named struct init, as it is part of an unnamed union in 'struct perf_event_attr', and doing so breaks the build on older gcc versions, such as: gcc version 4.1.2 20080704 (Red Hat 4.1.2-55) gcc version 4.4.7 20120313 (Red Hat 4.4.7-23) (GCC) arch/x86/util/perf_regs.c: In function 'arch__intr_reg_mask': arch/x86/util/perf_regs.c:279: error: unknown field 'sample_period' specified in initializer cc1: warnings being treated as errors arch/x86/util/perf_regs.c:279: warning: missing braces around initializer arch/x86/util/perf_regs.c:279: warning: (near initialization for 'attr.<anonymous>') Signed-off-by: Kan Liang <kan.liang@linux.intel.com> [ Only on a lenovo t480s, a skylake machine, where the XMM registers didn't show up in -I?/--user-regs=? as expected ] Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1557865174-56264-3-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:23 -03:00
Kan Liang	af785e75bf	perf parse-regs: Add generic support for arch__intr/user_reg_mask() There may be different register mask for use with intr or user on some platforms, e.g. Icelake. Add weak functions arch__intr_reg_mask() and arch__user_reg_mask() to return intr and user register mask respectively. Check mask before printing or comparing the register name. Generic code always return PERF_REGS_MASK. No functional change. Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1557865174-56264-2-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:12 -03:00
Kan Liang	aeea9062d9	perf parse-regs: Split parse_regs The available registers for --int-regs and --user-regs may be different, e.g. XMM registers. Split parse_regs into two dedicated functions for --int-regs and --user-regs respectively. Modify the warning message. "--user-regs=?" should be applied to show the available registers for --user-regs. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1557865174-56264-1-git-send-email-kan.liang@linux.intel.com [ Changed docs as suggested by Ravi and agreed by Kan ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Florian Fainelli	7025fdbea3	perf vendor events arm64: Add Cortex-A57 and Cortex-A72 events The Cortex-A57 and Cortex-A72 both support all ARMv8 recommended events up to the RC_ST_SPEC (0x91) event with the exception of: - L1D_CACHE_REFILL_INNER (0x44) - L1D_CACHE_REFILL_OUTER (0x45) - L1D_TLB_RD (0x4E) - L1D_TLB_WR (0x4F) - L2D_TLB_REFILL_RD (0x5C) - L2D_TLB_REFILL_WR (0x5D) - L2D_TLB_RD (0x5E) - L2D_TLB_WR (0x5F) - STREX_SPEC (0x6F) Create an appropriate JSON file for mapping those events and update the mapfile.csv for matching the Cortex-A57 and Cortex-A72 MIDR to that file. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean V Kelley <seanvk.dev@oregontracks.org> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org (moderated list:arm pmu profiling and debugging) Link: http://lkml.kernel.org/r/20190513202522.9050-4-f.fainelli@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Florian Fainelli	93fe8f1e11	perf vendor events arm64: Map Brahma-B53 CPUID to cortex-a53 events Broadcom's Brahma-B53 CPUs support the same type of events that the Cortex-A53 supports, recognize its CPUID and map it to the cortex-a53 events. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean V Kelley <seanvk.dev@oregontracks.org> Cc: bcm-kernel-feedback-list@broadcom.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org (moderated list Link: http://lkml.kernel.org/r/20190513202522.9050-3-f.fainelli@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Florian Fainelli	ae833a6124	perf vendor events arm64: Remove [[:xdigit:]] wildcard ARM64's implementation of get_cpuidr_str() masks out the revision bits [3:0] while reading the CPU identifier, there is no need for the [[:xdigit:]] wildcard. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean V Kelley <seanvk.dev@oregontracks.org> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org (moderated list:arm pmu profiling and debugging) Link: http://lkml.kernel.org/r/20190513202522.9050-2-f.fainelli@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Zenghui Yu	8e8f515d56	perf jevents: Remove unused variable Address gcc warning: pmu-events/jevents.c: In function ‘save_arch_std_events’: pmu-events/jevents.c:417:15: warning: unused variable ‘sb’ [-Wunused-variable] struct stat *sb = data; ^~ Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: wanghaibin.wang@huawei.com Link: http://lkml.kernel.org/r/1557919169-23972-1-git-send-email-yuzenghui@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Arnaldo Carvalho de Melo	d94cfbab6d	perf test zstd: Fixup verbose mode output The shell tests should not redirect useful output to /dev/null, as that is done automatically by 'perf test' in non verbose mode, so remove that from the zstd comp/decomp test, fixing up verbose mode. Before: $ perf test zstd 68: Zstd perf.data compression/decompression : Ok $ perf test -v zstd 68: Zstd perf.data compression/decompression : --- start --- test child forked, pid 11956 -z, --compression-level[=<n>] Collecting compressed record file: Checking compressed events stats: test child finished with 0 ---- end ---- Zstd perf.data compression/decompression: Ok $ Now: $ perf test zstd 68: Zstd perf.data compression/decompression : Ok $ perf test -v zstd 68: Zstd perf.data compression/decompression : --- start --- test child forked, pid 12695 Collecting compressed record file: 0+500 records in 72+1 records out 37361 bytes (37 kB, 36 KiB) copied, 9.83796 s, 3.8 kB/s [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB /tmp/perf.data.rzq, compressed (original 0.004 MB, ratio is 3.679) ] Checking compressed events stats: # compressed : Zstd, level = 1, ratio = 4 COMPRESSED events: 3 test child finished with 0 ---- end ---- Zstd perf.data compression/decompression: Ok $ Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-tp96618ds42zic94nlh0msz3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Alexey Budankov	bdc35cbc35	perf tests: Implement Zstd comp/decomp integration test Introduce a basic integration test for Zstd based record compression/decompression using 'perf record' and 'perf report'. Committer notes: Reduce a bit the freq (from 25 kHz to 5 kHz) and the number of /dev/null records read (from 1000 to 500), reducing the time it takes to something more in line with the time existing 'perf test' entries take to run. With that in place: $ time perf test zstd 68: Zstd perf.data compression/decompression : Ok real 0m10.376s user 0m0.105s sys 0m0.440s $ grep "model name" /proc/cpuinfo \| head -1 model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz $ Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/dc007ae4-104a-2b7c-316e-275929025f0d@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Alexey Budankov	371a3378d8	perf inject: Enable COMPRESSED record decompression Initialized decompression part of Zstd based API so COMPRESSED records would be decompressed into the resulting output data file. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/c27d7500-ecdd-3569-cab5-8f70bbed5ea4@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Alexey Budankov	cb62c6f1f5	perf report: Implement perf.data record decompression zstd_init(, comp_level = 0) initializes decompression part of API only hat now consists of zstd_decompress_stream() function. The perf.data PERF_RECORD_COMPRESSED records are decompressed using zstd_decompress_stream() function into a linked list of mmaped memory regions of mmap_comp_len size (struct decomp). After decompression of one COMPRESSED record its content is iterated and fetched for usual processing. The mmaped memory regions with decompressed events are kept in the linked list till the tool process termination. When dumping raw records (e.g., perf report -D --header) file offsets of events from compressed records are printed as zero. Committer notes: Since now we have support for processing PERF_RECORD_COMPRESSED, we see none, in raw form, like we saw in the previous patch commiter notes, they were decompressed into the usual PERF_RECORD_{FORK,MMAP,COMM,etc} records, we only see the stats for those PERF_RECORD_COMPRESSED events, and since I used the file generated in the commiter notes for the previous patch, there they are, 2 compressed records: $ perf report --header-only \| grep cmdline # cmdline : /home/acme/bin/perf record -z2 sleep 1 $ perf report -D \| grep COMPRESS COMPRESSED events: 2 COMPRESSED events: 0 $ perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 15 of event 'cycles:u' # Event count (approx.): 962227 # # Overhead Command Shared Object Symbol # ........ ....... ................ ........................... # 46.99% sleep libc-2.28.so [.] _dl_addr 29.24% sleep [unknown] [k] 0xffffffffaea00a67 16.45% sleep libc-2.28.so [.] __GI__IO_un_link.part.1 5.92% sleep ld-2.28.so [.] _dl_setup_hash 1.40% sleep libc-2.28.so [.] __nanosleep 0.00% sleep [unknown] [k] 0xffffffffaea00163 # # (Tip: To see callchains in a more compact form: perf report -g folded) # $ Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/304b0a59-942c-3fe1-da02-aa749f87108b@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Alexey Budankov	504c1ad116	perf record: Implement -z,--compression_level[=<n>] option Implemented -z,--compression_level[=<n>] option that enables compression of mmaped kernel data buffers content in runtime during perf record mode collection. Default option value is 1 (fastest compression). Compression overhead has been measured for serial and AIO streaming when profiling matrix multiplication workload: ------------------------------------------------------------- \| SERIAL \| AIO-1 \| ----------------------------------------------------------------\| \|-z \| OVH(x) \| ratio(x) size(MiB) \| OVH(x) \| ratio(x) size(MiB) \| \|---------------------------------------------------------------\| \| 0 \| 1,00 \| 1,000 179,424 \| 1,00 \| 1,000 187,527 \| \| 1 \| 1,04 \| 8,427 181,148 \| 1,01 \| 8,474 188,562 \| \| 2 \| 1,07 \| 8,055 186,953 \| 1,03 \| 7,912 191,773 \| \| 3 \| 1,04 \| 8,283 181,908 \| 1,03 \| 8,220 191,078 \| \| 5 \| 1,09 \| 8,101 187,705 \| 1,05 \| 7,780 190,065 \| \| 8 \| 1,05 \| 9,217 179,191 \| 1,12 \| 6,111 193,024 \| ----------------------------------------------------------------- OVH = (Execution time with -z N) / (Execution time with -z 0) ratio - compression ratio size - number of bytes that was compressed size ~= trace size x ratio Committer notes: Testing it I noticed that it failed to disable build id processing when compression is enabled, and as we'd have to uncompress everything to look for the PERF_RECORD_{MMAP,SAMPLE,etc} to figure out which build ids to read from DSOs, we better disable build id processing when compression is enabled, logging with pr_debug() when doing so: Original patch: # perf record -z2 ^C[ perf record: Woken up 1 times to write data ] 0x1746e0 [0x76]: failed to process type: 81 [Invalid argument] [ perf record: Captured and wrote 1.568 MB perf.data, compressed (original 0.452 MB, ratio is 3.995) ] # After auto-disabling build id processing when compression is enabled: $ perf record -z2 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.292) ] $ perf record -v -z2 sleep 1 Compression enabled, disabling build id collection at the end of the session. <SNIP extra -v pr_debug() messages> [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.305) ] $ Also, with parts of the patch originally after this one moved to just before this one we get: $ perf record -z2 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.371) ] $ perf report -D \| grep COMPRESS 0 0x1b8 [0x155]: PERF_RECORD_COMPRESSED: unhandled! 0 0x30d [0x80]: PERF_RECORD_COMPRESSED: unhandled! COMPRESSED events: 2 COMPRESSED events: 0 $ I.e. when faced with PERF_RECORD_COMPRESSED that we still have no code to process, we just show it as not being handled, skip them and continue, while before we had: $ perf report -D \| grep COMPRESS 0x1b8 [0x169]: failed to process type: 81 [Invalid argument] Error: failed to process sample 0 0x1b8 [0x169]: PERF_RECORD_COMPRESSED $ Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/9ff06518-ae63-a908-e44d-5d9e56dd66d9@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Alexey Budankov	61a7773ca8	perf report: Add stub processing of compressed events for -D Committer note: Split from a larger patch, this only dumps PERF_RECORD_COMPRESSED as unhandled, so that when we introduce the record part in the next patch, we don't see unhandled events when using 'perf record -D'. Changed it so that we dump the event if the handler is just a stub, i.e. for the case where we don't have ZSTD linked but we're processing a perf.data file generated by a tool with that linked. Also when failing to decompress we can't just dump the uncompressed event and return 0, we have to propagate the error. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/304b0a59-942c-3fe1-da02-aa749f87108b@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Alexey Budankov	ef781128e4	perf record: Implement compression for AIO trace streaming Compression is implemented using the functions from zstd.c. As the memory to operate on the compression uses mmap->aio.data[] buffers. If Zstd streaming compression API fails for some reason the data to be compressed are just copied into the memory buffers using plain memcpy(). Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED records. Each element of the array is not longer that PERF_SAMPLE_MAX_SIZE and consists of perf_event_header followed by the compressed chunk that is decompressed on the loading stage. perf_mmap__aio_push() is replaced by perf_mmap__push() which is now used in the both serial and AIO streaming cases. perf_mmap__push() is extended with positive return values to signify absence of data ready for processing. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/77db2b2c-5d03-dbb0-aeac-c4dd92129ab9@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Alexey Budankov	5d7f411649	perf record: Implement compression for serial trace streaming Compression is implemented using the functions from zstd.c. As the memory to operate on the compression uses mmap->data buffer. If Zstd streaming compression API fails for some reason the data to be compressed are just copied into the memory buffers using plain memcpy(). Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED records. Each element of the array is not longer that PERF_SAMPLE_MAX_SIZE and consists of perf_event_header followed by the compressed chunk that is decompressed on the loading stage. Comitter notes: Undo some unnecessary line breaks, remove some unnecessary () around zstd_data to then just get its address, and fix conflicts with BPF_PROG_INFO/BPF_BTF patchkits. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/744df43f-3932-2594-ddef-1e99a3cad03a@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Alexey Budankov	f24c1d7523	perf tools: Introduce Zstd streaming based compression API Implemented functions are based on Zstd streaming compression API. The functions are used in runtime to compress data that come from mmaped kernel buffer. zstd_init(), zstd_fini() are used for initialization and finalization to allocate and deallocate internal zstd objects. zstd_compress_stream_to_records() is used to convert parts of mmaped kernel buffer into an array of PERF_RECORD_COMPRESSED records. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/18bf36f3-b85a-1fe2-dd83-10e0c6069568@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Alexey Budankov	51255a8af7	perf mmap: Implement dedicated memory buffer for data compression Implemented mmap data buffer that is used as the memory to operate on when compressing data in case of serial trace streaming. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/49b31321-0f70-392b-9a4f-649d3affe090@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Alexey Budankov	42e1fd80a5	perf record: Implement COMPRESSED event record and its attributes Implemented PERF_RECORD_COMPRESSED event, related data types, header feature and functions to write, read and print feature attributes from the trace header section. comp_mmap_len preserves the size of mmaped kernel buffer that was used during collection. comp_mmap_len size is used on loading stage as the size of decomp buffer for decompression of COMPRESSED events content. Committer notes: Fixed up conflict with BPF_PROG_INFO and BTF_BTF header features. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ebbaf031-8dda-3864-ebc6-7922d43ee515@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Alexey Budankov	d3c8c08e75	perf session: Define 'bytes_transferred' and 'bytes_compressed' metrics Define 'bytes_transferred' and 'bytes_compressed' metrics to calculate ratio in the end of the data collection: compression ratio = bytes_transferred / bytes_compressed The 'bytes_transferred' metric accumulates the amount of bytes that was extracted from the mmaped kernel buffers for compression, while 'bytes_compressed' accumulates the amount of bytes that was received after applying compression. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1d4bf499-cb03-26dc-6fc6-f14fec7622ce@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Arnaldo Carvalho de Melo	5b6f5aef10	perf build tests: Add NO_LIBZSTD=1 to make_minimal So that we can test the ifdef parts for this feature. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-7o65mfl10wlvm8v3f0ombxd1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:49 -03:00
Donald Yandt	30ba5b0e66	perf machine: Null-terminate version char array upon fgets(/proc/version) error If fgets() fails due to any other error besides end-of-file, the version char array may not even be null-terminated. Signed-off-by: Donald Yandt <donald.yandt@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Avi Kivity <avi@scylladb.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Fixes: `a1645ce12a` ("perf: 'perf kvm' tool for monitoring guest performance from host") Link: http://lkml.kernel.org/r/20190514110100.22019-1-donald.yandt@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:47 -03:00
Kan Liang	bf6d18cffa	perf vendor events intel: Add uncore_upi JSON support Perf cannot parse UPI (Intel's "Ultra Path Interconnect" [1]) events. # perf stat -e UPI_DATA_BANDWIDTH_TX event syntax error: 'UPI_DATA_BANDWIDTH_TX' \___ parser error Run 'perf list' for a list of valid events The JSON lists call the box UPI LL, while perf calls it upi. Add conversion support to JSON to convert the unit properly. Committer notes: [1] https://en.wikipedia.org/wiki/Intel_Ultra_Path_Interconnect "The Intel Ultra Path Interconnect (UPI) is a point-to-point processor interconnect developed by Intel which replaced the Intel QuickPath Interconnect (QPI) in Xeon Skylake-SP platforms starting in 2017. UPI is a low-latency coherent interconnect for scalable multiprocessor systems with a shared address space. It uses a directory-based home snoop coherency protocol with a transfer speed of up to 10.4 GT/s. Supporting processors typically have two or three UPI links." Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1557234991-130456-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:47 -03:00
Adrian Hunter	b62d18aba1	perf scripts python: exported-sql-viewer.py: Add 'About' dialog box With support for Python 2 or 3 and PySide 1 or 2 (Qt 4 or 5), it is useful to see what versions are in use. Add an 'About' dialog box that displays Python, PySide, Qt and database server (SQLite or PostgreSQL) version numbers. Committer testing: $ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db Then go to 'Help', then 'About', select all the lines with the mouse press 'Control+C', then, on the same terminal press control+shift+V which shows my current environment: Python version: 2.7.16 PySide version: 1 Qt version: 4.8.7 SQLite version: 3.26.0 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190503120828.25326-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:47 -03:00
Adrian Hunter	9bc4e4bfe6	perf scripts python: exported-sql-viewer.py: Add context menu Add a context menu (right-click) that provides options for copying to clipboard, including, for trees, the ability to copy only the cell under the mouse pointer. Committer testing: $ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db Simply right click and pick "Copy selection", that at this point has just the first line, not expanded, then see what was copied by pressing shift+control+v on a terminal: Call Path,Object,Count,Time (ns),Time (%),Branch Count,Branch Count (%) ▶ simple-retpolin,,,,,, Ditto after expanding, i.e. the selection continues to be just one line: Call Path Object Count Time (ns) Time (%) Branch Count Branch Count (%) ▼ simple-retpolin Now select all the lines with the mouse and control+shift+v again: Call Path Object Count Time (ns) Time (%) Branch Count Branch Count (%) ▼ 14503:14503 ▼ _start ld-2.28.so 1 156267 100.0 10602 100.0 ▶ unknown unknown 1 2276 1.5 1 0.0 ▶ _dl_start ld-2.28.so 1 137047 87.7 10088 95.2 ▶ _dl_init ld-2.28.so 1 9142 5.9 326 3.1 ▼ _start simple-retpoline 1 7457 4.8 182 1.7 ▶ unknown unknown 1 805 10.8 1 0.5 ▶ __libc_start_main libc-2.28.so 1 6347 85.1 179 98.4 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190503120828.25326-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:47 -03:00
Adrian Hunter	96c43b9a7a	perf scripts python: exported-sql-viewer.py: Add copy to clipboard Add support for copying to clipboard. Two menu options are added to copy the selected rows / columns with normal spacing, or as comma-separated-values. In the case of trees, only entire rows can be copied. Comitter testing: $ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db Select the lines, press control+C and on the same terminal, press control+shift+V and voilà: Call Path Object Count Time (ns) Time (%) Branch Count Branch Count (%) ▼ 14503:14503 ▼ _start ld-2.28.so 1 156267 100.0 10602 100.0 unknown unknown 1 2276 1.5 1 0.0 ▼ _dl_start ld-2.28.so 1 137047 87.7 10088 95.2 ▶ unknown unknown 4 4127 3.0 4 0.0 _dl_setup_hash ld-2.28.so 1 0 0.0 1 0.0 ▶ _dl_sysdep_start ld-2.28.so 1 131342 95.8 9981 98.9 ▼ _dl_init ld-2.28.so 1 9142 5.9 326 3.1 ▼ call_init.part.0 ld-2.28.so 3 9133 99.9 319 97.9 ▶ _init libc-2.28.so 1 6877 75.3 110 34.5 ▶ check_stdfiles_vtables libc-2.28.so 1 76 0.8 2 0.6 ▶ init_cacheinfo libc-2.28.so 1 1991 21.8 197 61.8 ▶ _start simple-retpoline 1 7457 4.8 182 1.7 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190503120828.25326-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:47 -03:00
Adrian Hunter	3ac641f4ac	perf scripts python: exported-sql-viewer.py: Add tree level As preparation for adding support for copying to clipboard, keep track of what level each item is in tree items. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190503120828.25326-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:47 -03:00
Adrian Hunter	4b2084537e	perf scripts python: exported-sql-viewer.py: Fix error when shrinking / enlarging font Fix the following error if shrink / enlarge font is used with the help window. Traceback (most recent call last): File "tools/perf/scripts/python/exported-sql-viewer.py", line 2791, in ShrinkFont ShrinkFont(win.view) AttributeError: 'HelpWindow' object has no attribute 'view' Committer testing: Before, matches above output: $ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db Traceback (most recent call last): File "/home/acme/libexec/perf-core/scripts/python/exported-sql-viewer.py", line 2780, in EnlargeFont EnlargeFont(win.view) AttributeError: 'HelpWindow' object has no attribute 'view' $ After: No more tracebacks, but the fonts don't get enlarged, which is kinda frustrating... Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190503120828.25326-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:47 -03:00
Adrian Hunter	be6e747136	perf scripts python: exported-sql-viewer.py: Move view creation As preparation for adding support for copying to clipboard, create view in TreeWindowBase instead of derived classes. Committer testing: Tested using an old .db used to test some older patches: $ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db Nothing breaks. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190503120828.25326-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:47 -03:00
Andi Kleen	ca138a7aab	perf tools x86: Add support for recording and printing XMM registers Icelake and later platforms support collecting XMM registers with PEBS event. Add support for 'perf script' to dump them, and support for the register parser in 'perf record -I=' ... to configure them. For now they are just printed in hex, we could potentially later add other formats too. Committer testing: Before: # perf record -IXMM0 Warning: unknown register XMM0, check man page or run 'perf record -I?' Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] # # perf record -I? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] # After: # perf record -IXMM0 Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles). /bin/dmesg \| grep -i perf may provide additional information. # # perf record -I? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9 XMM10 XMM11 XMM12 XMM13 XMM14 XMM15 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use -I ? to list register names # More work is needed to, when faced with such error, warn the user that that register is not available on the running platform. Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190506141926.13659-1-kan.liang@linux.intel.com Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:47 -03:00
Arnaldo Carvalho de Melo	4c1cf20334	perf parse-regs: Improve error output when faced with unknown register name Add quotes around the register name and suggest using 'perf record -I?' to get the list of available registers. Before: # perf record -Idi,xmm20,xmm1 Warning: unknown register xmm20, check man page Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use -I ? to list register names # # perf record -Idi,xmm20,xmm1 Warning: unknown register "xmm20", check man page or run "perf record -I?" Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use -I ? to list register names # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lkml.kernel.org/n/tip-9a9hyuum8c0oggg86xd3sxc5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:47 -03:00
Arnaldo Carvalho de Melo	8e5bc76f2c	perf record: Fix suggestion to get list of registers usable with --user-regs and --intr-regs $ perf record -h -I Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use -I ? to list register names $ m $ perf record -I ? Workload failed: No such file or directory $ After: $ perf record -h -I Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use '-I?' to list register names $ $ perf record -I? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use '-I?' to list register names $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Fixes: `bcc84ec65a` ("perf record: Add ability to name registers to record") Link: https://lkml.kernel.org/n/tip-r0xhfhy5radmkhhcbcfs5izf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:46 -03:00
Jiri Olsa	382619c07f	perf tools: Speed up report for perf compiled with linwunwind When compiled with libunwind, perf does some preparatory work when processing side-band events. This is not needed when report actually don't unwind dwarf callchains, so it's disabled with dwarf_callchain_users bool. However we could move that check to higher level and shield more unwanted code for normal report processing, giving us following speed up on kernel build profile: Before: $ perf record make -j40 ... $ ll ../../perf.data -rw-------. 1 jolsa jolsa 461783932 Apr 26 09:11 perf.data $ perf stat -e cycles:u,instructions:u perf report -i perf.data > out Performance counter stats for 'perf report -i perf.data': 78,669,920,155 cycles:u 99,076,431,951 instructions:u # 1.26 insn per cycle 55.382823668 seconds time elapsed 27.512341000 seconds user 27.712871000 seconds sys After: $ perf stat -e cycles:u,instructions:u perf report -i perf.data > out Performance counter stats for 'perf report -i perf.data': 59,626,798,904 cycles:u 88,583,575,849 instructions:u # 1.49 insn per cycle 21.296935559 seconds time elapsed 20.010191000 seconds user 1.202935000 seconds sys The speed is higher with profile having many side-band events, because these trigger libunwind preparatory code. This does not apply for perf compiled with libdw for dwarf unwind, only for build with libunwind. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190426073804.17238-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:46 -03:00
Mao Han	b399ec215b	csky: Add support for libdw This patch add support for DWARF register mappings and libdw registers initialization, which is used by perf callchain analyzing when --call-graph=dwarf is given. Here is the elfutils csky backend patch set: https://sourceware.org/ml/elfutils-devel/2019-q2/msg00007.html Signed-off-by: Mao Han <han_mao@c-sky.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: linux-arch@vger.kernel.org Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1555860794-10572-1-git-send-email-guoren@kernel.org Signed-off-by: Guo Ren <ren_guo@c-sky.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:46 -03:00
Colin Ian King	1455ea2391	perf test: Fix spelling mistake "leadking" -> "leaking" There are a couple of spelling mistakes in test assert messages. Fix them. Signed-off-by: Colin King <colin.king@canonical.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: http://lkml.kernel.org/r/20190417105539.5902-1-colin.king@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:46 -03:00
Jin Yao	bdd1666b3d	perf annotate: Remove hist__account_cycles() from callback The hist__account_cycles() function is executed when the hist_iter__branch_callback() is called. But it looks it's not necessary. In hist__account_cycles, it already walks on all branch entries. This patch moves the hist__account_cycles out of callback, now the data processing is much faster than before. Previous code has an issue that the ch[offset].num++ (in __symbol__account_cycles) is executed repeatedly since hist__account_cycles is called in each hist_iter__branch_callback, so the counting of ch[offset].num is not correct (too big). With this patch, the issue is fixed. And we don't need the code of "ch->reset >= ch->num / 2" to check if there are too many overlaps (in annotation__count_and_fill), otherwise some data would be hidden. Now, we can try, for example: perf record -b ... perf annotate or perf report -s symbol The before/after output should be no change. v3: --- Fix the crash in stdio mode. Like previous code, it needs the checking of ui__has_annotation() before hist__account_cycles() v2: --- 1. Cover the similar perf report 2. Remove the checking code "ch->reset >= ch->num / 2" Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1552684577-29041-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-15 16:36:46 -03:00
Linus Torvalds	90489a72fb	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main kernel changes were: - add support for Intel's "adaptive PEBS v4" - which embedds LBS data in PEBS records and can thus batch up and reduce the IRQ (NMI) rate significantly - reducing overhead and making call-graph profiling less intrusive. - add Intel CPU core and uncore support updates for Tremont, Icelake, - extend the x86 PMU constraints scheduler with 'constraint ranges' to better support Icelake hw constraints, - make x86 call-chain support work better with CONFIG_FRAME_POINTER=y - misc other changes Tooling changes: - updates to the main tools: 'perf record', 'perf trace', 'perf stat' - updated Intel and S/390 vendor events - libtraceevent updates - misc other updates and fixes" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits) perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER watchdog: Fix typo in comment perf/x86/intel: Add Tremont core PMU support perf/x86/intel/uncore: Add Intel Icelake uncore support perf/x86/msr: Add Icelake support perf/x86/intel/rapl: Add Icelake support perf/x86/intel/cstate: Add Icelake support perf/x86/intel: Add Icelake support perf/x86: Support constraint ranges perf/x86/lbr: Avoid reading the LBRs when adaptive PEBS handles them perf/x86/intel: Support adaptive PEBS v4 perf/x86/intel/ds: Extract code of event update in short period perf/x86/intel: Extract memory code PEBS parser for reuse perf/x86: Support outputting XMM registers perf/x86/intel: Force resched when TFA sysctl is modified perf/core: Add perf_pmu_resched() as global function perf/headers: Fix stale comment for struct perf_addr_filter perf/core: Make perf_swevent_init_cpu() static perf/x86: Add sanity checks to x86_schedule_events() perf/x86: Optimize x86_schedule_events() ...	2019-05-06 14:16:36 -07:00
Arnaldo Carvalho de Melo	7e221b811f	perf tools: Remove needless asm/unistd.h include fixing build in some places We were including sys/syscall.h and asm/unistd.h, since sys/syscall.h includes asm/unistd.h, sometimes this leads to the redefinition of defines, breaking the build. Noticed on ARC with uCLibc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Rich Felker <dalias@libc.org> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Link: https://lkml.kernel.org/n/tip-xjpf80o64i2ko74aj2jih0qg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-02 16:00:20 -04:00
Arnaldo Carvalho de Melo	c638417e1a	tools build: Add -ldl to the disassembler-four-args feature test Thomas Backlund reported that the perf build was failing on the Mageia 7 distro, that is because it uses: cat /tmp/build/perf/feature/test-disassembler-four-args.make.output /usr/bin/ld: /usr/lib64/libbfd.a(plugin.o): in function `try_load_plugin': /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:243: undefined reference to `dlopen' /usr/bin/ld: /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:271: undefined reference to `dlsym' /usr/bin/ld: /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:256: undefined reference to `dlclose' /usr/bin/ld: /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:246: undefined reference to `dlerror' as we allow dynamic linking and loading Mageia 7 uses these linker flags: $ rpm --eval %ldflags -Wl,--as-needed -Wl,--no-undefined -Wl,-z,relro -Wl,-O1 -Wl,--build-id -Wl,--enable-new-dtags So add -ldl to this feature LDFLAGS. Reported-by: Thomas Backlund <tmb@mageia.org> Tested-by: Thomas Backlund <tmb@mageia.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lkml.kernel.org/r/20190501173158.GC21436@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-02 16:00:20 -04:00
Leo Yan	35bb59c10a	perf cs-etm: Always allocate memory for cs_etm_queue::prev_packet Robert Walker reported a segmentation fault is observed when process CoreSight trace data; this issue can be easily reproduced by the command 'perf report --itrace=i1000i' for decoding tracing data. If neither the 'b' flag (synthesize branches events) nor 'l' flag (synthesize last branch entries) are specified to option '--itrace', cs_etm_queue::prev_packet will not been initialised. After merging the code to support exception packets and sample flags, there introduced a number of uses of cs_etm_queue::prev_packet without checking whether it is valid, for these cases any accessing to uninitialised prev_packet will cause crash. As cs_etm_queue::prev_packet is used more widely now and it's already hard to follow which functions have been called in a context where the validity of cs_etm_queue::prev_packet has been checked, this patch always allocates memory for cs_etm_queue::prev_packet. Reported-by: Robert Walker <robert.walker@arm.com> Suggested-by: Robert Walker <robert.walker@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Robert Walker <robert.walker@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Fixes: `7100b12cf4` ("perf cs-etm: Generate branch sample for exception packet") Fixes: `24fff5eb2b` ("perf cs-etm: Avoid stale branch samples when flush packet") Link: http://lkml.kernel.org/r/20190428083228.20246-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-02 16:00:20 -04:00
Leo Yan	cf0c37b6db	perf cs-etm: Don't check cs_etm_queue::prev_packet validity Since cs_etm_queue::prev_packet is allocated for all cases, it will never be NULL pointer; now validity checking prev_packet is pointless, remove all of them. Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Robert Walker <robert.walker@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190428083228.20246-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-02 16:00:20 -04:00
Thomas Richter	167e418fa0	perf report: Report OOM in status line in the GTK UI An -ENOMEM error is not reported in the GTK GUI. Instead this error message pops up on the screen: [root@m35lp76 perf]# ./perf report -i perf.data.error68-1 Processing events... [974K/3M] Error:failed to process sample 0xf4198 [0x8]: failed to process type: 68 However when I use the same perf.data file with --stdio it works: [root@m35lp76 perf]# ./perf report -i perf.data.error68-1 --stdio \ \| head -12 # Total Lost Samples: 0 # # Samples: 76K of event 'cycles' # Event count (approx.): 99056160000 # # Overhead Command Shared Object Symbol # ........ ............... ................. ......... # 8.81% find [kernel.kallsyms] [k] ftrace_likely_update 8.74% swapper [kernel.kallsyms] [k] ftrace_likely_update 8.34% sshd [kernel.kallsyms] [k] ftrace_likely_update 2.19% kworker/u512:1- [kernel.kallsyms] [k] ftrace_likely_update The sample precentage is a bit low..... The GUI always fails in the FINISHED_ROUND event (68) and does not indicate the reason why. When happened is the following. Perf report calls a lot of functions and down deep when a FINISHED_ROUND event is processed, these functions are called: perf_session__process_event() + perf_session__process_user_event() + process_finished_round() + ordered_events__flush() + __ordered_events__flush() + do_flush() + ordered_events__deliver_event() + perf_session__deliver_event() + machine__deliver_event() + perf_evlist__deliver_event() + process_sample_event() + hist_entry_iter_add() --> only called in GUI case!!! + hist_iter__report__callback() + symbol__inc_addr_sample() Now this functions runs out of memory and returns -ENOMEM. This is reported all the way up until function perf_session__process_event() returns to its caller, where -ENOMEM is changed to -EINVAL and processing stops: if ((skip = perf_session__process_event(session, event, head)) < 0) { pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n", head, event->header.size, event->header.type); err = -EINVAL; goto out_err; } This occurred in the FINISHED_ROUND event when it has to process some 10000 entries and ran out of memory. This patch indicates the root cause and displays it in the status line of ther perf report GUI. Output before (on GUI status line): 0xf4198 [0x8]: failed to process type: 68 Output after: 0xf4198 [0x8]: failed to process type: 68 [not enough memory] Committer notes: the 'skip' variable needs to be initialized to -EINVAL, so that when the size is less than sizeof(struct perf_event_attr) we avoid this valid compiler warning: util/session.c: In function ‘perf_session__process_events’: util/session.c:1936:7: error: ‘skip’ may be used uninitialized in this function [-Werror=maybe-uninitialized] err = skip; ~~~~^~~~~~ util/session.c:1874:6: note: ‘skip’ was declared here s64 skip; ^~~~ cc1: all warnings being treated as errors Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20190423105303.61683-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-02 16:00:20 -04:00
Arnaldo Carvalho de Melo	bf561d3c13	perf bench numa: Add define for RUSAGE_THREAD if not present While cross building perf to the ARC architecture on a fedora 30 host, we were failing with: CC /tmp/build/perf/bench/numa.o bench/numa.c: In function ‘worker_thread’: bench/numa.c:1261:12: error: ‘RUSAGE_THREAD’ undeclared (first use in this function); did you mean ‘SIGEV_THREAD’? getrusage(RUSAGE_THREAD, &rusage); ^~~~~~~~~~~~~ SIGEV_THREAD bench/numa.c:1261:12: note: each undeclared identifier is reported only once for each function it appears in [perfbuilder@60d5802468f6 perf]$ /arc_gnu_2019.03-rc1_prebuilt_uclibc_le_archs_linux_install/bin/arc-linux-gcc --version \| head -1 arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225 [perfbuilder@60d5802468f6 perf]$ Trying to reproduce a report by Vineet, I noticed that, with just cross-built zlib and numactl libraries, I ended up with the above failure. So, since RUSAGE_THREAD is available as a define, check for that and numactl libraries, I ended up with the above failure. So, since RUSAGE_THREAD is available as a define in the system headers, check if it is defined in the 'perf bench numa' sources and define it if not. Now it builds and I have to figure out if the problem reported by Vineet only takes place if we have libelf or some other library available. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jiri Olsa <jolsa@kernel.org> Cc: linux-snps-arc@lists.infradead.org Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Link: https://lkml.kernel.org/n/tip-2wb4r1gir9xrevbpq7qp0amk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-02 16:00:20 -04:00
Thadeu Lima de Souza Cascardo	01e985e900	perf annotate: Fix build on 32 bit for BPF annotation Commit `6987561c9e` ("perf annotate: Enable annotation of BPF programs") adds support for BPF programs annotations but the new code does not build on 32-bit. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Song Liu <songliubraving@fb.com> Fixes: `6987561c9e` ("perf annotate: Enable annotation of BPF programs") Link: http://lkml.kernel.org/r/20190403194452.10845-1-cascardo@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-02 16:00:19 -04:00
Bo YU	2e712675ff	perf bpf: Return value with unlocking in perf_env__find_btf() In perf_env__find_btf(), we're returning without unlocking "env->bpf_progs.lock". There may be cause lockdep issue. Detected by CoversityScan, CID# 1444762:(program hangs(LOCK)) Signed-off-by: Bo YU <tsu.yubo@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Fixes: 2db7b1e0bd49d: (perf bpf: Return NULL when RB tree lookup fails in perf_env__find_btf()) Link: http://lkml.kernel.org/r/20190422080138.10088-1-tsu.yubo@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-02 16:00:19 -04:00
Jiri Olsa	2db7b1e0bd	perf bpf: Return NULL when RB tree lookup fails in perf_env__find_btf() We don't return NULL when we don't find the bpf_prog_info_node, fix that. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Reported-by: Song Liu <songliubraving@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: `3792cb2ff4` ("perf bpf: Save BTF in a rbtree in perf_env") Link: http://lkml.kernel.org/r/20190417145539.11669-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-04-17 14:30:23 -03:00
Jiri Olsa	b9abbdfa88	perf tools: Fix map reference counting By calling maps__insert() we assume to get 2 references on the map, which we relese within maps__remove call. However if there's already same map name, we currently don't bump the reference and can crash, like: Program received signal SIGABRT, Aborted. 0x00007ffff75e60f5 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff75e60f5 in raise () from /lib64/libc.so.6 #1 0x00007ffff75d0895 in abort () from /lib64/libc.so.6 #2 0x00007ffff75d0769 in __assert_fail_base.cold () from /lib64/libc.so.6 #3 0x00007ffff75de596 in __assert_fail () from /lib64/libc.so.6 #4 0x00000000004fc006 in refcount_sub_and_test (i=1, r=0x1224e88) at tools/include/linux/refcount.h:131 #5 refcount_dec_and_test (r=0x1224e88) at tools/include/linux/refcount.h:148 #6 map__put (map=0x1224df0) at util/map.c:299 #7 0x00000000004fdb95 in __maps__remove (map=0x1224df0, maps=0xb17d80) at util/map.c:953 #8 maps__remove (maps=0xb17d80, map=0x1224df0) at util/map.c:959 #9 0x00000000004f7d8a in map_groups__remove (map=<optimized out>, mg=<optimized out>) at util/map_groups.h:65 #10 machine__process_ksymbol_unregister (sample=<optimized out>, event=0x7ffff7279670, machine=<optimized out>) at util/machine.c:728 #11 machine__process_ksymbol (machine=<optimized out>, event=0x7ffff7279670, sample=<optimized out>) at util/machine.c:741 #12 0x00000000004fffbb in perf_session__deliver_event (session=0xb11390, event=0x7ffff7279670, tool=0x7fffffffc7b0, file_offset=13936) at util/session.c:1362 #13 0x00000000005039bb in do_flush (show_progress=false, oe=0xb17e80) at util/ordered-events.c:243 #14 __ordered_events__flush (oe=0xb17e80, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:322 #15 0x00000000005005e4 in perf_session__process_user_event (session=session@entry=0xb11390, event=event@entry=0x7ffff72a4af8, ... Add the map to the list and getting the reference event if we find the map with same name. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Fixes: `1e6285699b` ("perf symbols: Fix slowness due to -ffunction-section") Link: http://lkml.kernel.org/r/20190416160127.30203-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-04-17 14:30:11 -03:00
Jiri Olsa	adc6257c4a	perf evlist: Fix side band thread draining Current perf_evlist__poll_thread() code could finish without draining the data. Adding the logic that makes sure we won't finish before the drain. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Fixes: `657ee55319` ("perf evlist: Introduce side band thread") Link: http://lkml.kernel.org/r/20190416160127.30203-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-04-17 14:30:11 -03:00
Song Liu	a93e0b2365	perf tools: Check maps for bpf programs As reported by Jiri Olsa in: "[BUG] perf: intel_pt won't display kernel function" https://lore.kernel.org/lkml/20190403143738.GB32001@krava Recent changes to support PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT broke --kallsyms option. This is because it broke test __map__is_kmodule. This patch fixes this by adding check for bpf program, so that these maps are not mistaken as kernel modules. Signed-off-by: Song Liu <songliubraving@fb.com> Reported-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yonghong Song <yhs@fb.com> Link: http://lkml.kernel.org/r/20190416160127.30203-8-jolsa@kernel.org Fixes: `76193a9452` ("perf, bpf: Introduce PERF_RECORD_KSYMBOL") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-04-17 14:30:11 -03:00
Jiri Olsa	aa52660231	perf bpf: Return NULL when RB tree lookup fails in perf_env__find_bpf_prog_info() We currently don't return NULL in case we don't find the bpf_prog_info_node, fixing that. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: `e4378f0cb9` ("perf bpf: Save bpf_prog_info in a rbtree in perf_env") Link: http://lkml.kernel.org/r/20190416134151.15282-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-04-17 14:30:06 -03:00
Jiri Olsa	1e6db2ee86	perf top: Always sample time to satisfy needs of use of ordered queuing Bastian reported broken 'perf top -p PID' command, it won't display any data. The problem is that for -p option we monitor single thread, so we don't enable time in samples, because it's not needed. However since commit `16c66bc167` we use ordered queues to stash data plus later commits added logic for dropping samples in case there's big load and we don't keep up. All this needs timestamp for sample. Enabling it unconditionally for perf top. Reported-by: Bastian Beischer <bastian.beischer@rwth-aachen.de> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bastian beischer <bastian.beischer@rwth-aachen.de> Fixes: `16c66bc167` ("perf top: Add processing thread") Link: http://lkml.kernel.org/r/20190415125333.27160-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-04-16 12:36:20 -03:00

... 6 7 8 9 10 ...

10636 Commits