OpenCloudOS-Kernel/tools/perf/util/event.c

1753 lines
44 KiB
C
Raw Normal View History

License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
// SPDX-License-Identifier: GPL-2.0
#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <inttypes.h>
#include <linux/kernel.h>
#include <linux/types.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <uapi/linux/mman.h> /* To get things like MAP_HUGETLB even on older libc headers */
#include <api/fs/fs.h>
#include <linux/perf_event.h>
#include "event.h"
#include "debug.h"
#include "hist.h"
#include "machine.h"
#include "sort.h"
#include "string2.h"
#include "strlist.h"
#include "thread.h"
#include "thread_map.h"
#include "sane_ctype.h"
#include "map.h"
#include "symbol/kallsyms.h"
#include "asm/bug.h"
#include "stat.h"
#include "session.h"
perf tools: Handle PERF_RECORD_BPF_EVENT This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to turn it on. Committer notes: Add dummy machine__process_bpf_event() variant that returns zero for systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking the build in such systems. Remove the needless include <machine.h> from bpf->event.h, provide just forward declarations for the structs and unions in the parameters, to reduce compilation time and needless rebuilds when machine.h gets changed. Committer testing: When running with: # perf record --bpf-event On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL is not present, we fallback to removing those two bits from perf_event_attr, making the tool to continue to work on older kernels: perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off bpf_event ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off ksymbol ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ------------------------------------------------------------ And then proceeds to work without those two features. As passing --bpf-event is an explicit action performed by the user, perhaps we should emit a warning telling that the kernel has no such feature, but this can be done on top of this patch. Now with a kernel that supports these events, start the 'record --bpf-event -a' and then run 'perf trace sleep 10000' that will use the BPF augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus should generate PERF_RECORD_BPF_EVENT events: [root@quaco ~]# perf record -e dummy -a --bpf-event ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.713 MB perf.data ] [root@quaco ~]# bpftool prog 13: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 14: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 15: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 16: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 17: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 18: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 21: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 22: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 31: tracepoint name sys_enter tag 12504ba9402f952f gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 512B jited 374B memlock 4096B map_ids 30,29,28 32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 256B jited 191B memlock 4096B map_ids 30,29 # perf report -D | grep PERF_RECORD_BPF_EVENT | nl 1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13 2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14 3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15 4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16 5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17 6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18 7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21 8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22 9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29 10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29 11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30 12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30 13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31 14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32 # There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-18 00:15:18 +08:00
#include "bpf-event.h"
#define DEFAULT_PROC_MAP_PARSE_TIMEOUT 500
static const char *perf_event__names[] = {
[0] = "TOTAL",
[PERF_RECORD_MMAP] = "MMAP",
[PERF_RECORD_MMAP2] = "MMAP2",
[PERF_RECORD_LOST] = "LOST",
[PERF_RECORD_COMM] = "COMM",
[PERF_RECORD_EXIT] = "EXIT",
[PERF_RECORD_THROTTLE] = "THROTTLE",
[PERF_RECORD_UNTHROTTLE] = "UNTHROTTLE",
[PERF_RECORD_FORK] = "FORK",
[PERF_RECORD_READ] = "READ",
[PERF_RECORD_SAMPLE] = "SAMPLE",
[PERF_RECORD_AUX] = "AUX",
[PERF_RECORD_ITRACE_START] = "ITRACE_START",
perf tools: handle PERF_RECORD_LOST_SAMPLES This patch modifies the perf tool to handle the new RECORD type, PERF_RECORD_LOST_SAMPLES. The number of lost-sample events is stored in .nr_events[PERF_RECORD_LOST_SAMPLES]. The exact number of samples which the kernel dropped is stored in total_lost_samples. When the percentage of dropped samples is greater than 5%, a warning is printed. Here are some examples: Eg 1, Recording different frequently-occurring events is safe with the patch. Only a very low drop rate is associated with such actions. $ perf record -e '{cycles:p,instructions:p}' -c 20003 --no-time ~/tchain ~/tchain $ perf report -D | tail SAMPLE events: 120243 MMAP2 events: 5 LOST_SAMPLES events: 24 FINISHED_ROUND events: 15 cycles:p stats: TOTAL events: 59348 SAMPLE events: 59348 instructions:p stats: TOTAL events: 60895 SAMPLE events: 60895 $ perf report --stdio --group # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 24 # # Samples: 120K of event 'anon group { cycles:p, instructions:p }' # Event count (approx.): 24048600000 # # Overhead Command Shared Object Symbol # ................ ........... ................ .................................. # 99.74% 99.86% tchain_edit tchain_edit [.] f3 0.09% 0.02% tchain_edit tchain_edit [.] f2 0.04% 0.00% tchain_edit [kernel.vmlinux] [k] ixgbe_read_reg Eg 2, Recording the same thing multiple times can lead to high drop rate, but it is not a useful configuration. $ perf record -e '{cycles:p,cycles:p}' -c 20003 --no-time ~/tchain Warning: Processed 600592 samples and lost 99.73% samples! [perf record: Woken up 148 times to write data] [perf record: Captured and wrote 36.922 MB perf.data (1206322 samples)] [perf record: Woken up 1 times to write data] [perf record: Captured and wrote 0.121 MB perf.data (1629 samples)] Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@infradead.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1431285195-14269-9-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-11 03:13:15 +08:00
[PERF_RECORD_LOST_SAMPLES] = "LOST_SAMPLES",
[PERF_RECORD_SWITCH] = "SWITCH",
[PERF_RECORD_SWITCH_CPU_WIDE] = "SWITCH_CPU_WIDE",
perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info Introduce a new option to record PERF_RECORD_NAMESPACES events emitted by the kernel when fork, clone, setns or unshare are invoked. And update perf-record documentation with the new option to record namespace events. Committer notes: Combined it with a later patch to allow printing it via 'perf report -D' and be able to test the feature introduced in this patch. Had to move here also perf_ns__name(), that was introduced in another later patch. Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt: util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=] ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx ^ Testing it: # perf record --namespaces -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ] # # perf report -D <SNIP> 3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] 0x1151e0 [0x30]: event: 9 . . ... raw event: size 48 bytes . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h.... . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c.... . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................ <SNIP> NAMESPACES events: 1 <SNIP> # Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:43 +08:00
[PERF_RECORD_NAMESPACES] = "NAMESPACES",
[PERF_RECORD_KSYMBOL] = "KSYMBOL",
perf tools: Handle PERF_RECORD_BPF_EVENT This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to turn it on. Committer notes: Add dummy machine__process_bpf_event() variant that returns zero for systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking the build in such systems. Remove the needless include <machine.h> from bpf->event.h, provide just forward declarations for the structs and unions in the parameters, to reduce compilation time and needless rebuilds when machine.h gets changed. Committer testing: When running with: # perf record --bpf-event On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL is not present, we fallback to removing those two bits from perf_event_attr, making the tool to continue to work on older kernels: perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off bpf_event ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off ksymbol ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ------------------------------------------------------------ And then proceeds to work without those two features. As passing --bpf-event is an explicit action performed by the user, perhaps we should emit a warning telling that the kernel has no such feature, but this can be done on top of this patch. Now with a kernel that supports these events, start the 'record --bpf-event -a' and then run 'perf trace sleep 10000' that will use the BPF augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus should generate PERF_RECORD_BPF_EVENT events: [root@quaco ~]# perf record -e dummy -a --bpf-event ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.713 MB perf.data ] [root@quaco ~]# bpftool prog 13: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 14: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 15: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 16: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 17: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 18: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 21: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 22: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 31: tracepoint name sys_enter tag 12504ba9402f952f gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 512B jited 374B memlock 4096B map_ids 30,29,28 32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 256B jited 191B memlock 4096B map_ids 30,29 # perf report -D | grep PERF_RECORD_BPF_EVENT | nl 1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13 2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14 3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15 4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16 5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17 6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18 7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21 8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22 9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29 10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29 11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30 12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30 13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31 14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32 # There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-18 00:15:18 +08:00
[PERF_RECORD_BPF_EVENT] = "BPF_EVENT",
[PERF_RECORD_HEADER_ATTR] = "ATTR",
[PERF_RECORD_HEADER_EVENT_TYPE] = "EVENT_TYPE",
[PERF_RECORD_HEADER_TRACING_DATA] = "TRACING_DATA",
[PERF_RECORD_HEADER_BUILD_ID] = "BUILD_ID",
[PERF_RECORD_FINISHED_ROUND] = "FINISHED_ROUND",
perf tools: Add id index Add an index of the event identifiers, in preparation for Intel PT. The event id (also called the sample id) is a unique number allocated by the kernel to the event created by perf_event_open(). Events can include the event id by having a sample type including PERF_SAMPLE_ID or PERF_SAMPLE_IDENTIFIER. Currently the main use of the event id is to match an event back to the evsel to which it belongs i.e. perf_evlist__id2evsel() The purpose of this patch is to make it possible to match an event back to the mmap from which it was read. The reason that is useful is because the mmap represents a time-ordered context (either for a cpu or for a thread). Intel PT decodes trace information on that basis. In full-trace mode, that information can be recorded when the Intel PT trace is read, but in sample-mode the Intel PT trace data is embedded in a sample and it is in that case that the "id index" is needed. So the mmaps are numbered (idx) and the cpu and tid recorded against the id by perf_evlist__set_sid_idx() which is called by perf_evlist__mmap_per_evsel(). That information is recorded on the perf.data file in the new "id index". idx, cpu and tid are added to struct perf_sample_id (which is the node of evlist's hash table to match ids to evsels). The information can be retrieved using perf_evlist__id2sid(). Note however this all depends on having a sample type including PERF_SAMPLE_ID or PERF_SAMPLE_IDENTIFIER, otherwise ids are not recorded. The "id index" is a synthesized event record which will be created when Intel PT sampling is used by calling perf_event__synthesize_id_index(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1414417770-18602-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-27 21:49:22 +08:00
[PERF_RECORD_ID_INDEX] = "ID_INDEX",
[PERF_RECORD_AUXTRACE_INFO] = "AUXTRACE_INFO",
[PERF_RECORD_AUXTRACE] = "AUXTRACE",
[PERF_RECORD_AUXTRACE_ERROR] = "AUXTRACE_ERROR",
[PERF_RECORD_THREAD_MAP] = "THREAD_MAP",
[PERF_RECORD_CPU_MAP] = "CPU_MAP",
[PERF_RECORD_STAT_CONFIG] = "STAT_CONFIG",
[PERF_RECORD_STAT] = "STAT",
[PERF_RECORD_STAT_ROUND] = "STAT_ROUND",
[PERF_RECORD_EVENT_UPDATE] = "EVENT_UPDATE",
[PERF_RECORD_TIME_CONV] = "TIME_CONV",
perf tools: Add feature header record to pipe-mode Add header record types to pipe-mode, reusing the functions used in file-mode and leveraging the new struct feat_fd. For alignment, check that synthesized events don't exceed pagesize. Add the perf_event__synthesize_feature event call back to process the new header records. Before this patch: $ perf record -o - -e cycles sleep 1 | perf report --stdio --header [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] ... After this patch: $ perf record -o - -e cycles sleep 1 | perf report --stdio --header # ======== # captured on: Mon May 22 16:33:43 2017 # ======== # # hostname : my_hostname # os release : 4.11.0-dbx-up_perf # perf version : 4.11.rc6.g6277c80 # arch : x86_64 # nrcpus online : 72 # nrcpus avail : 72 # cpudesc : Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz # cpuid : GenuineIntel,6,63,2 # total memory : 263457192 kB # cmdline : /root/perf record -o - -e cycles -c 100000 sleep 1 # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: intel_bts = 6, uncore_imc_4 = 22, uncore_sbox_1 = 47, uncore_cbox_5 = 33, uncore_ha_0 = 16, uncore_cbox [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] ... Support added for the subcommands: report, inject, annotate and script. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-16-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18 12:25:48 +08:00
[PERF_RECORD_HEADER_FEATURE] = "FEATURE",
};
perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info Introduce a new option to record PERF_RECORD_NAMESPACES events emitted by the kernel when fork, clone, setns or unshare are invoked. And update perf-record documentation with the new option to record namespace events. Committer notes: Combined it with a later patch to allow printing it via 'perf report -D' and be able to test the feature introduced in this patch. Had to move here also perf_ns__name(), that was introduced in another later patch. Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt: util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=] ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx ^ Testing it: # perf record --namespaces -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ] # # perf report -D <SNIP> 3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] 0x1151e0 [0x30]: event: 9 . . ... raw event: size 48 bytes . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h.... . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c.... . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................ <SNIP> NAMESPACES events: 1 <SNIP> # Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:43 +08:00
static const char *perf_ns__names[] = {
[NET_NS_INDEX] = "net",
[UTS_NS_INDEX] = "uts",
[IPC_NS_INDEX] = "ipc",
[PID_NS_INDEX] = "pid",
[USER_NS_INDEX] = "user",
[MNT_NS_INDEX] = "mnt",
[CGROUP_NS_INDEX] = "cgroup",
};
unsigned int proc_map_timeout = DEFAULT_PROC_MAP_PARSE_TIMEOUT;
const char *perf_event__name(unsigned int id)
{
if (id >= ARRAY_SIZE(perf_event__names))
return "INVALID";
if (!perf_event__names[id])
return "UNKNOWN";
return perf_event__names[id];
}
perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info Introduce a new option to record PERF_RECORD_NAMESPACES events emitted by the kernel when fork, clone, setns or unshare are invoked. And update perf-record documentation with the new option to record namespace events. Committer notes: Combined it with a later patch to allow printing it via 'perf report -D' and be able to test the feature introduced in this patch. Had to move here also perf_ns__name(), that was introduced in another later patch. Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt: util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=] ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx ^ Testing it: # perf record --namespaces -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ] # # perf report -D <SNIP> 3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] 0x1151e0 [0x30]: event: 9 . . ... raw event: size 48 bytes . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h.... . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c.... . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................ <SNIP> NAMESPACES events: 1 <SNIP> # Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:43 +08:00
static const char *perf_ns__name(unsigned int id)
{
if (id >= ARRAY_SIZE(perf_ns__names))
return "UNKNOWN";
return perf_ns__names[id];
}
int perf_tool__process_synth_event(struct perf_tool *tool,
union perf_event *event,
struct machine *machine,
perf_event__handler_t process)
{
struct perf_sample synth_sample = {
.pid = -1,
.tid = -1,
.time = -1,
.stream_id = -1,
.cpu = -1,
.period = 1,
.cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK,
};
return process(tool, event, &synth_sample, machine);
};
/*
* Assumes that the first 4095 bytes of /proc/pid/stat contains
* the comm, tgid and ppid.
*/
static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
pid_t *tgid, pid_t *ppid)
{
char filename[PATH_MAX];
char bf[4096];
int fd;
size_t size = 0;
ssize_t n;
char *name, *tgids, *ppids;
*tgid = -1;
*ppid = -1;
snprintf(filename, sizeof(filename), "/proc/%d/status", pid);
fd = open(filename, O_RDONLY);
if (fd < 0) {
pr_debug("couldn't open %s\n", filename);
return -1;
}
n = read(fd, bf, sizeof(bf) - 1);
close(fd);
if (n <= 0) {
pr_warning("Couldn't get COMM, tigd and ppid for pid %d\n",
pid);
return -1;
}
bf[n] = '\0';
name = strstr(bf, "Name:");
tgids = strstr(bf, "Tgid:");
ppids = strstr(bf, "PPid:");
if (name) {
char *nl;
name += 5; /* strlen("Name:") */
name = ltrim(name);
nl = strchr(name, '\n');
if (nl)
*nl = '\0';
size = strlen(name);
if (size >= len)
size = len - 1;
memcpy(comm, name, size);
comm[size] = '\0';
} else {
pr_debug("Name: string not found for pid %d\n", pid);
}
if (tgids) {
tgids += 5; /* strlen("Tgid:") */
*tgid = atoi(tgids);
} else {
pr_debug("Tgid: string not found for pid %d\n", pid);
}
if (ppids) {
ppids += 5; /* strlen("PPid:") */
*ppid = atoi(ppids);
} else {
pr_debug("PPid: string not found for pid %d\n", pid);
}
return 0;
}
static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
struct machine *machine,
pid_t *tgid, pid_t *ppid)
{
size_t size;
*ppid = -1;
memset(&event->comm, 0, sizeof(event->comm));
if (machine__is_host(machine)) {
if (perf_event__get_comm_ids(pid, event->comm.comm,
sizeof(event->comm.comm),
tgid, ppid) != 0) {
return -1;
}
} else {
*tgid = machine->pid;
}
if (*tgid < 0)
return -1;
event->comm.pid = *tgid;
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
event->comm.header.type = PERF_RECORD_COMM;
size = strlen(event->comm.comm) + 1;
perf tools: fix ALIGN redefinition in system headers On some systems (e.g. Android), ALIGN is defined in system headers as ALIGN(p). The definition of ALIGN used in perf takes 2 parameters: ALIGN(x,a). This leads to redefinition conflicts. Redefinition error on Android: In file included from util/include/linux/list.h:1:0, from util/callchain.h:5, from util/hist.h:6, from util/session.h:4, from util/build-id.h:4, from util/annotate.c:11: util/include/linux/kernel.h:11:0: error: "ALIGN" redefined [-Werror] bionic/libc/include/sys/param.h:38:0: note: this is the location of the previous definition Conflics with system defined ALIGN in Android: util/event.c: In function 'perf_event__synthesize_comm': util/event.c:115:32: error: macro "ALIGN" passed 2 arguments, but takes just 1 util/event.c:115:9: error: 'ALIGN' undeclared (first use in this function) util/event.c:115:9: note: each undeclared identifier is reported only once for each function it appears in In order to avoid this redefinition, ALIGN is renamed to PERF_ALIGN. Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Irina Tirdea <irina.tirdea@intel.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1347315303-29906-5-git-send-email-irina.tirdea@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11 06:15:01 +08:00
size = PERF_ALIGN(size, sizeof(u64));
memset(event->comm.comm + size, 0, machine->id_hdr_size);
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
event->comm.header.size = (sizeof(event->comm) -
(sizeof(event->comm.comm) - size) +
machine->id_hdr_size);
event->comm.tid = pid;
return 0;
perf tools: Fix FORK after COMM when synthesizing records for pre-existing threads In this commit: commit 363b785f3805a2632eb09a8b430842461c21a640 Author: Don Zickus <dzickus@redhat.com> Date: Fri Mar 14 10:43:44 2014 -0400 perf tools: Speed up thread map generation We ended up emitting PERF_RECORD_FORK events after their corresponding PERF_RECORD_COMM, so the code below will remove the "existing thread" and then recreates it, unnecessarily: [root@ssdandy ~]# perf probe -x ~/bin/perf -L machine__process_fork_event <machine__process_fork_event@/home/acme/git/linux/tools/perf/util/machine.c:0> 0 int machine__process_fork_event(struct machine *machine, union perf_event *event, struct perf_sample *sample) 2 { 3 struct thread *thread = machine__find_thread(machine, event->fork.pid, event->fork.tid); 6 struct thread *parent = machine__findnew_thread(machine, event->fork.ppid, event->fork.ptid); /* if a thread currently exists for the thread id remove it */ if (thread != NULL) 12 machine__remove_thread(machine, thread); 14 thread = machine__findnew_thread(machine, event->fork.pid, event->fork.tid); 16 if (dump_trace) 17 perf_event__fprintf_task(event, stdout); 19 if (thread == NULL || parent == NULL || 20 thread__fork(thread, parent, sample->time) < 0) { 21 dump_printf("problem processing PERF_RECORD_FORK, skipping event.\n"); 22 return -1; } 25 return 0; 26 } [root@ssdandy ~]# perf probe -x ~/bin/perf fork_after_comm=machine__process_fork_event:12 Added new event: probe_perf:fork_after_comm (on machine__process_fork_event:12 in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:fork_after_comm -aR sleep 1 [root@ssdandy ~]# [root@ssdandy ~]# perf record -g -e probe_perf:* trace -o /tmp/bla ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.021 MB perf.data (30 samples) ] Terminated [root@ssdandy ~]# [root@ssdandy ~]# perf report --no-children --show-total-period --stdio # To display the perf.data header info, please use --header/--header-only options. # # Samples: 30 of event 'probe_perf:fork_after_comm' # Event count (approx.): 30 # # Overhead Period Command Shared Object Symbol # ........ ............ ....... ............. ............................... # 100.00% 30 trace trace [.] machine__process_fork_event | ---machine__process_fork_event __event__synthesize_thread.part.2 perf_event__synthesize_threads cmd_trace main __libc_start_main [root@ssdandy ~]# And Looking at 'perf report -D' output we see it: 0 0 0x8698 [0x30]: PERF_RECORD_COMM: auditd:703/707 0 0 0x86c8 [0x38]: PERF_RECORD_FORK(703:707):(703:703) Fix it by more closely mimicking how the kernel generates those records when a new fork happens, i.e. first a PERF_RECORD_FORK, then a PERF_RECORD_COMM. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-h0emvymi2t3mw8dlqd6d6z73@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-28 06:52:10 +08:00
}
perf record: Synthesize COMM event for a command line workload When perf creates a new child to profile, the events are enabled on exec(). And in this case, it doesn't synthesize any event for the child since they'll be generated during exec(). But there's an window between the enabling and the event generation. It used to be overcome since samples are only in kernel (so we always have the map) and the comm is overridden by a later COMM event. However it won't work if events are processed and displayed before the COMM event overrides like in 'perf script'. This leads to those early samples (like native_write_msr_safe) not having a comm but pid (like ':15328'). So it needs to synthesize COMM event for the child explicitly before enabling so that it can have a correct comm. But at this time, the comm will be "perf" since it's not exec-ed yet. Committer note: Before this patch: # perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # perf script --show-task-events :4429 4429 27909.079372: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079375: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079376: 10 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079377: 223 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079378: 6571 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. usleep 4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429 usleep 4429 27909.079381: 185403 cycles: ffffffff810a72d3 flush_signal_handlers (/lib/modules/4. usleep 4429 27909.079444: 2241110 cycles: 7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so) usleep 4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429) After: # perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # perf script --show-task perf 0 0.000000: PERF_RECORD_COMM: perf:8446/8446 perf 8446 30154.038944: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038948: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038949: 9 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038950: 230 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038951: 6772 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. usleep 8446 30154.038952: PERF_RECORD_COMM exec: usleep:8446/8446 usleep 8446 30154.038954: 196923 cycles: ffffffff81766440 _raw_spin_lock (/lib/modules/4.3.0-rc1 usleep 8446 30154.039021: 2292130 cycles: 7f609a173dc4 memcpy (/usr/lib64/ld-2.20.so) usleep 8446 30154.039349: PERF_RECORD_EXIT(8446:8446):(8446:8446) # Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1442881495-2928-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-22 08:24:55 +08:00
pid_t perf_event__synthesize_comm(struct perf_tool *tool,
perf tools: Fix FORK after COMM when synthesizing records for pre-existing threads In this commit: commit 363b785f3805a2632eb09a8b430842461c21a640 Author: Don Zickus <dzickus@redhat.com> Date: Fri Mar 14 10:43:44 2014 -0400 perf tools: Speed up thread map generation We ended up emitting PERF_RECORD_FORK events after their corresponding PERF_RECORD_COMM, so the code below will remove the "existing thread" and then recreates it, unnecessarily: [root@ssdandy ~]# perf probe -x ~/bin/perf -L machine__process_fork_event <machine__process_fork_event@/home/acme/git/linux/tools/perf/util/machine.c:0> 0 int machine__process_fork_event(struct machine *machine, union perf_event *event, struct perf_sample *sample) 2 { 3 struct thread *thread = machine__find_thread(machine, event->fork.pid, event->fork.tid); 6 struct thread *parent = machine__findnew_thread(machine, event->fork.ppid, event->fork.ptid); /* if a thread currently exists for the thread id remove it */ if (thread != NULL) 12 machine__remove_thread(machine, thread); 14 thread = machine__findnew_thread(machine, event->fork.pid, event->fork.tid); 16 if (dump_trace) 17 perf_event__fprintf_task(event, stdout); 19 if (thread == NULL || parent == NULL || 20 thread__fork(thread, parent, sample->time) < 0) { 21 dump_printf("problem processing PERF_RECORD_FORK, skipping event.\n"); 22 return -1; } 25 return 0; 26 } [root@ssdandy ~]# perf probe -x ~/bin/perf fork_after_comm=machine__process_fork_event:12 Added new event: probe_perf:fork_after_comm (on machine__process_fork_event:12 in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:fork_after_comm -aR sleep 1 [root@ssdandy ~]# [root@ssdandy ~]# perf record -g -e probe_perf:* trace -o /tmp/bla ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.021 MB perf.data (30 samples) ] Terminated [root@ssdandy ~]# [root@ssdandy ~]# perf report --no-children --show-total-period --stdio # To display the perf.data header info, please use --header/--header-only options. # # Samples: 30 of event 'probe_perf:fork_after_comm' # Event count (approx.): 30 # # Overhead Period Command Shared Object Symbol # ........ ............ ....... ............. ............................... # 100.00% 30 trace trace [.] machine__process_fork_event | ---machine__process_fork_event __event__synthesize_thread.part.2 perf_event__synthesize_threads cmd_trace main __libc_start_main [root@ssdandy ~]# And Looking at 'perf report -D' output we see it: 0 0 0x8698 [0x30]: PERF_RECORD_COMM: auditd:703/707 0 0 0x86c8 [0x38]: PERF_RECORD_FORK(703:707):(703:703) Fix it by more closely mimicking how the kernel generates those records when a new fork happens, i.e. first a PERF_RECORD_FORK, then a PERF_RECORD_COMM. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-h0emvymi2t3mw8dlqd6d6z73@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-28 06:52:10 +08:00
union perf_event *event, pid_t pid,
perf_event__handler_t process,
struct machine *machine)
{
pid_t tgid, ppid;
perf tools: Fix FORK after COMM when synthesizing records for pre-existing threads In this commit: commit 363b785f3805a2632eb09a8b430842461c21a640 Author: Don Zickus <dzickus@redhat.com> Date: Fri Mar 14 10:43:44 2014 -0400 perf tools: Speed up thread map generation We ended up emitting PERF_RECORD_FORK events after their corresponding PERF_RECORD_COMM, so the code below will remove the "existing thread" and then recreates it, unnecessarily: [root@ssdandy ~]# perf probe -x ~/bin/perf -L machine__process_fork_event <machine__process_fork_event@/home/acme/git/linux/tools/perf/util/machine.c:0> 0 int machine__process_fork_event(struct machine *machine, union perf_event *event, struct perf_sample *sample) 2 { 3 struct thread *thread = machine__find_thread(machine, event->fork.pid, event->fork.tid); 6 struct thread *parent = machine__findnew_thread(machine, event->fork.ppid, event->fork.ptid); /* if a thread currently exists for the thread id remove it */ if (thread != NULL) 12 machine__remove_thread(machine, thread); 14 thread = machine__findnew_thread(machine, event->fork.pid, event->fork.tid); 16 if (dump_trace) 17 perf_event__fprintf_task(event, stdout); 19 if (thread == NULL || parent == NULL || 20 thread__fork(thread, parent, sample->time) < 0) { 21 dump_printf("problem processing PERF_RECORD_FORK, skipping event.\n"); 22 return -1; } 25 return 0; 26 } [root@ssdandy ~]# perf probe -x ~/bin/perf fork_after_comm=machine__process_fork_event:12 Added new event: probe_perf:fork_after_comm (on machine__process_fork_event:12 in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:fork_after_comm -aR sleep 1 [root@ssdandy ~]# [root@ssdandy ~]# perf record -g -e probe_perf:* trace -o /tmp/bla ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.021 MB perf.data (30 samples) ] Terminated [root@ssdandy ~]# [root@ssdandy ~]# perf report --no-children --show-total-period --stdio # To display the perf.data header info, please use --header/--header-only options. # # Samples: 30 of event 'probe_perf:fork_after_comm' # Event count (approx.): 30 # # Overhead Period Command Shared Object Symbol # ........ ............ ....... ............. ............................... # 100.00% 30 trace trace [.] machine__process_fork_event | ---machine__process_fork_event __event__synthesize_thread.part.2 perf_event__synthesize_threads cmd_trace main __libc_start_main [root@ssdandy ~]# And Looking at 'perf report -D' output we see it: 0 0 0x8698 [0x30]: PERF_RECORD_COMM: auditd:703/707 0 0 0x86c8 [0x38]: PERF_RECORD_FORK(703:707):(703:703) Fix it by more closely mimicking how the kernel generates those records when a new fork happens, i.e. first a PERF_RECORD_FORK, then a PERF_RECORD_COMM. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-h0emvymi2t3mw8dlqd6d6z73@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-28 06:52:10 +08:00
if (perf_event__prepare_comm(event, pid, machine, &tgid, &ppid) != 0)
return -1;
if (perf_tool__process_synth_event(tool, event, machine, process) != 0)
return -1;
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
return tgid;
}
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
static void perf_event__get_ns_link_info(pid_t pid, const char *ns,
struct perf_ns_link_info *ns_link_info)
{
struct stat64 st;
char proc_ns[128];
sprintf(proc_ns, "/proc/%u/ns/%s", pid, ns);
if (stat64(proc_ns, &st) == 0) {
ns_link_info->dev = st.st_dev;
ns_link_info->ino = st.st_ino;
}
}
int perf_event__synthesize_namespaces(struct perf_tool *tool,
union perf_event *event,
pid_t pid, pid_t tgid,
perf_event__handler_t process,
struct machine *machine)
{
u32 idx;
struct perf_ns_link_info *ns_link_info;
if (!tool || !tool->namespace_events)
return 0;
memset(&event->namespaces, 0, (sizeof(event->namespaces) +
(NR_NAMESPACES * sizeof(struct perf_ns_link_info)) +
machine->id_hdr_size));
event->namespaces.pid = tgid;
event->namespaces.tid = pid;
event->namespaces.nr_namespaces = NR_NAMESPACES;
ns_link_info = event->namespaces.link_info;
for (idx = 0; idx < event->namespaces.nr_namespaces; idx++)
perf_event__get_ns_link_info(pid, perf_ns__name(idx),
&ns_link_info[idx]);
event->namespaces.header.type = PERF_RECORD_NAMESPACES;
event->namespaces.header.size = (sizeof(event->namespaces) +
(NR_NAMESPACES * sizeof(struct perf_ns_link_info)) +
machine->id_hdr_size);
if (perf_tool__process_synth_event(tool, event, machine, process) != 0)
return -1;
return 0;
}
static int perf_event__synthesize_fork(struct perf_tool *tool,
union perf_event *event,
pid_t pid, pid_t tgid, pid_t ppid,
perf_event__handler_t process,
struct machine *machine)
{
memset(&event->fork, 0, sizeof(event->fork) + machine->id_hdr_size);
/*
* for main thread set parent to ppid from status file. For other
* threads set parent pid to main thread. ie., assume main thread
* spawns all threads in a process
*/
if (tgid == pid) {
event->fork.ppid = ppid;
event->fork.ptid = ppid;
} else {
event->fork.ppid = tgid;
event->fork.ptid = tgid;
}
event->fork.pid = tgid;
event->fork.tid = pid;
event->fork.header.type = PERF_RECORD_FORK;
event->fork.header.misc = PERF_RECORD_MISC_FORK_EXEC;
event->fork.header.size = (sizeof(event->fork) + machine->id_hdr_size);
if (perf_tool__process_synth_event(tool, event, machine, process) != 0)
return -1;
return 0;
}
int perf_event__synthesize_mmap_events(struct perf_tool *tool,
union perf_event *event,
pid_t pid, pid_t tgid,
perf_event__handler_t process,
struct machine *machine,
bool mmap_data)
{
char filename[PATH_MAX];
FILE *fp;
unsigned long long t;
bool truncation = false;
unsigned long long timeout = proc_map_timeout * 1000000ULL;
int rc = 0;
const char *hugetlbfs_mnt = hugetlbfs__mountpoint();
int hugetlbfs_mnt_len = hugetlbfs_mnt ? strlen(hugetlbfs_mnt) : 0;
if (machine__is_default_guest(machine))
return 0;
perf tools: Make perf_event__synthesize_mmap_events() scale This patch significantly improves the execution time of perf_event__synthesize_mmap_events() when running perf record on systems where processes have lots of threads. It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to generate each map line in the maps file. If you have 1000 threads, then you have necessarily 1000 stacks. For each vma, you need to check if it corresponds to a thread's stack. With a large number of threads, this can take a very long time. I have seen latencies >> 10mn. As of today, perf does not use the fact that a mapping is a stack, therefore we can work around the issue by using /proc/pid/tasks/pid/maps. This entry does not try to map a vma to stack and is thus much faster with no loss of functonality. The proc-map-timeout logic is kept in case users still want some upper limit. In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual /proc/pid/task/pid/maps, tasks -> task. Thanks Arnaldo for catching this. Committer note: This problem seems to have been elliminated in the kernel since commit : b18cb64ead40 ("fs/proc: Stop trying to report thread stacks"). Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170315135059.GC2177@redhat.com Link: http://lkml.kernel.org/r/1489598233-25586-1-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-16 01:17:13 +08:00
snprintf(filename, sizeof(filename), "%s/proc/%d/task/%d/maps",
machine->root_dir, pid, pid);
fp = fopen(filename, "r");
if (fp == NULL) {
/*
* We raced with a task exiting - just return:
*/
pr_debug("couldn't open %s\n", filename);
return -1;
}
event->header.type = PERF_RECORD_MMAP2;
t = rdclock();
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
while (1) {
char bf[BUFSIZ];
char prot[5];
char execname[PATH_MAX];
char anonstr[] = "//anon";
unsigned int ino;
size_t size;
ssize_t n;
if (fgets(bf, sizeof(bf), fp) == NULL)
break;
if ((rdclock() - t) > timeout) {
pr_warning("Reading %s time out. "
"You may want to increase "
"the time limit by --proc-map-timeout\n",
filename);
truncation = true;
goto out;
}
/* ensure null termination since stack will be reused. */
strcpy(execname, "");
/* 00400000-0040c000 r-xp 00000000 fd:01 41038 /bin/cat */
n = sscanf(bf, "%"PRIx64"-%"PRIx64" %s %"PRIx64" %x:%x %u %[^\n]\n",
&event->mmap2.start, &event->mmap2.len, prot,
&event->mmap2.pgoff, &event->mmap2.maj,
&event->mmap2.min,
&ino, execname);
/*
* Anon maps don't have the execname.
*/
if (n < 7)
continue;
event->mmap2.ino = (u64)ino;
/*
* Just like the kernel, see __perf_event_mmap in kernel/perf_event.c
*/
if (machine__is_host(machine))
event->header.misc = PERF_RECORD_MISC_USER;
else
event->header.misc = PERF_RECORD_MISC_GUEST_USER;
/* map protection and flags bits */
event->mmap2.prot = 0;
event->mmap2.flags = 0;
if (prot[0] == 'r')
event->mmap2.prot |= PROT_READ;
if (prot[1] == 'w')
event->mmap2.prot |= PROT_WRITE;
if (prot[2] == 'x')
event->mmap2.prot |= PROT_EXEC;
if (prot[3] == 's')
event->mmap2.flags |= MAP_SHARED;
else
event->mmap2.flags |= MAP_PRIVATE;
if (prot[2] != 'x') {
if (!mmap_data || prot[0] != 'r')
continue;
event->header.misc |= PERF_RECORD_MISC_MMAP_DATA;
}
out:
if (truncation)
event->header.misc |= PERF_RECORD_MISC_PROC_MAP_PARSE_TIMEOUT;
if (!strcmp(execname, ""))
strcpy(execname, anonstr);
perf tools: Fix MMAP event synthesis broken by MAP_HUGETLB change Patch "perf record: Mark MAP_HUGETLB when synthesizing mmap events") breaks MMAP event synthesis. The executable name comparison will match any name if the length is zero, resulting in all the user space maps becoming anonymous. This is particularly noticeable with system-wide traces. Example: perf record -a sleep 1 perf script --show-mmap-events Committer note: That is not the case when, say, one has a qemu instance and libvirt actually mounts hugetlbfs. To test this I had to first umount it: [root@jouet ~]# mount | grep hugetlbfs hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel) [root@jouet ~]# After unmount it the error fixed by this patch manifests itself: # perf record -a sleep 1 # perf script --show-mmap-events | grep PERF_RECORD_MMAP2 | head -5 systemd 0 [000] 0.000000: PERF_RECORD_MMAP2 1/1: [0x557d47ed8000(0x167000) @ 0 fd:00 3146896 7362875424355726126]: r-xp //anon systemd 0 [000] 0.000000: PERF_RECORD_MMAP2 1/1: [0x7f96c488d000(0x4000) @ 0 fd:00 3153214 7362875424355726126]: r-xp //anon systemd 0 [000] 0.000000: PERF_RECORD_MMAP2 1/1: [0x7f96c4a92000(0x3d000) @ 0 fd:00 3159276 7362875424355726126]: r-xp //anon systemd 0 [000] 0.000000: PERF_RECORD_MMAP2 1/1: [0x7f96c4cd5000(0x15000) @ 0 fd:00 3153725 7362875424355726126]: r-xp //anon systemd 0 [000] 0.000000: PERF_RECORD_MMAP2 1/1: [0x7f96c4eeb000(0x25000) @ 0 fd:00 3153260 7362875424355726126]: r-xp //anon # Fixed version: # perf record -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.419 MB perf.data (182 samples) ] # perf script --show-mmap-events | grep PERF_RECORD_MMAP2 | head -5 systemd 0 [000] 0.000000: PERF_RECORD_MMAP2 1/1: [0x557d47ed8000(0x167000) @ 0 fd:00 3146896 7362875424355726126]: r-xp /usr/lib/systemd/systemd systemd 0 [000] 0.000000: PERF_RECORD_MMAP2 1/1: [0x7f96c488d000(0x4000) @ 0 fd:00 3153214 7362875424355726126]: r-xp /usr/lib64/libuuid.so.1.3.0 systemd 0 [000] 0.000000: PERF_RECORD_MMAP2 1/1: [0x7f96c4a92000(0x3d000) @ 0 fd:00 3159276 7362875424355726126]: r-xp /usr/lib64/libblkid.so.1.1.0 systemd 0 [000] 0.000000: PERF_RECORD_MMAP2 1/1: [0x7f96c4cd5000(0x15000) @ 0 fd:00 3153725 7362875424355726126]: r-xp /usr/lib64/libz.so.1.2.8 systemd 0 [000] 0.000000: PERF_RECORD_MMAP2 1/1: [0x7f96c4eeb000(0x25000) @ 0 fd:00 3153260 7362875424355726126]: r-xp /usr/lib64/liblzma.so.5.2.2 [root@jouet ~]# Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Link: http://lkml.kernel.org/r/1474641528-18776-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-23 22:38:34 +08:00
if (hugetlbfs_mnt_len &&
!strncmp(execname, hugetlbfs_mnt, hugetlbfs_mnt_len)) {
strcpy(execname, anonstr);
event->mmap2.flags |= MAP_HUGETLB;
}
size = strlen(execname) + 1;
memcpy(event->mmap2.filename, execname, size);
size = PERF_ALIGN(size, sizeof(u64));
event->mmap2.len -= event->mmap.start;
event->mmap2.header.size = (sizeof(event->mmap2) -
(sizeof(event->mmap2.filename) - size));
memset(event->mmap2.filename + size, 0, machine->id_hdr_size);
event->mmap2.header.size += machine->id_hdr_size;
event->mmap2.pid = tgid;
event->mmap2.tid = pid;
if (perf_tool__process_synth_event(tool, event, machine, process) != 0) {
rc = -1;
break;
}
if (truncation)
break;
}
fclose(fp);
return rc;
}
int perf_event__synthesize_modules(struct perf_tool *tool,
perf_event__handler_t process,
struct machine *machine)
{
int rc = 0;
struct map *pos;
struct maps *maps = machine__kernel_maps(machine);
union perf_event *event = zalloc((sizeof(event->mmap) +
machine->id_hdr_size));
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
if (event == NULL) {
pr_debug("Not enough memory synthesizing mmap event "
"for kernel modules\n");
return -1;
}
event->header.type = PERF_RECORD_MMAP;
/*
* kernel uses 0 for user space maps, see kernel/perf_event.c
* __perf_event_mmap
*/
if (machine__is_host(machine))
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
event->header.misc = PERF_RECORD_MISC_KERNEL;
else
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
event->header.misc = PERF_RECORD_MISC_GUEST_KERNEL;
for (pos = maps__first(maps); pos; pos = map__next(pos)) {
size_t size;
if (!__map__is_kmodule(pos))
continue;
perf tools: fix ALIGN redefinition in system headers On some systems (e.g. Android), ALIGN is defined in system headers as ALIGN(p). The definition of ALIGN used in perf takes 2 parameters: ALIGN(x,a). This leads to redefinition conflicts. Redefinition error on Android: In file included from util/include/linux/list.h:1:0, from util/callchain.h:5, from util/hist.h:6, from util/session.h:4, from util/build-id.h:4, from util/annotate.c:11: util/include/linux/kernel.h:11:0: error: "ALIGN" redefined [-Werror] bionic/libc/include/sys/param.h:38:0: note: this is the location of the previous definition Conflics with system defined ALIGN in Android: util/event.c: In function 'perf_event__synthesize_comm': util/event.c:115:32: error: macro "ALIGN" passed 2 arguments, but takes just 1 util/event.c:115:9: error: 'ALIGN' undeclared (first use in this function) util/event.c:115:9: note: each undeclared identifier is reported only once for each function it appears in In order to avoid this redefinition, ALIGN is renamed to PERF_ALIGN. Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Irina Tirdea <irina.tirdea@intel.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1347315303-29906-5-git-send-email-irina.tirdea@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11 06:15:01 +08:00
size = PERF_ALIGN(pos->dso->long_name_len + 1, sizeof(u64));
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
event->mmap.header.type = PERF_RECORD_MMAP;
event->mmap.header.size = (sizeof(event->mmap) -
(sizeof(event->mmap.filename) - size));
memset(event->mmap.filename + size, 0, machine->id_hdr_size);
event->mmap.header.size += machine->id_hdr_size;
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
event->mmap.start = pos->start;
event->mmap.len = pos->end - pos->start;
event->mmap.pid = machine->pid;
memcpy(event->mmap.filename, pos->dso->long_name,
pos->dso->long_name_len + 1);
if (perf_tool__process_synth_event(tool, event, machine, process) != 0) {
rc = -1;
break;
}
}
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
free(event);
return rc;
}
static int __event__synthesize_thread(union perf_event *comm_event,
union perf_event *mmap_event,
union perf_event *fork_event,
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
union perf_event *namespaces_event,
perf tools: Fix comm for processes with named threads perf does not properly handle monitoring of processes with named threads. For example: $ ps -C myapp -L PID LWP TTY TIME CMD 25118 25118 ? 00:00:00 myapp 25118 25119 ? 00:00:00 myapp:worker perf record -e cs -c 1 -fo /tmp/perf.data -p 25118 -- sleep 10 perf report --stdio -i /tmp/perf.data 100.00% myapp:worker [kernel.kallsyms] [k] perf_event_task_sched_out The process name is set to the name of the last thread it finds for the process. The Problem: perf-top and perf-record both create a thread_map of threads to be monitored. That map is used in perf_event__synthesize_thread_map which loops over the entries in thread_map and calls __event__synthesize_thread to generate COMM and MMAP events. __event__synthesize_thread calls perf_event__synthesize_comm which opens /proc/pid/status, reads the name of the task and its thread group id. That's all fine. The problem is that it then reads /proc/pid/task and generates COMM events for each task it finds - but using the name found in /proc/pid/status where pid is the thread of interest. The end result (looping over thread_map + synthesizing comm events for each thread each time) means the name of the last thread processed sets the name for all threads in the process - which is not good for multithreaded processes with named threads. The Fix: perf_event__synthesize_comm has an input argument (full) that decides whether to process task entries for each pid it is passed. It currently never set to 0 (perf_event__synthesize_comm has a single caller and it always passes the value 1). Let's fix that. Add the full input argument to __event__synthesize_thread which passes it to perf_event__synthesize_comm. For thread/process monitoring set full to 0 which means COMM and MMAP events are only generated for the pid passed to it. For system wide monitoring set full to 1 so that COMM events are generated for all threads in a process. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1324578603-12762-2-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-12-23 02:30:01 +08:00
pid_t pid, int full,
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
perf_event__handler_t process,
struct perf_tool *tool,
struct machine *machine,
bool mmap_data)
{
char filename[PATH_MAX];
DIR *tasks;
perf tools: Use readdir() instead of deprecated readdir_r() The readdir() function is thread safe as long as just one thread uses a DIR, which is the case when synthesizing events for pre-existing threads by traversing /proc, so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. CC /tmp/build/perf/util/event.o util/event.c: In function '__event__synthesize_thread': util/event.c:466:2: error: 'readdir_r' is deprecated [-Werror=deprecated-declarations] while (!readdir_r(tasks, &dirent, &next) && next) { ^~~~~ In file included from /usr/include/features.h:368:0, from /usr/include/stdint.h:25, from /usr/lib/gcc/x86_64-redhat-linux/6.0.0/include/stdint.h:9, from /git/linux/tools/include/linux/types.h:6, from util/event.c:1: /usr/include/dirent.h:189:12: note: declared here Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i1vj7nyjp2p750rirxgrfd3c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08 22:32:15 +08:00
struct dirent *dirent;
pid_t tgid, ppid;
int rc = 0;
/* special case: only send one comm event using passed in pid */
if (!full) {
tgid = perf_event__synthesize_comm(tool, comm_event, pid,
process, machine);
if (tgid == -1)
return -1;
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
if (perf_event__synthesize_namespaces(tool, namespaces_event, pid,
tgid, process, machine) < 0)
return -1;
/*
* send mmap only for thread group leader
* see thread__init_map_groups
*/
if (pid == tgid &&
perf_event__synthesize_mmap_events(tool, mmap_event, pid, tgid,
process, machine, mmap_data))
return -1;
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
return 0;
}
if (machine__is_default_guest(machine))
return 0;
snprintf(filename, sizeof(filename), "%s/proc/%d/task",
machine->root_dir, pid);
tasks = opendir(filename);
if (tasks == NULL) {
pr_debug("couldn't open %s\n", filename);
return 0;
}
perf tools: Use readdir() instead of deprecated readdir_r() The readdir() function is thread safe as long as just one thread uses a DIR, which is the case when synthesizing events for pre-existing threads by traversing /proc, so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. CC /tmp/build/perf/util/event.o util/event.c: In function '__event__synthesize_thread': util/event.c:466:2: error: 'readdir_r' is deprecated [-Werror=deprecated-declarations] while (!readdir_r(tasks, &dirent, &next) && next) { ^~~~~ In file included from /usr/include/features.h:368:0, from /usr/include/stdint.h:25, from /usr/lib/gcc/x86_64-redhat-linux/6.0.0/include/stdint.h:9, from /git/linux/tools/include/linux/types.h:6, from util/event.c:1: /usr/include/dirent.h:189:12: note: declared here Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i1vj7nyjp2p750rirxgrfd3c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08 22:32:15 +08:00
while ((dirent = readdir(tasks)) != NULL) {
char *end;
pid_t _pid;
perf tools: Use readdir() instead of deprecated readdir_r() The readdir() function is thread safe as long as just one thread uses a DIR, which is the case when synthesizing events for pre-existing threads by traversing /proc, so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. CC /tmp/build/perf/util/event.o util/event.c: In function '__event__synthesize_thread': util/event.c:466:2: error: 'readdir_r' is deprecated [-Werror=deprecated-declarations] while (!readdir_r(tasks, &dirent, &next) && next) { ^~~~~ In file included from /usr/include/features.h:368:0, from /usr/include/stdint.h:25, from /usr/lib/gcc/x86_64-redhat-linux/6.0.0/include/stdint.h:9, from /git/linux/tools/include/linux/types.h:6, from util/event.c:1: /usr/include/dirent.h:189:12: note: declared here Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i1vj7nyjp2p750rirxgrfd3c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08 22:32:15 +08:00
_pid = strtol(dirent->d_name, &end, 10);
if (*end)
continue;
rc = -1;
if (perf_event__prepare_comm(comm_event, _pid, machine,
&tgid, &ppid) != 0)
break;
perf tools: Fix FORK after COMM when synthesizing records for pre-existing threads In this commit: commit 363b785f3805a2632eb09a8b430842461c21a640 Author: Don Zickus <dzickus@redhat.com> Date: Fri Mar 14 10:43:44 2014 -0400 perf tools: Speed up thread map generation We ended up emitting PERF_RECORD_FORK events after their corresponding PERF_RECORD_COMM, so the code below will remove the "existing thread" and then recreates it, unnecessarily: [root@ssdandy ~]# perf probe -x ~/bin/perf -L machine__process_fork_event <machine__process_fork_event@/home/acme/git/linux/tools/perf/util/machine.c:0> 0 int machine__process_fork_event(struct machine *machine, union perf_event *event, struct perf_sample *sample) 2 { 3 struct thread *thread = machine__find_thread(machine, event->fork.pid, event->fork.tid); 6 struct thread *parent = machine__findnew_thread(machine, event->fork.ppid, event->fork.ptid); /* if a thread currently exists for the thread id remove it */ if (thread != NULL) 12 machine__remove_thread(machine, thread); 14 thread = machine__findnew_thread(machine, event->fork.pid, event->fork.tid); 16 if (dump_trace) 17 perf_event__fprintf_task(event, stdout); 19 if (thread == NULL || parent == NULL || 20 thread__fork(thread, parent, sample->time) < 0) { 21 dump_printf("problem processing PERF_RECORD_FORK, skipping event.\n"); 22 return -1; } 25 return 0; 26 } [root@ssdandy ~]# perf probe -x ~/bin/perf fork_after_comm=machine__process_fork_event:12 Added new event: probe_perf:fork_after_comm (on machine__process_fork_event:12 in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:fork_after_comm -aR sleep 1 [root@ssdandy ~]# [root@ssdandy ~]# perf record -g -e probe_perf:* trace -o /tmp/bla ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.021 MB perf.data (30 samples) ] Terminated [root@ssdandy ~]# [root@ssdandy ~]# perf report --no-children --show-total-period --stdio # To display the perf.data header info, please use --header/--header-only options. # # Samples: 30 of event 'probe_perf:fork_after_comm' # Event count (approx.): 30 # # Overhead Period Command Shared Object Symbol # ........ ............ ....... ............. ............................... # 100.00% 30 trace trace [.] machine__process_fork_event | ---machine__process_fork_event __event__synthesize_thread.part.2 perf_event__synthesize_threads cmd_trace main __libc_start_main [root@ssdandy ~]# And Looking at 'perf report -D' output we see it: 0 0 0x8698 [0x30]: PERF_RECORD_COMM: auditd:703/707 0 0 0x86c8 [0x38]: PERF_RECORD_FORK(703:707):(703:703) Fix it by more closely mimicking how the kernel generates those records when a new fork happens, i.e. first a PERF_RECORD_FORK, then a PERF_RECORD_COMM. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-h0emvymi2t3mw8dlqd6d6z73@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-28 06:52:10 +08:00
if (perf_event__synthesize_fork(tool, fork_event, _pid, tgid,
ppid, process, machine) < 0)
break;
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
if (perf_event__synthesize_namespaces(tool, namespaces_event, _pid,
tgid, process, machine) < 0)
break;
perf tools: Fix FORK after COMM when synthesizing records for pre-existing threads In this commit: commit 363b785f3805a2632eb09a8b430842461c21a640 Author: Don Zickus <dzickus@redhat.com> Date: Fri Mar 14 10:43:44 2014 -0400 perf tools: Speed up thread map generation We ended up emitting PERF_RECORD_FORK events after their corresponding PERF_RECORD_COMM, so the code below will remove the "existing thread" and then recreates it, unnecessarily: [root@ssdandy ~]# perf probe -x ~/bin/perf -L machine__process_fork_event <machine__process_fork_event@/home/acme/git/linux/tools/perf/util/machine.c:0> 0 int machine__process_fork_event(struct machine *machine, union perf_event *event, struct perf_sample *sample) 2 { 3 struct thread *thread = machine__find_thread(machine, event->fork.pid, event->fork.tid); 6 struct thread *parent = machine__findnew_thread(machine, event->fork.ppid, event->fork.ptid); /* if a thread currently exists for the thread id remove it */ if (thread != NULL) 12 machine__remove_thread(machine, thread); 14 thread = machine__findnew_thread(machine, event->fork.pid, event->fork.tid); 16 if (dump_trace) 17 perf_event__fprintf_task(event, stdout); 19 if (thread == NULL || parent == NULL || 20 thread__fork(thread, parent, sample->time) < 0) { 21 dump_printf("problem processing PERF_RECORD_FORK, skipping event.\n"); 22 return -1; } 25 return 0; 26 } [root@ssdandy ~]# perf probe -x ~/bin/perf fork_after_comm=machine__process_fork_event:12 Added new event: probe_perf:fork_after_comm (on machine__process_fork_event:12 in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:fork_after_comm -aR sleep 1 [root@ssdandy ~]# [root@ssdandy ~]# perf record -g -e probe_perf:* trace -o /tmp/bla ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.021 MB perf.data (30 samples) ] Terminated [root@ssdandy ~]# [root@ssdandy ~]# perf report --no-children --show-total-period --stdio # To display the perf.data header info, please use --header/--header-only options. # # Samples: 30 of event 'probe_perf:fork_after_comm' # Event count (approx.): 30 # # Overhead Period Command Shared Object Symbol # ........ ............ ....... ............. ............................... # 100.00% 30 trace trace [.] machine__process_fork_event | ---machine__process_fork_event __event__synthesize_thread.part.2 perf_event__synthesize_threads cmd_trace main __libc_start_main [root@ssdandy ~]# And Looking at 'perf report -D' output we see it: 0 0 0x8698 [0x30]: PERF_RECORD_COMM: auditd:703/707 0 0 0x86c8 [0x38]: PERF_RECORD_FORK(703:707):(703:703) Fix it by more closely mimicking how the kernel generates those records when a new fork happens, i.e. first a PERF_RECORD_FORK, then a PERF_RECORD_COMM. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-h0emvymi2t3mw8dlqd6d6z73@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-28 06:52:10 +08:00
/*
* Send the prepared comm event
*/
if (perf_tool__process_synth_event(tool, comm_event, machine, process) != 0)
break;
perf tools: Fix FORK after COMM when synthesizing records for pre-existing threads In this commit: commit 363b785f3805a2632eb09a8b430842461c21a640 Author: Don Zickus <dzickus@redhat.com> Date: Fri Mar 14 10:43:44 2014 -0400 perf tools: Speed up thread map generation We ended up emitting PERF_RECORD_FORK events after their corresponding PERF_RECORD_COMM, so the code below will remove the "existing thread" and then recreates it, unnecessarily: [root@ssdandy ~]# perf probe -x ~/bin/perf -L machine__process_fork_event <machine__process_fork_event@/home/acme/git/linux/tools/perf/util/machine.c:0> 0 int machine__process_fork_event(struct machine *machine, union perf_event *event, struct perf_sample *sample) 2 { 3 struct thread *thread = machine__find_thread(machine, event->fork.pid, event->fork.tid); 6 struct thread *parent = machine__findnew_thread(machine, event->fork.ppid, event->fork.ptid); /* if a thread currently exists for the thread id remove it */ if (thread != NULL) 12 machine__remove_thread(machine, thread); 14 thread = machine__findnew_thread(machine, event->fork.pid, event->fork.tid); 16 if (dump_trace) 17 perf_event__fprintf_task(event, stdout); 19 if (thread == NULL || parent == NULL || 20 thread__fork(thread, parent, sample->time) < 0) { 21 dump_printf("problem processing PERF_RECORD_FORK, skipping event.\n"); 22 return -1; } 25 return 0; 26 } [root@ssdandy ~]# perf probe -x ~/bin/perf fork_after_comm=machine__process_fork_event:12 Added new event: probe_perf:fork_after_comm (on machine__process_fork_event:12 in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:fork_after_comm -aR sleep 1 [root@ssdandy ~]# [root@ssdandy ~]# perf record -g -e probe_perf:* trace -o /tmp/bla ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.021 MB perf.data (30 samples) ] Terminated [root@ssdandy ~]# [root@ssdandy ~]# perf report --no-children --show-total-period --stdio # To display the perf.data header info, please use --header/--header-only options. # # Samples: 30 of event 'probe_perf:fork_after_comm' # Event count (approx.): 30 # # Overhead Period Command Shared Object Symbol # ........ ............ ....... ............. ............................... # 100.00% 30 trace trace [.] machine__process_fork_event | ---machine__process_fork_event __event__synthesize_thread.part.2 perf_event__synthesize_threads cmd_trace main __libc_start_main [root@ssdandy ~]# And Looking at 'perf report -D' output we see it: 0 0 0x8698 [0x30]: PERF_RECORD_COMM: auditd:703/707 0 0 0x86c8 [0x38]: PERF_RECORD_FORK(703:707):(703:703) Fix it by more closely mimicking how the kernel generates those records when a new fork happens, i.e. first a PERF_RECORD_FORK, then a PERF_RECORD_COMM. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-h0emvymi2t3mw8dlqd6d6z73@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-28 06:52:10 +08:00
rc = 0;
if (_pid == pid) {
/* process the parent's maps too */
rc = perf_event__synthesize_mmap_events(tool, mmap_event, pid, tgid,
process, machine, mmap_data);
if (rc)
break;
}
}
closedir(tasks);
return rc;
}
int perf_event__synthesize_thread_map(struct perf_tool *tool,
struct thread_map *threads,
perf_event__handler_t process,
struct machine *machine,
bool mmap_data)
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
{
union perf_event *comm_event, *mmap_event, *fork_event;
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
union perf_event *namespaces_event;
perf tools: Fix comm for processes with named threads perf does not properly handle monitoring of processes with named threads. For example: $ ps -C myapp -L PID LWP TTY TIME CMD 25118 25118 ? 00:00:00 myapp 25118 25119 ? 00:00:00 myapp:worker perf record -e cs -c 1 -fo /tmp/perf.data -p 25118 -- sleep 10 perf report --stdio -i /tmp/perf.data 100.00% myapp:worker [kernel.kallsyms] [k] perf_event_task_sched_out The process name is set to the name of the last thread it finds for the process. The Problem: perf-top and perf-record both create a thread_map of threads to be monitored. That map is used in perf_event__synthesize_thread_map which loops over the entries in thread_map and calls __event__synthesize_thread to generate COMM and MMAP events. __event__synthesize_thread calls perf_event__synthesize_comm which opens /proc/pid/status, reads the name of the task and its thread group id. That's all fine. The problem is that it then reads /proc/pid/task and generates COMM events for each task it finds - but using the name found in /proc/pid/status where pid is the thread of interest. The end result (looping over thread_map + synthesizing comm events for each thread each time) means the name of the last thread processed sets the name for all threads in the process - which is not good for multithreaded processes with named threads. The Fix: perf_event__synthesize_comm has an input argument (full) that decides whether to process task entries for each pid it is passed. It currently never set to 0 (perf_event__synthesize_comm has a single caller and it always passes the value 1). Let's fix that. Add the full input argument to __event__synthesize_thread which passes it to perf_event__synthesize_comm. For thread/process monitoring set full to 0 which means COMM and MMAP events are only generated for the pid passed to it. For system wide monitoring set full to 1 so that COMM events are generated for all threads in a process. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1324578603-12762-2-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-12-23 02:30:01 +08:00
int err = -1, thread, j;
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
comm_event = malloc(sizeof(comm_event->comm) + machine->id_hdr_size);
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
if (comm_event == NULL)
goto out;
mmap_event = malloc(sizeof(mmap_event->mmap2) + machine->id_hdr_size);
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
if (mmap_event == NULL)
goto out_free_comm;
fork_event = malloc(sizeof(fork_event->fork) + machine->id_hdr_size);
if (fork_event == NULL)
goto out_free_mmap;
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
namespaces_event = malloc(sizeof(namespaces_event->namespaces) +
(NR_NAMESPACES * sizeof(struct perf_ns_link_info)) +
machine->id_hdr_size);
if (namespaces_event == NULL)
goto out_free_fork;
perf tools: Fix thread_map event synthesizing in top and record Jeff Moyer reported these messages: Warning: ... trying to fall back to cpu-clock-ticks couldn't open /proc/-1/status couldn't open /proc/-1/maps [ls output] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.008 MB perf.data (~363 samples) ] That lead me and David Ahern to see that something was fishy on the thread synthesizing routines, at least for the case where the workload is started from 'perf record', as -1 is the default for target_tid in 'perf record --tid' parameter, so somehow we were trying to synthesize the PERF_RECORD_MMAP and PERF_RECORD_COMM events for the thread -1, a bug. So I investigated this and noticed that when we introduced support for recording a process and its threads using --pid some bugs were introduced and that the way to fix it was to instead of passing the target_tid to the event synthesizing routines we should better pass the thread_map that has the list of threads for a --pid or just the single thread for a --tid. Checked in the following ways: On a 8-way machine run cyclictest: [root@emilia ~]# perf record cyclictest -a -t -n -p99 -i100 -d50 policy: fifo: loadavg: 0.00 0.13 0.31 2/139 28798 T: 0 (28791) P:99 I:100 C: 25072 Min: 4 Act: 5 Avg: 6 Max: 122 T: 1 (28792) P:98 I:150 C: 16715 Min: 4 Act: 6 Avg: 5 Max: 27 T: 2 (28793) P:97 I:200 C: 12534 Min: 4 Act: 5 Avg: 4 Max: 8 T: 3 (28794) P:96 I:250 C: 10028 Min: 4 Act: 5 Avg: 5 Max: 96 T: 4 (28795) P:95 I:300 C: 8357 Min: 5 Act: 6 Avg: 5 Max: 12 T: 5 (28796) P:94 I:350 C: 7163 Min: 5 Act: 6 Avg: 5 Max: 12 T: 6 (28797) P:93 I:400 C: 6267 Min: 4 Act: 5 Avg: 5 Max: 9 T: 7 (28798) P:92 I:450 C: 5571 Min: 4 Act: 5 Avg: 5 Max: 9 ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.108 MB perf.data (~4719 samples) ] [root@emilia ~]# This will create one extra thread per CPU: [root@emilia ~]# tuna -t cyclictest -CP thread ctxt_switches pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 28825 OTHER 0 0xff 2169 671 cyclictest 28832 FIFO 93 6 52338 1 cyclictest 28833 FIFO 92 7 46524 1 cyclictest 28826 FIFO 99 0 209360 1 cyclictest 28827 FIFO 98 1 139577 1 cyclictest 28828 FIFO 97 2 104686 0 cyclictest 28829 FIFO 96 3 83751 1 cyclictest 28830 FIFO 95 4 69794 1 cyclictest 28831 FIFO 94 5 59825 1 cyclictest [root@emilia ~]# So we should expect only samples for the above 9 threads when using the --dump-raw-trace|-D perf report switch to look at the column with the tid: [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c 629 28825 110 28826 491 28827 308 28828 198 28829 621 28830 225 28831 203 28832 89 28833 [root@emilia ~]# So for workloads started by 'perf record' seems to work, now for existing workloads, just run cyclictest first, without 'perf record': [root@emilia ~]# tuna -t cyclictest -CP thread ctxt_switches pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 28859 OTHER 0 0xff 594 200 cyclictest 28864 FIFO 95 4 16587 1 cyclictest 28865 FIFO 94 5 14219 1 cyclictest 28866 FIFO 93 6 12443 0 cyclictest 28867 FIFO 92 7 11062 1 cyclictest 28860 FIFO 99 0 49779 1 cyclictest 28861 FIFO 98 1 33190 1 cyclictest 28862 FIFO 97 2 24895 1 cyclictest 28863 FIFO 96 3 19918 1 cyclictest [root@emilia ~]# and then later did: [root@emilia ~]# perf record --pid 28859 sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.027 MB perf.data (~1195 samples) ] [root@emilia ~]# To collect 3 seconds worth of samples for pid 28859 and its children: [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c 15 28859 33 28860 19 28861 13 28862 13 28863 10 28864 11 28865 9 28866 255 28867 [root@emilia ~]# Works, last thing is to check if looking at just one of those threads also works: [root@emilia ~]# perf record --tid 28866 sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.006 MB perf.data (~242 samples) ] [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c 3 28866 [root@emilia ~]# Works too. Reported-by: Jeff Moyer <jmoyer@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-10 22:52:47 +08:00
err = 0;
for (thread = 0; thread < threads->nr; ++thread) {
if (__event__synthesize_thread(comm_event, mmap_event,
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
fork_event, namespaces_event,
thread_map__pid(threads, thread), 0,
process, tool, machine,
mmap_data)) {
perf tools: Fix thread_map event synthesizing in top and record Jeff Moyer reported these messages: Warning: ... trying to fall back to cpu-clock-ticks couldn't open /proc/-1/status couldn't open /proc/-1/maps [ls output] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.008 MB perf.data (~363 samples) ] That lead me and David Ahern to see that something was fishy on the thread synthesizing routines, at least for the case where the workload is started from 'perf record', as -1 is the default for target_tid in 'perf record --tid' parameter, so somehow we were trying to synthesize the PERF_RECORD_MMAP and PERF_RECORD_COMM events for the thread -1, a bug. So I investigated this and noticed that when we introduced support for recording a process and its threads using --pid some bugs were introduced and that the way to fix it was to instead of passing the target_tid to the event synthesizing routines we should better pass the thread_map that has the list of threads for a --pid or just the single thread for a --tid. Checked in the following ways: On a 8-way machine run cyclictest: [root@emilia ~]# perf record cyclictest -a -t -n -p99 -i100 -d50 policy: fifo: loadavg: 0.00 0.13 0.31 2/139 28798 T: 0 (28791) P:99 I:100 C: 25072 Min: 4 Act: 5 Avg: 6 Max: 122 T: 1 (28792) P:98 I:150 C: 16715 Min: 4 Act: 6 Avg: 5 Max: 27 T: 2 (28793) P:97 I:200 C: 12534 Min: 4 Act: 5 Avg: 4 Max: 8 T: 3 (28794) P:96 I:250 C: 10028 Min: 4 Act: 5 Avg: 5 Max: 96 T: 4 (28795) P:95 I:300 C: 8357 Min: 5 Act: 6 Avg: 5 Max: 12 T: 5 (28796) P:94 I:350 C: 7163 Min: 5 Act: 6 Avg: 5 Max: 12 T: 6 (28797) P:93 I:400 C: 6267 Min: 4 Act: 5 Avg: 5 Max: 9 T: 7 (28798) P:92 I:450 C: 5571 Min: 4 Act: 5 Avg: 5 Max: 9 ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.108 MB perf.data (~4719 samples) ] [root@emilia ~]# This will create one extra thread per CPU: [root@emilia ~]# tuna -t cyclictest -CP thread ctxt_switches pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 28825 OTHER 0 0xff 2169 671 cyclictest 28832 FIFO 93 6 52338 1 cyclictest 28833 FIFO 92 7 46524 1 cyclictest 28826 FIFO 99 0 209360 1 cyclictest 28827 FIFO 98 1 139577 1 cyclictest 28828 FIFO 97 2 104686 0 cyclictest 28829 FIFO 96 3 83751 1 cyclictest 28830 FIFO 95 4 69794 1 cyclictest 28831 FIFO 94 5 59825 1 cyclictest [root@emilia ~]# So we should expect only samples for the above 9 threads when using the --dump-raw-trace|-D perf report switch to look at the column with the tid: [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c 629 28825 110 28826 491 28827 308 28828 198 28829 621 28830 225 28831 203 28832 89 28833 [root@emilia ~]# So for workloads started by 'perf record' seems to work, now for existing workloads, just run cyclictest first, without 'perf record': [root@emilia ~]# tuna -t cyclictest -CP thread ctxt_switches pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 28859 OTHER 0 0xff 594 200 cyclictest 28864 FIFO 95 4 16587 1 cyclictest 28865 FIFO 94 5 14219 1 cyclictest 28866 FIFO 93 6 12443 0 cyclictest 28867 FIFO 92 7 11062 1 cyclictest 28860 FIFO 99 0 49779 1 cyclictest 28861 FIFO 98 1 33190 1 cyclictest 28862 FIFO 97 2 24895 1 cyclictest 28863 FIFO 96 3 19918 1 cyclictest [root@emilia ~]# and then later did: [root@emilia ~]# perf record --pid 28859 sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.027 MB perf.data (~1195 samples) ] [root@emilia ~]# To collect 3 seconds worth of samples for pid 28859 and its children: [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c 15 28859 33 28860 19 28861 13 28862 13 28863 10 28864 11 28865 9 28866 255 28867 [root@emilia ~]# Works, last thing is to check if looking at just one of those threads also works: [root@emilia ~]# perf record --tid 28866 sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.006 MB perf.data (~242 samples) ] [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c 3 28866 [root@emilia ~]# Works too. Reported-by: Jeff Moyer <jmoyer@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-10 22:52:47 +08:00
err = -1;
break;
}
perf tools: Fix comm for processes with named threads perf does not properly handle monitoring of processes with named threads. For example: $ ps -C myapp -L PID LWP TTY TIME CMD 25118 25118 ? 00:00:00 myapp 25118 25119 ? 00:00:00 myapp:worker perf record -e cs -c 1 -fo /tmp/perf.data -p 25118 -- sleep 10 perf report --stdio -i /tmp/perf.data 100.00% myapp:worker [kernel.kallsyms] [k] perf_event_task_sched_out The process name is set to the name of the last thread it finds for the process. The Problem: perf-top and perf-record both create a thread_map of threads to be monitored. That map is used in perf_event__synthesize_thread_map which loops over the entries in thread_map and calls __event__synthesize_thread to generate COMM and MMAP events. __event__synthesize_thread calls perf_event__synthesize_comm which opens /proc/pid/status, reads the name of the task and its thread group id. That's all fine. The problem is that it then reads /proc/pid/task and generates COMM events for each task it finds - but using the name found in /proc/pid/status where pid is the thread of interest. The end result (looping over thread_map + synthesizing comm events for each thread each time) means the name of the last thread processed sets the name for all threads in the process - which is not good for multithreaded processes with named threads. The Fix: perf_event__synthesize_comm has an input argument (full) that decides whether to process task entries for each pid it is passed. It currently never set to 0 (perf_event__synthesize_comm has a single caller and it always passes the value 1). Let's fix that. Add the full input argument to __event__synthesize_thread which passes it to perf_event__synthesize_comm. For thread/process monitoring set full to 0 which means COMM and MMAP events are only generated for the pid passed to it. For system wide monitoring set full to 1 so that COMM events are generated for all threads in a process. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1324578603-12762-2-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-12-23 02:30:01 +08:00
/*
* comm.pid is set to thread group id by
* perf_event__synthesize_comm
*/
if ((int) comm_event->comm.pid != thread_map__pid(threads, thread)) {
perf tools: Fix comm for processes with named threads perf does not properly handle monitoring of processes with named threads. For example: $ ps -C myapp -L PID LWP TTY TIME CMD 25118 25118 ? 00:00:00 myapp 25118 25119 ? 00:00:00 myapp:worker perf record -e cs -c 1 -fo /tmp/perf.data -p 25118 -- sleep 10 perf report --stdio -i /tmp/perf.data 100.00% myapp:worker [kernel.kallsyms] [k] perf_event_task_sched_out The process name is set to the name of the last thread it finds for the process. The Problem: perf-top and perf-record both create a thread_map of threads to be monitored. That map is used in perf_event__synthesize_thread_map which loops over the entries in thread_map and calls __event__synthesize_thread to generate COMM and MMAP events. __event__synthesize_thread calls perf_event__synthesize_comm which opens /proc/pid/status, reads the name of the task and its thread group id. That's all fine. The problem is that it then reads /proc/pid/task and generates COMM events for each task it finds - but using the name found in /proc/pid/status where pid is the thread of interest. The end result (looping over thread_map + synthesizing comm events for each thread each time) means the name of the last thread processed sets the name for all threads in the process - which is not good for multithreaded processes with named threads. The Fix: perf_event__synthesize_comm has an input argument (full) that decides whether to process task entries for each pid it is passed. It currently never set to 0 (perf_event__synthesize_comm has a single caller and it always passes the value 1). Let's fix that. Add the full input argument to __event__synthesize_thread which passes it to perf_event__synthesize_comm. For thread/process monitoring set full to 0 which means COMM and MMAP events are only generated for the pid passed to it. For system wide monitoring set full to 1 so that COMM events are generated for all threads in a process. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1324578603-12762-2-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-12-23 02:30:01 +08:00
bool need_leader = true;
/* is thread group leader in thread_map? */
for (j = 0; j < threads->nr; ++j) {
if ((int) comm_event->comm.pid == thread_map__pid(threads, j)) {
perf tools: Fix comm for processes with named threads perf does not properly handle monitoring of processes with named threads. For example: $ ps -C myapp -L PID LWP TTY TIME CMD 25118 25118 ? 00:00:00 myapp 25118 25119 ? 00:00:00 myapp:worker perf record -e cs -c 1 -fo /tmp/perf.data -p 25118 -- sleep 10 perf report --stdio -i /tmp/perf.data 100.00% myapp:worker [kernel.kallsyms] [k] perf_event_task_sched_out The process name is set to the name of the last thread it finds for the process. The Problem: perf-top and perf-record both create a thread_map of threads to be monitored. That map is used in perf_event__synthesize_thread_map which loops over the entries in thread_map and calls __event__synthesize_thread to generate COMM and MMAP events. __event__synthesize_thread calls perf_event__synthesize_comm which opens /proc/pid/status, reads the name of the task and its thread group id. That's all fine. The problem is that it then reads /proc/pid/task and generates COMM events for each task it finds - but using the name found in /proc/pid/status where pid is the thread of interest. The end result (looping over thread_map + synthesizing comm events for each thread each time) means the name of the last thread processed sets the name for all threads in the process - which is not good for multithreaded processes with named threads. The Fix: perf_event__synthesize_comm has an input argument (full) that decides whether to process task entries for each pid it is passed. It currently never set to 0 (perf_event__synthesize_comm has a single caller and it always passes the value 1). Let's fix that. Add the full input argument to __event__synthesize_thread which passes it to perf_event__synthesize_comm. For thread/process monitoring set full to 0 which means COMM and MMAP events are only generated for the pid passed to it. For system wide monitoring set full to 1 so that COMM events are generated for all threads in a process. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1324578603-12762-2-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-12-23 02:30:01 +08:00
need_leader = false;
break;
}
}
/* if not, generate events for it */
if (need_leader &&
__event__synthesize_thread(comm_event, mmap_event,
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
fork_event, namespaces_event,
comm_event->comm.pid, 0,
process, tool, machine,
mmap_data)) {
perf tools: Fix comm for processes with named threads perf does not properly handle monitoring of processes with named threads. For example: $ ps -C myapp -L PID LWP TTY TIME CMD 25118 25118 ? 00:00:00 myapp 25118 25119 ? 00:00:00 myapp:worker perf record -e cs -c 1 -fo /tmp/perf.data -p 25118 -- sleep 10 perf report --stdio -i /tmp/perf.data 100.00% myapp:worker [kernel.kallsyms] [k] perf_event_task_sched_out The process name is set to the name of the last thread it finds for the process. The Problem: perf-top and perf-record both create a thread_map of threads to be monitored. That map is used in perf_event__synthesize_thread_map which loops over the entries in thread_map and calls __event__synthesize_thread to generate COMM and MMAP events. __event__synthesize_thread calls perf_event__synthesize_comm which opens /proc/pid/status, reads the name of the task and its thread group id. That's all fine. The problem is that it then reads /proc/pid/task and generates COMM events for each task it finds - but using the name found in /proc/pid/status where pid is the thread of interest. The end result (looping over thread_map + synthesizing comm events for each thread each time) means the name of the last thread processed sets the name for all threads in the process - which is not good for multithreaded processes with named threads. The Fix: perf_event__synthesize_comm has an input argument (full) that decides whether to process task entries for each pid it is passed. It currently never set to 0 (perf_event__synthesize_comm has a single caller and it always passes the value 1). Let's fix that. Add the full input argument to __event__synthesize_thread which passes it to perf_event__synthesize_comm. For thread/process monitoring set full to 0 which means COMM and MMAP events are only generated for the pid passed to it. For system wide monitoring set full to 1 so that COMM events are generated for all threads in a process. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1324578603-12762-2-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-12-23 02:30:01 +08:00
err = -1;
break;
}
}
perf tools: Fix thread_map event synthesizing in top and record Jeff Moyer reported these messages: Warning: ... trying to fall back to cpu-clock-ticks couldn't open /proc/-1/status couldn't open /proc/-1/maps [ls output] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.008 MB perf.data (~363 samples) ] That lead me and David Ahern to see that something was fishy on the thread synthesizing routines, at least for the case where the workload is started from 'perf record', as -1 is the default for target_tid in 'perf record --tid' parameter, so somehow we were trying to synthesize the PERF_RECORD_MMAP and PERF_RECORD_COMM events for the thread -1, a bug. So I investigated this and noticed that when we introduced support for recording a process and its threads using --pid some bugs were introduced and that the way to fix it was to instead of passing the target_tid to the event synthesizing routines we should better pass the thread_map that has the list of threads for a --pid or just the single thread for a --tid. Checked in the following ways: On a 8-way machine run cyclictest: [root@emilia ~]# perf record cyclictest -a -t -n -p99 -i100 -d50 policy: fifo: loadavg: 0.00 0.13 0.31 2/139 28798 T: 0 (28791) P:99 I:100 C: 25072 Min: 4 Act: 5 Avg: 6 Max: 122 T: 1 (28792) P:98 I:150 C: 16715 Min: 4 Act: 6 Avg: 5 Max: 27 T: 2 (28793) P:97 I:200 C: 12534 Min: 4 Act: 5 Avg: 4 Max: 8 T: 3 (28794) P:96 I:250 C: 10028 Min: 4 Act: 5 Avg: 5 Max: 96 T: 4 (28795) P:95 I:300 C: 8357 Min: 5 Act: 6 Avg: 5 Max: 12 T: 5 (28796) P:94 I:350 C: 7163 Min: 5 Act: 6 Avg: 5 Max: 12 T: 6 (28797) P:93 I:400 C: 6267 Min: 4 Act: 5 Avg: 5 Max: 9 T: 7 (28798) P:92 I:450 C: 5571 Min: 4 Act: 5 Avg: 5 Max: 9 ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.108 MB perf.data (~4719 samples) ] [root@emilia ~]# This will create one extra thread per CPU: [root@emilia ~]# tuna -t cyclictest -CP thread ctxt_switches pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 28825 OTHER 0 0xff 2169 671 cyclictest 28832 FIFO 93 6 52338 1 cyclictest 28833 FIFO 92 7 46524 1 cyclictest 28826 FIFO 99 0 209360 1 cyclictest 28827 FIFO 98 1 139577 1 cyclictest 28828 FIFO 97 2 104686 0 cyclictest 28829 FIFO 96 3 83751 1 cyclictest 28830 FIFO 95 4 69794 1 cyclictest 28831 FIFO 94 5 59825 1 cyclictest [root@emilia ~]# So we should expect only samples for the above 9 threads when using the --dump-raw-trace|-D perf report switch to look at the column with the tid: [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c 629 28825 110 28826 491 28827 308 28828 198 28829 621 28830 225 28831 203 28832 89 28833 [root@emilia ~]# So for workloads started by 'perf record' seems to work, now for existing workloads, just run cyclictest first, without 'perf record': [root@emilia ~]# tuna -t cyclictest -CP thread ctxt_switches pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 28859 OTHER 0 0xff 594 200 cyclictest 28864 FIFO 95 4 16587 1 cyclictest 28865 FIFO 94 5 14219 1 cyclictest 28866 FIFO 93 6 12443 0 cyclictest 28867 FIFO 92 7 11062 1 cyclictest 28860 FIFO 99 0 49779 1 cyclictest 28861 FIFO 98 1 33190 1 cyclictest 28862 FIFO 97 2 24895 1 cyclictest 28863 FIFO 96 3 19918 1 cyclictest [root@emilia ~]# and then later did: [root@emilia ~]# perf record --pid 28859 sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.027 MB perf.data (~1195 samples) ] [root@emilia ~]# To collect 3 seconds worth of samples for pid 28859 and its children: [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c 15 28859 33 28860 19 28861 13 28862 13 28863 10 28864 11 28865 9 28866 255 28867 [root@emilia ~]# Works, last thing is to check if looking at just one of those threads also works: [root@emilia ~]# perf record --tid 28866 sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.006 MB perf.data (~242 samples) ] [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c 3 28866 [root@emilia ~]# Works too. Reported-by: Jeff Moyer <jmoyer@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-10 22:52:47 +08:00
}
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
free(namespaces_event);
out_free_fork:
free(fork_event);
out_free_mmap:
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
free(mmap_event);
out_free_comm:
free(comm_event);
out:
return err;
}
perf top: Implement multithreading for perf_event__synthesize_threads The proc files which is sorted with alphabetical order are evenly assigned to several synthesize threads to be processed in parallel. For 'perf top', the threads number hard code to online CPU number. The following patch will introduce an option to set it. For other perf tools, the thread number is 1. Because the process function is not ready for multithreading, e.g. process_synthesized_event. This patch series only support event synthesize multithreading for 'perf top'. For other tools, it can be done separately later. With multithread applied, the total processing time can get up to 1.56x speedup on Knights Mill for 'perf top'. For specific single event processing, the processing time could increase because of the lock contention. So proc_map_timeout may need to be increased. Otherwise some proc maps will be truncated. Based on my test, increasing the proc_map_timeout has small impact on the total processing time. The total processing time still get 1.49x speedup on Knights Mill after increasing the proc_map_timeout. The patch itself doesn't increase the proc_map_timeout. Doesn't need to implement multithreading for per task monitoring, perf_event__synthesize_thread_map. It doesn't have performance issue. Committer testing: # getconf _NPROCESSORS_ONLN 4 # perf trace --no-inherit -e clone -o /tmp/output perf top # tail -4 /tmp/bla 0.124 ( 0.041 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eb3a8f30, parent_tidptr: 0x7fc3eb3a99d0, child_tidptr: 0x7fc3eb3a99d0, tls: 0x7fc3eb3a9700) = 9548 (perf) 0.246 ( 0.023 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eaba7f30, parent_tidptr: 0x7fc3eaba89d0, child_tidptr: 0x7fc3eaba89d0, tls: 0x7fc3eaba8700) = 9549 (perf) 0.286 ( 0.019 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9550 (perf) 246.540 ( 0.047 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9551 (perf) # Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1506696477-146932-4-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-29 22:47:54 +08:00
static int __perf_event__synthesize_threads(struct perf_tool *tool,
perf_event__handler_t process,
struct machine *machine,
bool mmap_data,
struct dirent **dirent,
int start,
int num)
{
union perf_event *comm_event, *mmap_event, *fork_event;
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
union perf_event *namespaces_event;
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
int err = -1;
char *end;
pid_t pid;
perf top: Implement multithreading for perf_event__synthesize_threads The proc files which is sorted with alphabetical order are evenly assigned to several synthesize threads to be processed in parallel. For 'perf top', the threads number hard code to online CPU number. The following patch will introduce an option to set it. For other perf tools, the thread number is 1. Because the process function is not ready for multithreading, e.g. process_synthesized_event. This patch series only support event synthesize multithreading for 'perf top'. For other tools, it can be done separately later. With multithread applied, the total processing time can get up to 1.56x speedup on Knights Mill for 'perf top'. For specific single event processing, the processing time could increase because of the lock contention. So proc_map_timeout may need to be increased. Otherwise some proc maps will be truncated. Based on my test, increasing the proc_map_timeout has small impact on the total processing time. The total processing time still get 1.49x speedup on Knights Mill after increasing the proc_map_timeout. The patch itself doesn't increase the proc_map_timeout. Doesn't need to implement multithreading for per task monitoring, perf_event__synthesize_thread_map. It doesn't have performance issue. Committer testing: # getconf _NPROCESSORS_ONLN 4 # perf trace --no-inherit -e clone -o /tmp/output perf top # tail -4 /tmp/bla 0.124 ( 0.041 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eb3a8f30, parent_tidptr: 0x7fc3eb3a99d0, child_tidptr: 0x7fc3eb3a99d0, tls: 0x7fc3eb3a9700) = 9548 (perf) 0.246 ( 0.023 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eaba7f30, parent_tidptr: 0x7fc3eaba89d0, child_tidptr: 0x7fc3eaba89d0, tls: 0x7fc3eaba8700) = 9549 (perf) 0.286 ( 0.019 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9550 (perf) 246.540 ( 0.047 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9551 (perf) # Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1506696477-146932-4-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-29 22:47:54 +08:00
int i;
comm_event = malloc(sizeof(comm_event->comm) + machine->id_hdr_size);
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
if (comm_event == NULL)
goto out;
mmap_event = malloc(sizeof(mmap_event->mmap2) + machine->id_hdr_size);
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
if (mmap_event == NULL)
goto out_free_comm;
fork_event = malloc(sizeof(fork_event->fork) + machine->id_hdr_size);
if (fork_event == NULL)
goto out_free_mmap;
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
namespaces_event = malloc(sizeof(namespaces_event->namespaces) +
(NR_NAMESPACES * sizeof(struct perf_ns_link_info)) +
machine->id_hdr_size);
if (namespaces_event == NULL)
goto out_free_fork;
perf top: Implement multithreading for perf_event__synthesize_threads The proc files which is sorted with alphabetical order are evenly assigned to several synthesize threads to be processed in parallel. For 'perf top', the threads number hard code to online CPU number. The following patch will introduce an option to set it. For other perf tools, the thread number is 1. Because the process function is not ready for multithreading, e.g. process_synthesized_event. This patch series only support event synthesize multithreading for 'perf top'. For other tools, it can be done separately later. With multithread applied, the total processing time can get up to 1.56x speedup on Knights Mill for 'perf top'. For specific single event processing, the processing time could increase because of the lock contention. So proc_map_timeout may need to be increased. Otherwise some proc maps will be truncated. Based on my test, increasing the proc_map_timeout has small impact on the total processing time. The total processing time still get 1.49x speedup on Knights Mill after increasing the proc_map_timeout. The patch itself doesn't increase the proc_map_timeout. Doesn't need to implement multithreading for per task monitoring, perf_event__synthesize_thread_map. It doesn't have performance issue. Committer testing: # getconf _NPROCESSORS_ONLN 4 # perf trace --no-inherit -e clone -o /tmp/output perf top # tail -4 /tmp/bla 0.124 ( 0.041 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eb3a8f30, parent_tidptr: 0x7fc3eb3a99d0, child_tidptr: 0x7fc3eb3a99d0, tls: 0x7fc3eb3a9700) = 9548 (perf) 0.246 ( 0.023 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eaba7f30, parent_tidptr: 0x7fc3eaba89d0, child_tidptr: 0x7fc3eaba89d0, tls: 0x7fc3eaba8700) = 9549 (perf) 0.286 ( 0.019 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9550 (perf) 246.540 ( 0.047 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9551 (perf) # Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1506696477-146932-4-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-29 22:47:54 +08:00
for (i = start; i < start + num; i++) {
if (!isdigit(dirent[i]->d_name[0]))
continue;
pid = (pid_t)strtol(dirent[i]->d_name, &end, 10);
/* only interested in proper numerical dirents */
perf top: Implement multithreading for perf_event__synthesize_threads The proc files which is sorted with alphabetical order are evenly assigned to several synthesize threads to be processed in parallel. For 'perf top', the threads number hard code to online CPU number. The following patch will introduce an option to set it. For other perf tools, the thread number is 1. Because the process function is not ready for multithreading, e.g. process_synthesized_event. This patch series only support event synthesize multithreading for 'perf top'. For other tools, it can be done separately later. With multithread applied, the total processing time can get up to 1.56x speedup on Knights Mill for 'perf top'. For specific single event processing, the processing time could increase because of the lock contention. So proc_map_timeout may need to be increased. Otherwise some proc maps will be truncated. Based on my test, increasing the proc_map_timeout has small impact on the total processing time. The total processing time still get 1.49x speedup on Knights Mill after increasing the proc_map_timeout. The patch itself doesn't increase the proc_map_timeout. Doesn't need to implement multithreading for per task monitoring, perf_event__synthesize_thread_map. It doesn't have performance issue. Committer testing: # getconf _NPROCESSORS_ONLN 4 # perf trace --no-inherit -e clone -o /tmp/output perf top # tail -4 /tmp/bla 0.124 ( 0.041 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eb3a8f30, parent_tidptr: 0x7fc3eb3a99d0, child_tidptr: 0x7fc3eb3a99d0, tls: 0x7fc3eb3a9700) = 9548 (perf) 0.246 ( 0.023 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eaba7f30, parent_tidptr: 0x7fc3eaba89d0, child_tidptr: 0x7fc3eaba89d0, tls: 0x7fc3eaba8700) = 9549 (perf) 0.286 ( 0.019 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9550 (perf) 246.540 ( 0.047 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9551 (perf) # Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1506696477-146932-4-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-29 22:47:54 +08:00
if (*end)
continue;
/*
* We may race with exiting thread, so don't stop just because
* one thread couldn't be synthesized.
*/
__event__synthesize_thread(comm_event, mmap_event, fork_event,
namespaces_event, pid, 1, process,
tool, machine, mmap_data);
}
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
err = 0;
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:51 +08:00
free(namespaces_event);
out_free_fork:
free(fork_event);
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
out_free_mmap:
free(mmap_event);
out_free_comm:
free(comm_event);
out:
return err;
}
perf top: Implement multithreading for perf_event__synthesize_threads The proc files which is sorted with alphabetical order are evenly assigned to several synthesize threads to be processed in parallel. For 'perf top', the threads number hard code to online CPU number. The following patch will introduce an option to set it. For other perf tools, the thread number is 1. Because the process function is not ready for multithreading, e.g. process_synthesized_event. This patch series only support event synthesize multithreading for 'perf top'. For other tools, it can be done separately later. With multithread applied, the total processing time can get up to 1.56x speedup on Knights Mill for 'perf top'. For specific single event processing, the processing time could increase because of the lock contention. So proc_map_timeout may need to be increased. Otherwise some proc maps will be truncated. Based on my test, increasing the proc_map_timeout has small impact on the total processing time. The total processing time still get 1.49x speedup on Knights Mill after increasing the proc_map_timeout. The patch itself doesn't increase the proc_map_timeout. Doesn't need to implement multithreading for per task monitoring, perf_event__synthesize_thread_map. It doesn't have performance issue. Committer testing: # getconf _NPROCESSORS_ONLN 4 # perf trace --no-inherit -e clone -o /tmp/output perf top # tail -4 /tmp/bla 0.124 ( 0.041 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eb3a8f30, parent_tidptr: 0x7fc3eb3a99d0, child_tidptr: 0x7fc3eb3a99d0, tls: 0x7fc3eb3a9700) = 9548 (perf) 0.246 ( 0.023 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eaba7f30, parent_tidptr: 0x7fc3eaba89d0, child_tidptr: 0x7fc3eaba89d0, tls: 0x7fc3eaba8700) = 9549 (perf) 0.286 ( 0.019 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9550 (perf) 246.540 ( 0.047 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9551 (perf) # Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1506696477-146932-4-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-29 22:47:54 +08:00
struct synthesize_threads_arg {
struct perf_tool *tool;
perf_event__handler_t process;
struct machine *machine;
bool mmap_data;
struct dirent **dirent;
int num;
int start;
};
static void *synthesize_threads_worker(void *arg)
{
struct synthesize_threads_arg *args = arg;
__perf_event__synthesize_threads(args->tool, args->process,
args->machine, args->mmap_data,
args->dirent,
perf top: Implement multithreading for perf_event__synthesize_threads The proc files which is sorted with alphabetical order are evenly assigned to several synthesize threads to be processed in parallel. For 'perf top', the threads number hard code to online CPU number. The following patch will introduce an option to set it. For other perf tools, the thread number is 1. Because the process function is not ready for multithreading, e.g. process_synthesized_event. This patch series only support event synthesize multithreading for 'perf top'. For other tools, it can be done separately later. With multithread applied, the total processing time can get up to 1.56x speedup on Knights Mill for 'perf top'. For specific single event processing, the processing time could increase because of the lock contention. So proc_map_timeout may need to be increased. Otherwise some proc maps will be truncated. Based on my test, increasing the proc_map_timeout has small impact on the total processing time. The total processing time still get 1.49x speedup on Knights Mill after increasing the proc_map_timeout. The patch itself doesn't increase the proc_map_timeout. Doesn't need to implement multithreading for per task monitoring, perf_event__synthesize_thread_map. It doesn't have performance issue. Committer testing: # getconf _NPROCESSORS_ONLN 4 # perf trace --no-inherit -e clone -o /tmp/output perf top # tail -4 /tmp/bla 0.124 ( 0.041 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eb3a8f30, parent_tidptr: 0x7fc3eb3a99d0, child_tidptr: 0x7fc3eb3a99d0, tls: 0x7fc3eb3a9700) = 9548 (perf) 0.246 ( 0.023 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eaba7f30, parent_tidptr: 0x7fc3eaba89d0, child_tidptr: 0x7fc3eaba89d0, tls: 0x7fc3eaba8700) = 9549 (perf) 0.286 ( 0.019 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9550 (perf) 246.540 ( 0.047 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9551 (perf) # Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1506696477-146932-4-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-29 22:47:54 +08:00
args->start, args->num);
return NULL;
}
int perf_event__synthesize_threads(struct perf_tool *tool,
perf_event__handler_t process,
struct machine *machine,
bool mmap_data,
unsigned int nr_threads_synthesize)
{
struct synthesize_threads_arg *args = NULL;
pthread_t *synthesize_threads = NULL;
char proc_path[PATH_MAX];
struct dirent **dirent;
int num_per_thread;
int m, n, i, j;
int thread_nr;
int base = 0;
int err = -1;
if (machine__is_default_guest(machine))
return 0;
snprintf(proc_path, sizeof(proc_path), "%s/proc", machine->root_dir);
n = scandir(proc_path, &dirent, 0, alphasort);
if (n < 0)
return err;
perf top: Add option to set the number of thread for event synthesize Using UINT_MAX to indicate the default thread#, which is the max number of online CPU. Committer testing: # perf trace --no-inherit -e clone -o /tmp/output perf top --num-thread-synthesize 9 # cat /tmp/output ? ( ? ): ... [continued]: clone()) = 26651 (perf) 0.059 ( 0.010 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f5bfac44f30, parent_tidptr: 0x7f5bfac459d0, child_tidptr: 0x7f5bfac459d0, tls: 0x7f5bfac45700) = 26652 (perf) 0.116 ( 0.014 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f5bfa443f30, parent_tidptr: 0x7f5bfa4449d0, child_tidptr: 0x7f5bfa4449d0, tls: 0x7f5bfa444700) = 26653 (perf) 0.141 ( 0.009 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f5bf9c42f30, parent_tidptr: 0x7f5bf9c439d0, child_tidptr: 0x7f5bf9c439d0, tls: 0x7f5bf9c43700) = 26654 (perf) 0.160 ( 0.012 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f5bf9441f30, parent_tidptr: 0x7f5bf94429d0, child_tidptr: 0x7f5bf94429d0, tls: 0x7f5bf9442700) = 26655 (perf) 0.232 ( 0.013 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f5bf8c40f30, parent_tidptr: 0x7f5bf8c419d0, child_tidptr: 0x7f5bf8c419d0, tls: 0x7f5bf8c41700) = 26656 (perf) 0.393 ( 0.011 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f5be3ffef30, parent_tidptr: 0x7f5be3fff9d0, child_tidptr: 0x7f5be3fff9d0, tls: 0x7f5be3fff700) = 26657 (perf) 0.802 ( 0.012 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f5be37fdf30, parent_tidptr: 0x7f5be37fe9d0, child_tidptr: 0x7f5be37fe9d0, tls: 0x7f5be37fe700) = 26658 (perf) 1.411 ( 0.022 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f5be2ffcf30, parent_tidptr: 0x7f5be2ffd9d0, child_tidptr: 0x7f5be2ffd9d0, tls: 0x7f5be2ffd700) = 26659 (perf) 246.422 ( 0.042 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f5be2ffcf30, parent_tidptr: 0x7f5be2ffd9d0, child_tidptr: 0x7f5be2ffd9d0, tls: 0x7f5be2ffd700) = 26660 (perf) # Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1506696477-146932-5-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-29 22:47:55 +08:00
if (nr_threads_synthesize == UINT_MAX)
thread_nr = sysconf(_SC_NPROCESSORS_ONLN);
else
thread_nr = nr_threads_synthesize;
perf top: Implement multithreading for perf_event__synthesize_threads The proc files which is sorted with alphabetical order are evenly assigned to several synthesize threads to be processed in parallel. For 'perf top', the threads number hard code to online CPU number. The following patch will introduce an option to set it. For other perf tools, the thread number is 1. Because the process function is not ready for multithreading, e.g. process_synthesized_event. This patch series only support event synthesize multithreading for 'perf top'. For other tools, it can be done separately later. With multithread applied, the total processing time can get up to 1.56x speedup on Knights Mill for 'perf top'. For specific single event processing, the processing time could increase because of the lock contention. So proc_map_timeout may need to be increased. Otherwise some proc maps will be truncated. Based on my test, increasing the proc_map_timeout has small impact on the total processing time. The total processing time still get 1.49x speedup on Knights Mill after increasing the proc_map_timeout. The patch itself doesn't increase the proc_map_timeout. Doesn't need to implement multithreading for per task monitoring, perf_event__synthesize_thread_map. It doesn't have performance issue. Committer testing: # getconf _NPROCESSORS_ONLN 4 # perf trace --no-inherit -e clone -o /tmp/output perf top # tail -4 /tmp/bla 0.124 ( 0.041 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eb3a8f30, parent_tidptr: 0x7fc3eb3a99d0, child_tidptr: 0x7fc3eb3a99d0, tls: 0x7fc3eb3a9700) = 9548 (perf) 0.246 ( 0.023 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eaba7f30, parent_tidptr: 0x7fc3eaba89d0, child_tidptr: 0x7fc3eaba89d0, tls: 0x7fc3eaba8700) = 9549 (perf) 0.286 ( 0.019 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9550 (perf) 246.540 ( 0.047 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9551 (perf) # Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1506696477-146932-4-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-29 22:47:54 +08:00
if (thread_nr <= 1) {
err = __perf_event__synthesize_threads(tool, process,
machine, mmap_data,
dirent, base, n);
goto free_dirent;
}
if (thread_nr > n)
thread_nr = n;
synthesize_threads = calloc(sizeof(pthread_t), thread_nr);
if (synthesize_threads == NULL)
goto free_dirent;
args = calloc(sizeof(*args), thread_nr);
if (args == NULL)
goto free_threads;
num_per_thread = n / thread_nr;
m = n % thread_nr;
for (i = 0; i < thread_nr; i++) {
args[i].tool = tool;
args[i].process = process;
args[i].machine = machine;
args[i].mmap_data = mmap_data;
args[i].dirent = dirent;
}
for (i = 0; i < m; i++) {
args[i].num = num_per_thread + 1;
args[i].start = i * args[i].num;
}
if (i != 0)
base = args[i-1].start + args[i-1].num;
for (j = i; j < thread_nr; j++) {
args[j].num = num_per_thread;
args[j].start = base + (j - i) * args[i].num;
}
for (i = 0; i < thread_nr; i++) {
if (pthread_create(&synthesize_threads[i], NULL,
synthesize_threads_worker, &args[i]))
goto out_join;
}
err = 0;
out_join:
for (i = 0; i < thread_nr; i++)
pthread_join(synthesize_threads[i], NULL);
free(args);
free_threads:
free(synthesize_threads);
free_dirent:
for (i = 0; i < n; i++)
free(dirent[i]);
free(dirent);
return err;
}
perf tools: Handle relocatable kernels DSOs don't have this problem because the kernel emits a PERF_MMAP for each new executable mapping it performs on monitored threads. To fix the kernel case we simulate the same behaviour, by having 'perf record' to synthesize a PERF_MMAP for the kernel, encoded like this: [root@doppio ~]# perf record -a -f sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.344 MB perf.data (~15038 samples) ] [root@doppio ~]# perf report -D | head -10 0xd0 [0x40]: event: 1 . . ... raw event: size 64 bytes . 0000: 01 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 ......@........ . 0010: 00 00 00 81 ff ff ff ff 00 00 00 00 00 00 00 00 ............... . 0020: 00 00 00 00 00 00 00 00 5b 6b 65 72 6e 65 6c 2e ........ [kernel . 0030: 6b 61 6c 6c 73 79 6d 73 2e 5f 74 65 78 74 5d 00 kallsyms._text] . 0xd0 [0x40]: PERF_RECORD_MMAP 0/0: [0xffffffff81000000((nil)) @ (nil)]: [kernel.kallsyms._text] I.e. we identify such event as having: .pid = 0 .filename = [kernel.kallsyms.REFNAME] .start = REFNAME addr in /proc/kallsyms at 'perf record' time and use now a hardcoded value of '.text' for REFNAME. Then, later, in 'perf report', if there are any kernel hits and thus we need to resolve kernel symbols, we search for REFNAME and if its address changed, relocation happened and we thus must change the kernel mapping routines to one that uses .pgoff as the relocation to apply. This way we use the same mechanism used for the other DSOs and don't have to do a two pass in all the kernel symbols. Reported-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> LKML-Reference: <1262717431-1246-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-06 02:50:31 +08:00
struct process_symbol_args {
const char *name;
u64 start;
};
static int find_symbol_cb(void *arg, const char *name, char type,
u64 start)
perf tools: Handle relocatable kernels DSOs don't have this problem because the kernel emits a PERF_MMAP for each new executable mapping it performs on monitored threads. To fix the kernel case we simulate the same behaviour, by having 'perf record' to synthesize a PERF_MMAP for the kernel, encoded like this: [root@doppio ~]# perf record -a -f sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.344 MB perf.data (~15038 samples) ] [root@doppio ~]# perf report -D | head -10 0xd0 [0x40]: event: 1 . . ... raw event: size 64 bytes . 0000: 01 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 ......@........ . 0010: 00 00 00 81 ff ff ff ff 00 00 00 00 00 00 00 00 ............... . 0020: 00 00 00 00 00 00 00 00 5b 6b 65 72 6e 65 6c 2e ........ [kernel . 0030: 6b 61 6c 6c 73 79 6d 73 2e 5f 74 65 78 74 5d 00 kallsyms._text] . 0xd0 [0x40]: PERF_RECORD_MMAP 0/0: [0xffffffff81000000((nil)) @ (nil)]: [kernel.kallsyms._text] I.e. we identify such event as having: .pid = 0 .filename = [kernel.kallsyms.REFNAME] .start = REFNAME addr in /proc/kallsyms at 'perf record' time and use now a hardcoded value of '.text' for REFNAME. Then, later, in 'perf report', if there are any kernel hits and thus we need to resolve kernel symbols, we search for REFNAME and if its address changed, relocation happened and we thus must change the kernel mapping routines to one that uses .pgoff as the relocation to apply. This way we use the same mechanism used for the other DSOs and don't have to do a two pass in all the kernel symbols. Reported-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> LKML-Reference: <1262717431-1246-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-06 02:50:31 +08:00
{
struct process_symbol_args *args = arg;
/*
* Must be a function or at least an alias, as in PARISC64, where "_text" is
* an 'A' to the same address as "_stext".
*/
if (!(kallsyms__is_function(type) ||
type == 'A') || strcmp(name, args->name))
perf tools: Handle relocatable kernels DSOs don't have this problem because the kernel emits a PERF_MMAP for each new executable mapping it performs on monitored threads. To fix the kernel case we simulate the same behaviour, by having 'perf record' to synthesize a PERF_MMAP for the kernel, encoded like this: [root@doppio ~]# perf record -a -f sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.344 MB perf.data (~15038 samples) ] [root@doppio ~]# perf report -D | head -10 0xd0 [0x40]: event: 1 . . ... raw event: size 64 bytes . 0000: 01 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 ......@........ . 0010: 00 00 00 81 ff ff ff ff 00 00 00 00 00 00 00 00 ............... . 0020: 00 00 00 00 00 00 00 00 5b 6b 65 72 6e 65 6c 2e ........ [kernel . 0030: 6b 61 6c 6c 73 79 6d 73 2e 5f 74 65 78 74 5d 00 kallsyms._text] . 0xd0 [0x40]: PERF_RECORD_MMAP 0/0: [0xffffffff81000000((nil)) @ (nil)]: [kernel.kallsyms._text] I.e. we identify such event as having: .pid = 0 .filename = [kernel.kallsyms.REFNAME] .start = REFNAME addr in /proc/kallsyms at 'perf record' time and use now a hardcoded value of '.text' for REFNAME. Then, later, in 'perf report', if there are any kernel hits and thus we need to resolve kernel symbols, we search for REFNAME and if its address changed, relocation happened and we thus must change the kernel mapping routines to one that uses .pgoff as the relocation to apply. This way we use the same mechanism used for the other DSOs and don't have to do a two pass in all the kernel symbols. Reported-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> LKML-Reference: <1262717431-1246-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-06 02:50:31 +08:00
return 0;
args->start = start;
return 1;
}
perf symbols: Accept symbols starting at address 0 That is the case of _text on s390, and we have some functions that return an address, using address zero to report problems, oops. This would lead the symbol loading routines to not use "_text" as the reference relocation symbol, or the first symbol for the kernel, but use instead "_stext", that is at the same address on x86_64 and others, but not on s390: [acme@localhost perf-4.11.0-rc6]$ head -15 /proc/kallsyms 0000000000000000 T _text 0000000000000418 t iplstart 0000000000000800 T start 000000000000080a t .base 000000000000082e t .sk8x8 0000000000000834 t .gotr 0000000000000842 t .cmd 0000000000000846 t .parm 000000000000084a t .lowcase 0000000000010000 T startup 0000000000010010 T startup_kdump 0000000000010214 t startup_kdump_relocated 0000000000011000 T startup_continue 00000000000112a0 T _ehead 0000000000100000 T _stext [acme@localhost perf-4.11.0-rc6]$ Which in turn would make 'perf test vmlinux' to fail because it wouldn't find the symbols before "_stext" in kallsyms. Fix it by using the return value only for errors and storing the address, when the symbol is successfully found, in a provided pointer arg. Before this patch: After: [acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1 1: vmlinux symtab matches kallsyms : --- start --- test child forked, pid 40693 Looking at the vmlinux_path (8 entries long) Using /usr/lib/debug/lib/modules/3.10.0-654.el7.s390x/vmlinux for symbols ERR : 0: _text not on kallsyms ERR : 0x418: iplstart not on kallsyms ERR : 0x800: start not on kallsyms ERR : 0x80a: .base not on kallsyms ERR : 0x82e: .sk8x8 not on kallsyms ERR : 0x834: .gotr not on kallsyms ERR : 0x842: .cmd not on kallsyms ERR : 0x846: .parm not on kallsyms ERR : 0x84a: .lowcase not on kallsyms ERR : 0x10000: startup not on kallsyms ERR : 0x10010: startup_kdump not on kallsyms ERR : 0x10214: startup_kdump_relocated not on kallsyms ERR : 0x11000: startup_continue not on kallsyms ERR : 0x112a0: _ehead not on kallsyms <SNIP warnings> test child finished with -1 ---- end ---- vmlinux symtab matches kallsyms: FAILED! [acme@localhost perf-4.11.0-rc6]$ After: [acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1 1: vmlinux symtab matches kallsyms : --- start --- test child forked, pid 47160 <SNIP warnings> test child finished with 0 ---- end ---- vmlinux symtab matches kallsyms: Ok [acme@localhost perf-4.11.0-rc6]$ Reported-by: Michael Petlan <mpetlan@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-9x9bwgd3btwdk1u51xie93fz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-28 08:21:09 +08:00
int kallsyms__get_function_start(const char *kallsyms_filename,
const char *symbol_name, u64 *addr)
{
struct process_symbol_args args = { .name = symbol_name, };
if (kallsyms__parse(kallsyms_filename, &args, find_symbol_cb) <= 0)
perf symbols: Accept symbols starting at address 0 That is the case of _text on s390, and we have some functions that return an address, using address zero to report problems, oops. This would lead the symbol loading routines to not use "_text" as the reference relocation symbol, or the first symbol for the kernel, but use instead "_stext", that is at the same address on x86_64 and others, but not on s390: [acme@localhost perf-4.11.0-rc6]$ head -15 /proc/kallsyms 0000000000000000 T _text 0000000000000418 t iplstart 0000000000000800 T start 000000000000080a t .base 000000000000082e t .sk8x8 0000000000000834 t .gotr 0000000000000842 t .cmd 0000000000000846 t .parm 000000000000084a t .lowcase 0000000000010000 T startup 0000000000010010 T startup_kdump 0000000000010214 t startup_kdump_relocated 0000000000011000 T startup_continue 00000000000112a0 T _ehead 0000000000100000 T _stext [acme@localhost perf-4.11.0-rc6]$ Which in turn would make 'perf test vmlinux' to fail because it wouldn't find the symbols before "_stext" in kallsyms. Fix it by using the return value only for errors and storing the address, when the symbol is successfully found, in a provided pointer arg. Before this patch: After: [acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1 1: vmlinux symtab matches kallsyms : --- start --- test child forked, pid 40693 Looking at the vmlinux_path (8 entries long) Using /usr/lib/debug/lib/modules/3.10.0-654.el7.s390x/vmlinux for symbols ERR : 0: _text not on kallsyms ERR : 0x418: iplstart not on kallsyms ERR : 0x800: start not on kallsyms ERR : 0x80a: .base not on kallsyms ERR : 0x82e: .sk8x8 not on kallsyms ERR : 0x834: .gotr not on kallsyms ERR : 0x842: .cmd not on kallsyms ERR : 0x846: .parm not on kallsyms ERR : 0x84a: .lowcase not on kallsyms ERR : 0x10000: startup not on kallsyms ERR : 0x10010: startup_kdump not on kallsyms ERR : 0x10214: startup_kdump_relocated not on kallsyms ERR : 0x11000: startup_continue not on kallsyms ERR : 0x112a0: _ehead not on kallsyms <SNIP warnings> test child finished with -1 ---- end ---- vmlinux symtab matches kallsyms: FAILED! [acme@localhost perf-4.11.0-rc6]$ After: [acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1 1: vmlinux symtab matches kallsyms : --- start --- test child forked, pid 47160 <SNIP warnings> test child finished with 0 ---- end ---- vmlinux symtab matches kallsyms: Ok [acme@localhost perf-4.11.0-rc6]$ Reported-by: Michael Petlan <mpetlan@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-9x9bwgd3btwdk1u51xie93fz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-28 08:21:09 +08:00
return -1;
perf symbols: Accept symbols starting at address 0 That is the case of _text on s390, and we have some functions that return an address, using address zero to report problems, oops. This would lead the symbol loading routines to not use "_text" as the reference relocation symbol, or the first symbol for the kernel, but use instead "_stext", that is at the same address on x86_64 and others, but not on s390: [acme@localhost perf-4.11.0-rc6]$ head -15 /proc/kallsyms 0000000000000000 T _text 0000000000000418 t iplstart 0000000000000800 T start 000000000000080a t .base 000000000000082e t .sk8x8 0000000000000834 t .gotr 0000000000000842 t .cmd 0000000000000846 t .parm 000000000000084a t .lowcase 0000000000010000 T startup 0000000000010010 T startup_kdump 0000000000010214 t startup_kdump_relocated 0000000000011000 T startup_continue 00000000000112a0 T _ehead 0000000000100000 T _stext [acme@localhost perf-4.11.0-rc6]$ Which in turn would make 'perf test vmlinux' to fail because it wouldn't find the symbols before "_stext" in kallsyms. Fix it by using the return value only for errors and storing the address, when the symbol is successfully found, in a provided pointer arg. Before this patch: After: [acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1 1: vmlinux symtab matches kallsyms : --- start --- test child forked, pid 40693 Looking at the vmlinux_path (8 entries long) Using /usr/lib/debug/lib/modules/3.10.0-654.el7.s390x/vmlinux for symbols ERR : 0: _text not on kallsyms ERR : 0x418: iplstart not on kallsyms ERR : 0x800: start not on kallsyms ERR : 0x80a: .base not on kallsyms ERR : 0x82e: .sk8x8 not on kallsyms ERR : 0x834: .gotr not on kallsyms ERR : 0x842: .cmd not on kallsyms ERR : 0x846: .parm not on kallsyms ERR : 0x84a: .lowcase not on kallsyms ERR : 0x10000: startup not on kallsyms ERR : 0x10010: startup_kdump not on kallsyms ERR : 0x10214: startup_kdump_relocated not on kallsyms ERR : 0x11000: startup_continue not on kallsyms ERR : 0x112a0: _ehead not on kallsyms <SNIP warnings> test child finished with -1 ---- end ---- vmlinux symtab matches kallsyms: FAILED! [acme@localhost perf-4.11.0-rc6]$ After: [acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1 1: vmlinux symtab matches kallsyms : --- start --- test child forked, pid 47160 <SNIP warnings> test child finished with 0 ---- end ---- vmlinux symtab matches kallsyms: Ok [acme@localhost perf-4.11.0-rc6]$ Reported-by: Michael Petlan <mpetlan@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-9x9bwgd3btwdk1u51xie93fz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-28 08:21:09 +08:00
*addr = args.start;
return 0;
}
int __weak perf_event__synthesize_extra_kmaps(struct perf_tool *tool __maybe_unused,
perf_event__handler_t process __maybe_unused,
struct machine *machine __maybe_unused)
{
return 0;
}
static int __perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
perf_event__handler_t process,
struct machine *machine)
perf tools: Handle relocatable kernels DSOs don't have this problem because the kernel emits a PERF_MMAP for each new executable mapping it performs on monitored threads. To fix the kernel case we simulate the same behaviour, by having 'perf record' to synthesize a PERF_MMAP for the kernel, encoded like this: [root@doppio ~]# perf record -a -f sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.344 MB perf.data (~15038 samples) ] [root@doppio ~]# perf report -D | head -10 0xd0 [0x40]: event: 1 . . ... raw event: size 64 bytes . 0000: 01 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 ......@........ . 0010: 00 00 00 81 ff ff ff ff 00 00 00 00 00 00 00 00 ............... . 0020: 00 00 00 00 00 00 00 00 5b 6b 65 72 6e 65 6c 2e ........ [kernel . 0030: 6b 61 6c 6c 73 79 6d 73 2e 5f 74 65 78 74 5d 00 kallsyms._text] . 0xd0 [0x40]: PERF_RECORD_MMAP 0/0: [0xffffffff81000000((nil)) @ (nil)]: [kernel.kallsyms._text] I.e. we identify such event as having: .pid = 0 .filename = [kernel.kallsyms.REFNAME] .start = REFNAME addr in /proc/kallsyms at 'perf record' time and use now a hardcoded value of '.text' for REFNAME. Then, later, in 'perf report', if there are any kernel hits and thus we need to resolve kernel symbols, we search for REFNAME and if its address changed, relocation happened and we thus must change the kernel mapping routines to one that uses .pgoff as the relocation to apply. This way we use the same mechanism used for the other DSOs and don't have to do a two pass in all the kernel symbols. Reported-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> LKML-Reference: <1262717431-1246-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-06 02:50:31 +08:00
{
size_t size;
struct map *map = machine__kernel_map(machine);
struct kmap *kmap;
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
int err;
union perf_event *event;
if (symbol_conf.kptr_restrict)
return -1;
if (map == NULL)
return -1;
perf tools: Handle relocatable kernels DSOs don't have this problem because the kernel emits a PERF_MMAP for each new executable mapping it performs on monitored threads. To fix the kernel case we simulate the same behaviour, by having 'perf record' to synthesize a PERF_MMAP for the kernel, encoded like this: [root@doppio ~]# perf record -a -f sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.344 MB perf.data (~15038 samples) ] [root@doppio ~]# perf report -D | head -10 0xd0 [0x40]: event: 1 . . ... raw event: size 64 bytes . 0000: 01 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 ......@........ . 0010: 00 00 00 81 ff ff ff ff 00 00 00 00 00 00 00 00 ............... . 0020: 00 00 00 00 00 00 00 00 5b 6b 65 72 6e 65 6c 2e ........ [kernel . 0030: 6b 61 6c 6c 73 79 6d 73 2e 5f 74 65 78 74 5d 00 kallsyms._text] . 0xd0 [0x40]: PERF_RECORD_MMAP 0/0: [0xffffffff81000000((nil)) @ (nil)]: [kernel.kallsyms._text] I.e. we identify such event as having: .pid = 0 .filename = [kernel.kallsyms.REFNAME] .start = REFNAME addr in /proc/kallsyms at 'perf record' time and use now a hardcoded value of '.text' for REFNAME. Then, later, in 'perf report', if there are any kernel hits and thus we need to resolve kernel symbols, we search for REFNAME and if its address changed, relocation happened and we thus must change the kernel mapping routines to one that uses .pgoff as the relocation to apply. This way we use the same mechanism used for the other DSOs and don't have to do a two pass in all the kernel symbols. Reported-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> LKML-Reference: <1262717431-1246-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-06 02:50:31 +08:00
/*
* We should get this from /sys/kernel/sections/.text, but till that is
* available use this, and after it is use this as a fallback for older
* kernels.
*/
event = zalloc((sizeof(event->mmap) + machine->id_hdr_size));
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
if (event == NULL) {
pr_debug("Not enough memory synthesizing mmap event "
"for kernel modules\n");
return -1;
}
perf tools: Handle relocatable kernels DSOs don't have this problem because the kernel emits a PERF_MMAP for each new executable mapping it performs on monitored threads. To fix the kernel case we simulate the same behaviour, by having 'perf record' to synthesize a PERF_MMAP for the kernel, encoded like this: [root@doppio ~]# perf record -a -f sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.344 MB perf.data (~15038 samples) ] [root@doppio ~]# perf report -D | head -10 0xd0 [0x40]: event: 1 . . ... raw event: size 64 bytes . 0000: 01 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 ......@........ . 0010: 00 00 00 81 ff ff ff ff 00 00 00 00 00 00 00 00 ............... . 0020: 00 00 00 00 00 00 00 00 5b 6b 65 72 6e 65 6c 2e ........ [kernel . 0030: 6b 61 6c 6c 73 79 6d 73 2e 5f 74 65 78 74 5d 00 kallsyms._text] . 0xd0 [0x40]: PERF_RECORD_MMAP 0/0: [0xffffffff81000000((nil)) @ (nil)]: [kernel.kallsyms._text] I.e. we identify such event as having: .pid = 0 .filename = [kernel.kallsyms.REFNAME] .start = REFNAME addr in /proc/kallsyms at 'perf record' time and use now a hardcoded value of '.text' for REFNAME. Then, later, in 'perf report', if there are any kernel hits and thus we need to resolve kernel symbols, we search for REFNAME and if its address changed, relocation happened and we thus must change the kernel mapping routines to one that uses .pgoff as the relocation to apply. This way we use the same mechanism used for the other DSOs and don't have to do a two pass in all the kernel symbols. Reported-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> LKML-Reference: <1262717431-1246-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-06 02:50:31 +08:00
if (machine__is_host(machine)) {
/*
* kernel uses PERF_RECORD_MISC_USER for user space maps,
* see kernel/perf_event.c __perf_event_mmap
*/
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
event->header.misc = PERF_RECORD_MISC_KERNEL;
} else {
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
event->header.misc = PERF_RECORD_MISC_GUEST_KERNEL;
}
perf tools: Handle relocatable kernels DSOs don't have this problem because the kernel emits a PERF_MMAP for each new executable mapping it performs on monitored threads. To fix the kernel case we simulate the same behaviour, by having 'perf record' to synthesize a PERF_MMAP for the kernel, encoded like this: [root@doppio ~]# perf record -a -f sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.344 MB perf.data (~15038 samples) ] [root@doppio ~]# perf report -D | head -10 0xd0 [0x40]: event: 1 . . ... raw event: size 64 bytes . 0000: 01 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 ......@........ . 0010: 00 00 00 81 ff ff ff ff 00 00 00 00 00 00 00 00 ............... . 0020: 00 00 00 00 00 00 00 00 5b 6b 65 72 6e 65 6c 2e ........ [kernel . 0030: 6b 61 6c 6c 73 79 6d 73 2e 5f 74 65 78 74 5d 00 kallsyms._text] . 0xd0 [0x40]: PERF_RECORD_MMAP 0/0: [0xffffffff81000000((nil)) @ (nil)]: [kernel.kallsyms._text] I.e. we identify such event as having: .pid = 0 .filename = [kernel.kallsyms.REFNAME] .start = REFNAME addr in /proc/kallsyms at 'perf record' time and use now a hardcoded value of '.text' for REFNAME. Then, later, in 'perf report', if there are any kernel hits and thus we need to resolve kernel symbols, we search for REFNAME and if its address changed, relocation happened and we thus must change the kernel mapping routines to one that uses .pgoff as the relocation to apply. This way we use the same mechanism used for the other DSOs and don't have to do a two pass in all the kernel symbols. Reported-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> LKML-Reference: <1262717431-1246-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-06 02:50:31 +08:00
kmap = map__kmap(map);
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
size = snprintf(event->mmap.filename, sizeof(event->mmap.filename),
"%s%s", machine->mmap_name, kmap->ref_reloc_sym->name) + 1;
perf tools: fix ALIGN redefinition in system headers On some systems (e.g. Android), ALIGN is defined in system headers as ALIGN(p). The definition of ALIGN used in perf takes 2 parameters: ALIGN(x,a). This leads to redefinition conflicts. Redefinition error on Android: In file included from util/include/linux/list.h:1:0, from util/callchain.h:5, from util/hist.h:6, from util/session.h:4, from util/build-id.h:4, from util/annotate.c:11: util/include/linux/kernel.h:11:0: error: "ALIGN" redefined [-Werror] bionic/libc/include/sys/param.h:38:0: note: this is the location of the previous definition Conflics with system defined ALIGN in Android: util/event.c: In function 'perf_event__synthesize_comm': util/event.c:115:32: error: macro "ALIGN" passed 2 arguments, but takes just 1 util/event.c:115:9: error: 'ALIGN' undeclared (first use in this function) util/event.c:115:9: note: each undeclared identifier is reported only once for each function it appears in In order to avoid this redefinition, ALIGN is renamed to PERF_ALIGN. Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Irina Tirdea <irina.tirdea@intel.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1347315303-29906-5-git-send-email-irina.tirdea@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11 06:15:01 +08:00
size = PERF_ALIGN(size, sizeof(u64));
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
event->mmap.header.type = PERF_RECORD_MMAP;
event->mmap.header.size = (sizeof(event->mmap) -
(sizeof(event->mmap.filename) - size) + machine->id_hdr_size);
event->mmap.pgoff = kmap->ref_reloc_sym->addr;
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
event->mmap.start = map->start;
event->mmap.len = map->end - event->mmap.start;
event->mmap.pid = machine->pid;
err = perf_tool__process_synth_event(tool, event, machine, process);
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-02 20:25:28 +08:00
free(event);
return err;
perf tools: Handle relocatable kernels DSOs don't have this problem because the kernel emits a PERF_MMAP for each new executable mapping it performs on monitored threads. To fix the kernel case we simulate the same behaviour, by having 'perf record' to synthesize a PERF_MMAP for the kernel, encoded like this: [root@doppio ~]# perf record -a -f sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.344 MB perf.data (~15038 samples) ] [root@doppio ~]# perf report -D | head -10 0xd0 [0x40]: event: 1 . . ... raw event: size 64 bytes . 0000: 01 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 ......@........ . 0010: 00 00 00 81 ff ff ff ff 00 00 00 00 00 00 00 00 ............... . 0020: 00 00 00 00 00 00 00 00 5b 6b 65 72 6e 65 6c 2e ........ [kernel . 0030: 6b 61 6c 6c 73 79 6d 73 2e 5f 74 65 78 74 5d 00 kallsyms._text] . 0xd0 [0x40]: PERF_RECORD_MMAP 0/0: [0xffffffff81000000((nil)) @ (nil)]: [kernel.kallsyms._text] I.e. we identify such event as having: .pid = 0 .filename = [kernel.kallsyms.REFNAME] .start = REFNAME addr in /proc/kallsyms at 'perf record' time and use now a hardcoded value of '.text' for REFNAME. Then, later, in 'perf report', if there are any kernel hits and thus we need to resolve kernel symbols, we search for REFNAME and if its address changed, relocation happened and we thus must change the kernel mapping routines to one that uses .pgoff as the relocation to apply. This way we use the same mechanism used for the other DSOs and don't have to do a two pass in all the kernel symbols. Reported-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> LKML-Reference: <1262717431-1246-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-06 02:50:31 +08:00
}
int perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
perf_event__handler_t process,
struct machine *machine)
{
int err;
err = __perf_event__synthesize_kernel_mmap(tool, process, machine);
if (err < 0)
return err;
return perf_event__synthesize_extra_kmaps(tool, process, machine);
}
int perf_event__synthesize_thread_map2(struct perf_tool *tool,
struct thread_map *threads,
perf_event__handler_t process,
struct machine *machine)
{
union perf_event *event;
int i, err, size;
size = sizeof(event->thread_map);
size += threads->nr * sizeof(event->thread_map.entries[0]);
event = zalloc(size);
if (!event)
return -ENOMEM;
event->header.type = PERF_RECORD_THREAD_MAP;
event->header.size = size;
event->thread_map.nr = threads->nr;
for (i = 0; i < threads->nr; i++) {
struct thread_map_event_entry *entry = &event->thread_map.entries[i];
char *comm = thread_map__comm(threads, i);
if (!comm)
comm = (char *) "";
entry->pid = thread_map__pid(threads, i);
strncpy((char *) &entry->comm, comm, sizeof(entry->comm));
}
err = process(tool, event, NULL, machine);
free(event);
return err;
}
static void synthesize_cpus(struct cpu_map_entries *cpus,
struct cpu_map *map)
{
int i;
cpus->nr = map->nr;
for (i = 0; i < map->nr; i++)
cpus->cpu[i] = map->map[i];
}
static void synthesize_mask(struct cpu_map_mask *mask,
struct cpu_map *map, int max)
{
int i;
mask->nr = BITS_TO_LONGS(max);
mask->long_size = sizeof(long);
for (i = 0; i < map->nr; i++)
set_bit(map->map[i], mask->mask);
}
static size_t cpus_size(struct cpu_map *map)
{
return sizeof(struct cpu_map_entries) + map->nr * sizeof(u16);
}
static size_t mask_size(struct cpu_map *map, int *max)
{
int i;
*max = 0;
for (i = 0; i < map->nr; i++) {
/* bit possition of the cpu is + 1 */
int bit = map->map[i] + 1;
if (bit > *max)
*max = bit;
}
return sizeof(struct cpu_map_mask) + BITS_TO_LONGS(*max) * sizeof(long);
}
void *cpu_map_data__alloc(struct cpu_map *map, size_t *size, u16 *type, int *max)
{
size_t size_cpus, size_mask;
bool is_dummy = cpu_map__empty(map);
/*
* Both array and mask data have variable size based
* on the number of cpus and their actual values.
* The size of the 'struct cpu_map_data' is:
*
* array = size of 'struct cpu_map_entries' +
* number of cpus * sizeof(u64)
*
* mask = size of 'struct cpu_map_mask' +
* maximum cpu bit converted to size of longs
*
* and finaly + the size of 'struct cpu_map_data'.
*/
size_cpus = cpus_size(map);
size_mask = mask_size(map, max);
if (is_dummy || (size_cpus < size_mask)) {
*size += size_cpus;
*type = PERF_CPU_MAP__CPUS;
} else {
*size += size_mask;
*type = PERF_CPU_MAP__MASK;
}
*size += sizeof(struct cpu_map_data);
*size = PERF_ALIGN(*size, sizeof(u64));
return zalloc(*size);
}
void cpu_map_data__synthesize(struct cpu_map_data *data, struct cpu_map *map,
u16 type, int max)
{
data->type = type;
switch (type) {
case PERF_CPU_MAP__CPUS:
synthesize_cpus((struct cpu_map_entries *) data->data, map);
break;
case PERF_CPU_MAP__MASK:
synthesize_mask((struct cpu_map_mask *) data->data, map, max);
default:
break;
};
}
static struct cpu_map_event* cpu_map_event__new(struct cpu_map *map)
{
size_t size = sizeof(struct cpu_map_event);
struct cpu_map_event *event;
int max;
u16 type;
event = cpu_map_data__alloc(map, &size, &type, &max);
if (!event)
return NULL;
event->header.type = PERF_RECORD_CPU_MAP;
event->header.size = size;
event->data.type = type;
cpu_map_data__synthesize(&event->data, map, type, max);
return event;
}
int perf_event__synthesize_cpu_map(struct perf_tool *tool,
struct cpu_map *map,
perf_event__handler_t process,
struct machine *machine)
{
struct cpu_map_event *event;
int err;
event = cpu_map_event__new(map);
if (!event)
return -ENOMEM;
err = process(tool, (union perf_event *) event, NULL, machine);
free(event);
return err;
}
int perf_event__synthesize_stat_config(struct perf_tool *tool,
struct perf_stat_config *config,
perf_event__handler_t process,
struct machine *machine)
{
struct stat_config_event *event;
int size, i = 0, err;
size = sizeof(*event);
size += (PERF_STAT_CONFIG_TERM__MAX * sizeof(event->data[0]));
event = zalloc(size);
if (!event)
return -ENOMEM;
event->header.type = PERF_RECORD_STAT_CONFIG;
event->header.size = size;
event->nr = PERF_STAT_CONFIG_TERM__MAX;
#define ADD(__term, __val) \
event->data[i].tag = PERF_STAT_CONFIG_TERM__##__term; \
event->data[i].val = __val; \
i++;
ADD(AGGR_MODE, config->aggr_mode)
ADD(INTERVAL, config->interval)
ADD(SCALE, config->scale)
WARN_ONCE(i != PERF_STAT_CONFIG_TERM__MAX,
"stat config terms unbalanced\n");
#undef ADD
err = process(tool, (union perf_event *) event, NULL, machine);
free(event);
return err;
}
int perf_event__synthesize_stat(struct perf_tool *tool,
u32 cpu, u32 thread, u64 id,
struct perf_counts_values *count,
perf_event__handler_t process,
struct machine *machine)
{
struct stat_event event;
event.header.type = PERF_RECORD_STAT;
event.header.size = sizeof(event);
event.header.misc = 0;
event.id = id;
event.cpu = cpu;
event.thread = thread;
event.val = count->val;
event.ena = count->ena;
event.run = count->run;
return process(tool, (union perf_event *) &event, NULL, machine);
}
int perf_event__synthesize_stat_round(struct perf_tool *tool,
u64 evtime, u64 type,
perf_event__handler_t process,
struct machine *machine)
{
struct stat_round_event event;
event.header.type = PERF_RECORD_STAT_ROUND;
event.header.size = sizeof(event);
event.header.misc = 0;
event.time = evtime;
event.type = type;
return process(tool, (union perf_event *) &event, NULL, machine);
}
void perf_event__read_stat_config(struct perf_stat_config *config,
struct stat_config_event *event)
{
unsigned i;
for (i = 0; i < event->nr; i++) {
switch (event->data[i].tag) {
#define CASE(__term, __val) \
case PERF_STAT_CONFIG_TERM__##__term: \
config->__val = event->data[i].val; \
break;
CASE(AGGR_MODE, aggr_mode)
CASE(SCALE, scale)
CASE(INTERVAL, interval)
#undef CASE
default:
pr_warning("unknown stat config term %" PRIu64 "\n",
event->data[i].tag);
}
}
}
size_t perf_event__fprintf_comm(union perf_event *event, FILE *fp)
{
const char *s;
if (event->header.misc & PERF_RECORD_MISC_COMM_EXEC)
s = " exec";
else
s = "";
return fprintf(fp, "%s: %s:%d/%d\n", s, event->comm.comm, event->comm.pid, event->comm.tid);
}
perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info Introduce a new option to record PERF_RECORD_NAMESPACES events emitted by the kernel when fork, clone, setns or unshare are invoked. And update perf-record documentation with the new option to record namespace events. Committer notes: Combined it with a later patch to allow printing it via 'perf report -D' and be able to test the feature introduced in this patch. Had to move here also perf_ns__name(), that was introduced in another later patch. Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt: util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=] ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx ^ Testing it: # perf record --namespaces -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ] # # perf report -D <SNIP> 3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] 0x1151e0 [0x30]: event: 9 . . ... raw event: size 48 bytes . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h.... . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c.... . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................ <SNIP> NAMESPACES events: 1 <SNIP> # Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:43 +08:00
size_t perf_event__fprintf_namespaces(union perf_event *event, FILE *fp)
{
size_t ret = 0;
struct perf_ns_link_info *ns_link_info;
u32 nr_namespaces, idx;
ns_link_info = event->namespaces.link_info;
nr_namespaces = event->namespaces.nr_namespaces;
ret += fprintf(fp, " %d/%d - nr_namespaces: %u\n\t\t[",
event->namespaces.pid,
event->namespaces.tid,
nr_namespaces);
for (idx = 0; idx < nr_namespaces; idx++) {
if (idx && (idx % 4 == 0))
ret += fprintf(fp, "\n\t\t ");
ret += fprintf(fp, "%u/%s: %" PRIu64 "/%#" PRIx64 "%s", idx,
perf_ns__name(idx), (u64)ns_link_info[idx].dev,
(u64)ns_link_info[idx].ino,
((idx + 1) != nr_namespaces) ? ", " : "]\n");
}
return ret;
}
perf tools: Use __maybe_used for unused variables perf defines both __used and __unused variables to use for marking unused variables. The variable __used is defined to __attribute__((__unused__)), which contradicts the kernel definition to __attribute__((__used__)) for new gcc versions. On Android, __used is also defined in system headers and this leads to warnings like: warning: '__used__' attribute ignored __unused is not defined in the kernel and is not a standard definition. If __unused is included everywhere instead of __used, this leads to conflicts with glibc headers, since glibc has a variables with this name in its headers. The best approach is to use __maybe_unused, the definition used in the kernel for __attribute__((unused)). In this way there is only one definition in perf sources (instead of 2 definitions that point to the same thing: __used and __unused) and it works on both Linux and Android. This patch simply replaces all instances of __used and __unused with __maybe_unused. Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com [ committer note: fixed up conflict with a116e05 in builtin-sched.c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11 06:15:03 +08:00
int perf_event__process_comm(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
{
return machine__process_comm_event(machine, event, sample);
}
perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info Introduce a new option to record PERF_RECORD_NAMESPACES events emitted by the kernel when fork, clone, setns or unshare are invoked. And update perf-record documentation with the new option to record namespace events. Committer notes: Combined it with a later patch to allow printing it via 'perf report -D' and be able to test the feature introduced in this patch. Had to move here also perf_ns__name(), that was introduced in another later patch. Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt: util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=] ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx ^ Testing it: # perf record --namespaces -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ] # # perf report -D <SNIP> 3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] 0x1151e0 [0x30]: event: 9 . . ... raw event: size 48 bytes . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h.... . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c.... . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................ <SNIP> NAMESPACES events: 1 <SNIP> # Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:43 +08:00
int perf_event__process_namespaces(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
{
return machine__process_namespaces_event(machine, event, sample);
}
perf tools: Use __maybe_used for unused variables perf defines both __used and __unused variables to use for marking unused variables. The variable __used is defined to __attribute__((__unused__)), which contradicts the kernel definition to __attribute__((__used__)) for new gcc versions. On Android, __used is also defined in system headers and this leads to warnings like: warning: '__used__' attribute ignored __unused is not defined in the kernel and is not a standard definition. If __unused is included everywhere instead of __used, this leads to conflicts with glibc headers, since glibc has a variables with this name in its headers. The best approach is to use __maybe_unused, the definition used in the kernel for __attribute__((unused)). In this way there is only one definition in perf sources (instead of 2 definitions that point to the same thing: __used and __unused) and it works on both Linux and Android. This patch simply replaces all instances of __used and __unused with __maybe_unused. Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com [ committer note: fixed up conflict with a116e05 in builtin-sched.c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11 06:15:03 +08:00
int perf_event__process_lost(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
{
return machine__process_lost_event(machine, event, sample);
}
int perf_event__process_aux(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct machine *machine)
{
return machine__process_aux_event(machine, event);
}
int perf_event__process_itrace_start(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct machine *machine)
{
return machine__process_itrace_start_event(machine, event);
}
perf tools: handle PERF_RECORD_LOST_SAMPLES This patch modifies the perf tool to handle the new RECORD type, PERF_RECORD_LOST_SAMPLES. The number of lost-sample events is stored in .nr_events[PERF_RECORD_LOST_SAMPLES]. The exact number of samples which the kernel dropped is stored in total_lost_samples. When the percentage of dropped samples is greater than 5%, a warning is printed. Here are some examples: Eg 1, Recording different frequently-occurring events is safe with the patch. Only a very low drop rate is associated with such actions. $ perf record -e '{cycles:p,instructions:p}' -c 20003 --no-time ~/tchain ~/tchain $ perf report -D | tail SAMPLE events: 120243 MMAP2 events: 5 LOST_SAMPLES events: 24 FINISHED_ROUND events: 15 cycles:p stats: TOTAL events: 59348 SAMPLE events: 59348 instructions:p stats: TOTAL events: 60895 SAMPLE events: 60895 $ perf report --stdio --group # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 24 # # Samples: 120K of event 'anon group { cycles:p, instructions:p }' # Event count (approx.): 24048600000 # # Overhead Command Shared Object Symbol # ................ ........... ................ .................................. # 99.74% 99.86% tchain_edit tchain_edit [.] f3 0.09% 0.02% tchain_edit tchain_edit [.] f2 0.04% 0.00% tchain_edit [kernel.vmlinux] [k] ixgbe_read_reg Eg 2, Recording the same thing multiple times can lead to high drop rate, but it is not a useful configuration. $ perf record -e '{cycles:p,cycles:p}' -c 20003 --no-time ~/tchain Warning: Processed 600592 samples and lost 99.73% samples! [perf record: Woken up 148 times to write data] [perf record: Captured and wrote 36.922 MB perf.data (1206322 samples)] [perf record: Woken up 1 times to write data] [perf record: Captured and wrote 0.121 MB perf.data (1629 samples)] Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@infradead.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1431285195-14269-9-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-11 03:13:15 +08:00
int perf_event__process_lost_samples(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
{
return machine__process_lost_samples_event(machine, event, sample);
}
int perf_event__process_switch(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct machine *machine)
{
return machine__process_switch_event(machine, event);
}
int perf_event__process_ksymbol(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct machine *machine)
{
return machine__process_ksymbol(machine, event, sample);
}
perf tools: Handle PERF_RECORD_BPF_EVENT This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to turn it on. Committer notes: Add dummy machine__process_bpf_event() variant that returns zero for systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking the build in such systems. Remove the needless include <machine.h> from bpf->event.h, provide just forward declarations for the structs and unions in the parameters, to reduce compilation time and needless rebuilds when machine.h gets changed. Committer testing: When running with: # perf record --bpf-event On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL is not present, we fallback to removing those two bits from perf_event_attr, making the tool to continue to work on older kernels: perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off bpf_event ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off ksymbol ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ------------------------------------------------------------ And then proceeds to work without those two features. As passing --bpf-event is an explicit action performed by the user, perhaps we should emit a warning telling that the kernel has no such feature, but this can be done on top of this patch. Now with a kernel that supports these events, start the 'record --bpf-event -a' and then run 'perf trace sleep 10000' that will use the BPF augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus should generate PERF_RECORD_BPF_EVENT events: [root@quaco ~]# perf record -e dummy -a --bpf-event ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.713 MB perf.data ] [root@quaco ~]# bpftool prog 13: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 14: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 15: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 16: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 17: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 18: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 21: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 22: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 31: tracepoint name sys_enter tag 12504ba9402f952f gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 512B jited 374B memlock 4096B map_ids 30,29,28 32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 256B jited 191B memlock 4096B map_ids 30,29 # perf report -D | grep PERF_RECORD_BPF_EVENT | nl 1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13 2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14 3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15 4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16 5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17 6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18 7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21 8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22 9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29 10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29 11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30 12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30 13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31 14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32 # There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-18 00:15:18 +08:00
int perf_event__process_bpf_event(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct machine *machine)
{
return machine__process_bpf_event(machine, event, sample);
}
size_t perf_event__fprintf_mmap(union perf_event *event, FILE *fp)
{
return fprintf(fp, " %d/%d: [%#" PRIx64 "(%#" PRIx64 ") @ %#" PRIx64 "]: %c %s\n",
event->mmap.pid, event->mmap.tid, event->mmap.start,
event->mmap.len, event->mmap.pgoff,
(event->header.misc & PERF_RECORD_MISC_MMAP_DATA) ? 'r' : 'x',
event->mmap.filename);
}
size_t perf_event__fprintf_mmap2(union perf_event *event, FILE *fp)
{
return fprintf(fp, " %d/%d: [%#" PRIx64 "(%#" PRIx64 ") @ %#" PRIx64
" %02x:%02x %"PRIu64" %"PRIu64"]: %c%c%c%c %s\n",
event->mmap2.pid, event->mmap2.tid, event->mmap2.start,
event->mmap2.len, event->mmap2.pgoff, event->mmap2.maj,
event->mmap2.min, event->mmap2.ino,
event->mmap2.ino_generation,
(event->mmap2.prot & PROT_READ) ? 'r' : '-',
(event->mmap2.prot & PROT_WRITE) ? 'w' : '-',
(event->mmap2.prot & PROT_EXEC) ? 'x' : '-',
(event->mmap2.flags & MAP_SHARED) ? 's' : 'p',
event->mmap2.filename);
}
size_t perf_event__fprintf_thread_map(union perf_event *event, FILE *fp)
{
struct thread_map *threads = thread_map__new_event(&event->thread_map);
size_t ret;
ret = fprintf(fp, " nr: ");
if (threads)
ret += thread_map__fprintf(threads, fp);
else
ret += fprintf(fp, "failed to get threads from event\n");
thread_map__put(threads);
return ret;
}
size_t perf_event__fprintf_cpu_map(union perf_event *event, FILE *fp)
{
struct cpu_map *cpus = cpu_map__new_data(&event->cpu_map.data);
size_t ret;
ret = fprintf(fp, ": ");
if (cpus)
ret += cpu_map__fprintf(cpus, fp);
else
ret += fprintf(fp, "failed to get cpumap from event\n");
cpu_map__put(cpus);
return ret;
}
int perf_event__process_mmap(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
{
return machine__process_mmap_event(machine, event, sample);
}
int perf_event__process_mmap2(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
{
return machine__process_mmap2_event(machine, event, sample);
}
size_t perf_event__fprintf_task(union perf_event *event, FILE *fp)
{
return fprintf(fp, "(%d:%d):(%d:%d)\n",
event->fork.pid, event->fork.tid,
event->fork.ppid, event->fork.ptid);
}
int perf_event__process_fork(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
{
return machine__process_fork_event(machine, event, sample);
}
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
int perf_event__process_exit(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
{
return machine__process_exit_event(machine, event, sample);
}
size_t perf_event__fprintf_aux(union perf_event *event, FILE *fp)
{
return fprintf(fp, " offset: %#"PRIx64" size: %#"PRIx64" flags: %#"PRIx64" [%s%s%s]\n",
event->aux.aux_offset, event->aux.aux_size,
event->aux.flags,
event->aux.flags & PERF_AUX_FLAG_TRUNCATED ? "T" : "",
event->aux.flags & PERF_AUX_FLAG_OVERWRITE ? "O" : "",
event->aux.flags & PERF_AUX_FLAG_PARTIAL ? "P" : "");
}
size_t perf_event__fprintf_itrace_start(union perf_event *event, FILE *fp)
{
return fprintf(fp, " pid: %u tid: %u\n",
event->itrace_start.pid, event->itrace_start.tid);
}
size_t perf_event__fprintf_switch(union perf_event *event, FILE *fp)
{
bool out = event->header.misc & PERF_RECORD_MISC_SWITCH_OUT;
const char *in_out = !out ? "IN " :
!(event->header.misc & PERF_RECORD_MISC_SWITCH_OUT_PREEMPT) ?
"OUT " : "OUT preempt";
if (event->header.type == PERF_RECORD_SWITCH)
return fprintf(fp, " %s\n", in_out);
return fprintf(fp, " %s %s pid/tid: %5u/%-5u\n",
in_out, out ? "next" : "prev",
event->context_switch.next_prev_pid,
event->context_switch.next_prev_tid);
}
static size_t perf_event__fprintf_lost(union perf_event *event, FILE *fp)
{
return fprintf(fp, " lost %" PRIu64 "\n", event->lost.lost);
}
size_t perf_event__fprintf_ksymbol(union perf_event *event, FILE *fp)
{
return fprintf(fp, " ksymbol event with addr %" PRIx64 " len %u type %u flags 0x%x name %s\n",
event->ksymbol_event.addr, event->ksymbol_event.len,
event->ksymbol_event.ksym_type,
event->ksymbol_event.flags, event->ksymbol_event.name);
}
perf tools: Handle PERF_RECORD_BPF_EVENT This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to turn it on. Committer notes: Add dummy machine__process_bpf_event() variant that returns zero for systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking the build in such systems. Remove the needless include <machine.h> from bpf->event.h, provide just forward declarations for the structs and unions in the parameters, to reduce compilation time and needless rebuilds when machine.h gets changed. Committer testing: When running with: # perf record --bpf-event On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL is not present, we fallback to removing those two bits from perf_event_attr, making the tool to continue to work on older kernels: perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off bpf_event ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off ksymbol ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ------------------------------------------------------------ And then proceeds to work without those two features. As passing --bpf-event is an explicit action performed by the user, perhaps we should emit a warning telling that the kernel has no such feature, but this can be done on top of this patch. Now with a kernel that supports these events, start the 'record --bpf-event -a' and then run 'perf trace sleep 10000' that will use the BPF augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus should generate PERF_RECORD_BPF_EVENT events: [root@quaco ~]# perf record -e dummy -a --bpf-event ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.713 MB perf.data ] [root@quaco ~]# bpftool prog 13: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 14: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 15: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 16: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 17: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 18: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 21: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 22: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 31: tracepoint name sys_enter tag 12504ba9402f952f gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 512B jited 374B memlock 4096B map_ids 30,29,28 32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 256B jited 191B memlock 4096B map_ids 30,29 # perf report -D | grep PERF_RECORD_BPF_EVENT | nl 1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13 2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14 3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15 4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16 5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17 6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18 7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21 8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22 9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29 10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29 11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30 12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30 13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31 14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32 # There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-18 00:15:18 +08:00
size_t perf_event__fprintf_bpf_event(union perf_event *event, FILE *fp)
{
return fprintf(fp, " bpf event with type %u, flags %u, id %u\n",
event->bpf_event.type, event->bpf_event.flags,
event->bpf_event.id);
}
size_t perf_event__fprintf(union perf_event *event, FILE *fp)
{
size_t ret = fprintf(fp, "PERF_RECORD_%s",
perf_event__name(event->header.type));
switch (event->header.type) {
case PERF_RECORD_COMM:
ret += perf_event__fprintf_comm(event, fp);
break;
case PERF_RECORD_FORK:
case PERF_RECORD_EXIT:
ret += perf_event__fprintf_task(event, fp);
break;
case PERF_RECORD_MMAP:
ret += perf_event__fprintf_mmap(event, fp);
perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info Introduce a new option to record PERF_RECORD_NAMESPACES events emitted by the kernel when fork, clone, setns or unshare are invoked. And update perf-record documentation with the new option to record namespace events. Committer notes: Combined it with a later patch to allow printing it via 'perf report -D' and be able to test the feature introduced in this patch. Had to move here also perf_ns__name(), that was introduced in another later patch. Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt: util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=] ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx ^ Testing it: # perf record --namespaces -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ] # # perf report -D <SNIP> 3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] 0x1151e0 [0x30]: event: 9 . . ... raw event: size 48 bytes . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h.... . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c.... . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................ <SNIP> NAMESPACES events: 1 <SNIP> # Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-08 04:41:43 +08:00
break;
case PERF_RECORD_NAMESPACES:
ret += perf_event__fprintf_namespaces(event, fp);
break;
case PERF_RECORD_MMAP2:
ret += perf_event__fprintf_mmap2(event, fp);
break;
case PERF_RECORD_AUX:
ret += perf_event__fprintf_aux(event, fp);
break;
case PERF_RECORD_ITRACE_START:
ret += perf_event__fprintf_itrace_start(event, fp);
break;
case PERF_RECORD_SWITCH:
case PERF_RECORD_SWITCH_CPU_WIDE:
ret += perf_event__fprintf_switch(event, fp);
break;
case PERF_RECORD_LOST:
ret += perf_event__fprintf_lost(event, fp);
break;
case PERF_RECORD_KSYMBOL:
ret += perf_event__fprintf_ksymbol(event, fp);
break;
perf tools: Handle PERF_RECORD_BPF_EVENT This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to turn it on. Committer notes: Add dummy machine__process_bpf_event() variant that returns zero for systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking the build in such systems. Remove the needless include <machine.h> from bpf->event.h, provide just forward declarations for the structs and unions in the parameters, to reduce compilation time and needless rebuilds when machine.h gets changed. Committer testing: When running with: # perf record --bpf-event On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL is not present, we fallback to removing those two bits from perf_event_attr, making the tool to continue to work on older kernels: perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off bpf_event ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off ksymbol ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ------------------------------------------------------------ And then proceeds to work without those two features. As passing --bpf-event is an explicit action performed by the user, perhaps we should emit a warning telling that the kernel has no such feature, but this can be done on top of this patch. Now with a kernel that supports these events, start the 'record --bpf-event -a' and then run 'perf trace sleep 10000' that will use the BPF augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus should generate PERF_RECORD_BPF_EVENT events: [root@quaco ~]# perf record -e dummy -a --bpf-event ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.713 MB perf.data ] [root@quaco ~]# bpftool prog 13: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 14: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 15: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 16: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 17: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 18: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 21: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 22: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 31: tracepoint name sys_enter tag 12504ba9402f952f gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 512B jited 374B memlock 4096B map_ids 30,29,28 32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 256B jited 191B memlock 4096B map_ids 30,29 # perf report -D | grep PERF_RECORD_BPF_EVENT | nl 1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13 2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14 3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15 4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16 5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17 6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18 7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21 8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22 9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29 10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29 11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30 12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30 13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31 14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32 # There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-18 00:15:18 +08:00
case PERF_RECORD_BPF_EVENT:
ret += perf_event__fprintf_bpf_event(event, fp);
break;
default:
ret += fprintf(fp, "\n");
}
return ret;
}
int perf_event__process(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
{
return machine__process_event(machine, event, sample);
}
struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr,
struct addr_location *al)
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
{
struct map_groups *mg = thread->mg;
struct machine *machine = mg->machine;
bool load_map = false;
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
al->machine = machine;
al->thread = thread;
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
al->addr = addr;
al->cpumode = cpumode;
al->filtered = 0;
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
if (machine == NULL) {
al->map = NULL;
return NULL;
}
if (cpumode == PERF_RECORD_MISC_KERNEL && perf_host) {
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
al->level = 'k';
mg = &machine->kmaps;
load_map = true;
} else if (cpumode == PERF_RECORD_MISC_USER && perf_host) {
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
al->level = '.';
} else if (cpumode == PERF_RECORD_MISC_GUEST_KERNEL && perf_guest) {
al->level = 'g';
mg = &machine->kmaps;
load_map = true;
} else if (cpumode == PERF_RECORD_MISC_GUEST_USER && perf_guest) {
al->level = 'u';
} else {
al->level = 'H';
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
al->map = NULL;
if ((cpumode == PERF_RECORD_MISC_GUEST_USER ||
cpumode == PERF_RECORD_MISC_GUEST_KERNEL) &&
!perf_guest)
al->filtered |= (1 << HIST_FILTER__GUEST);
if ((cpumode == PERF_RECORD_MISC_USER ||
cpumode == PERF_RECORD_MISC_KERNEL) &&
!perf_host)
al->filtered |= (1 << HIST_FILTER__HOST);
return NULL;
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
}
perf tools: Stop fallbacking to kallsyms for vdso symbols lookup David reports that: <quote> Perf has this hack where it uses the kernel symbol map as a backup when a symbol can't be found in the user's symbol table(s). This causes problems because the tests driving this code path use machine__kernel_ip(), and that is completely meaningless on Sparc. On sparc64 the kernel and user live in physically separate virtual address spaces, rather than a shared one. And the kernel lives at a virtual address that overlaps common userspace addresses. So this test passes almost all the time when a user symbol lookup fails. The consequence of this is that, if the unfound user virtual address in the sample doesn't match up to a kernel symbol either, we trigger things like this code in builtin-top.c: if (al.sym == NULL && al.map != NULL) { const char *msg = "Kernel samples will not be resolved.\n"; /* * As we do lazy loading of symtabs we only will know if the * specified vmlinux file is invalid when we actually have a * hit in kernel space and then try to load it. So if we get * here and there are _no_ symbols in the DSO backing the * kernel map, bail out. * * We may never get here, for instance, if we use -K/ * --hide-kernel-symbols, even if the user specifies an * invalid --vmlinux ;-) */ if (!machine->kptr_restrict_warned && !top->vmlinux_warned && __map__is_kernel(al.map) && map__has_symbols(al.map)) { if (symbol_conf.vmlinux_name) { char serr[256]; dso__strerror_load(al.map->dso, serr, sizeof(serr)); ui__warning("The %s file can't be used: %s\n%s", symbol_conf.vmlinux_name, serr, msg); } else { ui__warning("A vmlinux file was not found.\n%s", msg); } if (use_browser <= 0) sleep(5); top->vmlinux_warned = true; } } When I fire up a compilation on sparc, this triggers immediately. I'm trying to figure out what the "backup to kernel map" code is accomplishing. I see some language in the current code and in the changes that have happened in this area talking about vdso. Does that really happen? The vdso is mapped into userspace virtual addresses, not kernel ones. More history. This didn't cause problems on sparc some time ago, because the kernel IP check used to be "ip < 0" :-) Sparc kernel addresses are not negative. But now with machine__kernel_ip(), which works using the symbol table determined kernel address range, it does trigger. What it all boils down to is that on architectures like sparc, machine__kernel_ip() should always return false in this scenerio, and therefore this kind of logic: if (cpumode == PERF_RECORD_MISC_USER && machine && mg != &machine->kmaps && machine__kernel_ip(machine, al->addr)) { is basically invalid. PERF_RECORD_MISC_USER implies no kernel address can possibly match for the sample/event in question (no matter how hard you try!) :-) </> So, I thought something had changed and in the past we would somehow find that address in the kallsyms, but I couldn't find anything to back that up, the patch introducing this is over a decade old, lots of things changed, so I was just thinking I was missing something. I tried a gtod busy loop to generate vdso activity and added a 'perf probe' at that branch, on x86_64 to see if it ever gets hit: Made thread__find_map() noinline, as 'perf probe' in lines of inline functions seems to not be working, only at function start. (Masami?) # perf probe -x ~/bin/perf -L thread__find_map:57 <thread__find_map@/home/acme/git/perf/tools/perf/util/event.c:57> 57 if (cpumode == PERF_RECORD_MISC_USER && machine && 58 mg != &machine->kmaps && 59 machine__kernel_ip(machine, al->addr)) { 60 mg = &machine->kmaps; 61 load_map = true; 62 goto try_again; } } else { /* * Kernel maps might be changed when loading * symbols so loading * must be done prior to using kernel maps. */ 69 if (load_map) 70 map__load(al->map); 71 al->addr = al->map->map_ip(al->map, al->addr); # perf probe -x ~/bin/perf thread__find_map:60 Added new event: probe_perf:thread__find_map (on thread__find_map:60 in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:thread__find_map -aR sleep 1 # Then used this to see if, system wide, those probe points were being hit: # perf trace -e *perf:thread*/max-stack=8/ ^C[root@jouet ~]# No hits when running 'perf top' and: # cat gtod.c #include <sys/time.h> int main(void) { struct timeval tv; while (1) gettimeofday(&tv, 0); return 0; } [root@jouet c]# ./gtod ^C Pressed 'P' in 'perf top' and the [vdso] samples are there: 62.84% [vdso] [.] __vdso_gettimeofday 8.13% gtod [.] main 7.51% [vdso] [.] 0x0000000000000914 5.78% [vdso] [.] 0x0000000000000917 5.43% gtod [.] _init 2.71% [vdso] [.] 0x000000000000092d 0.35% [kernel] [k] native_io_delay 0.33% libc-2.26.so [.] __memmove_avx_unaligned_erms 0.20% [vdso] [.] 0x000000000000091d 0.17% [i2c_i801] [k] i801_access 0.06% firefox [.] free 0.06% libglib-2.0.so.0.5400.3 [.] g_source_iter_next 0.05% [vdso] [.] 0x0000000000000919 0.05% libpthread-2.26.so [.] __pthread_mutex_lock 0.05% libpixman-1.so.0.34.0 [.] 0x000000000006d3a7 0.04% [kernel] [k] entry_SYSCALL_64_trampoline 0.04% libxul.so [.] style::dom_apis::query_selector_slow 0.04% [kernel] [k] module_get_kallsym 0.04% firefox [.] malloc 0.04% [vdso] [.] 0x0000000000000910 I added a 'perf probe' to thread__find_map:69, and that surely got tons of hits, i.e. for every map found, just to make sure the 'perf probe' command was really working. In the process I noticed a bug, we're only have records for '[vdso]' for pre-existing commands, i.e. ones that are running when we start 'perf top', when we will generate the PERF_RECORD_MMAP by looking at /perf/PID/maps. I.e. like this, for preexisting processes with a vdso map, again, tracing for all the system, only pre-existing processes get a [vdso] map (when having one): [root@jouet ~]# perf probe -x ~/bin/perf __machine__addnew_vdso Added new event: probe_perf:__machine__addnew_vdso (on __machine__addnew_vdso in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:__machine__addnew_vdso -aR sleep 1 [root@jouet ~]# perf trace -e probe_perf:__machine__addnew_vdso/max-stack=8/ 0.000 probe_perf:__machine__addnew_vdso:(568eb3) __machine__addnew_vdso (/home/acme/bin/perf) map__new (/home/acme/bin/perf) machine__process_mmap2_event (/home/acme/bin/perf) machine__process_event (/home/acme/bin/perf) perf_event__process (/home/acme/bin/perf) perf_tool__process_synth_event (/home/acme/bin/perf) perf_event__synthesize_mmap_events (/home/acme/bin/perf) __event__synthesize_thread (/home/acme/bin/perf) The kernel is generating a PERF_RECORD_MMAP for vDSOs, but somehow 'perf top' is not getting those records while 'perf record' is: # perf record ~acme/c/gtod ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.076 MB perf.data (1499 samples) ] # perf report -D | grep PERF_RECORD_MMAP2 71293612401913 0x11b48 [0x70]: PERF_RECORD_MMAP2 25484/25484: [0x400000(0x1000) @ 0 fd:02 1137 541179306]: r-xp /home/acme/c/gtod 71293612419012 0x11be0 [0x70]: PERF_RECORD_MMAP2 25484/25484: [0x7fa4a2783000(0x227000) @ 0 fd:00 3146370 854107250]: r-xp /usr/lib64/ld-2.26.so 71293612432110 0x11c50 [0x60]: PERF_RECORD_MMAP2 25484/25484: [0x7ffcdb53a000(0x2000) @ 0 00:00 0 0]: r-xp [vdso] 71293612509944 0x11cb0 [0x70]: PERF_RECORD_MMAP2 25484/25484: [0x7fa4a23cd000(0x3b6000) @ 0 fd:00 3149723 262067164]: r-xp /usr/lib64/libc-2.26.so # # perf script | grep vdso | head gtod 25484 71293.612768: 2485554 cycles:ppp: 7ffcdb53a914 [unknown] ([vdso]) gtod 25484 71293.613576: 2149343 cycles:ppp: 7ffcdb53a917 [unknown] ([vdso]) gtod 25484 71293.614274: 1814652 cycles:ppp: 7ffcdb53aca8 __vdso_gettimeofday+0x98 ([vdso]) gtod 25484 71293.614862: 1669070 cycles:ppp: 7ffcdb53acc5 __vdso_gettimeofday+0xb5 ([vdso]) gtod 25484 71293.615404: 1451589 cycles:ppp: 7ffcdb53acc5 __vdso_gettimeofday+0xb5 ([vdso]) gtod 25484 71293.615999: 1269941 cycles:ppp: 7ffcdb53ace6 __vdso_gettimeofday+0xd6 ([vdso]) gtod 25484 71293.616405: 1177946 cycles:ppp: 7ffcdb53a914 [unknown] ([vdso]) gtod 25484 71293.616775: 1121290 cycles:ppp: 7ffcdb53ac47 __vdso_gettimeofday+0x37 ([vdso]) gtod 25484 71293.617150: 1037721 cycles:ppp: 7ffcdb53ace6 __vdso_gettimeofday+0xd6 ([vdso]) gtod 25484 71293.617478: 994526 cycles:ppp: 7ffcdb53ace6 __vdso_gettimeofday+0xd6 ([vdso]) # The patch is the obvious one and with it we also continue to resolve vdso symbols for pre-existing processes in 'perf top' and for all processes in 'perf record' + 'perf report/script'. Suggested-by: David Miller <davem@davemloft.net> Acked-by: David Miller <davem@davemloft.net> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-cs7skq9pp0kjypiju6o7trse@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-17 04:08:29 +08:00
al->map = map_groups__find(mg, al->addr);
perf tools: Stop fallbacking to kallsyms for vdso symbols lookup David reports that: <quote> Perf has this hack where it uses the kernel symbol map as a backup when a symbol can't be found in the user's symbol table(s). This causes problems because the tests driving this code path use machine__kernel_ip(), and that is completely meaningless on Sparc. On sparc64 the kernel and user live in physically separate virtual address spaces, rather than a shared one. And the kernel lives at a virtual address that overlaps common userspace addresses. So this test passes almost all the time when a user symbol lookup fails. The consequence of this is that, if the unfound user virtual address in the sample doesn't match up to a kernel symbol either, we trigger things like this code in builtin-top.c: if (al.sym == NULL && al.map != NULL) { const char *msg = "Kernel samples will not be resolved.\n"; /* * As we do lazy loading of symtabs we only will know if the * specified vmlinux file is invalid when we actually have a * hit in kernel space and then try to load it. So if we get * here and there are _no_ symbols in the DSO backing the * kernel map, bail out. * * We may never get here, for instance, if we use -K/ * --hide-kernel-symbols, even if the user specifies an * invalid --vmlinux ;-) */ if (!machine->kptr_restrict_warned && !top->vmlinux_warned && __map__is_kernel(al.map) && map__has_symbols(al.map)) { if (symbol_conf.vmlinux_name) { char serr[256]; dso__strerror_load(al.map->dso, serr, sizeof(serr)); ui__warning("The %s file can't be used: %s\n%s", symbol_conf.vmlinux_name, serr, msg); } else { ui__warning("A vmlinux file was not found.\n%s", msg); } if (use_browser <= 0) sleep(5); top->vmlinux_warned = true; } } When I fire up a compilation on sparc, this triggers immediately. I'm trying to figure out what the "backup to kernel map" code is accomplishing. I see some language in the current code and in the changes that have happened in this area talking about vdso. Does that really happen? The vdso is mapped into userspace virtual addresses, not kernel ones. More history. This didn't cause problems on sparc some time ago, because the kernel IP check used to be "ip < 0" :-) Sparc kernel addresses are not negative. But now with machine__kernel_ip(), which works using the symbol table determined kernel address range, it does trigger. What it all boils down to is that on architectures like sparc, machine__kernel_ip() should always return false in this scenerio, and therefore this kind of logic: if (cpumode == PERF_RECORD_MISC_USER && machine && mg != &machine->kmaps && machine__kernel_ip(machine, al->addr)) { is basically invalid. PERF_RECORD_MISC_USER implies no kernel address can possibly match for the sample/event in question (no matter how hard you try!) :-) </> So, I thought something had changed and in the past we would somehow find that address in the kallsyms, but I couldn't find anything to back that up, the patch introducing this is over a decade old, lots of things changed, so I was just thinking I was missing something. I tried a gtod busy loop to generate vdso activity and added a 'perf probe' at that branch, on x86_64 to see if it ever gets hit: Made thread__find_map() noinline, as 'perf probe' in lines of inline functions seems to not be working, only at function start. (Masami?) # perf probe -x ~/bin/perf -L thread__find_map:57 <thread__find_map@/home/acme/git/perf/tools/perf/util/event.c:57> 57 if (cpumode == PERF_RECORD_MISC_USER && machine && 58 mg != &machine->kmaps && 59 machine__kernel_ip(machine, al->addr)) { 60 mg = &machine->kmaps; 61 load_map = true; 62 goto try_again; } } else { /* * Kernel maps might be changed when loading * symbols so loading * must be done prior to using kernel maps. */ 69 if (load_map) 70 map__load(al->map); 71 al->addr = al->map->map_ip(al->map, al->addr); # perf probe -x ~/bin/perf thread__find_map:60 Added new event: probe_perf:thread__find_map (on thread__find_map:60 in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:thread__find_map -aR sleep 1 # Then used this to see if, system wide, those probe points were being hit: # perf trace -e *perf:thread*/max-stack=8/ ^C[root@jouet ~]# No hits when running 'perf top' and: # cat gtod.c #include <sys/time.h> int main(void) { struct timeval tv; while (1) gettimeofday(&tv, 0); return 0; } [root@jouet c]# ./gtod ^C Pressed 'P' in 'perf top' and the [vdso] samples are there: 62.84% [vdso] [.] __vdso_gettimeofday 8.13% gtod [.] main 7.51% [vdso] [.] 0x0000000000000914 5.78% [vdso] [.] 0x0000000000000917 5.43% gtod [.] _init 2.71% [vdso] [.] 0x000000000000092d 0.35% [kernel] [k] native_io_delay 0.33% libc-2.26.so [.] __memmove_avx_unaligned_erms 0.20% [vdso] [.] 0x000000000000091d 0.17% [i2c_i801] [k] i801_access 0.06% firefox [.] free 0.06% libglib-2.0.so.0.5400.3 [.] g_source_iter_next 0.05% [vdso] [.] 0x0000000000000919 0.05% libpthread-2.26.so [.] __pthread_mutex_lock 0.05% libpixman-1.so.0.34.0 [.] 0x000000000006d3a7 0.04% [kernel] [k] entry_SYSCALL_64_trampoline 0.04% libxul.so [.] style::dom_apis::query_selector_slow 0.04% [kernel] [k] module_get_kallsym 0.04% firefox [.] malloc 0.04% [vdso] [.] 0x0000000000000910 I added a 'perf probe' to thread__find_map:69, and that surely got tons of hits, i.e. for every map found, just to make sure the 'perf probe' command was really working. In the process I noticed a bug, we're only have records for '[vdso]' for pre-existing commands, i.e. ones that are running when we start 'perf top', when we will generate the PERF_RECORD_MMAP by looking at /perf/PID/maps. I.e. like this, for preexisting processes with a vdso map, again, tracing for all the system, only pre-existing processes get a [vdso] map (when having one): [root@jouet ~]# perf probe -x ~/bin/perf __machine__addnew_vdso Added new event: probe_perf:__machine__addnew_vdso (on __machine__addnew_vdso in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:__machine__addnew_vdso -aR sleep 1 [root@jouet ~]# perf trace -e probe_perf:__machine__addnew_vdso/max-stack=8/ 0.000 probe_perf:__machine__addnew_vdso:(568eb3) __machine__addnew_vdso (/home/acme/bin/perf) map__new (/home/acme/bin/perf) machine__process_mmap2_event (/home/acme/bin/perf) machine__process_event (/home/acme/bin/perf) perf_event__process (/home/acme/bin/perf) perf_tool__process_synth_event (/home/acme/bin/perf) perf_event__synthesize_mmap_events (/home/acme/bin/perf) __event__synthesize_thread (/home/acme/bin/perf) The kernel is generating a PERF_RECORD_MMAP for vDSOs, but somehow 'perf top' is not getting those records while 'perf record' is: # perf record ~acme/c/gtod ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.076 MB perf.data (1499 samples) ] # perf report -D | grep PERF_RECORD_MMAP2 71293612401913 0x11b48 [0x70]: PERF_RECORD_MMAP2 25484/25484: [0x400000(0x1000) @ 0 fd:02 1137 541179306]: r-xp /home/acme/c/gtod 71293612419012 0x11be0 [0x70]: PERF_RECORD_MMAP2 25484/25484: [0x7fa4a2783000(0x227000) @ 0 fd:00 3146370 854107250]: r-xp /usr/lib64/ld-2.26.so 71293612432110 0x11c50 [0x60]: PERF_RECORD_MMAP2 25484/25484: [0x7ffcdb53a000(0x2000) @ 0 00:00 0 0]: r-xp [vdso] 71293612509944 0x11cb0 [0x70]: PERF_RECORD_MMAP2 25484/25484: [0x7fa4a23cd000(0x3b6000) @ 0 fd:00 3149723 262067164]: r-xp /usr/lib64/libc-2.26.so # # perf script | grep vdso | head gtod 25484 71293.612768: 2485554 cycles:ppp: 7ffcdb53a914 [unknown] ([vdso]) gtod 25484 71293.613576: 2149343 cycles:ppp: 7ffcdb53a917 [unknown] ([vdso]) gtod 25484 71293.614274: 1814652 cycles:ppp: 7ffcdb53aca8 __vdso_gettimeofday+0x98 ([vdso]) gtod 25484 71293.614862: 1669070 cycles:ppp: 7ffcdb53acc5 __vdso_gettimeofday+0xb5 ([vdso]) gtod 25484 71293.615404: 1451589 cycles:ppp: 7ffcdb53acc5 __vdso_gettimeofday+0xb5 ([vdso]) gtod 25484 71293.615999: 1269941 cycles:ppp: 7ffcdb53ace6 __vdso_gettimeofday+0xd6 ([vdso]) gtod 25484 71293.616405: 1177946 cycles:ppp: 7ffcdb53a914 [unknown] ([vdso]) gtod 25484 71293.616775: 1121290 cycles:ppp: 7ffcdb53ac47 __vdso_gettimeofday+0x37 ([vdso]) gtod 25484 71293.617150: 1037721 cycles:ppp: 7ffcdb53ace6 __vdso_gettimeofday+0xd6 ([vdso]) gtod 25484 71293.617478: 994526 cycles:ppp: 7ffcdb53ace6 __vdso_gettimeofday+0xd6 ([vdso]) # The patch is the obvious one and with it we also continue to resolve vdso symbols for pre-existing processes in 'perf top' and for all processes in 'perf record' + 'perf report/script'. Suggested-by: David Miller <davem@davemloft.net> Acked-by: David Miller <davem@davemloft.net> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-cs7skq9pp0kjypiju6o7trse@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-17 04:08:29 +08:00
if (al->map != NULL) {
/*
* Kernel maps might be changed when loading symbols so loading
* must be done prior to using kernel maps.
*/
if (load_map)
map__load(al->map);
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
al->addr = al->map->map_ip(al->map, al->addr);
}
return al->map;
}
/*
* For branch stacks or branch samples, the sample cpumode might not be correct
* because it applies only to the sample 'ip' and not necessary to 'addr' or
* branch stack addresses. If possible, use a fallback to deal with those cases.
*/
struct map *thread__find_map_fb(struct thread *thread, u8 cpumode, u64 addr,
struct addr_location *al)
{
struct map *map = thread__find_map(thread, cpumode, addr, al);
struct machine *machine = thread->mg->machine;
u8 addr_cpumode = machine__addr_cpumode(machine, cpumode, addr);
if (map || addr_cpumode == cpumode)
return map;
return thread__find_map(thread, addr_cpumode, addr, al);
}
struct symbol *thread__find_symbol(struct thread *thread, u8 cpumode,
u64 addr, struct addr_location *al)
{
al->sym = NULL;
if (thread__find_map(thread, cpumode, addr, al))
al->sym = map__find_symbol(al->map, al->addr);
return al->sym;
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
}
struct symbol *thread__find_symbol_fb(struct thread *thread, u8 cpumode,
u64 addr, struct addr_location *al)
{
al->sym = NULL;
if (thread__find_map_fb(thread, cpumode, addr, al))
al->sym = map__find_symbol(al->map, al->addr);
return al->sym;
}
perf machine: Protect the machine->threads with a rwlock In addition to using refcounts for the struct thread lifetime management, we need to protect access to machine->threads from concurrent access. That happens in 'perf top', where a thread processes events, inserting and deleting entries from that rb_tree while another thread decays hist_entries, that end up dropping references and ultimately deleting threads from the rb_tree and releasing its resources when no further hist_entry (or other data structures, like in 'perf sched') references it. So the rule is the same for refcounts + protected trees in the kernel, get the tree lock, find object, bump the refcount, drop the tree lock, return, use object, drop the refcount if no more use of it is needed, keep it if storing it in some other data structure, drop when releasing that data structure. I.e. pair "t = machine__find(new)_thread()" with a "thread__put(t)", and "perf_event__preprocess_sample(&al)" with "addr_location__put(&al)". The addr_location__put() one is because as we return references to several data structures, we may end up adding more reference counting for the other data structures and then we'll drop it at addr_location__put() time. Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-bs9rt4n0jw3hi9f3zxyy3xln@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-07 07:43:22 +08:00
/*
* Callers need to drop the reference to al->thread, obtained in
* machine__findnew_thread()
*/
int machine__resolve(struct machine *machine, struct addr_location *al,
struct perf_sample *sample)
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
{
struct thread *thread = machine__findnew_thread(machine, sample->pid,
sample->tid);
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
if (thread == NULL)
return -1;
dump_printf(" ... thread: %s:%d\n", thread__comm_str(thread), thread->tid);
thread__find_map(thread, sample->cpumode, sample->ip, al);
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
dump_printf(" ...... dso: %s\n",
al->map ? al->map->dso->long_name :
al->level == 'H' ? "[hypervisor]" : "<not found>");
if (thread__is_filtered(thread))
al->filtered |= (1 << HIST_FILTER__THREAD);
al->sym = NULL;
al->cpu = sample->cpu;
al->socket = -1;
perf report: Use srcline from callchain for hist entries This also removes the symbol name from the srcline column, more on this below. This ensures we use the correct srcline, which could originate from a potentially inlined function. The hist entries used to query for the srcline based purely on the IP, which leads to wrong results for inlined entries. Before: ~~~~~ perf report --inline -s srcline -g none --stdio ... # Children Self Source:Line # ........ ........ .................................................................................................................................. # 94.23% 0.00% __libc_start_main+18446603487898210537 94.23% 0.00% _start+41 44.58% 0.00% main+100 44.58% 0.00% std::_Norm_helper<true>::_S_do_it<double>+100 44.58% 0.00% std::__complex_abs+100 44.58% 0.00% std::abs<double>+100 44.58% 0.00% std::norm<double>+100 36.01% 0.00% hypot+18446603487892193300 25.81% 0.00% main+41 25.81% 0.00% std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()+41 25.81% 0.00% std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >+41 25.75% 25.75% random.h:143 18.39% 0.00% main+57 18.39% 0.00% std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()+57 18.39% 0.00% std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >+57 13.80% 13.80% random.tcc:3330 5.64% 0.00% ??:0 4.13% 4.13% __hypot_finite+163 4.13% 0.00% __hypot_finite+18446603487892193443 ... ~~~~~ After: ~~~~~ perf report --inline -s srcline -g none --stdio ... # Children Self Source:Line # ........ ........ ........................................... # 94.30% 1.19% main.cpp:39 94.23% 0.00% __libc_start_main+18446603487898210537 94.23% 0.00% _start+41 48.44% 1.70% random.h:1823 48.44% 0.00% random.h:1814 46.74% 2.53% random.h:185 44.68% 0.10% complex:589 44.68% 0.00% complex:597 44.68% 0.00% complex:654 44.68% 0.00% complex:664 40.61% 13.80% random.tcc:3330 36.01% 0.00% hypot+18446603487892193300 26.81% 0.00% random.h:151 26.81% 0.00% random.h:332 25.75% 25.75% random.h:143 5.64% 0.00% ??:0 4.13% 4.13% __hypot_finite+163 4.13% 0.00% __hypot_finite+18446603487892193443 ... ~~~~~ Note that this change removes the symbol from the source:line hist column. If this information is desired, users should explicitly query for it if needed. I.e. run this command instead: ~~~~~ perf report --inline -s sym,srcline -g none --stdio ... # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 1K of event 'cycles:uppp' # Event count (approx.): 1381229476 # # Children Self Symbol Source:Line # ........ ........ ................................................................................................................................... ........................................... # 94.30% 1.19% [.] main main.cpp:39 94.23% 0.00% [.] __libc_start_main __libc_start_main+18446603487898210537 94.23% 0.00% [.] _start _start+41 48.44% 0.00% [.] std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined) random.h:1814 48.44% 0.00% [.] std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined) random.h:1823 46.74% 0.00% [.] std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inlined) random.h:185 44.68% 0.00% [.] std::_Norm_helper<true>::_S_do_it<double> (inlined) complex:654 44.68% 0.00% [.] std::__complex_abs (inlined) complex:589 44.68% 0.00% [.] std::abs<double> (inlined) complex:597 44.68% 0.00% [.] std::norm<double> (inlined) complex:664 39.80% 13.59% [.] std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.tcc:3330 36.01% 0.00% [.] hypot hypot+18446603487892193300 26.81% 0.00% [.] std::__detail::__mod<unsigned long, 2147483647ul, 16807ul, 0ul> (inlined) random.h:151 26.81% 0.00% [.] std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>::operator() (inlined) random.h:332 25.75% 0.00% [.] std::__detail::_Mod<unsigned long, 2147483647ul, 16807ul, 0ul, true, true>::__calc (inlined) random.h:143 25.19% 25.19% [.] std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.h:143 4.13% 4.13% [.] __hypot_finite __hypot_finite+163 4.13% 0.00% [.] __hypot_finite __hypot_finite+18446603487892193443 ... ~~~~~ Compared to the old behavior, this reduces duplication in the output. Before we used to print the symbol name in the srcline column even when the sym column was explicitly requested. I.e. the output was: ~~~~~ perf report --inline -s sym,srcline -g none --stdio ... # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 1K of event 'cycles:uppp' # Event count (approx.): 1381229476 # # Children Self Symbol Source:Line # ........ ........ ................................................................................................................................... .................................................................................................................................. # 94.23% 0.00% [.] __libc_start_main __libc_start_main+18446603487898210537 94.23% 0.00% [.] _start _start+41 44.58% 0.00% [.] main main+100 44.58% 0.00% [.] std::_Norm_helper<true>::_S_do_it<double> (inlined) std::_Norm_helper<true>::_S_do_it<double>+100 44.58% 0.00% [.] std::__complex_abs (inlined) std::__complex_abs+100 44.58% 0.00% [.] std::abs<double> (inlined) std::abs<double>+100 44.58% 0.00% [.] std::norm<double> (inlined) std::norm<double>+100 36.01% 0.00% [.] hypot hypot+18446603487892193300 25.81% 0.00% [.] main main+41 25.81% 0.00% [.] std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inlined) std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()+41 25.81% 0.00% [.] std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined) std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >+41 25.69% 25.69% [.] std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.h:143 18.39% 0.00% [.] main main+57 18.39% 0.00% [.] std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inlined) std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()+57 18.39% 0.00% [.] std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined) std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >+57 13.80% 13.80% [.] std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.tcc:3330 4.13% 4.13% [.] __hypot_finite __hypot_finite+163 4.13% 0.00% [.] __hypot_finite __hypot_finite+18446603487892193443 ... ~~~~~ Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171019113836.5548-5-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 19:38:35 +08:00
al->srcline = NULL;
if (al->cpu >= 0) {
struct perf_env *env = machine->env;
if (env && env->cpu)
al->socket = env->cpu[al->cpu].socket_id;
}
if (al->map) {
struct dso *dso = al->map->dso;
if (symbol_conf.dso_list &&
(!dso || !(strlist__has_entry(symbol_conf.dso_list,
dso->short_name) ||
(dso->short_name != dso->long_name &&
strlist__has_entry(symbol_conf.dso_list,
dso->long_name))))) {
al->filtered |= (1 << HIST_FILTER__DSO);
}
al->sym = map__find_symbol(al->map, al->addr);
}
if (symbol_conf.sym_list &&
(!al->sym || !strlist__has_entry(symbol_conf.sym_list,
al->sym->name))) {
al->filtered |= (1 << HIST_FILTER__SYMBOL);
}
return 0;
perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-28 02:29:23 +08:00
}
perf machine: Protect the machine->threads with a rwlock In addition to using refcounts for the struct thread lifetime management, we need to protect access to machine->threads from concurrent access. That happens in 'perf top', where a thread processes events, inserting and deleting entries from that rb_tree while another thread decays hist_entries, that end up dropping references and ultimately deleting threads from the rb_tree and releasing its resources when no further hist_entry (or other data structures, like in 'perf sched') references it. So the rule is the same for refcounts + protected trees in the kernel, get the tree lock, find object, bump the refcount, drop the tree lock, return, use object, drop the refcount if no more use of it is needed, keep it if storing it in some other data structure, drop when releasing that data structure. I.e. pair "t = machine__find(new)_thread()" with a "thread__put(t)", and "perf_event__preprocess_sample(&al)" with "addr_location__put(&al)". The addr_location__put() one is because as we return references to several data structures, we may end up adding more reference counting for the other data structures and then we'll drop it at addr_location__put() time. Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-bs9rt4n0jw3hi9f3zxyy3xln@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-07 07:43:22 +08:00
/*
* The preprocess_sample method will return with reference counts for the
* in it, when done using (and perhaps getting ref counts if needing to
* keep a pointer to one of those entries) it must be paired with
* addr_location__put(), so that the refcounts can be decremented.
*/
void addr_location__put(struct addr_location *al)
{
thread__zput(al->thread);
}
bool is_bts_event(struct perf_event_attr *attr)
{
return attr->type == PERF_TYPE_HARDWARE &&
(attr->config & PERF_COUNT_HW_BRANCH_INSTRUCTIONS) &&
attr->sample_period == 1;
}
bool sample_addr_correlates_sym(struct perf_event_attr *attr)
{
if (attr->type == PERF_TYPE_SOFTWARE &&
(attr->config == PERF_COUNT_SW_PAGE_FAULTS ||
attr->config == PERF_COUNT_SW_PAGE_FAULTS_MIN ||
attr->config == PERF_COUNT_SW_PAGE_FAULTS_MAJ))
return true;
if (is_bts_event(attr))
return true;
return false;
}
void thread__resolve(struct thread *thread, struct addr_location *al,
struct perf_sample *sample)
{
thread__find_map_fb(thread, sample->cpumode, sample->addr, al);
al->cpu = sample->cpu;
al->sym = NULL;
if (al->map)
al->sym = map__find_symbol(al->map, al->addr);
}