OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Arnaldo Carvalho de Melo	a14390fde6	perf script: Allow creating per-event dump files Introduce a new option to dump trace output to files named by the monitored events and update perf-script documentation accordingly. Shown below is output of perf script command with the newly introduced option. $ perf record -e cycles -e cs -ag -- sleep 1 $ perf script --per-event-dump $ ls perf.data.cycles.dump perf.data.cs.dump Without per-event-dump support, drawing flamegraphs for different events would require post processing to separate events. You can monitor only one event at a time if you want to get flamegraphs for different events. Using this option, you can get the trace output files named by the monitored events, and could draw flamegraphs according to the event's name. Based-on-a-patch-by: yuzhoujian <yuzhoujian@didichuxing.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1508921599-10832-3-git-send-email-yuzhoujian@didichuxing.com Link: http://lkml.kernel.org/n/tip-8ngzsjdhgiovkupl3r5yy570@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-10-27 09:10:10 -03:00
Milian Wolff	d8a88dd243	perf util: Enable handling of inlined frames by default Now that we have caches in place to speed up the process of finding inlined frames and srcline information repeatedly, we can enable this useful option by default. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171019113836.5548-6-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-10-25 10:50:47 -03:00
Andi Kleen	98ad761bd3	perf list: Fix group description in the man page Fix an incorrect description in the 'perf list' manpage. When a group does not fit into the hardware it is partially scheduled, but does not error out. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20171010224322.15861-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-10-23 11:20:54 -03:00
Ingo Molnar	ca4b9c3b74	Merge branch 'perf/urgent' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-10-20 11:02:05 +02:00
Taeung Song	3f50f614d6	perf record: Fix documentation for a inexistent option '-l' 'perf record' had a '-l' option that meant "scale counter values" a very long time ago, but it currently belongs to 'perf stat' as '-c'. So remove it. I found this problem in the below case. $ perf record -e cycles -l sleep 3 Error: unknown switch `l Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1507907412-19813-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-10-17 09:05:36 -03:00
Kan Liang	0c6b499495	perf top: Add option to set the number of thread for event synthesize Using UINT_MAX to indicate the default thread#, which is the max number of online CPU. Committer testing: # perf trace --no-inherit -e clone -o /tmp/output perf top --num-thread-synthesize 9 # cat /tmp/output ? ( ? ): ... [continued]: clone()) = 26651 (perf) 0.059 ( 0.010 ms): clone(flags: VM\|FS\|FILES\|SIGHAND\|THREAD\|SYSVSEM\|SETTLS\|PARENT_SETTID\|CHILD_CLEARTID, child_stack: 0x7f5bfac44f30, parent_tidptr: 0x7f5bfac459d0, child_tidptr: 0x7f5bfac459d0, tls: 0x7f5bfac45700) = 26652 (perf) 0.116 ( 0.014 ms): clone(flags: VM\|FS\|FILES\|SIGHAND\|THREAD\|SYSVSEM\|SETTLS\|PARENT_SETTID\|CHILD_CLEARTID, child_stack: 0x7f5bfa443f30, parent_tidptr: 0x7f5bfa4449d0, child_tidptr: 0x7f5bfa4449d0, tls: 0x7f5bfa444700) = 26653 (perf) 0.141 ( 0.009 ms): clone(flags: VM\|FS\|FILES\|SIGHAND\|THREAD\|SYSVSEM\|SETTLS\|PARENT_SETTID\|CHILD_CLEARTID, child_stack: 0x7f5bf9c42f30, parent_tidptr: 0x7f5bf9c439d0, child_tidptr: 0x7f5bf9c439d0, tls: 0x7f5bf9c43700) = 26654 (perf) 0.160 ( 0.012 ms): clone(flags: VM\|FS\|FILES\|SIGHAND\|THREAD\|SYSVSEM\|SETTLS\|PARENT_SETTID\|CHILD_CLEARTID, child_stack: 0x7f5bf9441f30, parent_tidptr: 0x7f5bf94429d0, child_tidptr: 0x7f5bf94429d0, tls: 0x7f5bf9442700) = 26655 (perf) 0.232 ( 0.013 ms): clone(flags: VM\|FS\|FILES\|SIGHAND\|THREAD\|SYSVSEM\|SETTLS\|PARENT_SETTID\|CHILD_CLEARTID, child_stack: 0x7f5bf8c40f30, parent_tidptr: 0x7f5bf8c419d0, child_tidptr: 0x7f5bf8c419d0, tls: 0x7f5bf8c41700) = 26656 (perf) 0.393 ( 0.011 ms): clone(flags: VM\|FS\|FILES\|SIGHAND\|THREAD\|SYSVSEM\|SETTLS\|PARENT_SETTID\|CHILD_CLEARTID, child_stack: 0x7f5be3ffef30, parent_tidptr: 0x7f5be3fff9d0, child_tidptr: 0x7f5be3fff9d0, tls: 0x7f5be3fff700) = 26657 (perf) 0.802 ( 0.012 ms): clone(flags: VM\|FS\|FILES\|SIGHAND\|THREAD\|SYSVSEM\|SETTLS\|PARENT_SETTID\|CHILD_CLEARTID, child_stack: 0x7f5be37fdf30, parent_tidptr: 0x7f5be37fe9d0, child_tidptr: 0x7f5be37fe9d0, tls: 0x7f5be37fe700) = 26658 (perf) 1.411 ( 0.022 ms): clone(flags: VM\|FS\|FILES\|SIGHAND\|THREAD\|SYSVSEM\|SETTLS\|PARENT_SETTID\|CHILD_CLEARTID, child_stack: 0x7f5be2ffcf30, parent_tidptr: 0x7f5be2ffd9d0, child_tidptr: 0x7f5be2ffd9d0, tls: 0x7f5be2ffd700) = 26659 (perf) 246.422 ( 0.042 ms): clone(flags: VM\|FS\|FILES\|SIGHAND\|THREAD\|SYSVSEM\|SETTLS\|PARENT_SETTID\|CHILD_CLEARTID, child_stack: 0x7f5be2ffcf30, parent_tidptr: 0x7f5be2ffd9d0, child_tidptr: 0x7f5be2ffd9d0, tls: 0x7f5be2ffd700) = 26660 (perf) # Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1506696477-146932-5-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-10-03 09:27:54 -03:00
Andi Kleen	b1491ace8e	perf script: Support user regs Teach perf script to print user regs. % perf record --user-regs=ip,sp ... % perf script -F ip,sym,uregs ... ffffffff9e060c24 native_write_msr ABI:2 SP:0x7ffd0ea06c38 IP:0x7fe77f55b637 ffffffff9e060c24 native_write_msr ABI:2 SP:0x7ffd0ea06c38 IP:0x7fe77f55b637 ffffffff9e060c24 native_write_msr ABI:2 SP:0x7ffd0ea06c38 IP:0x7fe77f55b637 ffffffff9e060c24 native_write_msr ABI:2 SP:0x7ffd0ea06c38 IP:0x7fe77f55b637 ffffffff9e00cc12 intel_pmu_handle_irq ABI:2 SP:0x7ffd0ea06c38 IP:0x7fe77f55b637 v2: Rebased on top of phys-addr patches Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20170905184057.26135-1-andi@firstfloor.org [ Use PRIu64 for regs->abi in print_sample_uregs() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-09-13 09:49:14 -03:00
Andi Kleen	84c4174227	perf record: Support direct --user-regs arguments USER_REGS can currently only collected implicitely with call graph recording. Sometimes it is useful to see them separately, and filter them. Add a new --user-regs option to record that is similar to --intr-regs, but acts on user regs. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170905170029.19722-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-09-13 09:49:14 -03:00
Andi Kleen	71b0acce78	perf list: Add metric groups to perf list Add code to perf list to print metric groups, and metrics that don't have an event name. The metricgroup code collects the eventgroups and events into a rblist, and then prints them according to the configured filters. The metricgroups are printed by default, but can be limited by perf list metric or perf list metricgroup % perf list metricgroup .. Metric Groups: DSB: DSB_Coverage [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)] FLOPS: GFLOPs [Giga Floating Point Operations Per Second] Frontend: IFetch_Line_Utilization [Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions] Frontend_Bandwidth: DSB_Coverage [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)] Memory_BW: MLP [Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)] v2: Check return value of asprintf to fix warning on FC26 Fix key in lookup/addition for the groups list Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170831194036.30146-8-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-09-13 09:49:13 -03:00
Andi Kleen	b18f3e3650	perf stat: Support JSON metrics in perf stat Add generic support for standalone metrics specified in JSON files to perf stat. A metric is a formula that uses multiple events to compute a higher level result (e.g. IPC). Previously metrics were always tied to an event and automatically enabled with that event. But now change it that we can have standalone metrics. They are in the same JSON data structure as events, but don't have an event name. We also allow to organize the metrics in metric groups, which allows a short cut to select several related metrics at once. Add a new -M / --metrics option to perf stat that adds the metrics or metric groups specified. Add the core code to manage and parse the metric groups. They are collected from the JSON data structures into a separate rblist. When computing shadow values look for metrics in that list. Then they are computed using the existing saved values infrastructure in stat-shadow.c The actual JSON metrics are in a separate pull request. % perf stat -M Summary --metric-only -a sleep 1 Performance counter stats for 'system wide': Instructions CLKS CPU_Utilization GFLOPs SMT_2T_Utilization Kernel_Utilization 317614222.0 1392930775.0 0.0 0.0 0.2 0.1 1.001497549 seconds time elapsed % perf stat -M GFLOPs flops Performance counter stats for 'flops': 3,999,541,471 fp_comp_ops_exe.sse_scalar_single # 1.2 GFLOPs (66.65%) 14 fp_comp_ops_exe.sse_scalar_double (66.65%) 0 fp_comp_ops_exe.sse_packed_double (66.67%) 0 fp_comp_ops_exe.sse_packed_single (66.70%) 0 simd_fp_256.packed_double (66.70%) 0 simd_fp_256.packed_single (66.67%) 0 duration_time 3.238372845 seconds time elapsed v2: Add missing header file v3: Move find_map to pmu.c Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170831194036.30146-7-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-09-13 09:49:13 -03:00
Andi Kleen	5a5dfe4b85	perf tools: Support weak groups in 'perf stat' Setting up groups can be complicated due to the complicated scheduling restrictions of different PMUs. User tools usually don't understand all these restrictions. Still in many cases it is useful to set up groups and they work most of the time. However if the group is set up wrong some members will not report any value because they never get scheduled. Add a concept of a 'weak group': try to set up a group, but if it's not schedulable fallback to not using a group. That gives us the best of both worlds: groups if they work, but still a usable fallback if they don't. In theory it would be possible to have more complex fallback strategies (e.g. try to split the group in half), but the simple fallback of not using a group seems to work for now. So far the weak group is only implemented for perf stat, not for record. Here's an unschedulable group (on IvyBridge with SMT on) % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1 73,806,067 branches 4,848,144 branch-misses # 6.57% of all branches 14,754,458 l1d.replacement 24,905,558 l2_lines_in.all <not supported> l2_rqsts.all_code_rd <------- will never report anything With the weak group: % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}:W' -a sleep 1 125,366,055 branches (80.02%) 9,208,402 branch-misses # 7.35% of all branches (80.01%) 24,560,249 l1d.replacement (80.00%) 43,174,971 l2_lines_in.all (80.05%) 31,891,457 l2_rqsts.all_code_rd (79.92%) The extra event scheduled with some extra multiplexing v2: Move fallback code to separate function. Add comment on for_each_group_member Adjust to new perf_evsel__close interface v3: Fix debug print out. Committer testing: Before: # perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1 Performance counter stats for 'system wide': <not counted> branches <not counted> branch-misses <not counted> l1d.replacement <not counted> l2_lines_in.all <not supported> l2_rqsts.all_code_rd 1.002147212 seconds time elapsed # perf stat -e '{branches,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1 Performance counter stats for 'system wide': 83,207,892 branches 11,065,444 l1d.replacement 28,484,024 l2_lines_in.all 12,186,179 l2_rqsts.all_code_rd 1.001739493 seconds time elapsed After: # perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}':W -a sleep 1 Performance counter stats for 'system wide': 543,323,909 branches (80.01%) 27,100,512 branch-misses # 4.99% of all branches (80.02%) 50,402,905 l1d.replacement (80.03%) 67,385,892 l2_lines_in.all (80.01%) 21,352,885 l2_rqsts.all_code_rd (79.94%) 1.001086658 seconds time elapsed # Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20170831194036.30146-2-andi@firstfloor.org [ Add a "'perf stat' only, for now" comment in the man page, suggested by Jiri ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-09-13 09:49:12 -03:00
David Ahern	0f59d7a352	perf sched timehist: Add pid and tid options Add options to only show event for specific pid(s) and tid(s). Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1504288152-19690-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-09-13 09:49:12 -03:00
Kan Liang	49d58f04eb	perf script: Support physical address Display the physical address at the tail if it is available. Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1504026672-7304-5-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-09-01 14:46:29 -03:00
Kan Liang	c35aeb9dfe	perf mem: Support physical address Add option phys-data in "perf mem" to record/report physical address. The default mem sort order for physical address is changed accordingly. Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1504026672-7304-4-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-09-01 14:46:23 -03:00
Kan Liang	8780fb25ab	perf sort: Add sort option for physical address Add a new sort option "phys_daddr" for --mem-mode sort. With this option applied, perf can sort and report by sample's physical address. Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1504026672-7304-3-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-09-01 14:46:11 -03:00
Kan Liang	3b0a5daa06	perf tools: Support new sample type for physical address Support new sample type PERF_SAMPLE_PHYS_ADDR for physical address. Add new option --phys-data to record sample physical address. Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1504026672-7304-2-git-send-email-kan.liang@intel.com [ Added missing printing in evsel.c patch sent by Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-09-01 14:46:00 -03:00
Jack Henschel	4fb2053920	perf intel-pt: Fix syntax in documentation of config option As specified in tools/perf/Documentation/perf-config.txt, perf configuration items must be in 'key = value' format, otherwise the following error message occurs: $ perf record -e intel_pt//u -- ls bad config file line 2 in ~/.perfconfig $ cat .perfconfig [intel-pt] mispred-all Changing to assigning a value to the key 'mispred-all' fixes the issue: $ perf record -e intel_pt//u -- ls [ perf record: Woken up 1 times to write data ] [ perf record: Capured and wrote 0.031 MB perf.data] $ cat .perfconfig [intel-pt] mispred-all = true Signed-off-by: Jack Henschel <jackdev@mailbox.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170831080535.2157-1-jackdev@mailbox.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-09-01 14:45:59 -03:00
Arnaldo Carvalho de Melo	27702bcfe8	perf trace: Support syscall name globbing So now we can use: # perf trace -e pkey_* 532.784 ( 0.006 ms): pkey/16018 pkey_alloc(init_val: DISABLE_WRITE) = -1 EINVAL Invalid argument 532.795 ( 0.004 ms): pkey/16018 pkey_mprotect(start: 0x7f380d0a6000, len: 4096, prot: READ\|WRITE, pkey: -1) = 0 532.801 ( 0.002 ms): pkey/16018 pkey_free(pkey: -1 ) = -1 EINVAL Invalid argument ^C[root@jouet ~]# Or '-e epoll', '-e msg', etc. Combining syscall names with perf events, tracepoints, etc, continues to be valid, i.e. this is possible: # perf probe -L sys_nanosleep <SyS_nanosleep@/home/acme/git/linux/kernel/time/hrtimer.c:0> 0 SYSCALL_DEFINE2(nanosleep, struct timespec __user , rqtp, struct timespec __user , rmtp) { struct timespec64 tu; 5 if (get_timespec64(&tu, rqtp)) 6 return -EFAULT; if (!timespec64_valid(&tu)) 9 return -EINVAL; 11 current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE; 12 current->restart_block.nanosleep.rmtp = rmtp; 13 return hrtimer_nanosleep(&tu, HRTIMER_MODE_REL, CLOCK_MONOTONIC); } # perf probe my_probe="sys_nanosleep:12 rmtp" Added new event: probe:my_probe (on sys_nanosleep:12 with rmtp) You can now use it in all perf tools, such as: perf record -e probe:my_probe -aR sleep 1 # # perf trace -e probe:my_probe/max-stack=5/,sleep sleep 1 0.427 ( 0.003 ms): sleep/16690 nanosleep(rqtp: 0x7ffefc245090) ... 0.430 ( ): probe:my_probe:(ffffffffbd112923) rmtp=0) sys_nanosleep ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) __nanosleep_nocancel (/usr/lib64/libc-2.25.so) 0.427 (1000.208 ms): sleep/16690 ... [continued]: nanosleep()) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-elycoi8wy6y0w9dkj7ox1mzz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-09-01 14:45:58 -03:00
Jack Henschel	726647d052	perf stat: Fix path to PMU formats in documentation As defined in tools/perf/util/pmu.c, the EVENT_SOURCE_DEVICE_PATH is /sys/bus/event_source/devices/ (no traling 's' in event_source) This patch corrects the path in the perf stat documentation Signed-off-by: Jack Henschel <jackdev@mailbox.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jack Henschel <jackdev@mailbox.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: trivial@kernel.org Link: http://lkml.kernel.org/r/20170824132022.10934-1-jackdev@mailbox.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-08-28 11:05:09 -03:00
Konstantin Khlebnikov	2826478a66	perf tools: Really install manpages via 'make install-man' Target install-man builds them but forget to install. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: `af3df2cf17` ("perf tools: Try to build Documentation when installing") Link: http://lkml.kernel.org/r/150322915300.129715.13645857235229756834.stgit@buzz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-08-22 13:24:53 -03:00
Taeung Song	01c85629f5	perf annotate: Document --show-total-period option When the --show-total-period option was introduced we forgot to add an entry in the man page, fix it. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Martin Liška <mliska@suse.cz> Fixes: `0c4a5bcea4` ("perf annotate: Display total number of samples with --show-total-period") Link: http://lkml.kernel.org/r/1503046013-5555-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-08-18 10:34:08 -03:00
Taeung Song	1ac39372e0	perf annotate stdio: Support --show-nr-samples option Add --show-nr-samples option to "perf annotate" so that it matches "perf report". Committer note: Note that it can't be used together with --show-total-period, which seems like a silly limitation, that can be lifted at some point. Made it bail out if not on --stdio. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1503046008-5511-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-08-18 10:31:53 -03:00
Adrian Hunter	69e6e410f1	perf script python: Rename call-graph-from-postgresql.py to call-graph-from-sql.py Rename call-graph-from-postgresql.py to call-graph-from-sql.py in preparation for adding support to it for SQLite 3. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/1501749090-20357-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-08-15 16:38:06 -03:00
Adrian Hunter	564b9527d1	perf script python: Add support for exporting to sqlite3 Add support for exporting to SQLite 3 the same data as the PostgreSQL export. Committer note: Tested on RHEL 7.4 using the 1.2.2-4el python-pyside packages from EPEL. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1501749090-20357-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-08-15 16:37:55 -03:00
Krister Johansen	868a832918	perf top: Support lookup of symbols in other mount namespaces. The perf top command needs to unshare its fs from the helper threads in order to successfully setns(2) during its symbol lookup. It also needs to impelement a force flag to ignore ownership of perf-<pid>.map files. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1499305693-1599-6-git-send-email-kjlx@templeofstupid.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-07-25 22:43:16 -03:00
Jin Yao	60f83fa634	perf record: Create a new option save_type in --branch-filter The option indicates the kernel to save branch type during sampling. One example: perf record -g --branch-filter any,save_type <command> Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1500379995-6449-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-07-18 23:14:39 -03:00
David Carrillo-Cisneros	e9def1b2e7	perf tools: Add feature header record to pipe-mode Add header record types to pipe-mode, reusing the functions used in file-mode and leveraging the new struct feat_fd. For alignment, check that synthesized events don't exceed pagesize. Add the perf_event__synthesize_feature event call back to process the new header records. Before this patch: $ perf record -o - -e cycles sleep 1 \| perf report --stdio --header [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] ... After this patch: $ perf record -o - -e cycles sleep 1 \| perf report --stdio --header # ======== # captured on: Mon May 22 16:33:43 2017 # ======== # # hostname : my_hostname # os release : 4.11.0-dbx-up_perf # perf version : 4.11.rc6.g6277c80 # arch : x86_64 # nrcpus online : 72 # nrcpus avail : 72 # cpudesc : Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz # cpuid : GenuineIntel,6,63,2 # total memory : 263457192 kB # cmdline : /root/perf record -o - -e cycles -c 100000 sleep 1 # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: intel_bts = 6, uncore_imc_4 = 22, uncore_sbox_1 = 47, uncore_cbox_5 = 33, uncore_ha_0 = 16, uncore_cbox [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] ... Support added for the subcommands: report, inject, annotate and script. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-16-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-07-18 23:14:36 -03:00
Krister Johansen	f045b8c4b3	perf buildid-cache: Support binary objects from other namespaces Teach buildid-cache how to add, remove, and update binary objects from other mount namespaces. Allow probe events tracing binaries in different namespaces to add their objects to the probe and build-id caches too. As a handy side effect, this also lets us access SDT probes in binaries from alternate mount namespaces. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1499305693-1599-5-git-send-email-kjlx@templeofstupid.com [ Add util/namespaces.c to tools/perf/util/python-ext-sources, to fix the python binding 'perf test' ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-07-18 23:14:11 -03:00
Krister Johansen	544abd44c7	perf probe: Allow placing uprobes in alternate namespaces. Teaches perf how to place a uprobe on a file that's in a different mount namespace. The user must add the probe using the --target-ns argument to perf probe. Once it has been placed, it may be recorded against without further namespace-specific commands. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> [ PPC build fixed by Ravi: ] Link: http://lkml.kernel.org/r/1500287542-6219-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> [ Fix !HAVE_DWARF_SUPPORT build ] Link: http://lkml.kernel.org/r/1499305693-1599-4-git-send-email-kjlx@templeofstupid.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-07-18 23:14:10 -03:00
Adrian Hunter	ead2bfdb85	perf intel-pt: Update documentation to include new ptwrite and power events Update documentation to include new ptwrite and power events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-36-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-30 11:50:54 -03:00
Adrian Hunter	47e780848e	perf script: Add 'synth' field for synthesized event payloads Add a field to display the content the raw_data of a synthesized event. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-22-git-send-email-adrian.hunter@intel.com [ Resolved conflict with `106dacd86f` ("perf script: Support -F brstackoff,dso") ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-27 12:19:10 -03:00
Adrian Hunter	70d110d775	perf auxtrace: Add itrace option to output power events Add itrace option to output power events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-25-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-27 12:09:58 -03:00
Adrian Hunter	3bdafdffa9	perf auxtrace: Add itrace option to output ptwrite events Add itrace option to output ptwrite events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-24-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-27 12:09:20 -03:00
Adrian Hunter	2bc60ffd66	perf intel-pt: Add documentation for new config terms Add documentation for new config terms. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-13-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-21 11:35:49 -03:00
Kan Liang	daefd0bc0b	perf stat: Add support to measure SMI cost Implementing a new --smi-cost mode in perf stat to measure SMI cost. During the measurement, the /sys/device/cpu/freeze_on_smi will be set. The measurement can be done with one counter (unhalted core cycles), and two free running MSR counters (IA32_APERF and SMI_COUNT). In practice, the percentages of SMI core cycles should be more useful than absolute value. So the output will be the percentage of SMI core cycles and SMI#. metric_only will be set by default. SMI cycles% = (aperf - unhalted core cycles) / aperf Here is an example output. Performance counter stats for 'sudo echo ': SMI cycles% SMI# 0.1% 1 0.010858678 seconds time elapsed Users who wants to get the actual value can apply additional --no-metric-only. Signed-off-by: Kan Liang <Kan.liang@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Elliott <elliott@hpe.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1495825538-5230-3-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-21 11:35:35 -03:00
Namhyung Kim	1096c35aa8	perf ftrace: Add -D option for depth filter The -D/--graph-depth option is to set max graph depth. The following example traces max 2-depth of page fault handler. $ sudo perf ftrace -G __do_page_fault -D 2 -- hello ... 0) \| __do_page_fault() { 0) 0.063 us \| down_read_trylock(); 0) 0.251 us \| find_vma(); 0) 5.374 us \| handle_mm_fault(); 0) 0.054 us \| up_read(); 0) 7.463 us \| } ... Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170618142302.25390-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-19 22:05:54 -03:00
Namhyung Kim	78b83e8b12	perf ftrace: Add option for function filtering The -T/--trace-funcs and -N/--notrace-funcs options are to specify functions to enable/disable tracing dynamically. The -G/--graph-funcs and -g/--nograph-funcs options are to set filters for function graph tracer. For example, to trace fault handling functions only: $ sudo perf ftrace -T fault hello 0) \| __do_page_fault() { 0) \| handle_mm_fault() { 0) 2.117 us \| __handle_mm_fault(); 0) 3.627 us \| } 0) 7.811 us \| } 0) \| __do_page_fault() { 0) \| handle_mm_fault() { 0) 2.014 us \| __handle_mm_fault(); 0) 2.424 us \| } 0) 2.951 us \| } ... To trace all functions executed in __do_page_fault: $ sudo perf ftrace -G __do_page_fault hello 2) \| __do_page_fault() { 3) 0.060 us \| down_read_trylock(); 3) \| find_vma() { 3) 0.075 us \| vmacache_find(); 3) 0.053 us \| vmacache_update(); 3) 1.246 us \| } 3) \| handle_mm_fault() { 3) 0.063 us \| __rcu_read_lock(); 3) 0.056 us \| mem_cgroup_from_task(); 3) 0.057 us \| __rcu_read_unlock(); 3) \| __handle_mm_fault() { 3) \| filemap_map_pages() { 3) 0.058 us \| __rcu_read_lock(); 3) \| alloc_set_pte() { ... But don't want to show details in handle_mm_fault: $ sudo perf ftrace -G __do_page_fault -g handle_mm_fault hello 3) \| __do_page_fault() { 3) 0.049 us \| down_read_trylock(); 3) \| find_vma() { 3) 0.048 us \| vmacache_find(); 3) 0.041 us \| vmacache_update(); 3) 0.680 us \| } 3) 0.036 us \| up_read(); 3) 4.547 us \| } / __do_page_fault */ ... Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170618142302.25390-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-19 22:05:53 -03:00
Mark Santaniello	106dacd86f	perf script: Support -F brstackoff,dso The idea here is to make AutoFDO easier in cloud environment with ASLR. It's easiest to show how this is useful by example. I built a small test akin to "while(1) { do_nothing(); }" where the do_nothing function is loaded from a dso: $ cat burncpu.cpp #include <dlfcn.h> int main() { void* handle = dlopen("./dso.so", RTLD_LAZY); if (!handle) return -1; typedef void (fp)(); fp do_nothing = (fp) dlsym(handle, "do_nothing"); while(1) { do_nothing(); } } $ cat dso.cpp extern "C" void do_nothing() {} $ cat build.sh #!/bin/bash g++ -shared dso.cpp -o dso.so g++ burncpu.cpp -o burncpu -ldl I sampled the execution of this program with perf record -b. Using the existing "brstack,dso", we get absolute addresses that are affected by ASLR, and could be different on different hosts. The address does not uniquely identify a branch/target in the binary: $ perf script -F brstack,dso \| sed 's/\/0 /\/0\n/g' \| grep burncpu \| grep dso.so \| head -n 1 0x7f967139b6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0 Using the existing "brstacksym,dso" is a little better, because the symbol plus offset and dso name does* uniquely identify a branch/target in the binary. Ultimately, however, AutoFDO wants a simple offset into the binary, so we'd have to undo all the work perf did to symbolize in the first place: $ perf script -F brstacksym,dso \| sed 's/\/0 /\/0\n/g' \| grep burncpu \| grep dso.so \| head -n 1 do_nothing+0x5(/tmp/burncpu/dso.so)/main+0x44(/tmp/burncpu/exe)/P/-/-/0 With the new "brstackoff,dso" we get what we need: a simple offset into a specific dso/binary that uniquely identifies a branch/target: $ perf script -F brstackoff,dso \| sed 's/\/0 /\/0\n/g' \| grep burncpu \| grep dso.so \| head -n 1 0x6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0 Signed-off-by: Mark Santaniello <marksan@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170619163825.2012979-2-marksan@fb.com [ Updated documentation about 'brstackoff' using text from above ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-19 22:05:46 -03:00
Andi Kleen	36ce565114	perf script: Allow adding and removing fields With 'perf script' it is common that we just want to add or remove a field. Currently this requires figuring out the long list of default fields and specifying them first, and then adding/removing the new field. This patch adds a new + - syntax to merely add or remove fields, that allows more succint and clearer command lines For example to remove the comm field from PMU samples: Previously $ perf script -F tid,cpu,time,event,sym,ip,dso,period \| head -1 swapper 0 [000] 504345.383126: 1 cycles: ffffffff90060c66 native_write_msr ([kernel.kallsyms]) with the new syntax perf script -F -comm \| head -1 0 [000] 504345.383126: 1 cycles: ffffffff90060c66 native_write_msr ([kernel.kallsyms]) The new syntax cannot be mixed with normal overriding. v2: Fix example in description. Use tid vs pid. No functional changes. v3: Don't skip initialization when user specified explicit type. v4: Rebase. Remove empty line. Committer testing: # perf record -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.748 MB perf.data (14 samples) ] Without a explicit field list specified via -F, defaults to: # perf script \| head -2 perf 6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) swapper 0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) # Which is equivalent to: # perf script -F comm,tid,cpu,time,period,event,ip,sym,dso \| head -2 perf 6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) swapper 0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) # So if we want to remove the comm, as in your original example, we would have to figure out the default field list and remove ' comm' from it: # perf script -F tid,cpu,time,period,event,ip,sym,dso \| head -2 6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) 0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) # With your patch this becomes simpler, one can remove fields by prefixing them with '-': # perf script -F -comm \| head -2 6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) 0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux) # Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Milian Wolff <milian.wolff@kdab.com> Link: http://lkml.kernel.org/r/20170602154810.15875-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-19 15:14:58 -03:00
SeongJae Park	14fc42fa1b	perf script python: Remove dups in documentation examples Few shell command examples in perf-script-python.txt has few nitpicks include: - tools/perf/scripts/python directory listing command is unnecessarily repeated. - few examples contain additional information in command prompt unnecessarily and inconsistently. This commit fixes them to enhance readability of the document. Signed-off-by: SeongJae Park <sj38.park@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Zanussi <tzanussi@gmail.com> Fixes: `cff68e5822` ("perf/scripts: Add perf-trace-python Documentation") Link: http://lkml.kernel.org/r/20170530111827.21732-4-sj38.park@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-07 20:36:12 -03:00
SeongJae Park	1bf8d5a4a5	perf script python: Updated trace_unhandled() signature Default function signature of trace_unhandled() got changed to include a field dict, but its documentation, perf-script-python.txt has not been updated. Fix it. Signed-off-by: SeongJae Park <sj38.park@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pierre Tardy <tardyp@gmail.com> Fixes: `c02514850d` ("perf scripts python: Give field dict to unhandled callback") Link: http://lkml.kernel.org/r/20170530111827.21732-6-sj38.park@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-07 20:27:32 -03:00
SeongJae Park	26ddb8722d	perf script python: Fix wrong code snippets in documentation This commit fixes wrong code snippets for trace_begin() and trace_end() function example definition. Signed-off-by: SeongJae Park <sj38.park@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Zanussi <tzanussi@gmail.com> Fixes: `cff68e5822` ("perf/scripts: Add perf-trace-python Documentation") Link: http://lkml.kernel.org/r/20170530111827.21732-5-sj38.park@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-07 20:27:26 -03:00
SeongJae Park	34d4453dac	perf script: Fix documentation errors This commit fixes two errors in documents for perf-script-python and perf-script-perl as below: - /sys/kernel/debug/tracing events -> /sys/kernel/debug/tracing/events/ - trace_handled -> trace_unhandled Signed-off-by: SeongJae Park <sj38.park@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Zanussi <tzanussi@gmail.com> Fixes: `cff68e5822` ("perf/scripts: Add perf-trace-python Documentation") Link: http://lkml.kernel.org/r/20170530111827.21732-3-sj38.park@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-07 20:27:20 -03:00
SeongJae Park	d89269a89e	perf probe: Fix examples section of documentation An example in perf-probe documentation for pattern of function name based probe addition is not providing example command for that case. This commit fixes the example to give appropriate example command. Signed-off-by: SeongJae Park <sj38.park@gmail.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <treeze.taeung@gmail.com> Fixes: `ee391de876` ("perf probe: Update perf probe document") Link: http://lkml.kernel.org/r/20170507103642.30560-1-sj38.park@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-07 20:23:11 -03:00
Namhyung Kim	325fbff51f	perf script: Add --inline option for debugging The --inline option is to show inlined functions in callchains. For example: $ perf script a.out 5644 11611.467597: 309961 cycles:u: 790 main (/home/namhyung/tmp/perf/a.out) 20511 __libc_start_main (/usr/lib/libc-2.25.so) 8ba _start (/home/namhyung/tmp/perf/a.out) ... $ perf script --inline a.out 5644 11611.467597: 309961 cycles:u: 790 main (/home/namhyung/tmp/perf/a.out) std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > main 20511 __libc_start_main (/usr/lib/libc-2.25.so) 8ba _start (/home/namhyung/tmp/perf/a.out) ... Reviewed-and-tested-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170524062129.32529-5-namhyung@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-05-24 08:41:48 +02:00
Kim Phillips	1291927a49	perf tools: Fix spelling mistakes Mostly in the documentation. Signed-off-by: Kim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170503131350.cebeecd8bd0f2968417626ab@arm.com [ Fix spelling of "parameter" in one of the spell-checked lines ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-05-04 09:59:53 -03:00
Ravi Bangoria	739cf30551	perf trace: Add usage of --no-syscalls in man page perf trace supports --no-syscalls option but it's not listed in the man page. (Though, I see an example using --no-syscalls in EXAMPLES section.) Committer note: The --no-syscalls option tells 'perf trace' not to automagically ask for raw_syscalls:sys_{enter,exit} to then format it in a strace like way. This become more used as 'perf trace' got support for arbitrary events, such as tracepoints, so more and more we use: # perf trace --no-syscalls -e nmi:* 0.000 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 36649 handled: 1) 0.019 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 2907 handled: 0) 0.676 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 9401 handled: 1) 0.680 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 288 handled: 0) 0.701 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 4977 handled: 1) 0.703 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 67 handled: 0) 0.736 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 8549 handled: 1) ^C# Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1492063332-5745-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-04-13 10:54:04 -03:00
David Carrillo-Cisneros	6d13491e2d	perf tools: Describe pipe mode in perf.data-file-fomat.txt Add a minimal description of pipe's data format. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170410201432.24807-4-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-04-11 15:23:41 -03:00
Milian Wolff	5dfa210e40	perf report: Enable sorting by srcline as key Often it is interesting to know how costly a given source line is in total. Previously, one had to build these sums manually based on all addresses that pointed to the same source line. This patch introduces srcline as a sort key, which will do the aggregation for us. Paired with the recent addition of showing inline frames, this makes perf report much more useful for many C++ work loads. The following shows the new feature in action. First, let's show the status quo output when we sort by address. The result contains many hist entries that generate the same output: ~~~~~~~~~~~~~~~~ $ perf report --stdio --inline -g address # Children Self Command Shared Object Symbol # ........ ........ ............ ................... ......................................... # 99.89% 35.34% cpp-inlining cpp-inlining [.] main \| \|--64.55%--main complex:655 \| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) \| /usr/include/c++/6.3.1/complex:664 (inline) \| \| \| \|--60.31%--hypot +20 \| \| \| \| \| \|--8.52%--__hypot_finite +273 \| \| \| \| \| \|--7.32%--__hypot_finite +411 ... --35.34%--_start +4194346 __libc_start_main +241 \| \|--6.65%--main random.tcc:3326 \| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) \| /usr/include/c++/6.3.1/bits/random.h:1809 (inline) \| /usr/include/c++/6.3.1/bits/random.h:1818 (inline) \| /usr/include/c++/6.3.1/bits/random.h:185 (inline) \| \|--2.70%--main random.tcc:3326 \| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) \| /usr/include/c++/6.3.1/bits/random.h:1809 (inline) \| /usr/include/c++/6.3.1/bits/random.h:1818 (inline) \| /usr/include/c++/6.3.1/bits/random.h:185 (inline) \| \|--1.69%--main random.tcc:3326 \| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) \| /usr/include/c++/6.3.1/bits/random.h:1809 (inline) \| /usr/include/c++/6.3.1/bits/random.h:1818 (inline) \| /usr/include/c++/6.3.1/bits/random.h:185 (inline) ... ~~~~~~~~~~~~~~~~ With this patch and `-g srcline` we instead get the following output: ~~~~~~~~~~~~~~~~ $ perf report --stdio --inline -g srcline # Children Self Command Shared Object Symbol # ........ ........ ............ ................... ......................................... # 99.89% 35.34% cpp-inlining cpp-inlining [.] main \| \|--64.55%--main complex:655 \| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) \| /usr/include/c++/6.3.1/complex:664 (inline) \| \| \| \|--64.02%--hypot \| \| \| \| \| --59.81%--__hypot_finite \| \| \| --0.53%--cabs \| --35.34%--_start __libc_start_main \| \|--12.48%--main random.tcc:3326 \| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) \| /usr/include/c++/6.3.1/bits/random.h:1809 (inline) \| /usr/include/c++/6.3.1/bits/random.h:1818 (inline) \| /usr/include/c++/6.3.1/bits/random.h:185 (inline) ... ~~~~~~~~~~~~~~~~ Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Yao Jin <yao.jin@linux.intel.com> Link: http://lkml.kernel.org/r/20170318214928.9047-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-03-27 12:13:28 -03:00
Jin Yao	f3a60646cc	perf report: Introduce --inline option It takes some time to look for inline stack for callgraph addresses. So it provides new option "--inline" to let user decide if enable this feature. --inline: If a callgraph address belongs to an inlined function, the inline stack will be printed. Each entry is the inline function name or file/line. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1490474069-15823-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-03-27 12:01:46 -03:00

1 2 3 4 5 ...

641 Commits