linux-sg2042/tools/perf
Jin Yao cebf7d51a6 perf diff: Report noisy for cycles diff
This patch prints the stddev and hist for the cycles diff of program
block. It can help us to understand if the cycles is noisy or not.

This patch is inspired by Andi Kleen's patch:

  https://lwn.net/Articles/600471/

We create new option '--cycles-hist'.

Example:

  perf record -b ./div
  perf record -b ./div
  perf diff -c cycles

  # Baseline                                [Program Block Range] Cycles Diff  Shared Object      Symbol
  # ........  .......................................................... ....  .................  ............................
  #
      46.72%                                      [div.c:40 -> div.c:40]    0  div                [.] main
      46.72%                                      [div.c:42 -> div.c:44]    0  div                [.] main
      46.72%                                      [div.c:42 -> div.c:39]    0  div                [.] main
      20.54%                          [random_r.c:357 -> random_r.c:394]    1  libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:357 -> random_r.c:380]    0  libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -> random_r.c:388]    0  libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -> random_r.c:391]    0  libc-2.27.so       [.] __random_r
      17.04%                              [random.c:288 -> random.c:291]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:291 -> random.c:291]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:293 -> random.c:293]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -> random.c:295]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -> random.c:295]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:298 -> random.c:298]    0  libc-2.27.so       [.] __random
       8.40%                                      [div.c:22 -> div.c:25]    0  div                [.] compute_flag
       8.40%                                      [div.c:27 -> div.c:28]    0  div                [.] compute_flag
       5.14%                                    [rand.c:26 -> rand.c:27]    0  libc-2.27.so       [.] rand
       5.14%                                    [rand.c:28 -> rand.c:28]    0  libc-2.27.so       [.] rand
       2.15%                                  [rand@plt+0 -> rand@plt+0]    0  div                [.] rand@plt
       0.00%                                                                   [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
       0.00%                                [do_mmap+714 -> do_mmap+732]  -10  [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+737 -> do_mmap+765]    1  [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+262 -> do_mmap+299]    0  [kernel.kallsyms]  [k] do_mmap
       0.00%  [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0]    7  [kernel.kallsyms]  [k] __x86_indirect_thunk_r15
       0.00%            [native_sched_clock+0 -> native_sched_clock+119]   -1  [kernel.kallsyms]  [k] native_sched_clock
       0.00%                 [native_write_msr+0 -> native_write_msr+16]  -13  [kernel.kallsyms]  [k] native_write_msr

When we enable the option '--cycles-hist', the output is

  perf diff -c cycles --cycles-hist

  # Baseline                                [Program Block Range] Cycles Diff        stddev/Hist  Shared Object      Symbol
  # ........  .......................................................... ....  .................  .................  ............................
  #
      46.72%                                      [div.c:40 -> div.c:40]    0  ± 37.8% ▁█▁▁██▁█   div                [.] main
      46.72%                                      [div.c:42 -> div.c:44]    0  ± 49.4% ▁▁▂█▂▂▂▂   div                [.] main
      46.72%                                      [div.c:42 -> div.c:39]    0  ± 24.1% ▃█▂▄▁▃▂▁   div                [.] main
      20.54%                          [random_r.c:357 -> random_r.c:394]    1  ± 33.5% ▅▂▁█▃▁▂▁   libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:357 -> random_r.c:380]    0  ± 39.4% ▁▁█▁██▅▁   libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -> random_r.c:388]    0                     libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -> random_r.c:391]    0  ± 41.2% ▁▃▁▂█▄▃▁   libc-2.27.so       [.] __random_r
      17.04%                              [random.c:288 -> random.c:291]    0  ± 48.8% ▁▁▁▁███▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:291 -> random.c:291]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:293 -> random.c:293]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -> random.c:295]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -> random.c:295]    0                     libc-2.27.so       [.] __random
      17.04%                              [random.c:298 -> random.c:298]    0  ± 75.6% ▃█▁▁▁▁▁▁   libc-2.27.so       [.] __random
       8.40%                                      [div.c:22 -> div.c:25]    0  ± 42.1% ▁▃▁▁███▁   div                [.] compute_flag
       8.40%                                      [div.c:27 -> div.c:28]    0  ± 41.8% ██▁▁▄▁▁▄   div                [.] compute_flag
       5.14%                                    [rand.c:26 -> rand.c:27]    0  ± 37.8% ▁▁▁████▁   libc-2.27.so       [.] rand
       5.14%                                    [rand.c:28 -> rand.c:28]    0                     libc-2.27.so       [.] rand
       2.15%                                  [rand@plt+0 -> rand@plt+0]    0                     div                [.] rand@plt
       0.00%                                                                                      [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
       0.00%                                [do_mmap+714 -> do_mmap+732]  -10                     [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+737 -> do_mmap+765]    1                     [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+262 -> do_mmap+299]    0                     [kernel.kallsyms]  [k] do_mmap
       0.00%  [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0]    7                     [kernel.kallsyms]  [k] __x86_indirect_thunk_r15
       0.00%            [native_sched_clock+0 -> native_sched_clock+119]   -1  ± 38.5% ▄█▁        [kernel.kallsyms]  [k] native_sched_clock
       0.00%                 [native_write_msr+0 -> native_write_msr+16]  -13  ± 47.1% ▁█▇▃▁▁     [kernel.kallsyms]  [k] native_write_msr

 v8:
 ---
 Rebase to perf/core branch

 v7:
 ---
 1. v6 got Jiri's ACK.
 2. Rebase to latest perf/core branch.

 v6:
 ---
 1. Jiri provides better code for using data__hpp_register() in ui_init().
    Use this code in v6.

 v5:
 ---
 1. Refine the use of data__hpp_register() in ui_init() according to
    Jiri's suggestion.

 v4:
 ---
 1. Rename the new option from '--noisy' to '--cycles-hist'
 2. Remove the option '-n'.
 3. Only update the spark value and stats when '--cycles-hist' is enabled.
 4. Remove the code of printing '..'.

 v3:
 ---
 1. Move the histogram to a separate column
 2. Move the svals[] out of struct stats

 v2:
 ---
 Jiri got a compile error,

  CC       builtin-diff.o
  builtin-diff.c: In function ‘compute_cycles_diff’:
  builtin-diff.c:712:10: error: taking the absolute value of unsigned type ‘u64’ {aka ‘long unsigned int’} has no effect [-Werror=absolute-value]
  712 |          labs(pair->block_info->cycles_spark[i] -
      |          ^~~~

 Because the result of u64 - u64 is still u64. Now we change the type of
 cycles_spark[] to s64.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20190925011446.30678-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-11 10:57:00 -03:00
..
Documentation perf diff: Report noisy for cycles diff 2019-10-11 10:57:00 -03:00
arch libperf: Adopt perf_mmap__read_event() from tools/perf 2019-10-10 11:49:46 -03:00
bench perf env: Remove needless cpumap.h header 2019-09-20 09:19:21 -03:00
examples/bpf perf augmented_raw_syscalls: Reduce perf_event_output() boilerplate 2019-08-26 11:58:29 -03:00
include/bpf perf include bpf: Add bpf_tail_call() prototype 2019-07-29 18:34:40 -03:00
jvmti perf jvmti: Link against tools/lib/string.o to have weak strlcpy() 2019-09-20 09:18:45 -03:00
lib perf tools: Propagate CFLAGS to libperf 2019-10-11 10:55:22 -03:00
pmu-events perf jevents: Fix period for Intel fixed counters 2019-09-30 17:29:53 -03:00
python treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 407 2019-06-05 17:37:14 +02:00
scripts perf scripts python: exported-sql-viewer.py: Add Time chart by CPU 2019-10-07 12:22:17 -03:00
tests libperf: Adopt perf_mmap__read_event() from tools/perf 2019-10-10 11:49:46 -03:00
trace perf beauty: Introduce strtoul() for x86 MSRs 2019-10-09 16:25:02 -03:00
ui libperf: Add perf_evlist__first()/last() functions 2019-09-25 09:51:48 -03:00
util perf diff: Report noisy for cycles diff 2019-10-11 10:57:00 -03:00
.gitignore perf: Update .gitignore file 2019-08-31 22:27:52 -03:00
Build perf tools: Rename build libperf to perf 2019-02-14 15:18:08 -03:00
CREDITS
MANIFEST tools lib: Adopt zalloc()/zfree() from tools/perf 2019-07-09 10:13:26 -03:00
Makefile perf tools: Disable parallelism for 'make clean' 2018-08-20 08:54:58 -03:00
Makefile.config perf tools: Propagate CFLAGS to libperf 2019-10-11 10:55:22 -03:00
Makefile.perf perf tools: Propagate CFLAGS to libperf 2019-10-11 10:55:22 -03:00
builtin-annotate.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-bench.c perf tools: Remove perf.h from source files not needing it 2019-08-29 17:38:32 -03:00
builtin-buildid-cache.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-buildid-list.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-c2c.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-config.c perf tools: Remove util.h from where it is not needed 2019-09-20 09:19:20 -03:00
builtin-data.c perf debug: Remove needless include directives from debug.h 2019-08-31 19:10:19 -03:00
builtin-diff.c perf diff: Report noisy for cycles diff 2019-10-11 10:57:00 -03:00
builtin-evlist.c perf evsel: Introduce evsel_fprintf.h 2019-09-25 16:26:34 -03:00
builtin-ftrace.c perf auxtrace: Uninline functions that touch perf_session 2019-08-31 22:24:10 -03:00
builtin-help.c perf debug: Remove needless include directives from debug.h 2019-08-31 19:10:19 -03:00
builtin-inject.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-kallsyms.c perf dsos: Move the dsos struct and its methods to separate source files 2019-08-31 22:24:10 -03:00
builtin-kmem.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-kvm.c libperf: Adopt perf_mmap__read_event() from tools/perf 2019-10-10 11:49:46 -03:00
builtin-list.c perf list: Allow plurals for metric, metricgroup 2019-09-25 09:51:42 -03:00
builtin-lock.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-mem.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-probe.c perf probe: No need for symbol.h, symbol_conf is enough 2019-08-31 22:24:10 -03:00
builtin-record.c libperf: Adopt perf_mmap__put() function from tools/perf 2019-10-10 10:09:25 -03:00
builtin-report.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-sched.c perf evsel: Introduce evsel_fprintf.h 2019-09-25 16:26:34 -03:00
builtin-script.c perf script: Allow --time with --reltime 2019-10-07 12:22:18 -03:00
builtin-stat.c libperf: Move 'sample_id' from 'struct evsel' to 'struct perf_evsel' 2019-09-25 09:51:47 -03:00
builtin-timechart.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-top.c libperf: Adopt perf_mmap__read_event() from tools/perf 2019-10-10 11:49:46 -03:00
builtin-trace.c libperf: Adopt perf_mmap__read_event() from tools/perf 2019-10-10 11:49:46 -03:00
builtin-version.c perf symbols: Move mem_info and branch_info out of symbol.h 2019-08-31 22:27:48 -03:00
builtin.h perf tools: Remove needless util.h include from builtin.h 2019-08-28 17:19:34 -03:00
check-headers.sh tools arch x86: Grab a copy of the file containing the MSR numbers 2019-10-07 12:22:18 -03:00
command-list.txt perf help: Add missing subcommand `version` 2018-09-19 14:53:36 -03:00
design.txt perf/doc: Update design.txt for exclude_{host|guest} flags 2019-01-21 11:01:18 +01:00
perf-archive.sh License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
perf-completion.sh perf tools: Auto-complete for events with ':' 2017-12-27 12:16:00 -03:00
perf-read-vdso.c perf tools: Make find_vdso_map() more modular 2019-01-08 13:28:13 -03:00
perf-sys.h perf tools: Make usage of test_attr__* optional for perf-sys.h 2019-10-07 12:22:17 -03:00
perf-with-kcore.sh Merge branch 'x86/cpu' into perf/core, to pick up dependent changes 2019-06-17 12:29:16 +02:00
perf.c libperf: Merge libperf_set_print() into libperf_init() 2019-09-25 09:51:49 -03:00
perf.h perf time-utils: Adopt rdclock() from perf.h 2019-08-29 17:38:32 -03:00