New features:
- Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles Baylis)
E.g.:
# perf report -s symbol_size,symbol
Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623
Overhead Symbol size Symbol
14.55% 326 [k] flush_tlb_mm_range
7.20% 1045 [k] filemap_map_pages
5.82% 124 [k] vma_interval_tree_insert
5.18% 2430 [k] unmap_page_range
2.57% 571 [k] vma_interval_tree_remove
1.94% 494 [k] page_add_file_rmap
1.82% 740 [k] page_remove_rmap
1.66% 1017 [k] release_pages
1.57% 1636 [k] update_blocked_averages
1.57% 76 [k] unlock_page
- Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' (Namhyung Kim)
Change in behaviour:
- Make system wide (-a) the default option if no target was specified and one
of following conditions is met:
- No workload specified (current behaviour)
- A workload is specified but all requested events are system wide ones,
like uncore ones. (Jiri Olsa)
Fixes:
- Add missing initialization to the instruction decoder used in the
intel PT/BTS code, which was causing lots of failures in 'perf test',
looking for a value when there was none (Adrian Hunter)
Infrastructure:
- Add arch code needed to adopt the kernel's refcount_t to aid in
catching bugs when using atomic_t as a reference counter, basically
cmpxchg related functions (Arnaldo Carvalho de Melo)
- Convert the code using atomic_t as reference counts to refcount_t
(Elena Rashetova)
- Add feature test for sched_getcpu() to more easily check for its
presence in the many libc implementations and accross different
versions of such C libraries (Arnaldo Carvalho de Melo)
- Issue a HW watchdog disable hint in 'perf stat' for when some of the
requested events can't get counted because a PMU counter is taken by that
watchdog (Borislav Petkov).
- Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)
Documentation:
- Clarify the term 'convergence' in:
perf bench numa numa-mem -h --show_convergence (Jiri Olsa)
Kernel code:
- Ensure probe location is at function entry in kretprobes (Naveen N. Rao)
- Allow return probes with offsets and absolute addresses (Naveen N. Rao)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYvbmTAAoJENZQFvNTUqpAN80QAJ2ETcTosR9fo06VrT2HqRT4
+iGe55wSu261TOekIkXOEW+ww321eNPfy4rIZeLCEFcCd9p03n5JceVbFnOjBuAz
Lk6jrKpaH+Ajp56nCLyWH4r3LYLJXdoIydNay4PZ08rl0GgagGqvevD8ZZCEO0sx
vjD1TFH2uSOq3UTxKapO++FHwhy+XqZ5S5I+rMuLxg6Qi+rLubXDztzIlcCfQPGx
g+zFkaJ/ms9TAtWK25xoj34QXsaqpBsF8qkCE1P8Zdjtnkp6zM2Rx3HvvbRDmgVx
/h0b1iua5IVElgnai/84ttJG3Bi6ovRbf/PFy+IceM4Qfx0eQeWmA3CAtcGOh9Gv
GTDCcJ7xWZBpM0g1wCk3ks2oApFTA6GkcnIt5alhTse5U3gNmImv3uvuN8d265KL
2oGKps7MH1nWMgpL4G4BNuZg2oqmM/uX9ERiuNjtCqj6WoHy2QSDyEMJN5Od3lYj
ar2PPGofHmiacsW3NNMT+LwQ/wL/d2dVfZTopMafeaxRDGTxdqkhLkB/sZT/wexQ
ySVijQPO+x0eLSIK/BWdEmD8K6JiYGdpIRDWVW+D043I7iiXFvPgui1bFABJEqn5
mZFa1qT4EQMuuSaLkkxtoOrdoF6YzJJA57sIx2IrouGDapJ2BDegiUOfE0PSp5l0
oeRuFcJYfpITC/TzntE3
=h2yo
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-4.11-20170306' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles Baylis)
E.g.:
# perf report -s symbol_size,symbol
Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623
Overhead Symbol size Symbol
14.55% 326 [k] flush_tlb_mm_range
7.20% 1045 [k] filemap_map_pages
5.82% 124 [k] vma_interval_tree_insert
5.18% 2430 [k] unmap_page_range
2.57% 571 [k] vma_interval_tree_remove
1.94% 494 [k] page_add_file_rmap
1.82% 740 [k] page_remove_rmap
1.66% 1017 [k] release_pages
1.57% 1636 [k] update_blocked_averages
1.57% 76 [k] unlock_page
- Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' (Namhyung Kim)
Change in behaviour:
- Make system wide (-a) the default option if no target was specified and one
of following conditions is met:
- No workload specified (current behaviour)
- A workload is specified but all requested events are system wide ones,
like uncore ones. (Jiri Olsa)
Fixes:
- Add missing initialization to the instruction decoder used in the
intel PT/BTS code, which was causing lots of failures in 'perf test',
looking for a value when there was none (Adrian Hunter)
Infrastructure changes:
- Add arch code needed to adopt the kernel's refcount_t to aid in
catching bugs when using atomic_t as a reference counter, basically
cmpxchg related functions (Arnaldo Carvalho de Melo)
- Convert the code using atomic_t as reference counts to refcount_t
(Elena Rashetova)
- Add feature test for sched_getcpu() to more easily check for its
presence in the many libc implementations and accross different
versions of such C libraries (Arnaldo Carvalho de Melo)
- Issue a HW watchdog disable hint in 'perf stat' for when some of the
requested events can't get counted because a PMU counter is taken by that
watchdog (Borislav Petkov).
- Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)
Documentation changes:
- Clarify the term 'convergence' in:
perf bench numa numa-mem -h --show_convergence (Jiri Olsa)
Kernel code changes:
- Ensure probe location is at function entry in kretprobes (Naveen N. Rao)
- Allow return probes with offsets and absolute addresses (Naveen N. Rao)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>