perf tools fixes for v5.15: 2nd batch
- Fix 'perf test' DWARF unwind for optimized builds. - Fix 'perf test' 'Object code reading' when dealing with samples in @plt symbols. - Fix off-by-one directory paths in the ARM support code. - Fix error message to eliminate confusion in 'perf config' when first creating a config file. - 'perf iostat' fix for system wide operation. - Fix printing of metrics when 'perf iostat' is used with one or more iio_root_ports and unconnected cpus (using -C). - Fix several typos in the documentation files. - Fix spelling mistake "icach" -> "icache" in the power8 JSON vendor files. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYVIO3wAKCRCyPKLppCJ+ J9piAP4jmxYEnimD6qvVHjOLio2LvwGI0u7MakZCHWVKQZKHbgEArb8l3+D2+YXw U7RxDmXoSE+0EjTV8o13sQlerRTU3wM= =oVI7 -----END PGP SIGNATURE----- Merge tag 'perf-tools-fixes-for-v5.15-2021-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull more perf tools fixes from Arnaldo Carvalho de Melo: - Fix 'perf test' DWARF unwind for optimized builds. - Fix 'perf test' 'Object code reading' when dealing with samples in @plt symbols. - Fix off-by-one directory paths in the ARM support code. - Fix error message to eliminate confusion in 'perf config' when first creating a config file. - 'perf iostat' fix for system wide operation. - Fix printing of metrics when 'perf iostat' is used with one or more iio_root_ports and unconnected cpus (using -C). - Fix several typos in the documentation files. - Fix spelling mistake "icach" -> "icache" in the power8 JSON vendor files. * tag 'perf-tools-fixes-for-v5.15-2021-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf iostat: Fix Segmentation fault from NULL 'struct perf_counts_values *' perf iostat: Use system-wide mode if the target cpu_list is unspecified perf config: Refine error message to eliminate confusion perf doc: Fix typos all over the place perf arm: Fix off-by-one directory paths. perf vendor events powerpc: Fix spelling mistake "icach" -> "icache" perf tests: Fix flaky test 'Object code reading' perf test: Fix DWARF unwind for optimized builds.
This commit is contained in:
commit
0513e464f9
|
@ -164,7 +164,7 @@ const char unwinding_data[n]: an array of unwinding data, consisting of the EH F
|
|||
The EH Frame header follows the Linux Standard Base (LSB) specification as described in the document at https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/ehframehdr.html
|
||||
|
||||
|
||||
The EH Frame follows the LSB specicfication as described in the document at https://refspecs.linuxbase.org/LSB_3.0.0/LSB-PDA/LSB-PDA/ehframechpt.html
|
||||
The EH Frame follows the LSB specification as described in the document at https://refspecs.linuxbase.org/LSB_3.0.0/LSB-PDA/LSB-PDA/ehframechpt.html
|
||||
|
||||
|
||||
NOTE: The mapped_size is generally either the same as unwind_data_size (if the unwinding data was mapped in memory by the running process) or zero (if the unwinding data is not mapped by the process). If the unwinding data was not mapped, then only the EH Frame Header will be read, which can be used to specify FP based unwinding for a function which does not have unwinding information.
|
||||
|
|
|
@ -261,7 +261,7 @@ COALESCE
|
|||
User can specify how to sort offsets for cacheline.
|
||||
|
||||
Following fields are available and governs the final
|
||||
output fields set for caheline offsets output:
|
||||
output fields set for cacheline offsets output:
|
||||
|
||||
tid - coalesced by process TIDs
|
||||
pid - coalesced by process PIDs
|
||||
|
|
|
@ -883,7 +883,7 @@ and "r" can be combined to get calls and returns.
|
|||
|
||||
"Transactions" events correspond to the start or end of transactions. The
|
||||
'flags' field can be used in perf script to determine whether the event is a
|
||||
tranasaction start, commit or abort.
|
||||
transaction start, commit or abort.
|
||||
|
||||
Note that "instructions", "branches" and "transactions" events depend on code
|
||||
flow packets which can be disabled by using the config term "branch=0". Refer
|
||||
|
|
|
@ -44,7 +44,7 @@ COMMON OPTIONS
|
|||
|
||||
-f::
|
||||
--force::
|
||||
Don't complan, do it.
|
||||
Don't complain, do it.
|
||||
|
||||
REPORT OPTIONS
|
||||
--------------
|
||||
|
|
|
@ -54,7 +54,7 @@ all sched_wakeup events in the system:
|
|||
Traces meant to be processed using a script should be recorded with
|
||||
the above option: -a to enable system-wide collection.
|
||||
|
||||
The format file for the sched_wakep event defines the following fields
|
||||
The format file for the sched_wakeup event defines the following fields
|
||||
(see /sys/kernel/debug/tracing/events/sched/sched_wakeup/format):
|
||||
|
||||
----
|
||||
|
|
|
@ -448,7 +448,7 @@ all sched_wakeup events in the system:
|
|||
Traces meant to be processed using a script should be recorded with
|
||||
the above option: -a to enable system-wide collection.
|
||||
|
||||
The format file for the sched_wakep event defines the following fields
|
||||
The format file for the sched_wakeup event defines the following fields
|
||||
(see /sys/kernel/debug/tracing/events/sched/sched_wakeup/format):
|
||||
|
||||
----
|
||||
|
|
|
@ -385,7 +385,7 @@ Aggregate counts per physical processor for system-wide mode measurements.
|
|||
Print metrics or metricgroups specified in a comma separated list.
|
||||
For a group all metrics from the group are added.
|
||||
The events from the metrics are automatically measured.
|
||||
See perf list output for the possble metrics and metricgroups.
|
||||
See perf list output for the possible metrics and metricgroups.
|
||||
|
||||
-A::
|
||||
--no-aggr::
|
||||
|
|
|
@ -2,7 +2,7 @@ Using TopDown metrics in user space
|
|||
-----------------------------------
|
||||
|
||||
Intel CPUs (since Sandy Bridge and Silvermont) support a TopDown
|
||||
methology to break down CPU pipeline execution into 4 bottlenecks:
|
||||
methodology to break down CPU pipeline execution into 4 bottlenecks:
|
||||
frontend bound, backend bound, bad speculation, retiring.
|
||||
|
||||
For more details on Topdown see [1][5]
|
||||
|
|
|
@ -8,10 +8,10 @@
|
|||
#include <linux/coresight-pmu.h>
|
||||
#include <linux/zalloc.h>
|
||||
|
||||
#include "../../util/auxtrace.h"
|
||||
#include "../../util/debug.h"
|
||||
#include "../../util/evlist.h"
|
||||
#include "../../util/pmu.h"
|
||||
#include "../../../util/auxtrace.h"
|
||||
#include "../../../util/debug.h"
|
||||
#include "../../../util/evlist.h"
|
||||
#include "../../../util/pmu.h"
|
||||
#include "cs-etm.h"
|
||||
#include "arm-spe.h"
|
||||
|
||||
|
|
|
@ -16,19 +16,19 @@
|
|||
#include <linux/zalloc.h>
|
||||
|
||||
#include "cs-etm.h"
|
||||
#include "../../util/debug.h"
|
||||
#include "../../util/record.h"
|
||||
#include "../../util/auxtrace.h"
|
||||
#include "../../util/cpumap.h"
|
||||
#include "../../util/event.h"
|
||||
#include "../../util/evlist.h"
|
||||
#include "../../util/evsel.h"
|
||||
#include "../../util/perf_api_probe.h"
|
||||
#include "../../util/evsel_config.h"
|
||||
#include "../../util/pmu.h"
|
||||
#include "../../util/cs-etm.h"
|
||||
#include "../../../util/debug.h"
|
||||
#include "../../../util/record.h"
|
||||
#include "../../../util/auxtrace.h"
|
||||
#include "../../../util/cpumap.h"
|
||||
#include "../../../util/event.h"
|
||||
#include "../../../util/evlist.h"
|
||||
#include "../../../util/evsel.h"
|
||||
#include "../../../util/perf_api_probe.h"
|
||||
#include "../../../util/evsel_config.h"
|
||||
#include "../../../util/pmu.h"
|
||||
#include "../../../util/cs-etm.h"
|
||||
#include <internal/lib.h> // page_size
|
||||
#include "../../util/session.h"
|
||||
#include "../../../util/session.h"
|
||||
|
||||
#include <errno.h>
|
||||
#include <stdlib.h>
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include "../../util/perf_regs.h"
|
||||
#include "../../../util/perf_regs.h"
|
||||
|
||||
const struct sample_reg sample_reg_masks[] = {
|
||||
SMPL_REG_END
|
||||
|
|
|
@ -10,7 +10,7 @@
|
|||
#include <linux/string.h>
|
||||
|
||||
#include "arm-spe.h"
|
||||
#include "../../util/pmu.h"
|
||||
#include "../../../util/pmu.h"
|
||||
|
||||
struct perf_event_attr
|
||||
*perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused)
|
||||
|
|
|
@ -1,8 +1,8 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <elfutils/libdwfl.h>
|
||||
#include "../../util/unwind-libdw.h"
|
||||
#include "../../util/perf_regs.h"
|
||||
#include "../../util/event.h"
|
||||
#include "../../../util/unwind-libdw.h"
|
||||
#include "../../../util/perf_regs.h"
|
||||
#include "../../../util/event.h"
|
||||
|
||||
bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
|
||||
{
|
||||
|
|
|
@ -3,8 +3,8 @@
|
|||
#include <errno.h>
|
||||
#include <libunwind.h>
|
||||
#include "perf_regs.h"
|
||||
#include "../../util/unwind.h"
|
||||
#include "../../util/debug.h"
|
||||
#include "../../../util/unwind.h"
|
||||
#include "../../../util/debug.h"
|
||||
|
||||
int libunwind__arch_reg_id(int regnum)
|
||||
{
|
||||
|
|
|
@ -432,7 +432,7 @@ void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
|
|||
u8 die = ((struct iio_root_port *)evsel->priv)->die;
|
||||
struct perf_counts_values *count = perf_counts(evsel->counts, die, 0);
|
||||
|
||||
if (count->run && count->ena) {
|
||||
if (count && count->run && count->ena) {
|
||||
if (evsel->prev_raw_counts && !out->force_header) {
|
||||
struct perf_counts_values *prev_count =
|
||||
perf_counts(evsel->prev_raw_counts, die, 0);
|
||||
|
|
|
@ -2408,6 +2408,8 @@ int cmd_stat(int argc, const char **argv)
|
|||
goto out;
|
||||
} else if (verbose)
|
||||
iostat_list(evsel_list, &stat_config);
|
||||
if (iostat_mode == IOSTAT_RUN && !target__has_cpu(&target))
|
||||
target.system_wide = true;
|
||||
}
|
||||
|
||||
if (add_default_attributes())
|
||||
|
|
|
@ -1046,7 +1046,7 @@
|
|||
{
|
||||
"EventCode": "0x4e010",
|
||||
"EventName": "PM_GCT_NOSLOT_IC_L3MISS",
|
||||
"BriefDescription": "Gct empty for this thread due to icach l3 miss",
|
||||
"BriefDescription": "Gct empty for this thread due to icache l3 miss",
|
||||
"PublicDescription": ""
|
||||
},
|
||||
{
|
||||
|
|
|
@ -229,8 +229,8 @@ static int read_object_code(u64 addr, size_t len, u8 cpumode,
|
|||
struct thread *thread, struct state *state)
|
||||
{
|
||||
struct addr_location al;
|
||||
unsigned char buf1[BUFSZ];
|
||||
unsigned char buf2[BUFSZ];
|
||||
unsigned char buf1[BUFSZ] = {0};
|
||||
unsigned char buf2[BUFSZ] = {0};
|
||||
size_t ret_len;
|
||||
u64 objdump_addr;
|
||||
const char *objdump_name;
|
||||
|
|
|
@ -20,6 +20,23 @@
|
|||
/* For bsearch. We try to unwind functions in shared object. */
|
||||
#include <stdlib.h>
|
||||
|
||||
/*
|
||||
* The test will assert frames are on the stack but tail call optimizations lose
|
||||
* the frame of the caller. Clang can disable this optimization on a called
|
||||
* function but GCC currently (11/2020) lacks this attribute. The barrier is
|
||||
* used to inhibit tail calls in these cases.
|
||||
*/
|
||||
#ifdef __has_attribute
|
||||
#if __has_attribute(disable_tail_calls)
|
||||
#define NO_TAIL_CALL_ATTRIBUTE __attribute__((disable_tail_calls))
|
||||
#define NO_TAIL_CALL_BARRIER
|
||||
#endif
|
||||
#endif
|
||||
#ifndef NO_TAIL_CALL_ATTRIBUTE
|
||||
#define NO_TAIL_CALL_ATTRIBUTE
|
||||
#define NO_TAIL_CALL_BARRIER __asm__ __volatile__("" : : : "memory");
|
||||
#endif
|
||||
|
||||
static int mmap_handler(struct perf_tool *tool __maybe_unused,
|
||||
union perf_event *event,
|
||||
struct perf_sample *sample,
|
||||
|
@ -91,7 +108,7 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
|
|||
return strcmp((const char *) symbol, funcs[idx]);
|
||||
}
|
||||
|
||||
noinline int test_dwarf_unwind__thread(struct thread *thread)
|
||||
NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__thread(struct thread *thread)
|
||||
{
|
||||
struct perf_sample sample;
|
||||
unsigned long cnt = 0;
|
||||
|
@ -122,7 +139,7 @@ noinline int test_dwarf_unwind__thread(struct thread *thread)
|
|||
|
||||
static int global_unwind_retval = -INT_MAX;
|
||||
|
||||
noinline int test_dwarf_unwind__compare(void *p1, void *p2)
|
||||
NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__compare(void *p1, void *p2)
|
||||
{
|
||||
/* Any possible value should be 'thread' */
|
||||
struct thread *thread = *(struct thread **)p1;
|
||||
|
@ -141,7 +158,7 @@ noinline int test_dwarf_unwind__compare(void *p1, void *p2)
|
|||
return p1 - p2;
|
||||
}
|
||||
|
||||
noinline int test_dwarf_unwind__krava_3(struct thread *thread)
|
||||
NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__krava_3(struct thread *thread)
|
||||
{
|
||||
struct thread *array[2] = {thread, thread};
|
||||
void *fp = &bsearch;
|
||||
|
@ -160,14 +177,22 @@ noinline int test_dwarf_unwind__krava_3(struct thread *thread)
|
|||
return global_unwind_retval;
|
||||
}
|
||||
|
||||
noinline int test_dwarf_unwind__krava_2(struct thread *thread)
|
||||
NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__krava_2(struct thread *thread)
|
||||
{
|
||||
return test_dwarf_unwind__krava_3(thread);
|
||||
int ret;
|
||||
|
||||
ret = test_dwarf_unwind__krava_3(thread);
|
||||
NO_TAIL_CALL_BARRIER;
|
||||
return ret;
|
||||
}
|
||||
|
||||
noinline int test_dwarf_unwind__krava_1(struct thread *thread)
|
||||
NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__krava_1(struct thread *thread)
|
||||
{
|
||||
return test_dwarf_unwind__krava_2(thread);
|
||||
int ret;
|
||||
|
||||
ret = test_dwarf_unwind__krava_2(thread);
|
||||
NO_TAIL_CALL_BARRIER;
|
||||
return ret;
|
||||
}
|
||||
|
||||
int test__dwarf_unwind(struct test *test __maybe_unused, int subtest __maybe_unused)
|
||||
|
|
|
@ -801,7 +801,7 @@ int perf_config_set(struct perf_config_set *set,
|
|||
section->name, item->name);
|
||||
ret = fn(key, value, data);
|
||||
if (ret < 0) {
|
||||
pr_err("Error: wrong config key-value pair %s=%s\n",
|
||||
pr_err("Error in the given config file: wrong config key-value pair %s=%s\n",
|
||||
key, value);
|
||||
/*
|
||||
* Can't be just a 'break', as perf_config_set__for_each_entry()
|
||||
|
|
Loading…
Reference in New Issue