And for consistency, rename it to bpf_map__def(), leaving "get" for
reference counting.
Also make it return a const pointer, as suggested by Wang.
Acked-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-mer00xqkiho0ymg66b5i9luw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For consistency, leaving "get" for reference counting.
Acked-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-crnflv84ejyhpba933ec71gs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To try to, over time, consistently use the IS_ERR() interface instead of
using two return values, i.e. the integer return value for an error and
the pointer address to return the bpf_map->priv pointer.
Also rename it to bpf__priv(), to leave the "get" term for reference
counting.
Noticed while working on using BPF for collecting non-integer syscall
argument payloads (struct sockaddr in calls such as connect(), for
instance), where we need to use BPF maps and thus generalise
bpf__setup_stdout() to connect bpf_output events with maps in a bpf
proggie.
Acked-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-saypxyd6ptrct379jqgxx4bl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
collect_config() collect all config key-value pairs from config files
and put each config info in config set. But if config set (i.e. 'set'
variable at collect_config()) is NULL, this is wrong so handle it.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1465210380-26749-4-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If a config file has wrong key-value pairs, the perf process will be
forcibly terminated by die() at perf_parse_file() called by
perf_config() so terminal settings can be crushed because of unusual
termination.
For example:
If user config file has a wrong value 'red;default' instead of a normal
value like 'red, default' for a key 'colors.top',
# cat ~/.perfconfig
[colors]
medium = red;default # wrong value
and if running sub-command 'top',
# perf top
perf process is dead by force and terminal setting is broken
with a messge like below.
Fatal: bad config file line 2 in /root/.perfconfig
So fix it.
If perf_config() can return on failure without calling die()
at perf_parse_file(), this problem can be solved.
And if a config file has wrong values, show the error message
and then use default config values instead of wrong config values.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1465210380-26749-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When in CSV mode --metric-only outputs an header, unlike the other
modes. Previously it did not properly print headers for the aggregation
columns, so the headers were actually shifted against the real values.
Fix this here by outputting the correct headers for CSV.
v2: Indent array.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464119559-17203-4-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When --metric-only is enabled there were no headers for the topology in
interval mode. Also when headers were printed they were on a separate
line.
Before:
$ perf stat --metric-only -A -I 1000 -a
1.001038376 frontend cycles idle insn per cycle stalled cycles per insn branch-misses of all branches
1.001038376 CPU0 123.54% 0.23 5.29 7.61%
1.001038376 CPU1 137.78% 0.24 5.13 10.07%
1.001038376 CPU2 64.48% 0.22 5.50 6.84%
After:
$ perf stat --metric-only -A -I 1000 -a
1.001111114 CPU0 82.46% 0.32 2.60 7.64%
1.001111114 CPU1 126.63% 0.02 42.83 0.15%
1.001111114 CPU2 193.54% 0.32 2.59 6.92%
v2: Move all headers on a single line
Reported-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464119559-17203-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implement the TopDown formulas in 'perf stat'. The topdown basic metrics
reported by the kernel are collected, and the formulas are computed and
output as normal metrics.
See the kernel commit exporting the events for details on the used
metrics.
Committer note:
Output example:
# perf stat --topdown -a usleep 1
Performance counter stats for 'system wide':
retiring bad speculation frontend bound backend bound
S0-C0 2 23.8% 11.6% 28.3% 36.3%
S0-C1 2 16.2% 15.7% 36.5% 31.6%
0.000579956 seconds time elapsed
#
v2: Always print all metrics, only use thresholds for coloring.
v3: Mark retiring over threshold green, not red.
v4: Only print one decimal digit
Fix color printing of one metric
v5: Avoid printing -0.0
v6: Remove extra frontend event lookup
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464119559-17203-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add basic plumbing for TopDown in perf stat
TopDown is intended to replace the frontend cycles idle/ backend cycles
idle metrics in standard perf stat output. These metrics are not
reliable in many workloads, due to out of order effects.
This implements a new --topdown mode in perf stat (similar to
--transaction) that measures the pipe line bottlenecks using
standardized formulas. The measurement can be all done with 5 counters
(one fixed counter)
The result are four metrics:
FrontendBound, BackendBound, BadSpeculation, Retiring
that describe the CPU pipeline behavior on a high level.
The full top down methology has many hierarchical metrics. This
implementation only supports level 1 which can be collected without
multiplexing. A full implementation of top down on top of perf is
available in pmu-tools toplev. (http://github.com/andikleen/pmu-tools)
The current version works on Intel Core CPUs starting with Sandy Bridge,
and Atom CPUs starting with Silvermont. In principle the generic
metrics should be also implementable on other out of order CPUs.
TopDown level 1 uses a set of abstracted metrics which are generic to
out of order CPU cores (although some CPUs may not implement all of
them):
topdown-total-slots Available slots in the pipeline
topdown-slots-issued Slots issued into the pipeline
topdown-slots-retired Slots successfully retired
topdown-fetch-bubbles Pipeline gaps in the frontend
topdown-recovery-bubbles Pipeline gaps during recovery
from misspeculation
These metrics then allow to compute four useful metrics:
FrontendBound, BackendBound, Retiring, BadSpeculation.
Add a new --topdown options to enable events. When --topdown is
specified set up events for all topdown events supported by the kernel.
Add topdown-* as a special case to the event parser, as is needed for
all events containing -.
The actual code to compute the metrics is in follow-on patches.
v2: Use standard sysctl read function.
v3: Move x86 specific code to arch/
v4: Enable --metric-only implicitly for topdown.
v5: Add --single-thread option to not force per core mode
v6: Fix output order of topdown metrics
v7: Allow combining with -d
v8: Remove --single-thread again
v9: Rename functions, adding arch_ and topdown_.
v10: Expand man page and describe TopDown better
Paste intro into commit description.
Print error when malloc fails.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464119559-17203-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf test' tries to parse all entries in /sys/devices/cpu/events/.
Ignore the special entries like '.scale', which cannot be directly
parsed as an event. This patch assumes all files containing a '.' are
special and can be ignored.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1465223766-29902-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's a display inconsistency when there are multiple tracepoint
events, some of which have the 'call-graph' config option set but the
first one hasn't, i.e. the whole logic for call graph processing is
enabled only if the first tracepoint event has call-graph set.
For instance, if we record signal_deliver with call-graph and
signal_generate without:
$ perf record -g -a -e signal:signal_deliver -e signal:signal_generate/call-graph=no/
[ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ]
$ perf script
kworker/u2:1 13 [000] 6563.875949: signal:signal_generate: sig=2 errno=0 code=128 comm=perf pid=1313 grp=1 res=0 ff61cc __send_signal+0x3ec ([kernel.kallsyms])
perf 1313 [000] 6563.877584: signal:signal_deliver: sig=2 errno=0 code=128 sa_handler=43115e sa_flags=14000000
7ffff314 get_signal+0x80007f0023a4 ([kernel.kallsyms])
7fffe358 do_signal+0x80007f002028 ([kernel.kallsyms])
7fffa5e8 exit_to_usermode_loop+0x80007f002053 ([kernel.kallsyms])
...
Then we exchange the order of these two events in commandline, and keep
signal_generate without call-graph.
$ perf record -g -a -e signal:signal_generate/call-graph=no/ -e signal:signal_deliver
[ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ]
$ perf script
kworker/u2:2 1314 [000] 6933.353060: signal:signal_generate: sig=2 errno=0 code=128 comm=perf pid=1321 grp=1 res=0
perf 1321 [000] 6933.353872: signal:signal_deliver: sig=2 errno=0 code=128 sa_handler=43115e sa_flags=14000000
This time, the callchain of the event signal_deliver disappeared. The
problem is caused by that perf only checks for the first evsel in evlist
and decides if callchain should be printed.
This patch traverses all evsels in evlist to see if any of them have
callchains, and shows the right result:
$ perf script
kworker/u2:2 1314 [000] 6933.353060: signal:signal_generate: sig=2 errno=0 code=128 comm=perf pid=1321 grp=1 res=0 ff61cc __send_signal+0x3ec ([kernel.kallsyms])
perf 1321 [000] 6933.353872: signal:signal_deliver: sig=2 errno=0 code=128 sa_handler=43115e sa_flags=14000000
7ffff314 get_signal+0x80007f0023a4 ([kernel.kallsyms])
7fffe358 do_signal+0x80007f002028 ([kernel.kallsyms])
7fffa5e8 exit_to_usermode_loop+0x80007f002053 ([kernel.kallsyms])
...
Signed-off-by: He Kuang <hekuang@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1463374279-97209-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If zalloc fail, setting evlist->mmap[i].fd is unsafe and
perf_evlist__alloc_mmap() should bail out right after that.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Fixes: d4c6fb36ac ("perf evsel: Record fd into perf_mmap")
Link: http://lkml.kernel.org/r/1464699975-230440-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Out of perf_evsel__intval(), that requires passing the variable name,
that will then be searched in the list of tracepoint variables for the
given evsel.
In cases such as syscall file descriptor ("fd") tracking, this is
wasteful, we need just to use perf_evsel__field() and cache the
format_field.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-r6f89jx9j5nkx037d0naviqy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that we have topology_max_smt_threads() use it
to detect the HT workarounds for older CPUs.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-6-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add topdown event declarations to Silvermont / Airmont.
These cores do not support the full Top Down metrics, but an useful
subset (FrontendBound, Retiring, Backend Bound/Bad Speculation).
The perf stat tool automatically handles the missing events
and combines the available metrics.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-5-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add declarations for the events needed for topdown to the
Intel big core CPUs starting with Sandy Bridge. We need
to report different values if HyperThreading is on or off.
The only thing this patch does is to export some events
in sysfs.
topdown level 1 uses a set of abstracted metrics which
are generic to out of order CPU cores (although some
CPUs may not implement all of them):
topdown-total-slots Available slots in the pipeline
topdown-slots-issued Slots issued into the pipeline
topdown-slots-retired Slots successfully retired
topdown-fetch-bubbles Pipeline gaps in the frontend
topdown-recovery-bubbles Pipeline gaps during recovery
from misspeculation
A slot is a single operation in the CPU pipe line.
These metrics then allow to compute four useful metrics:
FrontendBound, BackendBound, Retiring, BadSpeculation.
The formulas to compute the metrics are generic, they
only change based on the availability on the abstracted
input values.
The kernel declares the events supported by the current
CPU and their scaling factors (such as the pipeline width)
and perf stat then computes the formulas based on the
available metrics. This is similar how existing
perf metrics, such as TSC metrics or IPC, are implemented.
This abstracts all CPU pipe line specific knowledge in the
kernel driver, but still avoids the need for larger scale perf
interface changes.
For HyperThreading the any bit is needed to get accurate
values when both threads are executing. This implies that
the events can only be collected as root or with
perf_event_paranoid=-1 for now.
The basic scheme is based on the following paper:
Yasin, A Top Down Method for Performance analysis and Counter architecture ISPASS14 (pdf available via google)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add a way to show different sysfs events attributes depending on
HyperThreading is on or off. This is difficult to determine
early at boot, so we just do it dynamically when the sysfs
attribute is read.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-3-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For SMT specific workarounds it is useful to know if SMT is active
on any online CPU in the system. This currently requires a loop
over all online CPUs.
Add a global variable that is updated with the maximum number
of smt threads on any CPU on online/offline, and use it for
topology_max_smt_threads()
The single call is easier to use than a loop.
Not exported to user space because user space already can use
the existing sibling interfaces to find this out.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This allows (with a previous change to the perf error return ABI) for
calling out in userspace the exact reason for perf record failing
when PMU doesn't support overflow interrupts.
Note that this needs to be put ahead of existing precise_ip check as
that gets hit otherwise for the sampling fail case as well.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <acme@redhat.com>
Cc: <linux-snps-arc@lists.infradead.org>
Cc: <vincent.weaver@maine.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Link: http://lkml.kernel.org/r/1462786660-2900-2-git-send-email-vgupta@synopsys.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Change the return code for sampling event not supported from -ENOTSUPP
to -EOPNOTSUPP.
This allows userspace to identify this case specifically, instead of
printing the catch-all error message it did previously.
Technically this is an ABI change, but we think we can get away
with it.
Old behavior:
-------
| # perf record ls
| Error:
| The sys_perf_event_open() syscall returned with 524 (Unknown error 524)
| for event (cycles:ppp).
| /bin/dmesg may provide additional information.
| No CONFIG_PERF_EVENTS=y kernel support configured?
New behavior:
-------
| # perf record ls
| Error:
| PMU Hardware doesn't support sampling/overflow-interrupts.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <acme@redhat.com>
Cc: <linux-snps-arc@lists.infradead.org>
Cc: <vincent.weaver@maine.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Link: http://lkml.kernel.org/r/1462786660-2900-3-git-send-email-vgupta@synopsys.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some platforms, e.g. Knights Landing, use a common PCI device ID for
multiple instances of an uncore PMU device type. So it is impossible to
locate the specific instances only by PCI device ID.
The current code specially handles Knights Landing by arbitrarily pointing
an instance to an unused uncore box. However, we still have no idea
which uncore device is mapped to which box.
Furthermore, there could be more platforms which use a common PCI device ID
for uncore devices. We have to specially handle them one by one.
This patch records full device information (slot, func, and device ID)
in id_table[]. So the probe function can point the instance to a specific
uncore box by checking the full device information.
Tested-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: tglx@linutronix.de
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: bp@suse.de
Cc: harish.chegondi@intel.com
Cc: hubert.chrzaniuk@intel.com
Cc: lawrence.f.meadows@intel.com
Link: http://lkml.kernel.org/r/1463379504-39003-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch fixes an issue which was introduced by commit:
91a612eea9 ("perf/core: Fix dynamic interrupt throttle")
... which commit unconditionally sets the perf_sample_allowed_ns value
to !0. But that could trigger a bug in the following corner case:
The user can disable the dynamic interrupt throttle mechanism by setting
perf_cpu_time_max_percent to 0. Then they change perf_event_max_sample_rate.
For this case, the mechanism will be enabled implicitly, because
perf_sample_allowed_ns becomes !0 - which is not what we want.
This patch only updates perf_sample_allowed_ns when the dynamic
interrupt throttle mechanism is enabled.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Link: http://lkml.kernel.org/r/1462260366-3160-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There are now two different things called AUX in perf, the
infrastructure to deliver the mmap/comm/task records and the
AUX part in the mmap buffer (with associated AUX_RECORD).
Since the former is internal, rename it to side-band to reduce
the confusion factor.
No change in functionality.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The perf_event_aux() function iterates all PMUs and all events in
their respective per-CPU contexts to find the events to deliver
side-band records to.
For example, the brk test case in lkp triggers many mmap() operations,
which, if we're also running perf, results in many perf_event_aux()
invocations.
If we enable uncore PMU support (even when uncore events are not used),
dozens of uncore PMUs will be iterated, which can significantly
decrease brk_test's throughput.
For example, the brk throughput:
without uncore PMUs: 2647573 ops_per_sec
with uncore PMUs: 1768444 ops_per_sec
... a 33% reduction.
To get at the per-CPU events that need side-band records, this patch
puts these events on a per-CPU list, this avoids iterating the PMUs
and any events that do not need side-band records.
Per task events are unchanged to avoid extra overhead on the context
switch paths.
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Huang, Ying <ying.huang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1458757477-3781-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
User visible/kernel ABI:
- Per event callchain limit: Recently we introduced a sysctl to tune the
max-stack for all events for which callchains were requested:
$ sysctl kernel.perf_event_max_stack
kernel.perf_event_max_stack = 127
Now this patch introduces a way to configure this per event, i.e. this
becomes possible:
$ perf record -e sched:*/max-stack=2/ -e block:*/max-stack=10/ -a
allowing finer tuning of how much buffer space callchains use.
This uses an u16 from the reserved space at the end, leaving another
u16 for future use.
There has been interest in even finer tuning, namely to control the
max stack for kernel and userspace callchains separately. Further
discussion is needed, we may for instance use the remaining u16 for
that and when it is present, assume that the sample_max_stack introduced
in this patch applies for the kernel, and the u16 left is used for
limiting the userspace callchain. (Arnaldo Carvalho de Melo)
Infrastructure:
- Adopt get_main_thread from db-export.c (Andi Kleen)
- More prep work for backward ring buffer support (Wang Nan)
- Prep work for supporting SDT (Statically Defined Tracing)
tracepoints (Masami Hiramatsu)
- Add arch/*/include/generated/ to .gitignore (Taeung Song)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJXTGsMAAoJENZQFvNTUqpA94AP/jMhs4/+eFe9v/Q/ZR7ap6Fr
0tFHqztfYC+07wz0tafxmRo4EmapgZTCvCjTjLSmzwaYWG/W4trulP+ypW+jkb2c
RNt8yQcJZwQ2f0hz9fFUMt5IIz7rM3m4h1O2g6zTxa9aYWgdsNIQsY64V4Oyf6AS
sq+OX3xxYL9IvxpbPidG/FeNnnsgj7BjIBkT1dzpyDZkXouFjiA+oNmfRCKlMdHA
0qTtYr7nG0KAhAHXNWjJosuDCblQVnWaxGC+EvVeK3j5HOPj7s9ALKg9gt9t6jtk
oOdzKZn2suXPcUHsevoG7hb5ORPxm8Vt9t9lWWGtJU4hJyGyybO9BFaANg+EDuL6
GZ6urN5S5R2yrrTPasB8OFol8lpn8XMh60I8chA7uBcaNhH+bjLuzovczhYuOHO+
rZePVPOkOYLnmWe7ODUNlz6DbNPi5xalUxSUjnCDb2e68xXB7psduH7uvdeqtn/+
cBlkDr4WlYHRYlR+3+4nxXVnaYG6Wt4E6nzLS4hg3pQuvShHGbBGM1QgIsQJQGsd
i4mR5ufDkW8tCHfl847LZv9FOwghKsOdO1wwzo1aifh4nOdyz9/IVsco8wNu5cWN
jriIXTCfnqpvTPJ6OKzWXor2WqgYSWG9z0moCzPxkKGfyxUOEmDv7Y321x9r6x3a
rdGIg3WVMCgZ+JmmfZ5h
=k8d1
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-20160530' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible/kernel ABI changes:
- Per event callchain limit: Recently we introduced a sysctl to tune the
max-stack for all events for which callchains were requested:
$ sysctl kernel.perf_event_max_stack
kernel.perf_event_max_stack = 127
Now this patch introduces a way to configure this per event, i.e. this
becomes possible:
$ perf record -e sched:*/max-stack=2/ -e block:*/max-stack=10/ -a
allowing finer tuning of how much buffer space callchains use.
This uses an u16 from the reserved space at the end, leaving another
u16 for future use.
There has been interest in even finer tuning, namely to control the
max stack for kernel and userspace callchains separately. Further
discussion is needed, we may for instance use the remaining u16 for
that and when it is present, assume that the sample_max_stack introduced
in this patch applies for the kernel, and the u16 left is used for
limiting the userspace callchain. (Arnaldo Carvalho de Melo)
Infrastructure changes:
- Adopt get_main_thread from db-export.c (Andi Kleen)
- More prep work for backward ring buffer support (Wang Nan)
- Prep work for supporting SDT (Statically Defined Tracing)
tracepoints (Masami Hiramatsu)
- Add arch/*/include/generated/ to .gitignore (Taeung Song)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Use path/to/bin/buildid/elf instead of path/to/bin/buildid
to store corresponding elf binary.
This also stores vdso in buildid/vdso, kallsyms in buildid/kallsyms.
Note that the existing caches are not updated until user adds
or updates the cache. Anyway, if there is the old style build-id
cache it falls back to use it. (IOW, it is backward compatible)
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160528151537.16098.85815.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cleanup the code flow of dso__find_kallsyms() to remove redundant
checking code and add some comment for readability.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160528151522.16098.43446.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce filename__readable to check readability by opening the file
directly. Since the access(R_OK) just checks the readability based on
real UID/GID, it is ignored that the effective UID/GID and capabilities
for some special file (e.g. /proc/kcore).
filename__readable() directly opens given file with O_RDONLY so that the
kernel checks it by effective UID/GID and capabilities.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160528151513.16098.97576.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before this patch there's no way to pass arguments to fdarray__filter's
call back function.
This improvement will be used by 'perf record' to support unmapping ring
buffer for both main evlist and overwrite evlist. Without this patch
there's no way to track overwrite evlist from 'struct fdarray'.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464183898-174512-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now we have evlist->backward to indicate the mmap direction. Make
perf_evlist__mmap_read() choose right direction automatically.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464183898-174512-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
evlist->mmap[i]->refcnt could be 0 if an evlist has no evsel or if all
evsels don't match the evlist during mmap. For example, when all evsels
are overwritable but the evlist itself is normal. To avoid crashing,
perf should check 'base' pointer before checking refcnt, and raise bug
only when base is not NULL.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464183898-174512-2-git-send-email-wangnan0@huawei.com
[ Renamed 'mmap' variable, it is reserved in old distros such as Ubuntu 12.04, breaking the build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's no need to receive events from overwritable ring buffer.
Instead, perf should make them run in background until some external
event of interest takes place. This patch makes ignores normal events from
overwrite evlists.
Overwritable events must be mapped readonly and backward, so if evlist
and evsel doesn't match (evsel->overwrite is true but either evlist is
read/write or evlist is not backward, and vice versa), skip mapping it.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464056944-166978-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is possible that all events in an evlist are overwritable.
perf_event__synth_time_conv() should not crash in this case.
record__pick_pc() is used to check avaliability.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464056944-166978-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Additionally to being able to control the system wide maximum depth via
/proc/sys/kernel/perf_event_max_stack, now we are able to ask for
different depths per event, using perf_event_attr.sample_max_stack for
that.
This uses an u16 hole at the end of perf_event_attr, that, when
perf_event_attr.sample_type has the PERF_SAMPLE_CALLCHAIN, if
sample_max_stack is zero, means use perf_event_max_stack, otherwise
it'll be bounds checked under callchain_mutex.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the get_main_thread function from db-export.c to thread.c so that
it can be used elsewhere.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464051145-19968-2-git-send-email-andi@firstfloor.org
[ Removed leftover bits from db-export.h ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before this patch, a simple 'perf record' could fail if kptr_restrict is
set to 1 (for normal user) or 2 (for root):
# perf record ls
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Segmentation fault (core dumped)
This patch skips perf_event__synthesize_kernel_mmap() when kptr is not
available.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes: 45e9005690 ("perf machine: Do not bail out if not managing to read ref reloc symbol")
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464081688-167940-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If kptr_restrict is set to 2, even root is not allowed to see pointers.
This patch checks kptr_restrict even if euid == 0. For root, report
error if kptr_restrict is 2.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464081688-167940-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On rapl cleanup path, kfree() is given by mistake the address of the
pointer of the structure to free (rapl_pmus->pmus + i). Pass the pointer
instead (rapl_pmus->pmus[i]).
Fixes: 9de8d68695 "perf/x86/intel/rapl: Convert it to a per package facility"
Signed-off-by: Vincent Stehlé <vincent.stehle@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1464101629-14905-1-git-send-email-vincent.stehle@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
User visible:
- Add "srcline_from" and "srcline_to" branch sort keys to 'perf top' and
'perf report' (Andi Kleen)
Infrastructure:
- Make 'perf trace' auto-attach fd->name and ptr->name beautifiers based
on the name of syscall arguments, this way new syscalls that have
'const char * (path,pathname,filename)' will use the fd->name beautifier
(vfs_getname perf probe, if in place) and the 'fd->name' (vfs_getname
or via /proc/PID/fd/) (Arnaldo Carvalho de Melo)
- Infrastructure to read from a ring buffer in backward write mode (Wang Nan)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJXQ33GAAoJENZQFvNTUqpAbLUP/1fyH/IxpYfGBvsukbthQlxC
oRSJTxAGP6cxm7n9LYIO01EIcHAx7VKVrf2WrKJp6QtQPQflpSFWdvUB3imKE6q7
/+Tcg9eVp88Oqc5MvSmV//ICjKn6Cdr86I/4PuBcLa+nAJcSP3BZK9HSVkKu5Ot7
wHfbnbHiQo3pRZAL7t/newpmO4wZ135DDlmITcfXXTozjgcxW6fjoJaWwnjycEHd
sdZDMeQs1Avos5THMfyiK4b5xWUDFj/rhjGcsbrk9bfzg0iPeReDfi68uf7zqh5k
pAHpdIspHp2DfBxNWKwHFD1fzngAkK9i4IRocikqtnYD06lKpFTv4rTncn0UfUra
cOIaDTVNA8QIt/cOeSvKHPMWIDdLybBPva7iUHUDPh1fRHZn10TqHP3PUAlGB6QC
0PRIET1Iw0ugsDUeeA7EfUD/w30FkKrDEuSEUBCDV8k1VLPfz09QnaRiE9D/OYsf
VqwRDU9/Hy4E+L/DZDEyJII6frWeuTm6N1n83273zlyyHOPoH37NwJwKIGojxu0k
9sBtd4Y7g+flNzatdtybLPiOipTpxo2tHz/GC9C8EHLchuub4CXpQzMtcffkd285
Uj55QEMwV5UCh9bJSIoS5rfM1djB3h4eMO7dE5f1qgo2JUcmHwxxbpIcbi648rR2
KDtB2LnaC24PzSGN0NMK
=MhyE
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-20160523' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core improvements from Arnaldo Carvalho de Melo:
User visible changes:
- Add "srcline_from" and "srcline_to" branch sort keys to 'perf top' and
'perf report' (Andi Kleen)
Infrastructure changes:
- Make 'perf trace' auto-attach fd->name and ptr->name beautifiers based
on the name of syscall arguments, this way new syscalls that have
'const char * (path,pathname,filename)' will use the fd->name beautifier
(vfs_getname perf probe, if in place) and the 'fd->name' (vfs_getname
or via /proc/PID/fd/) (Arnaldo Carvalho de Melo)
- Infrastructure to read from a ring buffer in backward write mode (Wang Nan)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Introduce rb_find_range() to find start and end position from a backward
ring buffer.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-5-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
record__mmap_read() writes data from ring buffer into perf.data. 'head'
is maintained by the kernel, points to the last written record.
'old' is maintained by perf, points to the record read in previous
round. record__mmap_read() saves data from 'old' to 'head' to
perf.data.
The names of these variables are not very intutive. In addition,
when dealing with backward writing ring buffer, the md->prev pointer
should point to 'head' instead of the last byte it got.
Add 'start' and 'end' pointer to make code clear and set md->prev to
'head' instead of the moved 'old' pointer. This patch doesn't change
behavior since:
buf = &data[old & md->mask];
size = head - old;
old += size; <--- Here, old == head
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-4-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When record__mmap_read() requires data more than the size of ring
buffer, drop those data to avoid accessing invalid memory.
This can happen when reading from overwritable ring buffer, which
should be avoided. However, check this for robustness.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__toggle_{pause,resume}() are introduced to pause/resume
events in an evlist. Utilize PERF_EVENT_IOC_PAUSE_OUTPUT ioctl.
Following commits use them to ensure overwrite ring buffer is paused
before reading.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-2-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Return -1, like all other ioctl() usage in evlist.c, rename 'pause'
arg to avoid breaking the build on ubuntu 12.04 and other old systems ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>