Print out the cache-miss percentage as well if the cache refs were
collected, for all the generic cache event types.
Before:
11,103,723,230 dTLB-loads # 622.471 M/sec ( +- 0.30% )
87,065,337 dTLB-load-misses # 4.881 M/sec ( +- 0.90% )
After:
11,353,713,242 dTLB-loads # 626.020 M/sec ( +- 0.35% )
113,393,472 dTLB-load-misses # 1.00% of all dTLB cache hits ( +- 0.49% )
Also ASCII color highlight too high percentages, them when it's executed on the console.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-lkhwxsevdbd9a8nymx0vxc3y@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
perf bench needs this to build the kernel's memcpy routine:
In file included from bench/mem-memcpy-x86-64-asm.S:2:0:
bench/../../../arch/x86/lib/memcpy_64.S:7:33: fatal error: asm/alternative-asm.h: No such file or directory
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-c5d41xibgullk8h2280q4gv0@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch fixes an issue with event parsing.
The following commit appears to have broken the
ability to specify a comma separated list of events:
commit ceb53fbf6d
Author: Ingo Molnar <mingo@elte.hu>
Date: Wed Apr 27 04:06:33 2011 +0200
perf stat: Fail more clearly when an invalid modifier is specified
This patch fixes this while preserving the desired effect:
$ perf stat -e instructions:u,instructions:k ls /dev/null /dev/null
Performance counter stats for 'ls /dev/null':
365956 instructions:u # 0.00 insns per cycle
731806 instructions:k # 0.00 insns per cycle
0.001108862 seconds time elapsed
$ perf stat -e task-clock-msecs true
invalid event modifier: '-msecs'
Run 'perf list' for a list of valid events and modifiers
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: acme@redhat.com
Cc: peterz@infradead.org
Cc: fweisbec@gmail.com
Link: http://lkml.kernel.org/r/20110517133619.GA6999@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The perf_evlist__create_maps was discarding the --cpu parameter when a
--pid or --tid was specified, fix that.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
pubname_callback_param::found should be initialized to 0 in
fastpath lookup, the structure is on the stack and
uninitialized otherwise.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Link: http://lkml.kernel.org/r/1304066518-30420-2-git-send-email-ming.m.lin@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Including "../../annotate.h" once in
tools/perf/util/ui/browsers/annotate.c is enough. No need to do it twice.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The original Makefile uses "uname -m" to determine ARCH.
This causes problem on x86 when compile perf tool on 32 bit
userspace with a 64 bit kernel.
bench/../../../arch/x86/lib/memcpy_64.S: Assembler messages:
bench/../../../arch/x86/lib/memcpy_64.S:28: Error: bad register name `%rdi'
This is because "uname -m" returns x86_64 and memcpy_64.S is
included in 32 bit build.
Reported-by: Riccardo Magliocchetti <riccardo.magliocchetti@gmail.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Link: http://lkml.kernel.org/r/1304743274.3132.17.camel@localhost
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If RCU priority boosting is to be meaningful, callback invocation must
be boosted in addition to preempted RCU readers. Otherwise, in presence
of CPU real-time threads, the grace period ends, but the callbacks don't
get invoked. If the callbacks don't get invoked, the associated memory
doesn't get freed, so the system is still subject to OOM.
But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
moves the callback invocations to a kthread, which can be boosted easily.
Also add comments and properly synchronized all accesses to
rcu_cpu_kthread_task, as suggested by Lai Jiangshan.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Similar to perf-record, tell user about unsupported events
that will not be counted if invoked in verbose mode.
e.g.,
$ perf stat -e dTLB-prefetch-misses -v -- sleep 1
dTLB-prefetch-misses event is not supported by the kernel.
dTLB-prefetch-misses: 0 0 0
Performance counter stats for 'sleep 1':
<not counted> dTLB-prefetch-misses
1.001884783 seconds time elapsed
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1304114655-10600-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Recent stalled-cycles event names were larger than the 40 chars printout
used by perf list.
Extend that, make it robust for future extensions and also adjust alignments
in face of wider event names.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n009io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
David Ahern reported this perf stat failure:
> # /tmp/build-perf/perf stat -- sleep 1
> Error: stalled-cycles-frontend event is not supported.
> Fatal: Not all events could be opened.
>
> This is a Dell R410 with an E5620 processor.
Fail in a softer fashion on unknown/unsupported events.
Reported-by: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n006io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Adjust to color thresholds to better match the percentages seen in
real workloads. Both are now a bit more sensitive.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n004io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of failing on an unknown event, when new perf stat is run on
older kernels:
$ ./perf stat true
Error: open_counter returned with 22 (Invalid argument). /bin/dmesg
may provide additional information.
Fatal: Not all events could be opened.
Just ignore EINVAL and ENOSYS, we'll print the results as not counted:
Performance counter stats for 'true':
0.239483 task-clock # 0.493 CPUs utilized
0 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
86 page-faults # 0.359 M/sec
704,766 cycles # 2.943 GHz
<not counted> stalled-cycles
381,961 instructions # 0.54 insns per cycle
69,626 branches # 290.735 M/sec
4,594 branch-misses # 6.60% of all branches
0.000485883 seconds time elapsed
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio5hjpn3dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
--sync will tell perf stat to run sync() before starting a command.
This allows IO-heavy tests to be used with --repeat, without one
iteration impacting the other.
Elapsed time will stabilize for example:
before: 3.971525714 seconds time elapsed ( +- 8.56% )
after: 3.211098537 seconds time elapsed ( +- 1.52% )
So measurements will be more accurate.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Print out this kind of l1-dcache-misses percentage:
Performance counter stats for './bw_tcp localhost':
29,956,262,201 cycles # 3.002 GHz (scaled from 85.14%)
8,255,209,558 stalled-cycles # 27.56% of all cycles are idle (scaled from 86.56%)
1,206,130,308 l1-dcache-misses # 40.49% of all L1-dcache hits (scaled from 86.30%)
2,978,756,779 l1-dcache-refs # 298.512 M/sec (scaled from 70.02%)
8,861,956,159 instructions # 0.30 insns per cycle
# 0.93 stalled cycles per insn (scaled from 84.27%)
1,644,306,068 branches # 164.782 M/sec (scaled from 86.43%)
74,778,443 branch-misses # 4.55% of all branches (scaled from 70.69%)
9978.695711 task-clock # 0.693 CPUs utilized
14.404347983 seconds time elapsed
And color the result depending on the severity of cache-trashing.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-54gmz0zymaid84zcs7joq02p@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Print the missed-branches percentage with different warning level ASCII colors,
as the percentage passes the 5%/10%/20% thresholds.
These thresholds are set to relatively low levels, because on most CPUs even a
moderate percentage of branch-misses already shows up as a slowdown.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-ybqukg7p86leiup7gl03ecgk@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Print the stalled-cycles percentage with different warning level ASCII colors,
as the percentage passes the 25%/50%/75% thresholds.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-e25zz44rcms7mu9az4fu5zp0@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The new default output looks like this:
Performance counter stats for './loop_1b_instructions':
236.010686 task-clock # 0.996 CPUs utilized
0 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
99 page-faults # 0.000 M/sec
756,487,646 cycles # 3.205 GHz
354,938,996 stalled-cycles # 46.92% of all cycles are idle
1,001,403,797 instructions # 1.32 insns per cycle
# 0.35 stalled cycles per insn
100,279,773 branches # 424.895 M/sec
12,646 branch-misses # 0.013 % of all branches
0.236902540 seconds time elapsed
We dropped cache-refs and cache-misses and added stalled-cycles - this is a
more generic "how well utilized is the CPU" metric.
If the stalled-cycles ratio is too high then more specific measurements can be
taken to figure out the source of the inefficiency.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-pbpl2l4mn797s69bclfpwkwn@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add stalled cycles accounting and use it to print the "cycles stalled per
instruction" value.
Also change the unit of the cycles output from M/sec to GHz - this is more
intuitive.
Prettify the output to:
Performance counter stats for './loop_1b_instructions':
239.775036 task-clock # 0.997 CPUs utilized
761,903,912 cycles # 3.178 GHz
356,620,620 stalled-cycles # 46.81% of all cycles are idle
1,001,578,351 instructions # 1.31 insns per cycle
# 0.36 stalled cycles per insn
14,782 cache-references # 0.062 M/sec
5,694 cache-misses # 38.520 % of all cache refs
0.240493656 seconds time elapsed
Also adjust the --repeat output to make the percentages align vertically:
Performance counter stats for './loop_1b_instructions' (10 runs):
236.096793 task-clock # 0.997 CPUs utilized ( +- 0.011% )
756,553,086 cycles # 3.204 GHz ( +- 0.002% )
354,942,692 stalled-cycles # 46.92% of all cycles are idle ( +- 0.008% )
1,001,389,700 instructions # 1.32 insns per cycle
# 0.35 stalled cycles per insn ( +- 0.000% )
10,166 cache-references # 0.043 M/sec ( +- 0.742% )
468 cache-misses # 4.608 % of all cache refs ( +- 13.385% )
0.236874136 seconds time elapsed ( +- 0.01% )
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-uapziqny39601apdmmhoz7hk@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Create update_shadow_stats() which is then used in both read_counter_aggr()
and read_counter().
This not only simplifies the code but also fixes a bug: HW_CACHE_REFERENCES
was not updated in read_counter().
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-9uc55z3g88r47exde7zxjm6p@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Right now we display this by default:
0.202204 task-clock-msecs # 0.282 CPUs
0 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
85 page-faults # 0.420 M/sec
The task-clock-msecs event cannot actually be passed back as an
event name, the event name we recognize is 'task-clock'.
So change the output of the cpu-clock and task-clock events
to be idempotent.
( Units should be printed out in the right-side column, if needed. )
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-lexrnbzy09asscgd4f7oac4i@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently we fail without printing any error message on "perf stat -e task-clock-msecs".
The reason is that the task-clock event is matched and the "-msecs" postfix is assumed
to be an event modifier - but is not recognized.
This patch changes the code to be more informative:
$ perf stat -e task-clock-msecs true
invalid event modifier: '-msecs'
Run 'perf list' for a list of valid events and modifiers
And restructures the return value of parse_event_modifier() to allow
the printing of all variants of invalid event modifiers.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-wlaw3dvz1ly6wple8l52cfca@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We currently fail on something like '-e CPU-migrations', with:
invalid or unsupported event: 'CPU-migrations'
While 'CPU-migrations' is how we actually print out the event
in the default perf stat output:
Performance counter stats for 'true':
0.202204 task-clock-msecs # 0.282 CPUs
0 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
So change the matching to be case-insensitive.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-omcm3edjjtx83a4kh2e244se@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The new PERF_COUNT_HW_STALLED_CYCLES event tries to approximate
cycles the CPU does nothing useful, because it is stalled on a
cache-miss or some other condition.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-fue11vymwqsoo5to72jxxjyl@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Neil Brown pointed out that lock_depth somehow escaped the BKL
removal work. Let's get rid of it now.
Note that the perf scripting utilities still have a bunch of
code for dealing with common_lock_depth in tracepoints; I have
left that in place in case anybody wants to use that code with
older kernels.
Suggested-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110422111910.456c0e84@bike.lwn.net
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Check for required sample attributes using evsel rather than sample_type
in the session header. If the attribute for a default field is not
present for the event type (e.g., new command operating on file from
older kernel) the field is removed from the output list.
Expected event types must exist. For example, if a user specifies
-f trace:time,trace -f sw:time,cpu,sym
the perf.data file must contain both tracepoints and software events
(ie., it is an error if either does not exist in the file).
Attribute checking is done once at the beginning of perf-script rather
than for each sample.
v1 -> v2:
- addressed comments from acme
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1302148460-570-1-git-send-email-daahern@cisco.com
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The `try-cc' user-defined function was in tools/perf/feature-tests.mak;
this commit moves it to tools/perf/config/utilities.mak.
Signed-off-by: Michael Witten <mfwitten@gmail.com>
Link: http://lkml.kernel.org/n/tip-bqhwcuxsrve0iodn6q4ejaoi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, Python 3 is not supported by perf's code; this
can cause the build to fail for systems that have Python 3
installed as the default python:
python{,-config}
The Correct Solution is to write compatibility code so that
Python 3 works out-of-the-box.
However, users often have an ancillary Python 2 installed:
python2{,-config}
Therefore, a quick fix is to allow the user to specify those
ancillary paths as the python binaries that Makefile should
use, thereby avoiding Python 3 altogether; as an added benefit,
the Python binaries may be installed in non-standard locations
without the need for updating any PATH variable.
This commit adds the ability to set PYTHON and/or PYTHON_CONFIG
either as environment variables or as make variables on the
command line; the paths may be relative, and usually only PYTHON
is necessary in order for PYTHON_CONFIG to be defined implicitly.
Some rudimentary error checking is performed when the user
explicitly specifies a value for any of these variables.
In addition, this commit introduces significantly robust makefile
infrastructure for working with paths and communicating with the
shell; it's currently only used for handling Python, but I hope
it will prove useful in refactoring the makefiles.
Thanks to:
Raghavendra D Prabhu <rprabhu@wnohang.net>
for motivating this patch.
Acked-by: Raghavendra D Prabhu <rprabhu@wnohang.net>
Link: http://lkml.kernel.org/r/e987828e-87ec-4973-95e7-47f10f5d9bab-mfwitten@gmail.com
Signed-off-by: Michael Witten <mfwitten@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One more installment on an area that is mostly dormant.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf stat doesn't mmap and its perfectly fine for it to use task-bound
counters with inheritance.
So set the attr.inherit on the caller and leave the syscall itself to
validate it.
When the mmap fails perf_evlist__mmap will just emit a warning if this
is the failure reason.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/20110414170121.GC3229@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In hists browser, press hotkey 'a' to annotate current symbol.
Now it causes segment fault if 'a' is pressed on a null symbol.
Here are 2 small bugs:
- In perf_evsel__hists_browse, the condition check after 'a' is pressed
is not correct, we should check ->sym instead of ->map.
- In symbol__tui_annotate we must check whether sym is NULL or not
before getting annotation structure.
This patch fixes above 2 small bugs.
Link: http://lkml.kernel.org/r/1302244286.4106.36.camel@minggr.sh.intel.com
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix this:
util/cgroup.c: In function ‘open_cgroup’:
util/cgroup.c:16:16: error: ‘saved_ptr’ may be used uninitialized in this function
util/cgroup.c:16:16: note: ‘saved_ptr’ was declared here
Apparently newer GCC (4.6) can figure out that this variable is properly
initialized - but some versions of GCC (such as 4.5.2) need help.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-32, fpu: Fix FPU exception handling on non-SSE systems
x86, hibernate: Initialize mmu_cr4_features during boot
x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change
x86: visws: Fixup irq overhaul fallout
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Clean up rebalance_domains() load-balance interval calculation
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/mrst/vrtc: Fix boot crash in mrst_rtc_init()
rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm()
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Fix cpumask leak in __setup_irq()
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf probe: Fix listing incorrect line number with inline function
perf probe: Fix to find recursively inlined function
perf probe: Fix multiple --vars options behavior
perf probe: Fix to remove redundant close
perf probe: Fix to ensure function declared file
Fix a bug showing incorrect line number when a probe is put on the head of an
inline function. This patch updates find_perf_probe_point() and introduces new
rules to get correct line number.
- If debuginfo doesn't have a correct file name, we shouldn't return line
number too, because, without file name, line number is meaningless.
- If the address is in a function, it stores the function name and the offset
from the function entry.
- If the address is on a line, it tries to get the relative line number from
the function entry line, except for the address is same as the entry
address of the function (in this case, the relative line number should
be 0).
- If the address is in an inline function entry (call-site), it uses the
inline function call line number as the line on which the address is.
- If the address is in an inline function body, it stores the inline
function name and offset from the inline function call site instead of the
(non-inlined) function.
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092605.2132.11629.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix die_find_inlinefunc() to return correct innermost inlined function
at given address. Without this fix, it returns the outermost inlined
function.
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092559.2132.78634.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix a bug that perf-probe fails to initialize libdwfl and shows incorrect error
when user gives multiple --vars options.
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092553.2132.42691.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since dwfl_end() closes given fd with dwfl, caller doesn't need to close its fd
when finishing process.
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092547.2132.93728.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix to ensure function declared file matches given file name. This fixes
a potential bug.
As I've commented on Lin Ming's fastpath enhancement, decl_file should
be checked on each probe point if user gives a probe point as func@file.
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092541.2132.3584.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The default setting of perf record is to mmap 128 pages if the user
did not override with -m.
However the page size may vary accross different architecture
settings, giving different default size between each.
Moreover the kernel side still has a default max number of mlocked
pages of 512 kiB + 1 page for unprivileged users. 128 + 1 pages
with page size > 4096 overlaps this threshold.
Thus, better adapt to this limitation and set the default number of
pages to fit those 512 kiB + 1 page.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1301535324-9735-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Allow:
perf script -f <fields>
to be equivalent to:
perf script -f trace:<fields> -f sw:<fields> -f hw:<fields>
i.e., the specified fields apply to all event types if the type string
is not given.
The field (-f) arguments are processed in the order received. A later
usage can reset a prior request. e.g.,
-f trace: -f comm,tid,time,sym
The first -f suppresses trace events (field list is ""), but then the second
invocation sets the fields to comm,tid,time,sym. In this case a warning is
given to the user:
"Overriding previous field request for all events."
Alternativey, consider the order:
-f comm,tid,time,sym -f trace:
The first -f sets the fields for all events and the second -f suppresses trace
events. The user is given a warning message about the override, and the result
of the above is that only S/W and H/W events are displayed with the given
fields.
For the 'wildcard' option if a user selected field is invalid for an event
type, a message is displayed to the user that the option is ignored for that
type. For example:
perf script -f comm,tid,trace 2>&1 | less
'trace' not valid for hardware events. Ignoring.
'trace' not valid for software events. Ignoring.
Alternatively, if the type is given an invalid field is specified it is an
error. For example:
perf script -v -f sw:comm,tid,trace 2>&1 | less
'trace' not valid for software events.
At this point usage is displayed, and perf-script exits.
Finally, a user may not set fields to none for all event types.
i.e., -f "" is not allowed.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
LPU-Reference: <1300377801-27246-1-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pick up these two commits from Arnaldo's perf/core tree:
ca6a42586fae: perf tools: Emit clearer message for sys_perf_event_open ENOENT return
c286c419c784: perf tools: Fixup exit path when not able to open events
As they are really fixes we want to have sooner than laer.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix the following build error:
GEN python/perf.so
In file included from util/evsel.h:10,
from util/python.c:6:
util/hist.h:106:18: error: newt.h: No such file or directory
error: command 'x86_64-pc-linux-gnu-gcc' failed with exit status 1
make: *** [python/perf.so] Error 1
by passing BASIC_CFLAGS to setup.py. BASIC_CFLAGS variable contains
the -DNO_NEWT_SUPPORT switch which prevents building python c
extension with newt.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <20110329180236.GA19366@erda.amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If symbol_conf.priv_size is not a multiple of "sizeof(u64)" we'll bus
error on sparc64 in symbol__new because the "struct symbol *" pointer
is computed by adding symbol_conf.priv_size to the memory allocated.
We cannot isolate the fix to symbol__new and symbol__delete since the
private area is computed by subtracting the priv_size value from a
"struct symbol" pointer, so then the private area can still be
potentially unaligned.
So, simply align the symbol_conf.priv_size value in symbol__init()
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110328.175849.112593455.davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Resend of patch sent back in January 2011 in light of recent confusion around
unsupported events for a given platform.
Improve sys_perf_event_open ENOENT return handling in top and record, just
like 5a3446b does for stat.
Retry of Arnaldo's patch using ui_warning instead of die which allows the
fallback from hardware cycles to software clock.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
LKML-Reference: <1301080271-20945-1-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
[ committer note: Some adjustments to make it apply to newer codebase ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
v3 -> v2:
- Make pubname_search_cb more generic
- Add fastpath to find_probes also
v2 -> v1:
- Don't compare file names with cu_find_realpath(...), instead, compare
them with the name returned by dwarf_decl_file(sp_die)
The vmlinux file may have thousands of CUs.
We can lookup function name from .debug_pubnames section
to avoid the slow loop on CUs.
1. Improvement data for find_line_range
./perf stat -e cycles -r 10 -- ./perf probe -k /home/mlin/vmlinux \
-s /home/mlin/linux-2.6 \
--line csum_partial_copy_to_user > tmp.log
before patch applied
=====================
847,988,276 cycles
0.355075856 seconds time elapsed
after patch applied
=====================
206,102,622 cycles
0.086883555 seconds time elapsed
2. Improvement data for find_probes
./perf stat -e cycles -r 10 -- ./perf probe -k /home/mlin/vmlinux \
-s /home/mlin/linux-2.6 \
--vars csum_partial_copy_to_user > tmp.log
before patch applied
=====================
848,490,844 cycles
0.355307901 seconds time elapsed
after patch applied
=====================
205,684,469 cycles
0.086694010 seconds time elapsed
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
LKML-Reference: <1301041668.14111.52.camel@minggr.sh.intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have to deal with the TUI mode in perf top, so that we don't end up
with a garbled screen when, say, a non root user on a machine with a
paranoid setting (the default) tries to use 'perf top'.
Introduce a ui__warning_paranoid() routine shared by top and record that
tells the user the valid values for /proc/sys/kernel/perf_event_paranoid.
Cc: David Ahern <daahern@cisco.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf can't currently trace into the vsyscall page. It looks like it was
meant to work.
Tested on 2.6.38 and today's -git.
The bug is easy to reproduce. Compile this:
int main()
{
int i;
struct timespec t;
for(i = 0; i < 10000000; i++)
clock_gettime(CLOCK_MONOTONIC, &t);
return 0;
}
and run it through perf record; perf report. The top entry shows
"[unknown]" and you can't zoom in.
It looks like there are two issues. The first is a that a test for user
mode executing in kernel space is backwards. (That's the first hunk
below). The second (I think) is that something's wrong with the code
that generates lots of little struct dso objects for different sections
-- when it runs on vmlinux it results in bogus long_name values which
cause objdump to fail.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LPU-Reference: <AANLkTikxSw5+wJZUWNz++nL7mgivCh_Zf=2Kq6=f9Ce_@mail.gmail.com>
Signed-off-by: Andy Lutomirski <luto@mit.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue
perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows
perf symbols: Look at .dynsym again if .symtab not found
perf build-id: Add quirk to deal with perf.data file format breakage
perf session: Pass evsel in event_ops->sample()
perf: Better fit max unprivileged mlock pages for tools needs
perf_events: Fix stale ->cgrp pointer in update_cgrp_time_from_cpuctx()
perf top: Fix uninitialized 'counter' variable
tracing: Fix set_ftrace_filter probe function display
perf, x86: Fix Intel fixed counters base initialization
The original intent of the code was to repeat the search with
want_symtab = 0. But as the code stands now, we never hit the "default"
case of the switch statement. Which means we never repeat the search.
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reported-by: Arun Sharma <asharma@fb.com>
Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Dave Martin <dave.martin@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The a1645ce1 changeset:
"perf: 'perf kvm' tool for monitoring guest performance from host"
Added a field to struct build_id_event that broke the file format.
Since the kernel build-id is the first entry, process the table using
the old format if the well known '[kernel.kallsyms]' string for the
kernel build-id has the first 4 characters chopped off (where the pid_t
sits).
Reported-by: Han Pingtian <phan@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Cc: stable@kernel.org
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Resolving the sample->id to an evsel since the most advanced tools,
report and annotate, and the others will too when they evolve to
properly support multi-event perf.data files.
Good also because it does an extra validation, checking that the ID is
valid when present. When that is not the case, the overhead is just a
branch + function call (perf_evlist__id2evsel).
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
builtin-top.c has an uninitialized variable.
gcc(version 4.5.1) warns about it and it results in build failure:
builtin-top.c: In function 'display_thread':
builtin-top.c:518:9: error: 'counter' may be used uninitialized
This situation can indeed trigger, if the getline() call in
prompt_integer() fails.
Signed-off-by: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110323072939.11638.50173.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest:
ktest: Add STOP_TEST_AFTER to stop the test after a period of time
ktest: Monitor kernel while running of user tests
ktest: Fix bug where the test would not end after failure
ktest: Add BISECT_FILES to run git bisect on paths
ktest: Add BISECT_SKIP
ktest: Add manual bisect
ktest: Handle kernels before make oldnoconfig
ktest: Start failure timeout on panic too
ktest: Print logfile name on failure
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits)
trace, filters: Initialize the match variable in process_ops() properly
trace, documentation: Fix branch profiling location in debugfs
oprofile, s390: Cleanups
oprofile, s390: Remove hwsampler_files.c and merge it into init.c
perf: Fix tear-down of inherited group events
perf: Reorder & optimize perf_event_context to remove alignment padding on 64 bit builds
perf: Handle stopped state with tracepoints
perf: Fix the software events state check
perf, powerpc: Handle events that raise an exception without overflowing
perf, x86: Use INTEL_*_CONSTRAINT() for all PEBS event constraints
perf, x86: Clean up SandyBridge PEBS events
perf lock: Fix sorting by wait_min
perf tools: Version incorrect with some versions of grep
perf evlist: New command to list the names of events present in a perf.data file
perf script: Add support for H/W and S/W events
perf script: Add support for dumping symbols
perf script: Support custom field selection for output
perf script: Move printing of 'common' data from print_event and rename
perf tracing: Remove print_graph_cpu and print_graph_proc from trace-event-parse
perf script: Change process_event prototype
...
* 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (172 commits)
USB: Add support for SuperSpeed isoc endpoints
xhci: Clean up cycle bit math used during stalls.
xhci: Fix cycle bit calculation during stall handling.
xhci: Update internal dequeue pointers after stalls.
USB: Disable auto-suspend for USB 3.0 hubs.
USB: Remove bogus USB_PORT_STAT_SUPER_SPEED symbol.
xhci: Return canceled URBs immediately when host is halted.
xhci: Fixes for suspend/resume of shared HCDs.
xhci: Fix re-init on power loss after resume.
xhci: Make roothub functions deal with device removal.
xhci: Limit roothub ports to 15 USB3 & 31 USB2 ports.
xhci: Return a USB 3.0 hub descriptor for USB3 roothub.
xhci: Register second xHCI roothub.
xhci: Change xhci_find_slot_id_by_port() API.
xhci: Refactor bus suspend state into a struct.
xhci: Index with a port array instead of PORTSC addresses.
USB: Set usb_hcd->state and flags for shared roothubs.
usb: Make core allocate resources per PCI-device.
usb: Store bus type in usb_hcd, not in driver flags.
usb: Change usb_hcd->bandwidth_mutex to a pointer.
...
If lock was uncontended, wait_time_min == ULLONG_MAX, so we need to
handle this case differently to show high wait times first
Acked-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110222174715.GC9687@joi.lan>
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some versions of grep don't treat '\s' properly. When building perf on such
systems and using a kernel tarball the perf version is unable to be determined
from the main kernel Makefile and the user is left with a version of '..'.
Replacing the use of '\s' with '[[:space:]]', which should work in all grep
versions, gives a usable version number.
Reported-by: Tapan Dhimant <tdhimant@akamai.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tapan Dhimant <tdhimant@akamai.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@kernel.org
LKML-Reference: <1300241800-30281-1-git-send-email-johunt@akamai.com>
Signed-off-by: Josh Hunt <johunt@akamai.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (184 commits)
perf probe: Clean up probe_point_lazy_walker() return value
tracing: Fix irqoff selftest expanding max buffer
tracing: Align 4 byte ints together in struct tracer
tracing: Export trace_set_clr_event()
tracing: Explain about unstable clock on resume with ring buffer warning
ftrace/graph: Trace function entry before updating index
ftrace: Add .ref.text as one of the safe areas to trace
tracing: Adjust conditional expression latency formatting.
tracing: Fix event alignment: skb:kfree_skb
tracing: Fix event alignment: mce:mce_record
tracing: Fix event alignment: kvm:kvm_hv_hypercall
tracing: Fix event alignment: module:module_request
tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup
tracing: Remove lock_depth from event entry
perf header: Stop using 'self'
perf session: Use evlist/evsel for managing perf.data attributes
perf top: Don't let events to eat up whole header line
perf top: Fix events overflow in top command
ring-buffer: Remove unused #include <linux/trace_irq.h>
tracing: Add an 'overwrite' trace_option.
...
Newer compilers (gcc 4.6) complains about:
return ret < 0 ?: 0;
For the following reason:
util/probe-finder.c: In function ‘probe_point_lazy_walker’:
util/probe-finder.c:1331:18: error: the omitted middle operand in ?: will always be ‘true’, suggest explicit middle operand [-Werror=parentheses]
And indeed the return value is a somewhat obscure (but correct) value
of 'true', so return 'ret' instead - this is cleaner and unconfuses
GCC as well.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Allow a user to select which fields to print to stdout for event data.
Options include comm (command name), tid (thread id), pid (process id),
time (perf timestamp), cpu, event (for event name), and trace (for
trace data).
Default is set to maintain compatibility with current output; this
feature does alter output format slightly -- no '-' between command
and pid/tid.
Thanks to Frederic Weisbecker for detailed suggestions on this approach.
Examples (output compressed)
1. trace, default format
perf record -ga -e sched:sched_switch
perf script
swapper 0 [000] 537.037184: sched_switch: prev_comm=swapper prev_pid=0...
sshd 1675 [000] 537.037309: sched_switch: prev_comm=sshd prev_pid=1675...
netstat 1692 [001] 537.038664: sched_switch: prev_comm=netstat prev_pid=1692...
2. trace, custom format
perf record -ga -e sched:sched_switch
perf script -f comm,pid,time,trace <--- omitting cpu and event name
swapper 0 537.037184: prev_comm=swapper prev_pid=0 prev_prio=120 ...
sshd 1675 537.037309: prev_comm=sshd prev_pid=1675 prev_prio=120 ...
netstat 1692 537.038664: prev_comm=netstat prev_pid=1692 prev_prio=120 ...
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1299734608-5223-5-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This change does impact output: latency data is trace specific and is
now printed after the common data - comm, tid, cpu, time and event name.
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1299734608-5223-4-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Next patch moves printing of 'common' data into perf-script which
removes the need for these functions.
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1299734608-5223-3-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Prepare for handling of samples for any event type.
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1299734608-5223-2-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now the --filter option is usable with perf stat too.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1300117230-8404-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
While going thru each of the sym_entry fields looking to reduce it to
the set of entries needed when in an active symbols list, 'skip' should
really be in symbol, as we set it when loading the symtab.
And the space used by the basic symbol allocation remains the same as
we had 5 bytes of padding.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And the DSO__ORIG_ enum to SYMTAB__, to clarify that this is about from
where the symtab was obtained.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can get it from syme->map->dso->kernel (that should be renamed to
origin, but leave this for another patch).
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can get that counter index from perf_top->sym_evsel->idx instead.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stop using this python/OOP convention, doesn't really helps. Will do
more from time to time till we get it cleaned up in all of tools/perf.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we can reuse things like the id to attr lookup routine
(perf_evlist__id2evsel) that uses a hash table instead of the linear
lookup done in the older perf_header_attr routines, etc.
Also to make evsels/evlist more pervasive an API, simplyfing using the
emerging perf lib.
cc: Arun Sharma <arun@sharma-home.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Passing multiple events might force out information about pid/tid/cpu.
Attached patch leaves 30 characters for this info at the expense of the
events' names.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Han Pingtian <phan@redhat.com>
LKML-Reference: <1299528821-17521-3-git-send-email-jolsa@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The snprintf function returns number of printed characters even if it
cross the size parameter. So passing enough events via '-e' parameter
will cause segmentation fault.
It's reproduced by following command:
perf top -e `perf list | grep Tracepoint | awk -F'[' '\
{gsub(/[[:space:]]+/,"",$1);array[FNR]=$1}END{outputs=array[1];\
for (i=2;i<=FNR;i++){ outputs=outputs "," array[i];};print outputs}'`
Attached patch is adding SNPRINTF macro that provides the overflow check
and returns actuall number of printed characters.
Reported-by: Han Pingtian <phan@redhat.com>
Cc: Han Pingtian <phan@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299528821-17521-2-git-send-email-jolsa@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kallsyms has a virtual file name [kernel.kallsyms]. Currently, it can't
be added to buildid cache successfully because the code
(build_id_cache__add_s) tries to resolve [kernel.kallsyms] to a real
absolute pathname and that fails.
Fixes it by not resolving it and just use the name [kernel.kallsyms].
So dir ~/.debug/[kernel.kallsyms] is created.
Original bug report at:
https://lkml.org/lkml/2011/3/1/524
Tested-by: Han Pingtian <phan@redhat.com>
Cc: Han Pingtian <phan@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299165837-27817-1-git-send-email-ming.m.lin@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, if a test causes constant output but never reaches a
boot prompt, or crashes, the test will never stop. Add STOP_TEST_AFTER
to create a variable that will stop (and fail) the test after it has run
for this amount of time. The default is 10 minutes. Setting this
variable to -1 will disable it.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>