OpenCloudOS-Kernel

History

Frederic Weisbecker d6b17bebd7 perf: Provide a new deterministic events reordering algorithm The current events reordering algorithm is based on a heuristic that gets broken once we deal with a very fast flow of events. Indeed the time period based flushing is not suitable anymore in the following case, assuming we have a flush period of two seconds. CPU 0 \| CPU 1 \| cnt1 timestamps \| cnt1 timestamps \| 0 \| 0 1 \| 1 2 \| 2 3 \| 3 [...] \| [...] 4 seconds later If we spend too much time to read the buffers (case of a lot of events to record in each buffers or when we have a lot of CPU buffers to read), in the next pass the CPU 0 buffer could contain a slice of several seconds of events. We'll read them all and notice we've reached the period to flush. In the above example we flush the first half of the CPU 0 buffer, then we read the CPU 1 buffer where we have events that were on the flush slice and then the reordering fails. It's simple to reproduce with: perf lock record perf bench sched messaging To solve this, we use a new solution that doesn't rely on an heuristical time slice period anymore but on a deterministic basis based on how perf record does its job. perf record saves the buffers through passes. A pass is a tour on every buffers from every CPUs. This is made in order: for each CPU we read the buffers of every counters. So the more buffers we visit, the later will be the timstamps of their events. When perf record finishes a pass it records a PERF_RECORD_FINISHED_ROUND pseudo event. We record the max timestamp t found in the pass n. Assuming these timestamps are monotonic across cpus, we know that if a buffer still has events with timestamps below t, they will be all available and then read in the pass n + 1. Hence when we start to read the pass n + 2, we can safely flush every events with timestamps below t. ============ PASS n ================= CPU 0 \| CPU 1 \| cnt1 timestamps \| cnt2 timestamps 1 \| 2 2 \| 3 - \| 4 <--- max recorded ============ PASS n + 1 ============== CPU 0 \| CPU 1 \| cnt1 timestamps \| cnt2 timestamps 3 \| 5 4 \| 6 5 \| 7 <---- max recorded Flush every events below timestamp 4 ============ PASS n + 2 ============== CPU 0 \| CPU 1 \| cnt1 timestamps \| cnt2 timestamps 6 \| 8 7 \| 9 - \| 10 Flush every events below timestamp 7 etc... It also works on perf.data versions that don't have PERF_RECORD_FINISHED_ROUND pseudo events. The difference is that the events will be only flushed in the end of the perf.data processing. It will then consume more memory and scale less with large perf.data files. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com>		2010-05-09 13:43:42 +02:00
..
Documentation	perf list: Improve the raw hw event descriptor documentation	2010-05-07 14:07:05 -03:00
arch	perf probe: Add PowerPC DWARF register number mappings	2010-04-22 13:48:31 +10:00
bench	perf: Fix endianness argument compatibility with OPT_BOOLEAN() and introduce OPT_INCR()	2010-04-14 11:26:44 +02:00
scripts	perf: Remove leftover useless options to record trace events from scripts	2010-04-30 19:55:00 +02:00
util	perf: Provide a new deterministic events reordering algorithm	2010-05-09 13:43:42 +02:00
.gitignore	perf: Ignore perf-archive temp file	2010-01-29 10:37:33 +01:00
CREDITS	perf_counter tools: Add CREDITS file for Git contributors	2009-06-24 19:54:29 +02:00
Makefile	perf: add perf-inject builtin	2010-05-02 13:36:56 -03:00
builtin-annotate.c	perf: add perf-inject builtin	2010-05-02 13:36:56 -03:00
builtin-bench.c	perf bench: Add "all" pseudo subsystem and "all" pseudo suite	2009-12-14 08:51:19 +01:00
builtin-buildid-cache.c	perf: Fix endianness argument compatibility with OPT_BOOLEAN() and introduce OPT_INCR()	2010-04-14 11:26:44 +02:00
builtin-buildid-list.c	perf: add perf-inject builtin	2010-05-02 13:36:56 -03:00
builtin-diff.c	perf: add perf-inject builtin	2010-05-02 13:36:56 -03:00
builtin-help.c	perf: Fix endianness argument compatibility with OPT_BOOLEAN() and introduce OPT_INCR()	2010-04-14 11:26:44 +02:00
builtin-inject.c	perf inject: Add missing bits	2010-05-04 10:48:22 -03:00
builtin-kmem.c	perf: add perf-inject builtin	2010-05-02 13:36:56 -03:00
builtin-kvm.c	perf: 'perf kvm' tool for monitoring guest performance from host	2010-04-19 12:37:24 +03:00
builtin-list.c	perf list: Fix large list output by using the pager	2009-08-13 09:05:48 +02:00
builtin-lock.c	perf: add perf-inject builtin	2010-05-02 13:36:56 -03:00
builtin-probe.c	perf probe: Add --max-probes option	2010-04-26 15:35:20 -03:00
builtin-record.c	perf: Introduce a new "round of buffers read" pseudo event	2010-05-09 13:43:42 +02:00
builtin-report.c	perf report: Document '--call-graph' better for usage	2010-05-08 18:11:44 +02:00
builtin-sched.c	perf: add perf-inject builtin	2010-05-02 13:36:56 -03:00
builtin-stat.c	perf: Fix endianness argument compatibility with OPT_BOOLEAN() and introduce OPT_INCR()	2010-04-14 11:26:44 +02:00
builtin-test.c	perf test: Initial regression testing command	2010-04-29 18:59:23 -03:00
builtin-timechart.c	perf: add perf-inject builtin	2010-05-02 13:36:56 -03:00
builtin-top.c	perf, x86: Improve the PEBS ABI	2010-05-07 11:31:02 +02:00
builtin-trace.c	perf: add perf-inject builtin	2010-05-02 13:36:56 -03:00
builtin.h	perf: add perf-inject builtin	2010-05-02 13:36:56 -03:00
command-list.txt	perf inject: Add missing bits	2010-05-04 10:48:22 -03:00
design.txt	perf: Fix few typos + cosmetics	2010-01-13 17:39:44 +01:00
perf-archive.sh	perf archive: Explain how to use the generated tarball	2010-03-23 20:33:08 +01:00
perf.c	perf: add perf-inject builtin	2010-05-02 13:36:56 -03:00
perf.h	perf: 'perf kvm' tool for monitoring guest performance from host	2010-04-19 12:37:24 +03:00