linux-sg2042/kernel
Frederic Weisbecker f413cdb80c perf_counter: Fix/complete ftrace event records sampling
This patch implements the kernel side support for ftrace event
record sampling.

A new counter sampling attribute is added:

   PERF_SAMPLE_TP_RECORD

which requests ftrace events record sampling. In this case
if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
fires, we emit the tracepoint binary record to the
perfcounter event buffer, as a sample.

Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
record:

 perf record -f -F 1 -a -e workqueue:workqueue_execution
 perf report -D

 0x21e18 [0x48]: event: 9
 .
 . ... raw event: size 72 bytes
 .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
 .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
 .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
 .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
 .  0040:  e0 b1 31 81 ff ff ff ff                          .......
.
0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33

The raw ftrace binary record starts at offset 0020.

Translation:

 struct trace_entry {
	type		= 0x2b = 43;
	flags		= 1;
	preempt_count	= 2;
	pid		= 0xa = 10;
	tgid		= 0xa = 10;
 }

 thread_comm = "events/1"
 thread_pid  = 0xa = 10;
 func	    = 0xffffffff8131b1e0 = flush_to_ldisc()

What will come next?

 - Userspace support ('perf trace'), 'flight data recorder' mode
   for perf trace, etc.

 - The unconditional copy from the profiling callback brings
   some costs however if someone wants no such sampling to
   occur, and needs to be fixed in the future. For that we need
   to have an instant access to the perf counter attribute.
   This is a matter of a flag to add in the struct ftrace_event.

 - Take care of the events recursivity! Don't ever try to record
   a lock event for example, it seems some locking is used in
   the profiling fast path and lead to a tracing recursivity.
   That will be fixed using raw spinlock or recursivity
   protection.

 - [...]

 - Profit! :-)

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:53:48 +02:00
..
gcov gcov: enable GCOV_PROFILE_ALL for x86_64 2009-06-18 13:03:58 -07:00
irq genirq: Fix UP compile failure caused by irq_thread_check_affinity 2009-07-22 23:18:46 +02:00
power headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
time clocksource: Prevent NULL pointer dereference 2009-07-19 17:15:54 +02:00
trace perf_counter: Fix/complete ftrace event records sampling 2009-08-09 12:53:48 +02:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.preempt rcu: provide RCU options on non-preempt architectures too 2008-12-25 09:31:28 +01:00
Makefile Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-06-28 11:05:04 -07:00
acct.c bsdacct: fix access to invalid filp in acct_on() 2009-06-30 18:56:00 -07:00
async.c async: Fix lack of boot-time console due to insufficient synchronization 2009-06-08 12:31:53 -07:00
audit.c Fix rule eviction order for AUDIT_DIR 2009-06-24 00:02:38 -04:00
audit.h Fix rule eviction order for AUDIT_DIR 2009-06-24 00:02:38 -04:00
audit_tree.c Fix rule eviction order for AUDIT_DIR 2009-06-24 00:02:38 -04:00
audit_watch.c Audit: clean up all op= output to include string quoting 2009-06-24 00:00:52 -04:00
auditfilter.c Audit: clean up all op= output to include string quoting 2009-06-24 00:00:52 -04:00
auditsc.c Fix rule eviction order for AUDIT_DIR 2009-06-24 00:02:38 -04:00
backtracetest.c
bounds.c
capability.c [CVE-2009-0029] System call wrappers part 04 2009-01-14 14:15:19 +01:00
cgroup.c cgroup avoid permanent sleep at rmdir 2009-07-29 19:10:35 -07:00
cgroup_debug.c debug cgroup: remove unneeded cgroup_lock 2009-04-02 19:04:54 -07:00
cgroup_freezer.c
compat.c signals: implement sys_rt_tgsigqueueinfo 2009-04-30 19:24:24 +02:00
configs.c
cpu.c mm/init: cpu_hotplug_init() must be initialized before SLAB 2009-06-22 21:18:12 -07:00
cpuset.c cpuset,mm: update tasks' mems_allowed in time 2009-06-16 19:47:31 -07:00
cred-internals.h
cred.c CRED: Rename cred_exec_mutex to reflect that it's a guard against ptrace 2009-05-11 08:15:36 +10:00
delayacct.c schedstat: consolidate per-task cpu runtime stats 2008-12-18 13:54:01 +01:00
dma-coherent.c dma-coherent: Restore dma_alloc_from_coherent() large alloc fall back policy. 2009-01-21 18:51:53 +09:00
dma.c
exec_domain.c Get rid of indirect include of fs_struct.h 2009-03-31 23:00:27 -04:00
exit.c headers: mnt_namespace.h redux 2009-07-08 09:31:56 -07:00
extable.c Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-04-05 11:04:19 -07:00
fork.c execve: must clear current->clear_child_tid 2009-08-07 10:39:56 -07:00
freezer.c sched: fix nr_uninterruptible accounting of frozen tasks really 2009-07-18 14:19:53 +02:00
futex.c futexes: Fix infinite loop in get_futex_key() on huge page 2009-07-11 12:40:44 +02:00
futex_compat.c
groups.c groups: move code to kernel/groups.c 2009-06-16 19:47:48 -07:00
hrtimer.c hrtimer: Fix migration expiry check 2009-07-10 17:32:55 +02:00
hung_task.c softlockup: ensure the task has been switched out once 2009-02-11 11:04:16 +01:00
itimer.c timers: split process wide cpu clocks/timers 2009-02-05 13:04:33 +01:00
kallsyms.c kernel/kallsyms.c: replace deprecated __initcall with device_initcall and fix whitespace 2009-06-09 22:37:52 +02:00
kexec.c kexec: fix omitting offset in extended crashkernel syntax 2009-07-29 19:10:34 -07:00
kfifo.c kernel/kfifo.c: replace conditional test with is_power_of_2() 2009-06-16 19:47:47 -07:00
kgdb.c sysrq, intel_fb: fix sysrq g collision 2009-05-15 07:56:24 -05:00
kmod.c headers: mnt_namespace.h redux 2009-07-08 09:31:56 -07:00
kprobes.c kprobes: Use kernel_text_address() for checking probe address 2009-07-30 16:44:06 -07:00
ksysfs.c kernel/ksysfs.c:fix dependence on CONFIG_NET 2009-01-06 10:44:31 -08:00
kthread.c update the comment in kthread_stop() 2009-07-27 12:15:46 -07:00
latencytop.c sched, latencytop: incorporate review feedback from Andrew Morton 2009-02-11 10:18:04 +01:00
lockdep.c Merge branch 'linus' into tracing/core 2009-05-07 11:17:34 +02:00
lockdep_internals.h lockdep: increase MAX_LOCKDEP_ENTRIES and MAX_LOCKDEP_CHAINS 2009-05-12 19:59:52 +02:00
lockdep_proc.c lockstat: warn about disabled lock debugging 2009-02-14 23:28:28 +01:00
lockdep_states.h lockdep: move state bit definitions around 2009-02-14 23:27:59 +01:00
marker.c
module.c module: use MODULE_SYMBOL_PREFIX with module_layout 2009-07-27 12:15:45 -07:00
mutex-debug.c mutex: implement adaptive spinning 2009-01-14 18:09:02 +01:00
mutex-debug.h mutex: implement adaptive spinning 2009-01-14 18:09:02 +01:00
mutex.c Merge branch 'linus' into perfcounters/core 2009-06-11 17:55:42 +02:00
mutex.h mutex: implement adaptive spinning 2009-01-14 18:09:02 +01:00
notifier.c Merge commit 'v2.6.28-rc6' into core/debug 2008-11-26 08:22:50 +01:00
ns_cgroup.c cgroups: relax ns_can_attach checks to allow attaching to grandchild cgroups 2009-04-02 19:04:53 -07:00
nsproxy.c nsproxy: extract create_nsproxy() 2009-06-18 13:03:56 -07:00
panic.c trace: stop tracer in oops_enter() 2009-07-24 15:30:45 -04:00
params.c module_param: allow 'bool' module_params to be bool, not just int. 2009-06-12 21:46:58 +09:30
perf_counter.c perf_counter: Fix/complete ftrace event records sampling 2009-08-09 12:53:48 +02:00
pid.c kmemleak: Remove alloc_bootmem annotations introduced in the past 2009-07-09 17:07:02 +01:00
pid_namespace.c pidns: rewrite copy_pid_ns() 2009-06-18 13:03:55 -07:00
pm_qos_params.c
posix-cpu-timers.c kernel/posix-cpu-timers.c: fix sparse warning 2009-04-30 08:08:31 +02:00
posix-timers.c posix-timers: Fix oops in clock_nanosleep() with CLOCK_MONOTONIC_RAW 2009-08-04 10:16:41 +02:00
printk.c printk: Add KERN_DEFAULT printk log-level 2009-06-16 11:02:28 -07:00
profile.c profile: suppress warning about large allocations when profile=1 is specified 2009-07-29 19:10:36 -07:00
ptrace.c cred_guard_mutex: do not return -EINTR to user-space 2009-07-06 13:57:04 -07:00
rcuclassic.c kmemtrace, rcu: fix linux/rcutree.h and linux/rcuclassic.h dependencies 2009-04-03 12:23:02 +02:00
rcupdate.c RCU: Don't try and predeclare inline funcs as it upsets some versions of gcc 2009-04-15 13:55:14 -07:00
rcupreempt.c rcu: rcu_sched_grace_period(): kill the bogus flush_signals() 2009-05-05 20:28:05 +02:00
rcupreempt_trace.c "Tree RCU": scalable classic RCU implementation 2008-12-18 21:56:04 +01:00
rcutorture.c cpumask: convert rcutorture.c 2009-03-30 22:05:16 +10:30
rcutree.c rcu: Mark Hierarchical RCU no longer experimental 2009-06-24 15:02:48 +02:00
rcutree.h kmemtrace, rcu: fix rcu_tree_trace.c data structure dependencies 2009-04-03 12:23:03 +02:00
rcutree_trace.c rcu: Add __rcu_pending tracing to hierarchical RCU 2009-04-14 11:33:43 +02:00
relay.c Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-04-05 11:04:19 -07:00
res_counter.c memcg: add interface to reset limits 2009-06-18 13:03:48 -07:00
resource.c kernel/resource.c: fix sign extension in reserve_setup() 2009-06-30 18:56:00 -07:00
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c trivial: fix ETIMEOUT -> ETIMEDOUT typos 2009-06-12 18:01:50 +02:00
rtmutex.h
rtmutex_common.h rt_mutex: add proxy lock routines 2009-04-06 11:14:02 +02:00
rwsem.c
sched.c sched: fix load average accounting vs. cpu hotplug 2009-07-18 14:19:52 +02:00
sched_clock.c sched: Fix fallback sched_clock()'s offset when using jiffies 2009-05-09 10:08:19 +02:00
sched_cpupri.c sched: Fix race in cpupri introduced by cpumask_var changes 2009-08-02 14:23:29 +02:00
sched_cpupri.h cpumask: remove cpumask_t from core 2009-03-30 22:05:17 +10:30
sched_debug.c sched: Hide runqueues from direct refer at source code level 2009-06-17 18:29:42 +02:00
sched_fair.c sched: Fix latencytop and sleep profiling vs group scheduling 2009-08-02 14:10:12 +02:00
sched_features.h Merge branch 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-03-30 17:17:35 -07:00
sched_idletask.c sched, timers: move calc_load() to scheduler 2009-05-15 15:32:45 +02:00
sched_rt.c sched_rt: Fix overload bug on rt group scheduling 2009-07-10 10:43:29 +02:00
sched_stats.h sched: remove unused fields from struct rq 2009-03-24 23:16:51 +01:00
seccomp.c x86-64: seccomp: fix 32/64 syscall hole 2009-03-02 15:41:30 -08:00
semaphore.c
signal.c do_sigaltstack: small cleanups 2009-08-01 11:18:56 -07:00
slow-work.c slow-work: use round_jiffies() for thread pool's cull and OOM timers 2009-06-16 19:47:49 -07:00
smp.c generic-ipi: fix hotplug_cfd() 2009-08-07 10:39:55 -07:00
softirq.c softirq: introduce tasklet_hrtimer infrastructure 2009-07-22 17:01:17 +02:00
softlockup.c softlockup: decouple hung tasks check from softlockup detection 2009-01-16 14:06:04 +01:00
spinlock.c Allow rwlocks to re-enable interrupts 2009-04-02 19:05:11 -07:00
srcu.c
stacktrace.c stacktrace: provide save_stack_trace_tsk() weak alias 2008-12-25 11:44:43 +01:00
stop_machine.c cpumask: remove cpumask_t from core 2009-03-30 22:05:17 +10:30
sys.c groups: move code to kernel/groups.c 2009-06-16 19:47:48 -07:00
sys_ni.c Merge commit 'v2.6.29-rc2' into perfcounters/core 2009-01-21 16:37:27 +01:00
sysctl.c Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-06-28 11:05:28 -07:00
sysctl_check.c net: add ARP notify option for devices 2009-02-01 01:04:33 -08:00
taskstats.c cpumask: convert rest of files in kernel/ 2009-01-01 10:12:28 +10:30
test_kprobes.c kprobes: add tests for register_kprobes 2009-01-06 15:59:20 -08:00
time.c [CVE-2009-0029] System call wrappers part 01 2009-01-14 14:15:18 +01:00
timeconst.pl
timer.c timer: Avoid reading uninitialized data 2009-07-18 23:11:43 +02:00
tracepoint.c tracepoints: dont update zero-sized tracepoint sections 2009-03-18 19:55:00 +01:00
tsacct.c Fix fixpoint divide exception in acct_update_integrals 2009-03-09 08:13:35 -07:00
uid16.c [CVE-2009-0029] System call wrappers part 19 2009-01-14 14:15:26 +01:00
up.c smp_call_function_single(): be slightly less stupid, fix #2 2009-01-12 16:04:37 +01:00
user.c sched: delayed cleanup of user_struct 2009-06-15 21:30:23 -07:00
user_namespace.c Fix recursive lock in free_uid()/free_user_ns() 2009-02-27 16:26:21 -08:00
utsname.c utsns: extract creeate_uts_ns() 2009-06-18 13:03:55 -07:00
utsname_sysctl.c proc_sysctl: use CONFIG_PROC_SYSCTL around ipc and utsname proc_handlers 2009-04-02 19:05:01 -07:00
wait.c wait: don't use __wake_up_common() 2009-04-14 17:17:16 +02:00
workqueue.c ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro 2009-06-02 01:10:40 +02:00