OpenCloudOS-Kernel/kernel/trace/trace_branch.c

381 lines
8.2 KiB
C
Raw Normal View History

tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
/*
* unlikely profiler
*
* Copyright (C) 2008 Steven Rostedt <srostedt@redhat.com>
*/
#include <linux/kallsyms.h>
#include <linux/seq_file.h>
#include <linux/spinlock.h>
#include <linux/irqflags.h>
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
#include <linux/debugfs.h>
#include <linux/uaccess.h>
#include <linux/module.h>
#include <linux/ftrace.h>
#include <linux/hash.h>
#include <linux/fs.h>
#include <asm/local.h>
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
#include "trace.h"
#include "trace_stat.h"
#include "trace_output.h"
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
#ifdef CONFIG_BRANCH_TRACER
static struct tracer branch_trace;
static int branch_tracing_enabled __read_mostly;
static DEFINE_MUTEX(branch_tracing_mutex);
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
static struct trace_array *branch_tracer;
static void
probe_likely_condition(struct ftrace_branch_data *f, int val, int expect)
{
struct trace_array *tr = branch_tracer;
struct ring_buffer_event *event;
struct trace_branch *entry;
unsigned long flags;
int cpu, pc;
const char *p;
/*
* I would love to save just the ftrace_likely_data pointer, but
* this code can also be used by modules. Ugly things can happen
* if the module is unloaded, and then we go and read the
* pointer. This is slower, but much safer.
*/
if (unlikely(!tr))
return;
local_irq_save(flags);
cpu = raw_smp_processor_id();
if (atomic_inc_return(&tr->data[cpu]->disabled) != 1)
goto out;
pc = preempt_count();
event = trace_buffer_lock_reserve(tr, TRACE_BRANCH,
sizeof(*entry), flags, pc);
if (!event)
goto out;
entry = ring_buffer_event_data(event);
/* Strip off the path, only save the file */
p = f->file + strlen(f->file);
while (p >= f->file && *p != '/')
p--;
p++;
strncpy(entry->func, f->func, TRACE_FUNC_SIZE);
strncpy(entry->file, p, TRACE_FILE_SIZE);
entry->func[TRACE_FUNC_SIZE] = 0;
entry->file[TRACE_FILE_SIZE] = 0;
entry->line = f->line;
entry->correct = val == expect;
ring_buffer_unlock_commit(tr->buffer, event);
out:
atomic_dec(&tr->data[cpu]->disabled);
local_irq_restore(flags);
}
static inline
void trace_likely_condition(struct ftrace_branch_data *f, int val, int expect)
{
if (!branch_tracing_enabled)
return;
probe_likely_condition(f, val, expect);
}
int enable_branch_tracing(struct trace_array *tr)
{
mutex_lock(&branch_tracing_mutex);
branch_tracer = tr;
/*
* Must be seen before enabling. The reader is a condition
* where we do not need a matching rmb()
*/
smp_wmb();
branch_tracing_enabled++;
mutex_unlock(&branch_tracing_mutex);
return 0;
}
void disable_branch_tracing(void)
{
mutex_lock(&branch_tracing_mutex);
if (!branch_tracing_enabled)
goto out_unlock;
branch_tracing_enabled--;
out_unlock:
mutex_unlock(&branch_tracing_mutex);
}
static void start_branch_trace(struct trace_array *tr)
{
enable_branch_tracing(tr);
}
static void stop_branch_trace(struct trace_array *tr)
{
disable_branch_tracing();
}
static int branch_trace_init(struct trace_array *tr)
{
start_branch_trace(tr);
return 0;
}
static void branch_trace_reset(struct trace_array *tr)
{
stop_branch_trace(tr);
}
static enum print_line_t trace_branch_print(struct trace_iterator *iter,
int flags)
{
struct trace_branch *field;
trace_assign_type(field, iter->ent);
if (trace_seq_printf(&iter->seq, "[%s] %s:%s:%d\n",
field->correct ? " ok " : " MISS ",
field->func,
field->file,
field->line))
return TRACE_TYPE_PARTIAL_LINE;
return TRACE_TYPE_HANDLED;
}
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
static struct trace_event trace_branch_event = {
.type = TRACE_BRANCH,
.trace = trace_branch_print,
};
static struct tracer branch_trace __read_mostly =
{
.name = "branch",
.init = branch_trace_init,
.reset = branch_trace_reset,
#ifdef CONFIG_FTRACE_SELFTEST
.selftest = trace_selftest_startup_branch,
#endif /* CONFIG_FTRACE_SELFTEST */
};
__init static int init_branch_tracer(void)
{
int ret;
ret = register_ftrace_event(&trace_branch_event);
if (!ret) {
printk(KERN_WARNING "Warning: could not register "
"branch events\n");
return 1;
}
return register_tracer(&branch_trace);
}
device_initcall(init_branch_tracer);
#else
static inline
void trace_likely_condition(struct ftrace_branch_data *f, int val, int expect)
{
}
#endif /* CONFIG_BRANCH_TRACER */
void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect)
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
{
/*
* I would love to have a trace point here instead, but the
* trace point code is so inundated with unlikely and likely
* conditions that the recursive nightmare that exists is too
* much to try to get working. At least for now.
*/
trace_likely_condition(f, val, expect);
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
/* FIXME: Make this atomic! */
if (val == expect)
f->correct++;
else
f->incorrect++;
}
EXPORT_SYMBOL(ftrace_likely_update);
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
extern unsigned long __start_annotated_branch_profile[];
extern unsigned long __stop_annotated_branch_profile[];
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
static int annotated_branch_stat_headers(struct seq_file *m)
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
{
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
seq_printf(m, " correct incorrect %% ");
seq_printf(m, " Function "
" File Line\n"
" ------- --------- - "
" -------- "
" ---- ----\n");
return 0;
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
}
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
static inline long get_incorrect_percent(struct ftrace_branch_data *p)
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
{
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
long percent;
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
if (p->correct) {
percent = p->incorrect * 100;
percent /= p->correct + p->incorrect;
} else
percent = p->incorrect ? 100 : -1;
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
return percent;
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
}
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
static int branch_stat_show(struct seq_file *m, void *v)
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
{
struct ftrace_branch_data *p = v;
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
const char *f;
long percent;
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
/* Only print the file, not the path */
f = p->file + strlen(p->file);
while (f >= p->file && *f != '/')
f--;
f++;
/*
* The miss is overlayed on correct, and hit on incorrect.
*/
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
percent = get_incorrect_percent(p);
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
seq_printf(m, "%8lu %8lu ", p->correct, p->incorrect);
if (percent < 0)
seq_printf(m, " X ");
else
seq_printf(m, "%3ld ", percent);
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
seq_printf(m, "%-30.30s %-20.20s %d\n", p->func, f, p->line);
return 0;
}
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
static void *annotated_branch_stat_start(void)
{
return __start_annotated_branch_profile;
}
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
static void *
annotated_branch_stat_next(void *v, int idx)
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
{
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
struct ftrace_branch_data *p = v;
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
++p;
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
if ((void *)p >= (void *)__stop_annotated_branch_profile)
return NULL;
return p;
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
}
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
static int annotated_branch_stat_cmp(void *p1, void *p2)
{
struct ftrace_branch_data *a = p1;
struct ftrace_branch_data *b = p2;
long percent_a, percent_b;
percent_a = get_incorrect_percent(a);
percent_b = get_incorrect_percent(b);
if (percent_a < percent_b)
return -1;
if (percent_a > percent_b)
return 1;
else
return 0;
}
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
static struct tracer_stat annotated_branch_stats = {
.name = "branch_annotated",
.stat_start = annotated_branch_stat_start,
.stat_next = annotated_branch_stat_next,
.stat_cmp = annotated_branch_stat_cmp,
.stat_headers = annotated_branch_stat_headers,
.stat_show = branch_stat_show
};
__init static int init_annotated_branch_stats(void)
{
int ret;
ret = register_stat_tracer(&annotated_branch_stats);
if (!ret) {
printk(KERN_WARNING "Warning: could not register "
"annotated branches stats\n");
return 1;
}
return 0;
}
fs_initcall(init_annotated_branch_stats);
#ifdef CONFIG_PROFILE_ALL_BRANCHES
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
extern unsigned long __start_branch_profile[];
extern unsigned long __stop_branch_profile[];
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
static int all_branch_stat_headers(struct seq_file *m)
{
seq_printf(m, " miss hit %% ");
seq_printf(m, " Function "
" File Line\n"
" ------- --------- - "
" -------- "
" ---- ----\n");
return 0;
}
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
static void *all_branch_stat_start(void)
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
{
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
return __start_branch_profile;
}
static void *
all_branch_stat_next(void *v, int idx)
{
struct ftrace_branch_data *p = v;
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
++p;
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
if ((void *)p >= (void *)__stop_branch_profile)
return NULL;
tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 13:14:39 +08:00
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
return p;
}
static struct tracer_stat all_branch_stats = {
.name = "branch_all",
.stat_start = all_branch_stat_start,
.stat_next = all_branch_stat_next,
.stat_headers = all_branch_stat_headers,
.stat_show = branch_stat_show
};
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
__init static int all_annotated_branch_stats(void)
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
{
int ret;
ret = register_stat_tracer(&all_branch_stats);
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
if (!ret) {
printk(KERN_WARNING "Warning: could not register "
"all branches stats\n");
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
return 1;
}
return 0;
tracing/branch-tracer: adapt to the stat tracing API Impact: refactor the branch tracer This patch adapts the branch tracer to the tracing API. This is a proof of concept because the branch tracer implements two "stat tracing" that were split in two files. So I added an option to the branch tracer: stat_all_branch. If it is set, then trace_stat will output all of the branches entries stats. Otherwise, it will print the annotated branches. Its is a kind of quick trick, waiting for a better solution. By default, the annotated branches stat are sorted by incorrect branch prediction percentage. Ie: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 1 100 native_smp_prepare_cpus smpboot.c 1228 0 1 100 hpet_rtc_timer_reinit hpet.c 1057 0 18032 100 sched_info_queued sched_stats.h 223 0 684 100 yield_task_fair sched_fair.c 984 0 282 100 pre_schedule_rt sched_rt.c 1263 0 13414 100 sched_info_dequeued sched_stats.h 178 0 21724 100 sched_info_switch sched_stats.h 270 0 1 100 get_signal_to_deliver signal.c 1820 0 8 100 __cancel_work_timer workqueue.c 560 0 212 100 verify_export_symbols module.c 1509 0 17 100 __rmqueue_fallback page_alloc.c 793 0 43 100 clear_page_mlock internal.h 129 0 124 100 try_to_unmap_anon rmap.c 1021 0 53 100 try_to_unmap_anon rmap.c 1013 0 6 100 vma_address rmap.c 232 0 3301 100 try_to_unmap_file rmap.c 1082 0 466 100 try_to_unmap_file rmap.c 1077 0 1 100 mem_cgroup_create memcontrol.c 1090 0 3 100 inotify_find_update_watch inotify.c 726 2 30163 99 perf_counter_task_sched_out perf_counter.c 385 1 2935 99 percpu_free allocpercpu.c 138 1544 297672 99 dentry_lru_del_init dcache.c 153 8 1074 99 input_pass_event input.c 86 1390 76781 98 mapping_unevictable pagemap.h 50 280 6665 95 pick_next_task_rt sched_rt.c 889 750 4826 86 next_pidmap pid.c 194 2 8 80 blocking_notifier_chain_regist notifier.c 220 36 130 78 ioremap_pte_range ioremap.c 22 1093 3247 74 IS_ERR err.h 34 1023 2908 73 sched_slice sched_fair.c 445 22 60 73 disk_put_part genhd.h 206 [...] It enables a developer to quickly address the source of incorrect branch predictions. Note that this sorting would be better with a second sort on the number of incorrect predictions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 06:25:38 +08:00
}
fs_initcall(all_annotated_branch_stats);
#endif /* CONFIG_PROFILE_ALL_BRANCHES */