linux-sg2042/arch
Steven Rostedt (Red Hat) 79922b8009 ftrace: Optimize function graph to be called directly
Function graph tracing is a bit different than the function tracers, as
it is processed after either the ftrace_caller or ftrace_regs_caller
and we only have one place to modify the jump to ftrace_graph_caller,
the jump needs to happen after the restore of registeres.

The function graph tracer is dependent on the function tracer, where
even if the function graph tracing is going on by itself, the save and
restore of registers is still done for function tracing regardless of
if function tracing is happening, before it calls the function graph
code.

If there's no function tracing happening, it is possible to just call
the function graph tracer directly, and avoid the wasted effort to save
and restore regs for function tracing.

This requires adding new flags to the dyn_ftrace records:

  FTRACE_FL_TRAMP
  FTRACE_FL_TRAMP_EN

The first is set if the count for the record is one, and the ftrace_ops
associated to that record has its own trampoline. That way the mcount code
can call that trampoline directly.

In the future, trampolines can be added to arbitrary ftrace_ops, where you
can have two or more ftrace_ops registered to ftrace (like kprobes and perf)
and if they are not tracing the same functions, then instead of doing a
loop to check all registered ftrace_ops against their hashes, just call the
ftrace_ops trampoline directly, which would call the registered ftrace_ops
function directly.

Without this patch perf showed:

  0.05%  hackbench  [kernel.kallsyms]  [k] ftrace_caller
  0.05%  hackbench  [kernel.kallsyms]  [k] arch_local_irq_save
  0.05%  hackbench  [kernel.kallsyms]  [k] native_sched_clock
  0.04%  hackbench  [kernel.kallsyms]  [k] __buffer_unlock_commit
  0.04%  hackbench  [kernel.kallsyms]  [k] preempt_trace
  0.04%  hackbench  [kernel.kallsyms]  [k] prepare_ftrace_return
  0.04%  hackbench  [kernel.kallsyms]  [k] __this_cpu_preempt_check
  0.04%  hackbench  [kernel.kallsyms]  [k] ftrace_graph_caller

See that the ftrace_caller took up more time than the ftrace_graph_caller
did.

With this patch:

  0.05%  hackbench  [kernel.kallsyms]  [k] __buffer_unlock_commit
  0.04%  hackbench  [kernel.kallsyms]  [k] call_filter_check_discard
  0.04%  hackbench  [kernel.kallsyms]  [k] ftrace_graph_caller
  0.04%  hackbench  [kernel.kallsyms]  [k] sched_clock

The ftrace_caller is no where to be found and ftrace_graph_caller still
takes up the same percentage.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-01 07:13:31 -04:00
..
alpha Merge commit '3cf2f34' into sched/core, to fix build error 2014-06-12 13:46:37 +02:00
arc ARC: [SMP] Enable icache coherency 2014-06-26 11:59:01 +05:30
arm Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm 2014-06-29 13:40:08 -07:00
arm64 arm64: mm: remove broken &= operator from pmd_mknotpresent 2014-06-18 16:34:30 +01:00
avr32 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next 2014-06-03 12:57:53 -07:00
blackfin blackfin updates for Linux 3.16 2014-06-12 20:08:47 -07:00
c6x DeviceTree for 3.16: 2014-06-04 10:02:38 -07:00
cris cris: update comments for generic idle conversion 2014-06-06 16:08:18 -07:00
frv sys_sgetmask/sys_ssetmask: add CONFIG_SGETMASK_SYSCALL 2014-06-04 16:54:14 -07:00
hexagon Hexagon: Delete stale barrier.h 2014-05-01 10:09:47 -07:00
ia64 ia64: arch/ia64/include/uapi/asm/fcntl.h needs personality.h 2014-06-23 16:47:44 -07:00
m32r arch,m32r: Convert smp_mb__*() 2014-04-18 14:20:36 +02:00
m68k Merge commit '3cf2f34' into sched/core, to fix build error 2014-06-12 13:46:37 +02:00
metag Merge commit '3cf2f34' into sched/core, to fix build error 2014-06-12 13:46:37 +02:00
microblaze Microblaze patches for 3.16-rc1 2014-06-05 16:15:33 -07:00
mips MIPS: Lasat: Fix build error if CRC32 is not enabled. 2014-06-26 14:43:01 +01:00
mn10300 sys_sgetmask/sys_ssetmask: add CONFIG_SGETMASK_SYSCALL 2014-06-04 16:54:14 -07:00
openrisc DeviceTree for 3.16: 2014-06-04 10:02:38 -07:00
parisc Merge commit '3cf2f34' into sched/core, to fix build error 2014-06-12 13:46:37 +02:00
powerpc powerpc: Don't skip ePAPR spin-table CPUs 2014-06-25 13:10:49 +10:00
s390 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2014-06-21 06:47:01 -10:00
score arch,score: Convert smp_mb__*() 2014-04-18 14:20:43 +02:00
sh Merge commit '3cf2f34' into sched/core, to fix build error 2014-06-12 13:46:37 +02:00
sparc nmi: provide the option to issue an NMI back trace to every cpu but current 2014-06-23 16:47:44 -07:00
tile Merge commit '3cf2f34' into sched/core, to fix build error 2014-06-12 13:46:37 +02:00
um Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2014-06-12 21:23:38 -07:00
unicore32 unicore32: Remove ARCH_HAS_CPUFREQ config option 2014-06-20 08:22:41 +08:00
x86 ftrace: Optimize function graph to be called directly 2014-07-01 07:13:31 -04:00
xtensa Merge commit '3cf2f34' into sched/core, to fix build error 2014-06-12 13:46:37 +02:00
.gitignore
Kconfig