2008-05-13 03:20:51 +08:00
|
|
|
|
|
|
|
# Do not instrument the tracer itself:
|
|
|
|
|
2008-10-07 07:06:12 +08:00
|
|
|
ifdef CONFIG_FUNCTION_TRACER
|
2008-05-13 03:20:51 +08:00
|
|
|
ORIG_CFLAGS := $(KBUILD_CFLAGS)
|
|
|
|
KBUILD_CFLAGS = $(subst -pg,,$(ORIG_CFLAGS))
|
2008-05-13 03:20:54 +08:00
|
|
|
|
2012-07-20 23:13:07 +08:00
|
|
|
ifdef CONFIG_FTRACE_SELFTEST
|
2008-05-13 03:20:54 +08:00
|
|
|
# selftest needs instrumentation
|
|
|
|
CFLAGS_trace_selftest_dynamic.o = -pg
|
|
|
|
obj-y += trace_selftest_dynamic.o
|
2008-05-13 03:20:51 +08:00
|
|
|
endif
|
2012-07-20 23:13:07 +08:00
|
|
|
endif
|
2008-05-13 03:20:51 +08:00
|
|
|
|
2008-11-12 13:14:40 +08:00
|
|
|
# If unlikely tracing is enabled, do not trace these files
|
2008-11-13 04:24:24 +08:00
|
|
|
ifdef CONFIG_TRACING_BRANCHES
|
|
|
|
KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
|
2008-11-12 13:14:40 +08:00
|
|
|
endif
|
|
|
|
|
2014-05-30 10:49:07 +08:00
|
|
|
CFLAGS_trace_benchmark.o := -I$(src)
|
2011-08-11 22:25:54 +08:00
|
|
|
CFLAGS_trace_events_filter.o := -I$(src)
|
|
|
|
|
trace: Stop compiling in trace_clock unconditionally
Commit 56449f437 "tracing: make the trace clocks available generally",
in April 2009, made trace_clock available unconditionally, since
CONFIG_X86_DS used it too.
Commit faa4602e47 "x86, perf, bts, mm: Delete the never used BTS-ptrace code",
in March 2010, removed CONFIG_X86_DS, and now only CONFIG_RING_BUFFER (split
out from CONFIG_TRACING for general use) has a dependency on trace_clock. So,
only compile in trace_clock with CONFIG_RING_BUFFER or CONFIG_TRACING
enabled.
Link: http://lkml.kernel.org/r/20120903024513.GA19583@leaf
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-09-03 10:45:14 +08:00
|
|
|
obj-$(CONFIG_TRACE_CLOCK) += trace_clock.o
|
2009-04-14 17:24:36 +08:00
|
|
|
|
2008-10-07 07:06:12 +08:00
|
|
|
obj-$(CONFIG_FUNCTION_TRACER) += libftrace.o
|
tracing: unified trace buffer
This is a unified tracing buffer that implements a ring buffer that
hopefully everyone will eventually be able to use.
The events recorded into the buffer have the following structure:
struct ring_buffer_event {
u32 type:2, len:3, time_delta:27;
u32 array[];
};
The minimum size of an event is 8 bytes. All events are 4 byte
aligned inside the buffer.
There are 4 types (all internal use for the ring buffer, only
the data type is exported to the interface users).
RINGBUF_TYPE_PADDING: this type is used to note extra space at the end
of a buffer page.
RINGBUF_TYPE_TIME_EXTENT: This type is used when the time between events
is greater than the 27 bit delta can hold. We add another
32 bits, and record that in its own event (8 byte size).
RINGBUF_TYPE_TIME_STAMP: (Not implemented yet). This will hold data to
help keep the buffer timestamps in sync.
RINGBUF_TYPE_DATA: The event actually holds user data.
The "len" field is only three bits. Since the data must be
4 byte aligned, this field is shifted left by 2, giving a
max length of 28 bytes. If the data load is greater than 28
bytes, the first array field holds the full length of the
data load and the len field is set to zero.
Example, data size of 7 bytes:
type = RINGBUF_TYPE_DATA
len = 2
time_delta: <time-stamp> - <prev_event-time-stamp>
array[0..1]: <7 bytes of data> <1 byte empty>
This event is saved in 12 bytes of the buffer.
An event with 82 bytes of data:
type = RINGBUF_TYPE_DATA
len = 0
time_delta: <time-stamp> - <prev_event-time-stamp>
array[0]: 84 (Note the alignment)
array[1..14]: <82 bytes of data> <2 bytes empty>
The above event is saved in 92 bytes (if my math is correct).
82 bytes of data, 2 bytes empty, 4 byte header, 4 byte length.
Do not reference the above event struct directly. Use the following
functions to gain access to the event table, since the
ring_buffer_event structure may change in the future.
ring_buffer_event_length(event): get the length of the event.
This is the size of the memory used to record this
event, and not the size of the data pay load.
ring_buffer_time_delta(event): get the time delta of the event
This returns the delta time stamp since the last event.
Note: Even though this is in the header, there should
be no reason to access this directly, accept
for debugging.
ring_buffer_event_data(event): get the data from the event
This is the function to use to get the actual data
from the event. Note, it is only a pointer to the
data inside the buffer. This data must be copied to
another location otherwise you risk it being written
over in the buffer.
ring_buffer_lock: A way to lock the entire buffer.
ring_buffer_unlock: unlock the buffer.
ring_buffer_alloc: create a new ring buffer. Can choose between
overwrite or consumer/producer mode. Overwrite will
overwrite old data, where as consumer producer will
throw away new data if the consumer catches up with the
producer. The consumer/producer is the default.
ring_buffer_free: free the ring buffer.
ring_buffer_resize: resize the buffer. Changes the size of each cpu
buffer. Note, it is up to the caller to provide that
the buffer is not being used while this is happening.
This requirement may go away but do not count on it.
ring_buffer_lock_reserve: locks the ring buffer and allocates an
entry on the buffer to write to.
ring_buffer_unlock_commit: unlocks the ring buffer and commits it to
the buffer.
ring_buffer_write: writes some data into the ring buffer.
ring_buffer_peek: Look at a next item in the cpu buffer.
ring_buffer_consume: get the next item in the cpu buffer and
consume it. That is, this function increments the head
pointer.
ring_buffer_read_start: Start an iterator of a cpu buffer.
For now, this disables the cpu buffer, until you issue
a finish. This is just because we do not want the iterator
to be overwritten. This restriction may change in the future.
But note, this is used for static reading of a buffer which
is usually done "after" a trace. Live readings would want
to use the ring_buffer_consume above, which will not
disable the ring buffer.
ring_buffer_read_finish: Finishes the read iterator and reenables
the ring buffer.
ring_buffer_iter_peek: Look at the next item in the cpu iterator.
ring_buffer_read: Read the iterator and increment it.
ring_buffer_iter_reset: Reset the iterator to point to the beginning
of the cpu buffer.
ring_buffer_iter_empty: Returns true if the iterator is at the end
of the cpu buffer.
ring_buffer_size: returns the size in bytes of each cpu buffer.
Note, the real size is this times the number of CPUs.
ring_buffer_reset_cpu: Sets the cpu buffer to empty
ring_buffer_reset: sets all cpu buffers to empty
ring_buffer_swap_cpu: swaps a cpu buffer from one buffer with a
cpu buffer of another buffer. This is handy when you
want to take a snap shot of a running trace on just one
cpu. Having a backup buffer, to swap with facilitates this.
Ftrace max latencies use this.
ring_buffer_empty: Returns true if the ring buffer is empty.
ring_buffer_empty_cpu: Returns true if the cpu buffer is empty.
ring_buffer_record_disable: disable all cpu buffers (read only)
ring_buffer_record_disable_cpu: disable a single cpu buffer (read only)
ring_buffer_record_enable: enable all cpu buffers.
ring_buffer_record_enabl_cpu: enable a single cpu buffer.
ring_buffer_entries: The number of entries in a ring buffer.
ring_buffer_overruns: The number of entries removed due to writing wrap.
ring_buffer_time_stamp: Get the time stamp used by the ring buffer
ring_buffer_normalize_time_stamp: normalize the ring buffer time stamp
into nanosecs.
I still need to implement the GTOD feature. But we need support from
the cpu frequency infrastructure. But this can be done at a later
time without affecting the ring buffer interface.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-30 11:02:38 +08:00
|
|
|
obj-$(CONFIG_RING_BUFFER) += ring_buffer.o
|
2009-05-06 10:47:18 +08:00
|
|
|
obj-$(CONFIG_RING_BUFFER_BENCHMARK) += ring_buffer_benchmark.o
|
2008-05-13 03:20:42 +08:00
|
|
|
|
2008-05-13 03:20:42 +08:00
|
|
|
obj-$(CONFIG_TRACING) += trace.o
|
2008-12-24 12:24:12 +08:00
|
|
|
obj-$(CONFIG_TRACING) += trace_output.o
|
2008-12-29 12:44:51 +08:00
|
|
|
obj-$(CONFIG_TRACING) += trace_stat.o
|
2009-03-07 00:21:49 +08:00
|
|
|
obj-$(CONFIG_TRACING) += trace_printk.o
|
2008-05-13 03:20:42 +08:00
|
|
|
obj-$(CONFIG_CONTEXT_SWITCH_TRACER) += trace_sched_switch.o
|
2008-10-07 07:06:12 +08:00
|
|
|
obj-$(CONFIG_FUNCTION_TRACER) += trace_functions.o
|
2008-05-13 03:20:42 +08:00
|
|
|
obj-$(CONFIG_IRQSOFF_TRACER) += trace_irqsoff.o
|
2008-05-13 03:20:42 +08:00
|
|
|
obj-$(CONFIG_PREEMPT_TRACER) += trace_irqsoff.o
|
2008-05-13 03:20:42 +08:00
|
|
|
obj-$(CONFIG_SCHED_TRACER) += trace_sched_wakeup.o
|
2008-09-19 18:06:43 +08:00
|
|
|
obj-$(CONFIG_NOP_TRACER) += trace_nop.o
|
2008-08-28 11:31:01 +08:00
|
|
|
obj-$(CONFIG_STACK_TRACER) += trace_stack.o
|
2008-05-13 03:20:57 +08:00
|
|
|
obj-$(CONFIG_MMIOTRACE) += trace_mmiotrace.o
|
2008-11-26 04:07:04 +08:00
|
|
|
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += trace_functions_graph.o
|
2008-11-13 05:18:45 +08:00
|
|
|
obj-$(CONFIG_TRACE_BRANCH_PROFILING) += trace_branch.o
|
tracing/events: convert block trace points to TRACE_EVENT()
TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
these new capabilities to this tracepoint:
- zero-copy and per-cpu splice() tracing
- binary tracing without printf overhead
- structured logging records exposed under /debug/tracing/events
- trace events embedded in function tracer output and other plugins
- user-defined, per tracepoint filter expressions
...
Cons:
- no dev_t info for the output of plug, unplug_timer and unplug_io events.
no dev_t info for getrq and sleeprq events if bio == NULL.
no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.
This is mainly because we can't get the deivce from a request queue.
But this may change in the future.
- A packet command is converted to a string in TP_assign, not TP_print.
While blktrace do the convertion just before output.
Since pc requests should be rather rare, this is not a big issue.
- In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
has a unique format, which means we have some unused data in a trace entry.
The overhead is minimized by using __dynamic_array() instead of __array().
I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:
dd dd + ioctl blktrace dd + TRACE_EVENT (splice)
1 7.36s, 42.7 MB/s 7.50s, 42.0 MB/s 7.41s, 42.5 MB/s
2 7.43s, 42.3 MB/s 7.48s, 42.1 MB/s 7.43s, 42.4 MB/s
3 7.38s, 42.6 MB/s 7.45s, 42.2 MB/s 7.41s, 42.5 MB/s
So the overhead of tracing is very small, and no regression when using
those trace events vs blktrace.
And the binary output of TRACE_EVENT is much smaller than blktrace:
# ls -l -h
-rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
-rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
-rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out
Following are some comparisons between TRACE_EVENT and blktrace:
plug:
kjournald-480 [000] 303.084981: block_plug: [kjournald]
kjournald-480 [000] 303.084981: 8,0 P N [kjournald]
unplug_io:
kblockd/0-118 [000] 300.052973: block_unplug_io: [kblockd/0] 1
kblockd/0-118 [000] 300.052974: 8,0 U N [kblockd/0] 1
remap:
kjournald-480 [000] 303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384
kjournald-480 [000] 303.085043: 8,0 A W 102736992 + 8 <- (8,8) 33384
bio_backmerge:
kjournald-480 [000] 303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald]
kjournald-480 [000] 303.085086: 8,0 M W 102737032 + 8 [kjournald]
getrq:
kjournald-480 [000] 303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald]
kjournald-480 [000] 303.084975: 8,0 G W 102736984 + 8 [kjournald]
bash-2066 [001] 1072.953770: 8,0 G N [bash]
bash-2066 [001] 1072.953773: block_getrq: 0,0 N 0 + 0 [bash]
rq_complete:
konsole-2065 [001] 300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0]
konsole-2065 [001] 300.053191: 8,0 C W 103669040 + 16 [0]
ksoftirqd/1-7 [001] 1072.953811: 8,0 C N (5a 00 08 00 00 00 00 00 24 00) [0]
ksoftirqd/1-7 [001] 1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0]
rq_insert:
kjournald-480 [000] 303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald]
kjournald-480 [000] 303.084986: 8,0 I W 102736984 + 8 [kjournald]
Changelog from v2 -> v3:
- use the newly introduced __dynamic_array().
Changelog from v1 -> v2:
- use __string() instead of __array() to minimize the memory required
to store hex dump of rq->cmd().
- support large pc requests.
- add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.
- some cleanups.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-09 13:43:05 +08:00
|
|
|
obj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o
|
|
|
|
ifeq ($(CONFIG_BLOCK),y)
|
|
|
|
obj-$(CONFIG_EVENT_TRACING) += blktrace.o
|
|
|
|
endif
|
2009-04-08 16:14:01 +08:00
|
|
|
obj-$(CONFIG_EVENT_TRACING) += trace_events.o
|
|
|
|
obj-$(CONFIG_EVENT_TRACING) += trace_export.o
|
2009-03-07 12:52:59 +08:00
|
|
|
obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o
|
2009-12-21 14:27:35 +08:00
|
|
|
ifeq ($(CONFIG_PERF_EVENTS),y)
|
2010-03-05 12:35:37 +08:00
|
|
|
obj-$(CONFIG_EVENT_TRACING) += trace_event_perf.o
|
2009-12-21 14:27:35 +08:00
|
|
|
endif
|
2009-04-08 16:14:01 +08:00
|
|
|
obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 21:59:24 +08:00
|
|
|
obj-$(CONFIG_EVENT_TRACING) += trace_events_trigger.o
|
2009-11-04 08:12:47 +08:00
|
|
|
obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o
|
2010-10-28 23:31:17 +08:00
|
|
|
obj-$(CONFIG_TRACEPOINTS) += power-traces.o
|
2011-09-30 04:07:23 +08:00
|
|
|
ifeq ($(CONFIG_PM_RUNTIME),y)
|
2011-09-28 04:53:27 +08:00
|
|
|
obj-$(CONFIG_TRACEPOINTS) += rpm-traces.o
|
2011-09-30 04:07:23 +08:00
|
|
|
endif
|
2010-08-05 22:22:23 +08:00
|
|
|
ifeq ($(CONFIG_TRACING),y)
|
|
|
|
obj-$(CONFIG_KGDB_KDB) += trace_kdb.o
|
|
|
|
endif
|
2012-04-09 17:11:44 +08:00
|
|
|
obj-$(CONFIG_PROBE_EVENTS) += trace_probe.o
|
2012-04-11 18:30:43 +08:00
|
|
|
obj-$(CONFIG_UPROBE_EVENT) += trace_uprobe.o
|
2008-05-13 03:20:42 +08:00
|
|
|
|
2014-05-30 10:49:07 +08:00
|
|
|
obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o
|
|
|
|
|
2008-05-13 03:20:42 +08:00
|
|
|
libftrace-y := ftrace.o
|