tracing: Add inter-event hist trigger Documentation

Add background and details on inter-event hist triggers, including
hist variables, synthetic events, and actions.

Link: http://lkml.kernel.org/r/b0414efb66535aa52aa7411f58c3d56724027fce.1516069914.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Baohong Liu <baohong.liu@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
This commit is contained in:
Tom Zanussi 2018-01-15 20:52:06 -06:00 committed by Steven Rostedt (VMware)
parent f404da6e1d
commit 033cbceefa
1 changed files with 381 additions and 0 deletions

View File

@ -1603,3 +1603,384 @@
Hits: 489
Entries: 7
Dropped: 0
2.2 Inter-event hist triggers
-----------------------------
Inter-event hist triggers are hist triggers that combine values from
one or more other events and create a histogram using that data. Data
from an inter-event histogram can in turn become the source for
further combined histograms, thus providing a chain of related
histograms, which is important for some applications.
The most important example of an inter-event quantity that can be used
in this manner is latency, which is simply a difference in timestamps
between two events. Although latency is the most important
inter-event quantity, note that because the support is completely
general across the trace event subsystem, any event field can be used
in an inter-event quantity.
An example of a histogram that combines data from other histograms
into a useful chain would be a 'wakeupswitch latency' histogram that
combines a 'wakeup latency' histogram and a 'switch latency'
histogram.
Normally, a hist trigger specification consists of a (possibly
compound) key along with one or more numeric values, which are
continually updated sums associated with that key. A histogram
specification in this case consists of individual key and value
specifications that refer to trace event fields associated with a
single event type.
The inter-event hist trigger extension allows fields from multiple
events to be referenced and combined into a multi-event histogram
specification. In support of this overall goal, a few enabling
features have been added to the hist trigger support:
- In order to compute an inter-event quantity, a value from one
event needs to saved and then referenced from another event. This
requires the introduction of support for histogram 'variables'.
- The computation of inter-event quantities and their combination
require some minimal amount of support for applying simple
expressions to variables (+ and -).
- A histogram consisting of inter-event quantities isn't logically a
histogram on either event (so having the 'hist' file for either
event host the histogram output doesn't really make sense). To
address the idea that the histogram is associated with a
combination of events, support is added allowing the creation of
'synthetic' events that are events derived from other events.
These synthetic events are full-fledged events just like any other
and can be used as such, as for instance to create the
'combination' histograms mentioned previously.
- A set of 'actions' can be associated with histogram entries -
these can be used to generate the previously mentioned synthetic
events, but can also be used for other purposes, such as for
example saving context when a 'max' latency has been hit.
- Trace events don't have a 'timestamp' associated with them, but
there is an implicit timestamp saved along with an event in the
underlying ftrace ring buffer. This timestamp is now exposed as a
a synthetic field named 'common_timestamp' which can be used in
histograms as if it were any other event field; it isn't an actual
field in the trace format but rather is a synthesized value that
nonetheless can be used as if it were an actual field. By default
it is in units of nanoseconds; appending '.usecs' to a
common_timestamp field changes the units to microseconds.
These features are decribed in more detail in the following sections.
2.2.1 Histogram Variables
-------------------------
Variables are simply named locations used for saving and retrieving
values between matching events. A 'matching' event is defined as an
event that has a matching key - if a variable is saved for a histogram
entry corresponding to that key, any subsequent event with a matching
key can access that variable.
A variable's value is normally available to any subsequent event until
it is set to something else by a subsequent event. The one exception
to that rule is that any variable used in an expression is essentially
'read-once' - once it's used by an expression in a subsequent event,
it's reset to its 'unset' state, which means it can't be used again
unless it's set again. This ensures not only that an event doesn't
use an uninitialized variable in a calculation, but that that variable
is used only once and not for any unrelated subsequent match.
The basic syntax for saving a variable is to simply prefix a unique
variable name not corresponding to any keyword along with an '=' sign
to any event field.
Either keys or values can be saved and retrieved in this way. This
creates a variable named 'ts0' for a histogram entry with the key
'next_pid':
# echo 'hist:keys=next_pid:vals=$ts0:ts0=common_timestamp ... >> \
event/trigger
The ts0 variable can be accessed by any subsequent event having the
same pid as 'next_pid'.
Variable references are formed by prepending the variable name with
the '$' sign. Thus for example, the ts0 variable above would be
referenced as '$ts0' in expressions.
Because 'vals=' is used, the common_timestamp variable value above
will also be summed as a normal histogram value would (though for a
timestamp it makes little sense).
The below shows that a key value can also be saved in the same way:
# echo 'hist:timer_pid=common_pid:key=timer_pid ...' >> event/trigger
If a variable isn't a key variable or prefixed with 'vals=', the
associated event field will be saved in a variable but won't be summed
as a value:
# echo 'hist:keys=next_pid:ts1=common_timestamp ... >> event/trigger
Multiple variables can be assigned at the same time. The below would
result in both ts0 and b being created as variables, with both
common_timestamp and field1 additionally being summed as values:
# echo 'hist:keys=pid:vals=$ts0,$b:ts0=common_timestamp,b=field1 ... >> \
event/trigger
Note that variable assignments can appear either preceding or
following their use. The command below behaves identically to the
command above:
# echo 'hist:keys=pid:ts0=common_timestamp,b=field1:vals=$ts0,$b ... >> \
event/trigger
Any number of variables not bound to a 'vals=' prefix can also be
assigned by simply separating them with colons. Below is the same
thing but without the values being summed in the histogram:
# echo 'hist:keys=pid:ts0=common_timestamp:b=field1 ... >> event/trigger
Variables set as above can be referenced and used in expressions on
another event.
For example, here's how a latency can be calculated:
# echo 'hist:keys=pid,prio:ts0=common_timestamp ... >> event1/trigger
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp-$ts0 ... >> event2/trigger
In the first line above, the event's timetamp is saved into the
variable ts0. In the next line, ts0 is subtracted from the second
event's timestamp to produce the latency, which is then assigned into
yet another variable, 'wakeup_lat'. The hist trigger below in turn
makes use of the wakeup_lat variable to compute a combined latency
using the same key and variable from yet another event:
# echo 'hist:key=pid:wakeupswitch_lat=$wakeup_lat+$switchtime_lat ... >> event3/trigger
2.2.2 Synthetic Events
----------------------
Synthetic events are user-defined events generated from hist trigger
variables or fields associated with one or more other events. Their
purpose is to provide a mechanism for displaying data spanning
multiple events consistent with the existing and already familiar
usage for normal events.
To define a synthetic event, the user writes a simple specification
consisting of the name of the new event along with one or more
variables and their types, which can be any valid field type,
separated by semicolons, to the tracing/synthetic_events file.
For instance, the following creates a new event named 'wakeup_latency'
with 3 fields: lat, pid, and prio. Each of those fields is simply a
variable reference to a variable on another event:
# echo 'wakeup_latency \
u64 lat; \
pid_t pid; \
int prio' >> \
/sys/kernel/debug/tracing/synthetic_events
Reading the tracing/synthetic_events file lists all the currently
defined synthetic events, in this case the event defined above:
# cat /sys/kernel/debug/tracing/synthetic_events
wakeup_latency u64 lat; pid_t pid; int prio
An existing synthetic event definition can be removed by prepending
the command that defined it with a '!':
# echo '!wakeup_latency u64 lat pid_t pid int prio' >> \
/sys/kernel/debug/tracing/synthetic_events
At this point, there isn't yet an actual 'wakeup_latency' event
instantiated in the event subsytem - for this to happen, a 'hist
trigger action' needs to be instantiated and bound to actual fields
and variables defined on other events (see Section 6.3.3 below).
Once that is done, an event instance is created, and a histogram can
be defined using it:
# echo 'hist:keys=pid,prio,lat.log2:sort=pid,lat' >> \
/sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger
The new event is created under the tracing/events/synthetic/ directory
and looks and behaves just like any other event:
# ls /sys/kernel/debug/tracing/events/synthetic/wakeup_latency
enable filter format hist id trigger
Like any other event, once a histogram is enabled for the event, the
output can be displayed by reading the event's 'hist' file.
2.2.3 Hist trigger 'actions'
----------------------------
A hist trigger 'action' is a function that's executed whenever a
histogram entry is added or updated.
The default 'action' if no special function is explicity specified is
as it always has been, to simply update the set of values associated
with an entry. Some applications, however, may want to perform
additional actions at that point, such as generate another event, or
compare and save a maximum.
The following additional actions are available. To specify an action
for a given event, simply specify the action between colons in the
hist trigger specification.
- onmatch(matching.event).<synthetic_event_name>(param list)
The 'onmatch(matching.event).<synthetic_event_name>(params)' hist
trigger action is invoked whenever an event matches and the
histogram entry would be added or updated. It causes the named
synthetic event to be generated with the values given in the
'param list'. The result is the generation of a synthetic event
that consists of the values contained in those variables at the
time the invoking event was hit.
The 'param list' consists of one or more parameters which may be
either variables or fields defined on either the 'matching.event'
or the target event. The variables or fields specified in the
param list may be either fully-qualified or unqualified. If a
variable is specified as unqualified, it must be unique between
the two events. A field name used as a param can be unqualified
if it refers to the target event, but must be fully qualified if
it refers to the matching event. A fully-qualified name is of the
form 'system.event_name.$var_name' or 'system.event_name.field'.
The 'matching.event' specification is simply the fully qualified
event name of the event that matches the target event for the
onmatch() functionality, in the form 'system.event_name'.
Finally, the number and type of variables/fields in the 'param
list' must match the number and types of the fields in the
synthetic event being generated.
As an example the below defines a simple synthetic event and uses
a variable defined on the sched_wakeup_new event as a parameter
when invoking the synthetic event. Here we define the synthetic
event:
# echo 'wakeup_new_test pid_t pid' >> \
/sys/kernel/debug/tracing/synthetic_events
# cat /sys/kernel/debug/tracing/synthetic_events
wakeup_new_test pid_t pid
The following hist trigger both defines the missing testpid
variable and specifies an onmatch() action that generates a
wakeup_new_test synthetic event whenever a sched_wakeup_new event
occurs, which because of the 'if comm == "cyclictest"' filter only
happens when the executable is cyclictest:
# echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\
wakeup_new_test($testpid) if comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger
Creating and displaying a histogram based on those events is now
just a matter of using the fields and new synthetic event in the
tracing/events/synthetic directory, as usual:
# echo 'hist:keys=pid:sort=pid' >> \
/sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/trigger
Running 'cyclictest' should cause wakeup_new events to generate
wakeup_new_test synthetic events which should result in histogram
output in the wakeup_new_test event's hist file:
# cat /sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/hist
A more typical usage would be to use two events to calculate a
latency. The following example uses a set of hist triggers to
produce a 'wakeup_latency' histogram:
First, we define a 'wakeup_latency' synthetic event:
# echo 'wakeup_latency u64 lat; pid_t pid; int prio' >> \
/sys/kernel/debug/tracing/synthetic_events
Next, we specify that whenever we see a sched_waking event for a
cyclictest thread, save the timestamp in a 'ts0' variable:
# echo 'hist:keys=$saved_pid:saved_pid=pid:ts0=common_timestamp.usecs \
if comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_waking/trigger
Then, when the corresponding thread is actually scheduled onto the
CPU by a sched_switch event, calculate the latency and use that
along with another variable and an event field to generate a
wakeup_latency synthetic event:
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:\
onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,\
$saved_pid,next_prio) if next_comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_switch/trigger
We also need to create a histogram on the wakeup_latency synthetic
event in order to aggregate the generated synthetic event data:
# echo 'hist:keys=pid,prio,lat:sort=pid,lat' >> \
/sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger
Finally, once we've run cyclictest to actually generate some
events, we can see the output by looking at the wakeup_latency
synthetic event's hist file:
# cat /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/hist
- onmax(var).save(field,.. .)
The 'onmax(var).save(field,...)' hist trigger action is invoked
whenever the value of 'var' associated with a histogram entry
exceeds the current maximum contained in that variable.
The end result is that the trace event fields specified as the
onmax.save() params will be saved if 'var' exceeds the current
maximum for that hist trigger entry. This allows context from the
event that exhibited the new maximum to be saved for later
reference. When the histogram is displayed, additional fields
displaying the saved values will be printed.
As an example the below defines a couple of hist triggers, one for
sched_waking and another for sched_switch, keyed on pid. Whenever
a sched_waking occurs, the timestamp is saved in the entry
corresponding to the current pid, and when the scheduler switches
back to that pid, the timestamp difference is calculated. If the
resulting latency, stored in wakeup_lat, exceeds the current
maximum latency, the values specified in the save() fields are
recoreded:
# echo 'hist:keys=pid:ts0=common_timestamp.usecs \
if comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_waking/trigger
# echo 'hist:keys=next_pid:\
wakeup_lat=common_timestamp.usecs-$ts0:\
onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm) \
if next_comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_switch/trigger
When the histogram is displayed, the max value and the saved
values corresponding to the max are displayed following the rest
of the fields:
# cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist
{ next_pid: 2255 } hitcount: 239
common_timestamp-ts0: 0
max: 27
next_comm: cyclictest
prev_pid: 0 prev_prio: 120 prev_comm: swapper/1
{ next_pid: 2256 } hitcount: 2355
common_timestamp-ts0: 0
max: 49 next_comm: cyclictest
prev_pid: 0 prev_prio: 120 prev_comm: swapper/0
Totals:
Hits: 12970
Entries: 2
Dropped: 0