New features:

- Tom Zanussi's extended histogram work
    This adds the synthetic events to have histograms from multiple event data
    Adds triggers "onmatch" and "onmax" to call the synthetic events
    Several updates to the histogram code from this
 
  - Allow way to nest ring buffer calls in the same context
 
  - Allow absolute time stamps in ring buffer
 
  - Rewrite of filter code parsing based on Al Viro's suggestions
 
  - Setting of trace_clock to global if TSC is unstable (on boot)
 
  - Better OOM handling when allocating large ring buffers
 
  - Added initcall tracepoints (consolidated initcall_debug code with them)
 
 And other various fixes and clean ups
 -----BEGIN PGP SIGNATURE-----
 
 iQHIBAABCgAyFiEEPm6V/WuN2kyArTUe1a05Y9njSUkFAlrLoCAUHHJvc3RlZHRA
 Z29vZG1pcy5vcmcACgkQ1a05Y9njSUks/QwAn/ky8WgfjcRdjKmBYuEwDedvm9iI
 V9G5kpv5JMw5dLz4l1pS3tA3M9Lyuc5z3Shw92FTy36vdU1wxEjQgHa7viB1xk9x
 KsiTyNjTsgrRd7GVHMy/8Be2RRiTRLaXKAsLCoj/c7QWzagV1P8XWlWK5mojYkh/
 DrSXyg9Avkp30+sU1bvcLWnmmZUFqMxs+bWipD9uFc98USMMyeP25nrnhrj0gDTg
 Q93cjXUuyVRC4lJ2YTW0GCSKhMKEw5f/ltEOT1hwScqYkCJj1EubKqS53R/9h21z
 IPUrYcqLnMRu0j2ejR+UAy5Vsy3gJUrPMQb0F6hlu1DwbMd0d/9SGh1c+Sm+zorh
 yftWTdCZsYrXkaOuB6V5M30X+KBwbWO0Xc9VCvgJ/IU5vMlgLSt5itTWbT/Fmfhb
 ll5/RXP7zhSXRv5sdl/BP3/4dd6F8jpyKyaR2Rk2+XjBOGIq5mvqNGr4Vj9AzxW8
 E0nvq7l7e0dbxZNM42gEm3cht1VUg7Zz0Y0+
 =91oN
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "New features:

   - Tom Zanussi's extended histogram work.

     This adds the synthetic events to have histograms from multiple
     event data Adds triggers "onmatch" and "onmax" to call the
     synthetic events Several updates to the histogram code from this

   - Allow way to nest ring buffer calls in the same context

   - Allow absolute time stamps in ring buffer

   - Rewrite of filter code parsing based on Al Viro's suggestions

   - Setting of trace_clock to global if TSC is unstable (on boot)

   - Better OOM handling when allocating large ring buffers

   - Added initcall tracepoints (consolidated initcall_debug code with
     them)

  And other various fixes and clean ups"

* tag 'trace-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (68 commits)
  init: Have initcall_debug still work without CONFIG_TRACEPOINTS
  init, tracing: Have printk come through the trace events for initcall_debug
  init, tracing: instrument security and console initcall trace events
  init, tracing: Add initcall trace events
  tracing: Add rcu dereference annotation for test func that touches filter->prog
  tracing: Add rcu dereference annotation for filter->prog
  tracing: Fixup logic inversion on setting trace_global_clock defaults
  tracing: Hide global trace clock from lockdep
  ring-buffer: Add set/clear_current_oom_origin() during allocations
  ring-buffer: Check if memory is available before allocation
  lockdep: Add print_irqtrace_events() to __warn
  vsprintf: Do not preprocess non-dereferenced pointers for bprintf (%px and %pK)
  tracing: Uninitialized variable in create_tracing_map_fields()
  tracing: Make sure variable string fields are NULL-terminated
  tracing: Add action comparisons when testing matching hist triggers
  tracing: Don't add flag strings when displaying variable references
  tracing: Fix display of hist trigger expressions containing timestamps
  ftrace: Drop a VLA in module_exists()
  tracing: Mention trace_clock=global when warning about unstable clocks
  tracing: Default to using trace_global_clock if sched_clock is unstable
  ...
This commit is contained in:
Linus Torvalds 2018-04-10 11:27:30 -07:00
commit 2a56bb596b
30 changed files with 8411 additions and 3307 deletions

File diff suppressed because it is too large Load Diff

View File

@ -543,6 +543,30 @@ of ftrace. Here is a list of some of the key files:
See events.txt for more information. See events.txt for more information.
timestamp_mode:
Certain tracers may change the timestamp mode used when
logging trace events into the event buffer. Events with
different modes can coexist within a buffer but the mode in
effect when an event is logged determines which timestamp mode
is used for that event. The default timestamp mode is
'delta'.
Usual timestamp modes for tracing:
# cat timestamp_mode
[delta] absolute
The timestamp mode with the square brackets around it is the
one in effect.
delta: Default timestamp mode - timestamp is a delta against
a per-buffer timestamp.
absolute: The timestamp is a full timestamp, not a delta
against some other value. As such it takes up more
space and is less efficient.
hwlat_detector: hwlat_detector:
Directory for the Hardware Latency Detector. Directory for the Hardware Latency Detector.

File diff suppressed because it is too large Load Diff

View File

@ -34,10 +34,12 @@ struct ring_buffer_event {
* array[0] = time delta (28 .. 59) * array[0] = time delta (28 .. 59)
* size = 8 bytes * size = 8 bytes
* *
* @RINGBUF_TYPE_TIME_STAMP: Sync time stamp with external clock * @RINGBUF_TYPE_TIME_STAMP: Absolute timestamp
* array[0] = tv_nsec * Same format as TIME_EXTEND except that the
* array[1..2] = tv_sec * value is an absolute timestamp, not a delta
* size = 16 bytes * event.time_delta contains bottom 27 bits
* array[0] = top (28 .. 59) bits
* size = 8 bytes
* *
* <= @RINGBUF_TYPE_DATA_TYPE_LEN_MAX: * <= @RINGBUF_TYPE_DATA_TYPE_LEN_MAX:
* Data record * Data record
@ -54,12 +56,12 @@ enum ring_buffer_type {
RINGBUF_TYPE_DATA_TYPE_LEN_MAX = 28, RINGBUF_TYPE_DATA_TYPE_LEN_MAX = 28,
RINGBUF_TYPE_PADDING, RINGBUF_TYPE_PADDING,
RINGBUF_TYPE_TIME_EXTEND, RINGBUF_TYPE_TIME_EXTEND,
/* FIXME: RINGBUF_TYPE_TIME_STAMP not implemented */
RINGBUF_TYPE_TIME_STAMP, RINGBUF_TYPE_TIME_STAMP,
}; };
unsigned ring_buffer_event_length(struct ring_buffer_event *event); unsigned ring_buffer_event_length(struct ring_buffer_event *event);
void *ring_buffer_event_data(struct ring_buffer_event *event); void *ring_buffer_event_data(struct ring_buffer_event *event);
u64 ring_buffer_event_time_stamp(struct ring_buffer_event *event);
/* /*
* ring_buffer_discard_commit will remove an event that has not * ring_buffer_discard_commit will remove an event that has not
@ -115,6 +117,9 @@ int ring_buffer_unlock_commit(struct ring_buffer *buffer,
int ring_buffer_write(struct ring_buffer *buffer, int ring_buffer_write(struct ring_buffer *buffer,
unsigned long length, void *data); unsigned long length, void *data);
void ring_buffer_nest_start(struct ring_buffer *buffer);
void ring_buffer_nest_end(struct ring_buffer *buffer);
struct ring_buffer_event * struct ring_buffer_event *
ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts, ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts,
unsigned long *lost_events); unsigned long *lost_events);
@ -178,6 +183,8 @@ void ring_buffer_normalize_time_stamp(struct ring_buffer *buffer,
int cpu, u64 *ts); int cpu, u64 *ts);
void ring_buffer_set_clock(struct ring_buffer *buffer, void ring_buffer_set_clock(struct ring_buffer *buffer,
u64 (*clock)(void)); u64 (*clock)(void));
void ring_buffer_set_time_stamp_abs(struct ring_buffer *buffer, bool abs);
bool ring_buffer_time_stamp_abs(struct ring_buffer *buffer);
size_t ring_buffer_page_len(void *page); size_t ring_buffer_page_len(void *page);

View File

@ -430,11 +430,13 @@ enum event_trigger_type {
extern int filter_match_preds(struct event_filter *filter, void *rec); extern int filter_match_preds(struct event_filter *filter, void *rec);
extern enum event_trigger_type event_triggers_call(struct trace_event_file *file, extern enum event_trigger_type
void *rec); event_triggers_call(struct trace_event_file *file, void *rec,
extern void event_triggers_post_call(struct trace_event_file *file, struct ring_buffer_event *event);
enum event_trigger_type tt, extern void
void *rec); event_triggers_post_call(struct trace_event_file *file,
enum event_trigger_type tt,
void *rec, struct ring_buffer_event *event);
bool trace_event_ignore_this_pid(struct trace_event_file *trace_file); bool trace_event_ignore_this_pid(struct trace_event_file *trace_file);
@ -454,7 +456,7 @@ trace_trigger_soft_disabled(struct trace_event_file *file)
if (!(eflags & EVENT_FILE_FL_TRIGGER_COND)) { if (!(eflags & EVENT_FILE_FL_TRIGGER_COND)) {
if (eflags & EVENT_FILE_FL_TRIGGER_MODE) if (eflags & EVENT_FILE_FL_TRIGGER_MODE)
event_triggers_call(file, NULL); event_triggers_call(file, NULL, NULL);
if (eflags & EVENT_FILE_FL_SOFT_DISABLED) if (eflags & EVENT_FILE_FL_SOFT_DISABLED)
return true; return true;
if (eflags & EVENT_FILE_FL_PID_FILTER) if (eflags & EVENT_FILE_FL_PID_FILTER)

View File

@ -0,0 +1,66 @@
/* SPDX-License-Identifier: GPL-2.0 */
#undef TRACE_SYSTEM
#define TRACE_SYSTEM initcall
#if !defined(_TRACE_INITCALL_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_INITCALL_H
#include <linux/tracepoint.h>
TRACE_EVENT(initcall_level,
TP_PROTO(const char *level),
TP_ARGS(level),
TP_STRUCT__entry(
__string(level, level)
),
TP_fast_assign(
__assign_str(level, level);
),
TP_printk("level=%s", __get_str(level))
);
TRACE_EVENT(initcall_start,
TP_PROTO(initcall_t func),
TP_ARGS(func),
TP_STRUCT__entry(
__field(initcall_t, func)
),
TP_fast_assign(
__entry->func = func;
),
TP_printk("func=%pS", __entry->func)
);
TRACE_EVENT(initcall_finish,
TP_PROTO(initcall_t func, int ret),
TP_ARGS(func, ret),
TP_STRUCT__entry(
__field(initcall_t, func)
__field(int, ret)
),
TP_fast_assign(
__entry->func = func;
__entry->ret = ret;
),
TP_printk("func=%pS ret=%d", __entry->func, __entry->ret)
);
#endif /* if !defined(_TRACE_GPIO_H) || defined(TRACE_HEADER_MULTI_READ) */
/* This part must be outside protection */
#include <trace/define_trace.h>

View File

@ -97,6 +97,9 @@
#include <asm/sections.h> #include <asm/sections.h>
#include <asm/cacheflush.h> #include <asm/cacheflush.h>
#define CREATE_TRACE_POINTS
#include <trace/events/initcall.h>
static int kernel_init(void *); static int kernel_init(void *);
extern void init_IRQ(void); extern void init_IRQ(void);
@ -491,6 +494,17 @@ void __init __weak thread_stack_cache_init(void)
void __init __weak mem_encrypt_init(void) { } void __init __weak mem_encrypt_init(void) { }
bool initcall_debug;
core_param(initcall_debug, initcall_debug, bool, 0644);
#ifdef TRACEPOINTS_ENABLED
static void __init initcall_debug_enable(void);
#else
static inline void initcall_debug_enable(void)
{
}
#endif
/* /*
* Set up kernel memory allocators * Set up kernel memory allocators
*/ */
@ -612,6 +626,9 @@ asmlinkage __visible void __init start_kernel(void)
/* Trace events are available after this */ /* Trace events are available after this */
trace_init(); trace_init();
if (initcall_debug)
initcall_debug_enable();
context_tracking_init(); context_tracking_init();
/* init some links before init_ISA_irqs() */ /* init some links before init_ISA_irqs() */
early_irq_init(); early_irq_init();
@ -728,9 +745,6 @@ static void __init do_ctors(void)
#endif #endif
} }
bool initcall_debug;
core_param(initcall_debug, initcall_debug, bool, 0644);
#ifdef CONFIG_KALLSYMS #ifdef CONFIG_KALLSYMS
struct blacklist_entry { struct blacklist_entry {
struct list_head next; struct list_head next;
@ -800,37 +814,71 @@ static bool __init_or_module initcall_blacklisted(initcall_t fn)
#endif #endif
__setup("initcall_blacklist=", initcall_blacklist); __setup("initcall_blacklist=", initcall_blacklist);
static int __init_or_module do_one_initcall_debug(initcall_t fn) static __init_or_module void
trace_initcall_start_cb(void *data, initcall_t fn)
{ {
ktime_t calltime, delta, rettime; ktime_t *calltime = (ktime_t *)data;
unsigned long long duration;
int ret;
printk(KERN_DEBUG "calling %pF @ %i\n", fn, task_pid_nr(current)); printk(KERN_DEBUG "calling %pF @ %i\n", fn, task_pid_nr(current));
calltime = ktime_get(); *calltime = ktime_get();
ret = fn(); }
static __init_or_module void
trace_initcall_finish_cb(void *data, initcall_t fn, int ret)
{
ktime_t *calltime = (ktime_t *)data;
ktime_t delta, rettime;
unsigned long long duration;
rettime = ktime_get(); rettime = ktime_get();
delta = ktime_sub(rettime, calltime); delta = ktime_sub(rettime, *calltime);
duration = (unsigned long long) ktime_to_ns(delta) >> 10; duration = (unsigned long long) ktime_to_ns(delta) >> 10;
printk(KERN_DEBUG "initcall %pF returned %d after %lld usecs\n", printk(KERN_DEBUG "initcall %pF returned %d after %lld usecs\n",
fn, ret, duration); fn, ret, duration);
return ret;
} }
static ktime_t initcall_calltime;
#ifdef TRACEPOINTS_ENABLED
static void __init initcall_debug_enable(void)
{
int ret;
ret = register_trace_initcall_start(trace_initcall_start_cb,
&initcall_calltime);
ret |= register_trace_initcall_finish(trace_initcall_finish_cb,
&initcall_calltime);
WARN(ret, "Failed to register initcall tracepoints\n");
}
# define do_trace_initcall_start trace_initcall_start
# define do_trace_initcall_finish trace_initcall_finish
#else
static inline void do_trace_initcall_start(initcall_t fn)
{
if (!initcall_debug)
return;
trace_initcall_start_cb(&initcall_calltime, fn);
}
static inline void do_trace_initcall_finish(initcall_t fn, int ret)
{
if (!initcall_debug)
return;
trace_initcall_finish_cb(&initcall_calltime, fn, ret);
}
#endif /* !TRACEPOINTS_ENABLED */
int __init_or_module do_one_initcall(initcall_t fn) int __init_or_module do_one_initcall(initcall_t fn)
{ {
int count = preempt_count(); int count = preempt_count();
int ret;
char msgbuf[64]; char msgbuf[64];
int ret;
if (initcall_blacklisted(fn)) if (initcall_blacklisted(fn))
return -EPERM; return -EPERM;
if (initcall_debug) do_trace_initcall_start(fn);
ret = do_one_initcall_debug(fn); ret = fn();
else do_trace_initcall_finish(fn, ret);
ret = fn();
msgbuf[0] = 0; msgbuf[0] = 0;
@ -874,7 +922,7 @@ static initcall_t *initcall_levels[] __initdata = {
/* Keep these in sync with initcalls in include/linux/init.h */ /* Keep these in sync with initcalls in include/linux/init.h */
static char *initcall_level_names[] __initdata = { static char *initcall_level_names[] __initdata = {
"early", "pure",
"core", "core",
"postcore", "postcore",
"arch", "arch",
@ -895,6 +943,7 @@ static void __init do_initcall_level(int level)
level, level, level, level,
NULL, &repair_env_string); NULL, &repair_env_string);
trace_initcall_level(initcall_level_names[level]);
for (fn = initcall_levels[level]; fn < initcall_levels[level+1]; fn++) for (fn = initcall_levels[level]; fn < initcall_levels[level+1]; fn++)
do_one_initcall(*fn); do_one_initcall(*fn);
} }
@ -929,6 +978,7 @@ static void __init do_pre_smp_initcalls(void)
{ {
initcall_t *fn; initcall_t *fn;
trace_initcall_level("early");
for (fn = __initcall_start; fn < __initcall0_start; fn++) for (fn = __initcall_start; fn < __initcall0_start; fn++)
do_one_initcall(*fn); do_one_initcall(*fn);
} }

View File

@ -554,6 +554,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
else else
dump_stack(); dump_stack();
print_irqtrace_events(current);
print_oops_end_marker(); print_oops_end_marker();
/* Just a warning, don't kill lockdep. */ /* Just a warning, don't kill lockdep. */

View File

@ -51,6 +51,7 @@
#include <linux/uaccess.h> #include <linux/uaccess.h>
#include <asm/sections.h> #include <asm/sections.h>
#include <trace/events/initcall.h>
#define CREATE_TRACE_POINTS #define CREATE_TRACE_POINTS
#include <trace/events/printk.h> #include <trace/events/printk.h>
@ -2780,6 +2781,7 @@ EXPORT_SYMBOL(unregister_console);
*/ */
void __init console_init(void) void __init console_init(void)
{ {
int ret;
initcall_t *call; initcall_t *call;
/* Setup the default TTY line discipline. */ /* Setup the default TTY line discipline. */
@ -2790,8 +2792,11 @@ void __init console_init(void)
* inform about problems etc.. * inform about problems etc..
*/ */
call = __con_initcall_start; call = __con_initcall_start;
trace_initcall_level("console");
while (call < __con_initcall_end) { while (call < __con_initcall_end) {
(*call)(); trace_initcall_start((*call));
ret = (*call)();
trace_initcall_finish((*call), ret);
call++; call++;
} }
} }

View File

@ -606,7 +606,10 @@ config HIST_TRIGGERS
event activity as an initial guide for further investigation event activity as an initial guide for further investigation
using more advanced tools. using more advanced tools.
See Documentation/trace/events.txt. Inter-event tracing of quantities such as latencies is also
supported using hist triggers under this option.
See Documentation/trace/histogram.txt.
If in doubt, say N. If in doubt, say N.
config MMIOTRACE_TEST config MMIOTRACE_TEST

View File

@ -3902,14 +3902,13 @@ static bool module_exists(const char *module)
{ {
/* All modules have the symbol __this_module */ /* All modules have the symbol __this_module */
const char this_mod[] = "__this_module"; const char this_mod[] = "__this_module";
const int modname_size = MAX_PARAM_PREFIX_LEN + sizeof(this_mod) + 1; char modname[MAX_PARAM_PREFIX_LEN + sizeof(this_mod) + 2];
char modname[modname_size + 1];
unsigned long val; unsigned long val;
int n; int n;
n = snprintf(modname, modname_size + 1, "%s:%s", module, this_mod); n = snprintf(modname, sizeof(modname), "%s:%s", module, this_mod);
if (n > modname_size) if (n > sizeof(modname) - 1)
return false; return false;
val = module_kallsyms_lookup_name(modname); val = module_kallsyms_lookup_name(modname);

View File

@ -22,6 +22,7 @@
#include <linux/hash.h> #include <linux/hash.h>
#include <linux/list.h> #include <linux/list.h>
#include <linux/cpu.h> #include <linux/cpu.h>
#include <linux/oom.h>
#include <asm/local.h> #include <asm/local.h>
@ -41,6 +42,8 @@ int ring_buffer_print_entry_header(struct trace_seq *s)
RINGBUF_TYPE_PADDING); RINGBUF_TYPE_PADDING);
trace_seq_printf(s, "\ttime_extend : type == %d\n", trace_seq_printf(s, "\ttime_extend : type == %d\n",
RINGBUF_TYPE_TIME_EXTEND); RINGBUF_TYPE_TIME_EXTEND);
trace_seq_printf(s, "\ttime_stamp : type == %d\n",
RINGBUF_TYPE_TIME_STAMP);
trace_seq_printf(s, "\tdata max type_len == %d\n", trace_seq_printf(s, "\tdata max type_len == %d\n",
RINGBUF_TYPE_DATA_TYPE_LEN_MAX); RINGBUF_TYPE_DATA_TYPE_LEN_MAX);
@ -140,12 +143,15 @@ int ring_buffer_print_entry_header(struct trace_seq *s)
enum { enum {
RB_LEN_TIME_EXTEND = 8, RB_LEN_TIME_EXTEND = 8,
RB_LEN_TIME_STAMP = 16, RB_LEN_TIME_STAMP = 8,
}; };
#define skip_time_extend(event) \ #define skip_time_extend(event) \
((struct ring_buffer_event *)((char *)event + RB_LEN_TIME_EXTEND)) ((struct ring_buffer_event *)((char *)event + RB_LEN_TIME_EXTEND))
#define extended_time(event) \
(event->type_len >= RINGBUF_TYPE_TIME_EXTEND)
static inline int rb_null_event(struct ring_buffer_event *event) static inline int rb_null_event(struct ring_buffer_event *event)
{ {
return event->type_len == RINGBUF_TYPE_PADDING && !event->time_delta; return event->type_len == RINGBUF_TYPE_PADDING && !event->time_delta;
@ -209,7 +215,7 @@ rb_event_ts_length(struct ring_buffer_event *event)
{ {
unsigned len = 0; unsigned len = 0;
if (event->type_len == RINGBUF_TYPE_TIME_EXTEND) { if (extended_time(event)) {
/* time extends include the data event after it */ /* time extends include the data event after it */
len = RB_LEN_TIME_EXTEND; len = RB_LEN_TIME_EXTEND;
event = skip_time_extend(event); event = skip_time_extend(event);
@ -231,7 +237,7 @@ unsigned ring_buffer_event_length(struct ring_buffer_event *event)
{ {
unsigned length; unsigned length;
if (event->type_len == RINGBUF_TYPE_TIME_EXTEND) if (extended_time(event))
event = skip_time_extend(event); event = skip_time_extend(event);
length = rb_event_length(event); length = rb_event_length(event);
@ -248,7 +254,7 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_length);
static __always_inline void * static __always_inline void *
rb_event_data(struct ring_buffer_event *event) rb_event_data(struct ring_buffer_event *event)
{ {
if (event->type_len == RINGBUF_TYPE_TIME_EXTEND) if (extended_time(event))
event = skip_time_extend(event); event = skip_time_extend(event);
BUG_ON(event->type_len > RINGBUF_TYPE_DATA_TYPE_LEN_MAX); BUG_ON(event->type_len > RINGBUF_TYPE_DATA_TYPE_LEN_MAX);
/* If length is in len field, then array[0] has the data */ /* If length is in len field, then array[0] has the data */
@ -275,6 +281,27 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data);
#define TS_MASK ((1ULL << TS_SHIFT) - 1) #define TS_MASK ((1ULL << TS_SHIFT) - 1)
#define TS_DELTA_TEST (~TS_MASK) #define TS_DELTA_TEST (~TS_MASK)
/**
* ring_buffer_event_time_stamp - return the event's extended timestamp
* @event: the event to get the timestamp of
*
* Returns the extended timestamp associated with a data event.
* An extended time_stamp is a 64-bit timestamp represented
* internally in a special way that makes the best use of space
* contained within a ring buffer event. This function decodes
* it and maps it to a straight u64 value.
*/
u64 ring_buffer_event_time_stamp(struct ring_buffer_event *event)
{
u64 ts;
ts = event->array[0];
ts <<= TS_SHIFT;
ts += event->time_delta;
return ts;
}
/* Flag when events were overwritten */ /* Flag when events were overwritten */
#define RB_MISSED_EVENTS (1 << 31) #define RB_MISSED_EVENTS (1 << 31)
/* Missed count stored at end */ /* Missed count stored at end */
@ -451,6 +478,7 @@ struct ring_buffer_per_cpu {
struct buffer_page *reader_page; struct buffer_page *reader_page;
unsigned long lost_events; unsigned long lost_events;
unsigned long last_overrun; unsigned long last_overrun;
unsigned long nest;
local_t entries_bytes; local_t entries_bytes;
local_t entries; local_t entries;
local_t overrun; local_t overrun;
@ -488,6 +516,7 @@ struct ring_buffer {
u64 (*clock)(void); u64 (*clock)(void);
struct rb_irq_work irq_work; struct rb_irq_work irq_work;
bool time_stamp_abs;
}; };
struct ring_buffer_iter { struct ring_buffer_iter {
@ -1134,30 +1163,60 @@ static int rb_check_pages(struct ring_buffer_per_cpu *cpu_buffer)
static int __rb_allocate_pages(long nr_pages, struct list_head *pages, int cpu) static int __rb_allocate_pages(long nr_pages, struct list_head *pages, int cpu)
{ {
struct buffer_page *bpage, *tmp; struct buffer_page *bpage, *tmp;
bool user_thread = current->mm != NULL;
gfp_t mflags;
long i; long i;
/*
* Check if the available memory is there first.
* Note, si_mem_available() only gives us a rough estimate of available
* memory. It may not be accurate. But we don't care, we just want
* to prevent doing any allocation when it is obvious that it is
* not going to succeed.
*/
i = si_mem_available();
if (i < nr_pages)
return -ENOMEM;
/*
* __GFP_RETRY_MAYFAIL flag makes sure that the allocation fails
* gracefully without invoking oom-killer and the system is not
* destabilized.
*/
mflags = GFP_KERNEL | __GFP_RETRY_MAYFAIL;
/*
* If a user thread allocates too much, and si_mem_available()
* reports there's enough memory, even though there is not.
* Make sure the OOM killer kills this thread. This can happen
* even with RETRY_MAYFAIL because another task may be doing
* an allocation after this task has taken all memory.
* This is the task the OOM killer needs to take out during this
* loop, even if it was triggered by an allocation somewhere else.
*/
if (user_thread)
set_current_oom_origin();
for (i = 0; i < nr_pages; i++) { for (i = 0; i < nr_pages; i++) {
struct page *page; struct page *page;
/*
* __GFP_RETRY_MAYFAIL flag makes sure that the allocation fails
* gracefully without invoking oom-killer and the system is not
* destabilized.
*/
bpage = kzalloc_node(ALIGN(sizeof(*bpage), cache_line_size()), bpage = kzalloc_node(ALIGN(sizeof(*bpage), cache_line_size()),
GFP_KERNEL | __GFP_RETRY_MAYFAIL, mflags, cpu_to_node(cpu));
cpu_to_node(cpu));
if (!bpage) if (!bpage)
goto free_pages; goto free_pages;
list_add(&bpage->list, pages); list_add(&bpage->list, pages);
page = alloc_pages_node(cpu_to_node(cpu), page = alloc_pages_node(cpu_to_node(cpu), mflags, 0);
GFP_KERNEL | __GFP_RETRY_MAYFAIL, 0);
if (!page) if (!page)
goto free_pages; goto free_pages;
bpage->page = page_address(page); bpage->page = page_address(page);
rb_init_page(bpage->page); rb_init_page(bpage->page);
if (user_thread && fatal_signal_pending(current))
goto free_pages;
} }
if (user_thread)
clear_current_oom_origin();
return 0; return 0;
@ -1166,6 +1225,8 @@ free_pages:
list_del_init(&bpage->list); list_del_init(&bpage->list);
free_buffer_page(bpage); free_buffer_page(bpage);
} }
if (user_thread)
clear_current_oom_origin();
return -ENOMEM; return -ENOMEM;
} }
@ -1382,6 +1443,16 @@ void ring_buffer_set_clock(struct ring_buffer *buffer,
buffer->clock = clock; buffer->clock = clock;
} }
void ring_buffer_set_time_stamp_abs(struct ring_buffer *buffer, bool abs)
{
buffer->time_stamp_abs = abs;
}
bool ring_buffer_time_stamp_abs(struct ring_buffer *buffer)
{
return buffer->time_stamp_abs;
}
static void rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer); static void rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer);
static inline unsigned long rb_page_entries(struct buffer_page *bpage) static inline unsigned long rb_page_entries(struct buffer_page *bpage)
@ -2206,12 +2277,15 @@ rb_move_tail(struct ring_buffer_per_cpu *cpu_buffer,
/* Slow path, do not inline */ /* Slow path, do not inline */
static noinline struct ring_buffer_event * static noinline struct ring_buffer_event *
rb_add_time_stamp(struct ring_buffer_event *event, u64 delta) rb_add_time_stamp(struct ring_buffer_event *event, u64 delta, bool abs)
{ {
event->type_len = RINGBUF_TYPE_TIME_EXTEND; if (abs)
event->type_len = RINGBUF_TYPE_TIME_STAMP;
else
event->type_len = RINGBUF_TYPE_TIME_EXTEND;
/* Not the first event on the page? */ /* Not the first event on the page, or not delta? */
if (rb_event_index(event)) { if (abs || rb_event_index(event)) {
event->time_delta = delta & TS_MASK; event->time_delta = delta & TS_MASK;
event->array[0] = delta >> TS_SHIFT; event->array[0] = delta >> TS_SHIFT;
} else { } else {
@ -2254,7 +2328,9 @@ rb_update_event(struct ring_buffer_per_cpu *cpu_buffer,
* add it to the start of the resevered space. * add it to the start of the resevered space.
*/ */
if (unlikely(info->add_timestamp)) { if (unlikely(info->add_timestamp)) {
event = rb_add_time_stamp(event, delta); bool abs = ring_buffer_time_stamp_abs(cpu_buffer->buffer);
event = rb_add_time_stamp(event, info->delta, abs);
length -= RB_LEN_TIME_EXTEND; length -= RB_LEN_TIME_EXTEND;
delta = 0; delta = 0;
} }
@ -2442,7 +2518,7 @@ static __always_inline void rb_end_commit(struct ring_buffer_per_cpu *cpu_buffer
static inline void rb_event_discard(struct ring_buffer_event *event) static inline void rb_event_discard(struct ring_buffer_event *event)
{ {
if (event->type_len == RINGBUF_TYPE_TIME_EXTEND) if (extended_time(event))
event = skip_time_extend(event); event = skip_time_extend(event);
/* array[0] holds the actual length for the discarded event */ /* array[0] holds the actual length for the discarded event */
@ -2486,10 +2562,11 @@ rb_update_write_stamp(struct ring_buffer_per_cpu *cpu_buffer,
cpu_buffer->write_stamp = cpu_buffer->write_stamp =
cpu_buffer->commit_page->page->time_stamp; cpu_buffer->commit_page->page->time_stamp;
else if (event->type_len == RINGBUF_TYPE_TIME_EXTEND) { else if (event->type_len == RINGBUF_TYPE_TIME_EXTEND) {
delta = event->array[0]; delta = ring_buffer_event_time_stamp(event);
delta <<= TS_SHIFT;
delta += event->time_delta;
cpu_buffer->write_stamp += delta; cpu_buffer->write_stamp += delta;
} else if (event->type_len == RINGBUF_TYPE_TIME_STAMP) {
delta = ring_buffer_event_time_stamp(event);
cpu_buffer->write_stamp = delta;
} else } else
cpu_buffer->write_stamp += event->time_delta; cpu_buffer->write_stamp += event->time_delta;
} }
@ -2581,10 +2658,10 @@ trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer)
bit = pc & NMI_MASK ? RB_CTX_NMI : bit = pc & NMI_MASK ? RB_CTX_NMI :
pc & HARDIRQ_MASK ? RB_CTX_IRQ : RB_CTX_SOFTIRQ; pc & HARDIRQ_MASK ? RB_CTX_IRQ : RB_CTX_SOFTIRQ;
if (unlikely(val & (1 << bit))) if (unlikely(val & (1 << (bit + cpu_buffer->nest))))
return 1; return 1;
val |= (1 << bit); val |= (1 << (bit + cpu_buffer->nest));
cpu_buffer->current_context = val; cpu_buffer->current_context = val;
return 0; return 0;
@ -2593,7 +2670,57 @@ trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer)
static __always_inline void static __always_inline void
trace_recursive_unlock(struct ring_buffer_per_cpu *cpu_buffer) trace_recursive_unlock(struct ring_buffer_per_cpu *cpu_buffer)
{ {
cpu_buffer->current_context &= cpu_buffer->current_context - 1; cpu_buffer->current_context &=
cpu_buffer->current_context - (1 << cpu_buffer->nest);
}
/* The recursive locking above uses 4 bits */
#define NESTED_BITS 4
/**
* ring_buffer_nest_start - Allow to trace while nested
* @buffer: The ring buffer to modify
*
* The ring buffer has a safty mechanism to prevent recursion.
* But there may be a case where a trace needs to be done while
* tracing something else. In this case, calling this function
* will allow this function to nest within a currently active
* ring_buffer_lock_reserve().
*
* Call this function before calling another ring_buffer_lock_reserve() and
* call ring_buffer_nest_end() after the nested ring_buffer_unlock_commit().
*/
void ring_buffer_nest_start(struct ring_buffer *buffer)
{
struct ring_buffer_per_cpu *cpu_buffer;
int cpu;
/* Enabled by ring_buffer_nest_end() */
preempt_disable_notrace();
cpu = raw_smp_processor_id();
cpu_buffer = buffer->buffers[cpu];
/* This is the shift value for the above recusive locking */
cpu_buffer->nest += NESTED_BITS;
}
/**
* ring_buffer_nest_end - Allow to trace while nested
* @buffer: The ring buffer to modify
*
* Must be called after ring_buffer_nest_start() and after the
* ring_buffer_unlock_commit().
*/
void ring_buffer_nest_end(struct ring_buffer *buffer)
{
struct ring_buffer_per_cpu *cpu_buffer;
int cpu;
/* disabled by ring_buffer_nest_start() */
cpu = raw_smp_processor_id();
cpu_buffer = buffer->buffers[cpu];
/* This is the shift value for the above recusive locking */
cpu_buffer->nest -= NESTED_BITS;
preempt_enable_notrace();
} }
/** /**
@ -2637,7 +2764,8 @@ rb_handle_timestamp(struct ring_buffer_per_cpu *cpu_buffer,
sched_clock_stable() ? "" : sched_clock_stable() ? "" :
"If you just came from a suspend/resume,\n" "If you just came from a suspend/resume,\n"
"please switch to the trace global clock:\n" "please switch to the trace global clock:\n"
" echo global > /sys/kernel/debug/tracing/trace_clock\n"); " echo global > /sys/kernel/debug/tracing/trace_clock\n"
"or add trace_clock=global to the kernel command line\n");
info->add_timestamp = 1; info->add_timestamp = 1;
} }
@ -2669,7 +2797,7 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
* If this is the first commit on the page, then it has the same * If this is the first commit on the page, then it has the same
* timestamp as the page itself. * timestamp as the page itself.
*/ */
if (!tail) if (!tail && !ring_buffer_time_stamp_abs(cpu_buffer->buffer))
info->delta = 0; info->delta = 0;
/* See if we shot pass the end of this buffer page */ /* See if we shot pass the end of this buffer page */
@ -2746,8 +2874,11 @@ rb_reserve_next_event(struct ring_buffer *buffer,
/* make sure this diff is calculated here */ /* make sure this diff is calculated here */
barrier(); barrier();
/* Did the write stamp get updated already? */ if (ring_buffer_time_stamp_abs(buffer)) {
if (likely(info.ts >= cpu_buffer->write_stamp)) { info.delta = info.ts;
rb_handle_timestamp(cpu_buffer, &info);
} else /* Did the write stamp get updated already? */
if (likely(info.ts >= cpu_buffer->write_stamp)) {
info.delta = diff; info.delta = diff;
if (unlikely(test_time_stamp(info.delta))) if (unlikely(test_time_stamp(info.delta)))
rb_handle_timestamp(cpu_buffer, &info); rb_handle_timestamp(cpu_buffer, &info);
@ -3429,14 +3560,13 @@ rb_update_read_stamp(struct ring_buffer_per_cpu *cpu_buffer,
return; return;
case RINGBUF_TYPE_TIME_EXTEND: case RINGBUF_TYPE_TIME_EXTEND:
delta = event->array[0]; delta = ring_buffer_event_time_stamp(event);
delta <<= TS_SHIFT;
delta += event->time_delta;
cpu_buffer->read_stamp += delta; cpu_buffer->read_stamp += delta;
return; return;
case RINGBUF_TYPE_TIME_STAMP: case RINGBUF_TYPE_TIME_STAMP:
/* FIXME: not implemented */ delta = ring_buffer_event_time_stamp(event);
cpu_buffer->read_stamp = delta;
return; return;
case RINGBUF_TYPE_DATA: case RINGBUF_TYPE_DATA:
@ -3460,14 +3590,13 @@ rb_update_iter_read_stamp(struct ring_buffer_iter *iter,
return; return;
case RINGBUF_TYPE_TIME_EXTEND: case RINGBUF_TYPE_TIME_EXTEND:
delta = event->array[0]; delta = ring_buffer_event_time_stamp(event);
delta <<= TS_SHIFT;
delta += event->time_delta;
iter->read_stamp += delta; iter->read_stamp += delta;
return; return;
case RINGBUF_TYPE_TIME_STAMP: case RINGBUF_TYPE_TIME_STAMP:
/* FIXME: not implemented */ delta = ring_buffer_event_time_stamp(event);
iter->read_stamp = delta;
return; return;
case RINGBUF_TYPE_DATA: case RINGBUF_TYPE_DATA:
@ -3691,6 +3820,8 @@ rb_buffer_peek(struct ring_buffer_per_cpu *cpu_buffer, u64 *ts,
struct buffer_page *reader; struct buffer_page *reader;
int nr_loops = 0; int nr_loops = 0;
if (ts)
*ts = 0;
again: again:
/* /*
* We repeat when a time extend is encountered. * We repeat when a time extend is encountered.
@ -3727,12 +3858,17 @@ rb_buffer_peek(struct ring_buffer_per_cpu *cpu_buffer, u64 *ts,
goto again; goto again;
case RINGBUF_TYPE_TIME_STAMP: case RINGBUF_TYPE_TIME_STAMP:
/* FIXME: not implemented */ if (ts) {
*ts = ring_buffer_event_time_stamp(event);
ring_buffer_normalize_time_stamp(cpu_buffer->buffer,
cpu_buffer->cpu, ts);
}
/* Internal data, OK to advance */
rb_advance_reader(cpu_buffer); rb_advance_reader(cpu_buffer);
goto again; goto again;
case RINGBUF_TYPE_DATA: case RINGBUF_TYPE_DATA:
if (ts) { if (ts && !(*ts)) {
*ts = cpu_buffer->read_stamp + event->time_delta; *ts = cpu_buffer->read_stamp + event->time_delta;
ring_buffer_normalize_time_stamp(cpu_buffer->buffer, ring_buffer_normalize_time_stamp(cpu_buffer->buffer,
cpu_buffer->cpu, ts); cpu_buffer->cpu, ts);
@ -3757,6 +3893,9 @@ rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
struct ring_buffer_event *event; struct ring_buffer_event *event;
int nr_loops = 0; int nr_loops = 0;
if (ts)
*ts = 0;
cpu_buffer = iter->cpu_buffer; cpu_buffer = iter->cpu_buffer;
buffer = cpu_buffer->buffer; buffer = cpu_buffer->buffer;
@ -3809,12 +3948,17 @@ rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
goto again; goto again;
case RINGBUF_TYPE_TIME_STAMP: case RINGBUF_TYPE_TIME_STAMP:
/* FIXME: not implemented */ if (ts) {
*ts = ring_buffer_event_time_stamp(event);
ring_buffer_normalize_time_stamp(cpu_buffer->buffer,
cpu_buffer->cpu, ts);
}
/* Internal data, OK to advance */
rb_advance_iter(iter); rb_advance_iter(iter);
goto again; goto again;
case RINGBUF_TYPE_DATA: case RINGBUF_TYPE_DATA:
if (ts) { if (ts && !(*ts)) {
*ts = iter->read_stamp + event->time_delta; *ts = iter->read_stamp + event->time_delta;
ring_buffer_normalize_time_stamp(buffer, ring_buffer_normalize_time_stamp(buffer,
cpu_buffer->cpu, ts); cpu_buffer->cpu, ts);

View File

@ -41,6 +41,7 @@
#include <linux/nmi.h> #include <linux/nmi.h>
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/trace.h> #include <linux/trace.h>
#include <linux/sched/clock.h>
#include <linux/sched/rt.h> #include <linux/sched/rt.h>
#include "trace.h" #include "trace.h"
@ -1168,6 +1169,14 @@ static struct {
ARCH_TRACE_CLOCKS ARCH_TRACE_CLOCKS
}; };
bool trace_clock_in_ns(struct trace_array *tr)
{
if (trace_clocks[tr->clock_id].in_ns)
return true;
return false;
}
/* /*
* trace_parser_get_init - gets the buffer for trace parser * trace_parser_get_init - gets the buffer for trace parser
*/ */
@ -2269,7 +2278,7 @@ trace_event_buffer_lock_reserve(struct ring_buffer **current_rb,
*current_rb = trace_file->tr->trace_buffer.buffer; *current_rb = trace_file->tr->trace_buffer.buffer;
if ((trace_file->flags & if (!ring_buffer_time_stamp_abs(*current_rb) && (trace_file->flags &
(EVENT_FILE_FL_SOFT_DISABLED | EVENT_FILE_FL_FILTERED)) && (EVENT_FILE_FL_SOFT_DISABLED | EVENT_FILE_FL_FILTERED)) &&
(entry = this_cpu_read(trace_buffered_event))) { (entry = this_cpu_read(trace_buffered_event))) {
/* Try to use the per cpu buffer first */ /* Try to use the per cpu buffer first */
@ -4515,6 +4524,9 @@ static const char readme_msg[] =
#ifdef CONFIG_X86_64 #ifdef CONFIG_X86_64
" x86-tsc: TSC cycle counter\n" " x86-tsc: TSC cycle counter\n"
#endif #endif
"\n timestamp_mode\t-view the mode used to timestamp events\n"
" delta: Delta difference against a buffer-wide timestamp\n"
" absolute: Absolute (standalone) timestamp\n"
"\n trace_marker\t\t- Writes into this file writes into the kernel buffer\n" "\n trace_marker\t\t- Writes into this file writes into the kernel buffer\n"
"\n trace_marker_raw\t\t- Writes into this file writes binary data into the kernel buffer\n" "\n trace_marker_raw\t\t- Writes into this file writes binary data into the kernel buffer\n"
" tracing_cpumask\t- Limit which CPUs to trace\n" " tracing_cpumask\t- Limit which CPUs to trace\n"
@ -4691,8 +4703,9 @@ static const char readme_msg[] =
"\t .sym display an address as a symbol\n" "\t .sym display an address as a symbol\n"
"\t .sym-offset display an address as a symbol and offset\n" "\t .sym-offset display an address as a symbol and offset\n"
"\t .execname display a common_pid as a program name\n" "\t .execname display a common_pid as a program name\n"
"\t .syscall display a syscall id as a syscall name\n\n" "\t .syscall display a syscall id as a syscall name\n"
"\t .log2 display log2 value rather than raw number\n\n" "\t .log2 display log2 value rather than raw number\n"
"\t .usecs display a common_timestamp in microseconds\n\n"
"\t The 'pause' parameter can be used to pause an existing hist\n" "\t The 'pause' parameter can be used to pause an existing hist\n"
"\t trigger or to start a hist trigger but not log any events\n" "\t trigger or to start a hist trigger but not log any events\n"
"\t until told to do so. 'continue' can be used to start or\n" "\t until told to do so. 'continue' can be used to start or\n"
@ -6202,7 +6215,7 @@ static int tracing_clock_show(struct seq_file *m, void *v)
return 0; return 0;
} }
static int tracing_set_clock(struct trace_array *tr, const char *clockstr) int tracing_set_clock(struct trace_array *tr, const char *clockstr)
{ {
int i; int i;
@ -6282,6 +6295,71 @@ static int tracing_clock_open(struct inode *inode, struct file *file)
return ret; return ret;
} }
static int tracing_time_stamp_mode_show(struct seq_file *m, void *v)
{
struct trace_array *tr = m->private;
mutex_lock(&trace_types_lock);
if (ring_buffer_time_stamp_abs(tr->trace_buffer.buffer))
seq_puts(m, "delta [absolute]\n");
else
seq_puts(m, "[delta] absolute\n");
mutex_unlock(&trace_types_lock);
return 0;
}
static int tracing_time_stamp_mode_open(struct inode *inode, struct file *file)
{
struct trace_array *tr = inode->i_private;
int ret;
if (tracing_disabled)
return -ENODEV;
if (trace_array_get(tr))
return -ENODEV;
ret = single_open(file, tracing_time_stamp_mode_show, inode->i_private);
if (ret < 0)
trace_array_put(tr);
return ret;
}
int tracing_set_time_stamp_abs(struct trace_array *tr, bool abs)
{
int ret = 0;
mutex_lock(&trace_types_lock);
if (abs && tr->time_stamp_abs_ref++)
goto out;
if (!abs) {
if (WARN_ON_ONCE(!tr->time_stamp_abs_ref)) {
ret = -EINVAL;
goto out;
}
if (--tr->time_stamp_abs_ref)
goto out;
}
ring_buffer_set_time_stamp_abs(tr->trace_buffer.buffer, abs);
#ifdef CONFIG_TRACER_MAX_TRACE
if (tr->max_buffer.buffer)
ring_buffer_set_time_stamp_abs(tr->max_buffer.buffer, abs);
#endif
out:
mutex_unlock(&trace_types_lock);
return ret;
}
struct ftrace_buffer_info { struct ftrace_buffer_info {
struct trace_iterator iter; struct trace_iterator iter;
void *spare; void *spare;
@ -6529,6 +6607,13 @@ static const struct file_operations trace_clock_fops = {
.write = tracing_clock_write, .write = tracing_clock_write,
}; };
static const struct file_operations trace_time_stamp_mode_fops = {
.open = tracing_time_stamp_mode_open,
.read = seq_read,
.llseek = seq_lseek,
.release = tracing_single_release_tr,
};
#ifdef CONFIG_TRACER_SNAPSHOT #ifdef CONFIG_TRACER_SNAPSHOT
static const struct file_operations snapshot_fops = { static const struct file_operations snapshot_fops = {
.open = tracing_snapshot_open, .open = tracing_snapshot_open,
@ -7699,6 +7784,7 @@ static int instance_mkdir(const char *name)
INIT_LIST_HEAD(&tr->systems); INIT_LIST_HEAD(&tr->systems);
INIT_LIST_HEAD(&tr->events); INIT_LIST_HEAD(&tr->events);
INIT_LIST_HEAD(&tr->hist_vars);
if (allocate_trace_buffers(tr, trace_buf_size) < 0) if (allocate_trace_buffers(tr, trace_buf_size) < 0)
goto out_free_tr; goto out_free_tr;
@ -7851,6 +7937,9 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer)
trace_create_file("tracing_on", 0644, d_tracer, trace_create_file("tracing_on", 0644, d_tracer,
tr, &rb_simple_fops); tr, &rb_simple_fops);
trace_create_file("timestamp_mode", 0444, d_tracer, tr,
&trace_time_stamp_mode_fops);
create_trace_options_dir(tr); create_trace_options_dir(tr);
#if defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER) #if defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER)
@ -8446,6 +8535,7 @@ __init static int tracer_alloc_buffers(void)
INIT_LIST_HEAD(&global_trace.systems); INIT_LIST_HEAD(&global_trace.systems);
INIT_LIST_HEAD(&global_trace.events); INIT_LIST_HEAD(&global_trace.events);
INIT_LIST_HEAD(&global_trace.hist_vars);
list_add(&global_trace.list, &ftrace_trace_arrays); list_add(&global_trace.list, &ftrace_trace_arrays);
apply_trace_boot_options(); apply_trace_boot_options();
@ -8507,3 +8597,21 @@ __init static int clear_boot_tracer(void)
fs_initcall(tracer_init_tracefs); fs_initcall(tracer_init_tracefs);
late_initcall_sync(clear_boot_tracer); late_initcall_sync(clear_boot_tracer);
#ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
__init static int tracing_set_default_clock(void)
{
/* sched_clock_stable() is determined in late_initcall */
if (!trace_boot_clock && !sched_clock_stable()) {
printk(KERN_WARNING
"Unstable clock detected, switching default tracing clock to \"global\"\n"
"If you want to keep using the local clock, then add:\n"
" \"trace_clock=local\"\n"
"on the kernel command line\n");
tracing_set_clock(&global_trace, "global");
}
return 0;
}
late_initcall_sync(tracing_set_default_clock);
#endif

View File

@ -273,6 +273,8 @@ struct trace_array {
/* function tracing enabled */ /* function tracing enabled */
int function_enabled; int function_enabled;
#endif #endif
int time_stamp_abs_ref;
struct list_head hist_vars;
}; };
enum { enum {
@ -286,6 +288,11 @@ extern struct mutex trace_types_lock;
extern int trace_array_get(struct trace_array *tr); extern int trace_array_get(struct trace_array *tr);
extern void trace_array_put(struct trace_array *tr); extern void trace_array_put(struct trace_array *tr);
extern int tracing_set_time_stamp_abs(struct trace_array *tr, bool abs);
extern int tracing_set_clock(struct trace_array *tr, const char *clockstr);
extern bool trace_clock_in_ns(struct trace_array *tr);
/* /*
* The global tracer (top) should be the first trace array added, * The global tracer (top) should be the first trace array added,
* but we check the flag anyway. * but we check the flag anyway.
@ -1209,12 +1216,11 @@ struct ftrace_event_field {
int is_signed; int is_signed;
}; };
struct prog_entry;
struct event_filter { struct event_filter {
int n_preds; /* Number assigned */ struct prog_entry __rcu *prog;
int a_preds; /* allocated */ char *filter_string;
struct filter_pred __rcu *preds;
struct filter_pred __rcu *root;
char *filter_string;
}; };
struct event_subsystem { struct event_subsystem {
@ -1291,7 +1297,7 @@ __event_trigger_test_discard(struct trace_event_file *file,
unsigned long eflags = file->flags; unsigned long eflags = file->flags;
if (eflags & EVENT_FILE_FL_TRIGGER_COND) if (eflags & EVENT_FILE_FL_TRIGGER_COND)
*tt = event_triggers_call(file, entry); *tt = event_triggers_call(file, entry, event);
if (test_bit(EVENT_FILE_FL_SOFT_DISABLED_BIT, &file->flags) || if (test_bit(EVENT_FILE_FL_SOFT_DISABLED_BIT, &file->flags) ||
(unlikely(file->flags & EVENT_FILE_FL_FILTERED) && (unlikely(file->flags & EVENT_FILE_FL_FILTERED) &&
@ -1328,7 +1334,7 @@ event_trigger_unlock_commit(struct trace_event_file *file,
trace_buffer_unlock_commit(file->tr, buffer, event, irq_flags, pc); trace_buffer_unlock_commit(file->tr, buffer, event, irq_flags, pc);
if (tt) if (tt)
event_triggers_post_call(file, tt, entry); event_triggers_post_call(file, tt, entry, event);
} }
/** /**
@ -1361,7 +1367,7 @@ event_trigger_unlock_commit_regs(struct trace_event_file *file,
irq_flags, pc, regs); irq_flags, pc, regs);
if (tt) if (tt)
event_triggers_post_call(file, tt, entry); event_triggers_post_call(file, tt, entry, event);
} }
#define FILTER_PRED_INVALID ((unsigned short)-1) #define FILTER_PRED_INVALID ((unsigned short)-1)
@ -1406,12 +1412,8 @@ struct filter_pred {
unsigned short *ops; unsigned short *ops;
struct ftrace_event_field *field; struct ftrace_event_field *field;
int offset; int offset;
int not; int not;
int op; int op;
unsigned short index;
unsigned short parent;
unsigned short left;
unsigned short right;
}; };
static inline bool is_string_field(struct ftrace_event_field *field) static inline bool is_string_field(struct ftrace_event_field *field)
@ -1543,6 +1545,8 @@ extern void pause_named_trigger(struct event_trigger_data *data);
extern void unpause_named_trigger(struct event_trigger_data *data); extern void unpause_named_trigger(struct event_trigger_data *data);
extern void set_named_trigger_data(struct event_trigger_data *data, extern void set_named_trigger_data(struct event_trigger_data *data,
struct event_trigger_data *named_data); struct event_trigger_data *named_data);
extern struct event_trigger_data *
get_named_trigger_data(struct event_trigger_data *data);
extern int register_event_command(struct event_command *cmd); extern int register_event_command(struct event_command *cmd);
extern int unregister_event_command(struct event_command *cmd); extern int unregister_event_command(struct event_command *cmd);
extern int register_trigger_hist_enable_disable_cmds(void); extern int register_trigger_hist_enable_disable_cmds(void);
@ -1586,7 +1590,8 @@ extern int register_trigger_hist_enable_disable_cmds(void);
*/ */
struct event_trigger_ops { struct event_trigger_ops {
void (*func)(struct event_trigger_data *data, void (*func)(struct event_trigger_data *data,
void *rec); void *rec,
struct ring_buffer_event *rbe);
int (*init)(struct event_trigger_ops *ops, int (*init)(struct event_trigger_ops *ops,
struct event_trigger_data *data); struct event_trigger_data *data);
void (*free)(struct event_trigger_ops *ops, void (*free)(struct event_trigger_ops *ops,

View File

@ -96,7 +96,7 @@ u64 notrace trace_clock_global(void)
int this_cpu; int this_cpu;
u64 now; u64 now;
local_irq_save(flags); raw_local_irq_save(flags);
this_cpu = raw_smp_processor_id(); this_cpu = raw_smp_processor_id();
now = sched_clock_cpu(this_cpu); now = sched_clock_cpu(this_cpu);
@ -122,7 +122,7 @@ u64 notrace trace_clock_global(void)
arch_spin_unlock(&trace_clock_struct.lock); arch_spin_unlock(&trace_clock_struct.lock);
out: out:
local_irq_restore(flags); raw_local_irq_restore(flags);
return now; return now;
} }

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -63,7 +63,8 @@ void trigger_data_free(struct event_trigger_data *data)
* any trigger that should be deferred, ETT_NONE if nothing to defer. * any trigger that should be deferred, ETT_NONE if nothing to defer.
*/ */
enum event_trigger_type enum event_trigger_type
event_triggers_call(struct trace_event_file *file, void *rec) event_triggers_call(struct trace_event_file *file, void *rec,
struct ring_buffer_event *event)
{ {
struct event_trigger_data *data; struct event_trigger_data *data;
enum event_trigger_type tt = ETT_NONE; enum event_trigger_type tt = ETT_NONE;
@ -76,7 +77,7 @@ event_triggers_call(struct trace_event_file *file, void *rec)
if (data->paused) if (data->paused)
continue; continue;
if (!rec) { if (!rec) {
data->ops->func(data, rec); data->ops->func(data, rec, event);
continue; continue;
} }
filter = rcu_dereference_sched(data->filter); filter = rcu_dereference_sched(data->filter);
@ -86,7 +87,7 @@ event_triggers_call(struct trace_event_file *file, void *rec)
tt |= data->cmd_ops->trigger_type; tt |= data->cmd_ops->trigger_type;
continue; continue;
} }
data->ops->func(data, rec); data->ops->func(data, rec, event);
} }
return tt; return tt;
} }
@ -108,7 +109,7 @@ EXPORT_SYMBOL_GPL(event_triggers_call);
void void
event_triggers_post_call(struct trace_event_file *file, event_triggers_post_call(struct trace_event_file *file,
enum event_trigger_type tt, enum event_trigger_type tt,
void *rec) void *rec, struct ring_buffer_event *event)
{ {
struct event_trigger_data *data; struct event_trigger_data *data;
@ -116,7 +117,7 @@ event_triggers_post_call(struct trace_event_file *file,
if (data->paused) if (data->paused)
continue; continue;
if (data->cmd_ops->trigger_type & tt) if (data->cmd_ops->trigger_type & tt)
data->ops->func(data, rec); data->ops->func(data, rec, event);
} }
} }
EXPORT_SYMBOL_GPL(event_triggers_post_call); EXPORT_SYMBOL_GPL(event_triggers_post_call);
@ -908,8 +909,15 @@ void set_named_trigger_data(struct event_trigger_data *data,
data->named_data = named_data; data->named_data = named_data;
} }
struct event_trigger_data *
get_named_trigger_data(struct event_trigger_data *data)
{
return data->named_data;
}
static void static void
traceon_trigger(struct event_trigger_data *data, void *rec) traceon_trigger(struct event_trigger_data *data, void *rec,
struct ring_buffer_event *event)
{ {
if (tracing_is_on()) if (tracing_is_on())
return; return;
@ -918,7 +926,8 @@ traceon_trigger(struct event_trigger_data *data, void *rec)
} }
static void static void
traceon_count_trigger(struct event_trigger_data *data, void *rec) traceon_count_trigger(struct event_trigger_data *data, void *rec,
struct ring_buffer_event *event)
{ {
if (tracing_is_on()) if (tracing_is_on())
return; return;
@ -933,7 +942,8 @@ traceon_count_trigger(struct event_trigger_data *data, void *rec)
} }
static void static void
traceoff_trigger(struct event_trigger_data *data, void *rec) traceoff_trigger(struct event_trigger_data *data, void *rec,
struct ring_buffer_event *event)
{ {
if (!tracing_is_on()) if (!tracing_is_on())
return; return;
@ -942,7 +952,8 @@ traceoff_trigger(struct event_trigger_data *data, void *rec)
} }
static void static void
traceoff_count_trigger(struct event_trigger_data *data, void *rec) traceoff_count_trigger(struct event_trigger_data *data, void *rec,
struct ring_buffer_event *event)
{ {
if (!tracing_is_on()) if (!tracing_is_on())
return; return;
@ -1039,13 +1050,15 @@ static struct event_command trigger_traceoff_cmd = {
#ifdef CONFIG_TRACER_SNAPSHOT #ifdef CONFIG_TRACER_SNAPSHOT
static void static void
snapshot_trigger(struct event_trigger_data *data, void *rec) snapshot_trigger(struct event_trigger_data *data, void *rec,
struct ring_buffer_event *event)
{ {
tracing_snapshot(); tracing_snapshot();
} }
static void static void
snapshot_count_trigger(struct event_trigger_data *data, void *rec) snapshot_count_trigger(struct event_trigger_data *data, void *rec,
struct ring_buffer_event *event)
{ {
if (!data->count) if (!data->count)
return; return;
@ -1053,7 +1066,7 @@ snapshot_count_trigger(struct event_trigger_data *data, void *rec)
if (data->count != -1) if (data->count != -1)
(data->count)--; (data->count)--;
snapshot_trigger(data, rec); snapshot_trigger(data, rec, event);
} }
static int static int
@ -1141,13 +1154,15 @@ static __init int register_trigger_snapshot_cmd(void) { return 0; }
#endif #endif
static void static void
stacktrace_trigger(struct event_trigger_data *data, void *rec) stacktrace_trigger(struct event_trigger_data *data, void *rec,
struct ring_buffer_event *event)
{ {
trace_dump_stack(STACK_SKIP); trace_dump_stack(STACK_SKIP);
} }
static void static void
stacktrace_count_trigger(struct event_trigger_data *data, void *rec) stacktrace_count_trigger(struct event_trigger_data *data, void *rec,
struct ring_buffer_event *event)
{ {
if (!data->count) if (!data->count)
return; return;
@ -1155,7 +1170,7 @@ stacktrace_count_trigger(struct event_trigger_data *data, void *rec)
if (data->count != -1) if (data->count != -1)
(data->count)--; (data->count)--;
stacktrace_trigger(data, rec); stacktrace_trigger(data, rec, event);
} }
static int static int
@ -1217,7 +1232,8 @@ static __init void unregister_trigger_traceon_traceoff_cmds(void)
} }
static void static void
event_enable_trigger(struct event_trigger_data *data, void *rec) event_enable_trigger(struct event_trigger_data *data, void *rec,
struct ring_buffer_event *event)
{ {
struct enable_trigger_data *enable_data = data->private_data; struct enable_trigger_data *enable_data = data->private_data;
@ -1228,7 +1244,8 @@ event_enable_trigger(struct event_trigger_data *data, void *rec)
} }
static void static void
event_enable_count_trigger(struct event_trigger_data *data, void *rec) event_enable_count_trigger(struct event_trigger_data *data, void *rec,
struct ring_buffer_event *event)
{ {
struct enable_trigger_data *enable_data = data->private_data; struct enable_trigger_data *enable_data = data->private_data;
@ -1242,7 +1259,7 @@ event_enable_count_trigger(struct event_trigger_data *data, void *rec)
if (data->count != -1) if (data->count != -1)
(data->count)--; (data->count)--;
event_enable_trigger(data, rec); event_enable_trigger(data, rec, event);
} }
int event_enable_trigger_print(struct seq_file *m, int event_enable_trigger_print(struct seq_file *m,

View File

@ -66,6 +66,73 @@ u64 tracing_map_read_sum(struct tracing_map_elt *elt, unsigned int i)
return (u64)atomic64_read(&elt->fields[i].sum); return (u64)atomic64_read(&elt->fields[i].sum);
} }
/**
* tracing_map_set_var - Assign a tracing_map_elt's variable field
* @elt: The tracing_map_elt
* @i: The index of the given variable associated with the tracing_map_elt
* @n: The value to assign
*
* Assign n to variable i associated with the specified tracing_map_elt
* instance. The index i is the index returned by the call to
* tracing_map_add_var() when the tracing map was set up.
*/
void tracing_map_set_var(struct tracing_map_elt *elt, unsigned int i, u64 n)
{
atomic64_set(&elt->vars[i], n);
elt->var_set[i] = true;
}
/**
* tracing_map_var_set - Return whether or not a variable has been set
* @elt: The tracing_map_elt
* @i: The index of the given variable associated with the tracing_map_elt
*
* Return true if the variable has been set, false otherwise. The
* index i is the index returned by the call to tracing_map_add_var()
* when the tracing map was set up.
*/
bool tracing_map_var_set(struct tracing_map_elt *elt, unsigned int i)
{
return elt->var_set[i];
}
/**
* tracing_map_read_var - Return the value of a tracing_map_elt's variable field
* @elt: The tracing_map_elt
* @i: The index of the given variable associated with the tracing_map_elt
*
* Retrieve the value of the variable i associated with the specified
* tracing_map_elt instance. The index i is the index returned by the
* call to tracing_map_add_var() when the tracing map was set
* up.
*
* Return: The variable value associated with field i for elt.
*/
u64 tracing_map_read_var(struct tracing_map_elt *elt, unsigned int i)
{
return (u64)atomic64_read(&elt->vars[i]);
}
/**
* tracing_map_read_var_once - Return and reset a tracing_map_elt's variable field
* @elt: The tracing_map_elt
* @i: The index of the given variable associated with the tracing_map_elt
*
* Retrieve the value of the variable i associated with the specified
* tracing_map_elt instance, and reset the variable to the 'not set'
* state. The index i is the index returned by the call to
* tracing_map_add_var() when the tracing map was set up. The reset
* essentially makes the variable a read-once variable if it's only
* accessed using this function.
*
* Return: The variable value associated with field i for elt.
*/
u64 tracing_map_read_var_once(struct tracing_map_elt *elt, unsigned int i)
{
elt->var_set[i] = false;
return (u64)atomic64_read(&elt->vars[i]);
}
int tracing_map_cmp_string(void *val_a, void *val_b) int tracing_map_cmp_string(void *val_a, void *val_b)
{ {
char *a = val_a; char *a = val_a;
@ -170,6 +237,28 @@ int tracing_map_add_sum_field(struct tracing_map *map)
return tracing_map_add_field(map, tracing_map_cmp_atomic64); return tracing_map_add_field(map, tracing_map_cmp_atomic64);
} }
/**
* tracing_map_add_var - Add a field describing a tracing_map var
* @map: The tracing_map
*
* Add a var to the map and return the index identifying it in the map
* and associated tracing_map_elts. This is the index used for
* instance to update a var for a particular tracing_map_elt using
* tracing_map_update_var() or reading it via tracing_map_read_var().
*
* Return: The index identifying the var in the map and associated
* tracing_map_elts, or -EINVAL on error.
*/
int tracing_map_add_var(struct tracing_map *map)
{
int ret = -EINVAL;
if (map->n_vars < TRACING_MAP_VARS_MAX)
ret = map->n_vars++;
return ret;
}
/** /**
* tracing_map_add_key_field - Add a field describing a tracing_map key * tracing_map_add_key_field - Add a field describing a tracing_map key
* @map: The tracing_map * @map: The tracing_map
@ -280,6 +369,11 @@ static void tracing_map_elt_clear(struct tracing_map_elt *elt)
if (elt->fields[i].cmp_fn == tracing_map_cmp_atomic64) if (elt->fields[i].cmp_fn == tracing_map_cmp_atomic64)
atomic64_set(&elt->fields[i].sum, 0); atomic64_set(&elt->fields[i].sum, 0);
for (i = 0; i < elt->map->n_vars; i++) {
atomic64_set(&elt->vars[i], 0);
elt->var_set[i] = false;
}
if (elt->map->ops && elt->map->ops->elt_clear) if (elt->map->ops && elt->map->ops->elt_clear)
elt->map->ops->elt_clear(elt); elt->map->ops->elt_clear(elt);
} }
@ -306,6 +400,8 @@ static void tracing_map_elt_free(struct tracing_map_elt *elt)
if (elt->map->ops && elt->map->ops->elt_free) if (elt->map->ops && elt->map->ops->elt_free)
elt->map->ops->elt_free(elt); elt->map->ops->elt_free(elt);
kfree(elt->fields); kfree(elt->fields);
kfree(elt->vars);
kfree(elt->var_set);
kfree(elt->key); kfree(elt->key);
kfree(elt); kfree(elt);
} }
@ -333,6 +429,18 @@ static struct tracing_map_elt *tracing_map_elt_alloc(struct tracing_map *map)
goto free; goto free;
} }
elt->vars = kcalloc(map->n_vars, sizeof(*elt->vars), GFP_KERNEL);
if (!elt->vars) {
err = -ENOMEM;
goto free;
}
elt->var_set = kcalloc(map->n_vars, sizeof(*elt->var_set), GFP_KERNEL);
if (!elt->var_set) {
err = -ENOMEM;
goto free;
}
tracing_map_elt_init_fields(elt); tracing_map_elt_init_fields(elt);
if (map->ops && map->ops->elt_alloc) { if (map->ops && map->ops->elt_alloc) {
@ -414,7 +522,9 @@ static inline struct tracing_map_elt *
__tracing_map_insert(struct tracing_map *map, void *key, bool lookup_only) __tracing_map_insert(struct tracing_map *map, void *key, bool lookup_only)
{ {
u32 idx, key_hash, test_key; u32 idx, key_hash, test_key;
int dup_try = 0;
struct tracing_map_entry *entry; struct tracing_map_entry *entry;
struct tracing_map_elt *val;
key_hash = jhash(key, map->key_size, 0); key_hash = jhash(key, map->key_size, 0);
if (key_hash == 0) if (key_hash == 0)
@ -426,11 +536,33 @@ __tracing_map_insert(struct tracing_map *map, void *key, bool lookup_only)
entry = TRACING_MAP_ENTRY(map->map, idx); entry = TRACING_MAP_ENTRY(map->map, idx);
test_key = entry->key; test_key = entry->key;
if (test_key && test_key == key_hash && entry->val && if (test_key && test_key == key_hash) {
keys_match(key, entry->val->key, map->key_size)) { val = READ_ONCE(entry->val);
if (!lookup_only) if (val &&
atomic64_inc(&map->hits); keys_match(key, val->key, map->key_size)) {
return entry->val; if (!lookup_only)
atomic64_inc(&map->hits);
return val;
} else if (unlikely(!val)) {
/*
* The key is present. But, val (pointer to elt
* struct) is still NULL. which means some other
* thread is in the process of inserting an
* element.
*
* On top of that, it's key_hash is same as the
* one being inserted right now. So, it's
* possible that the element has the same
* key as well.
*/
dup_try++;
if (dup_try > map->map_size) {
atomic64_inc(&map->drops);
break;
}
continue;
}
} }
if (!test_key) { if (!test_key) {
@ -452,6 +584,13 @@ __tracing_map_insert(struct tracing_map *map, void *key, bool lookup_only)
atomic64_inc(&map->hits); atomic64_inc(&map->hits);
return entry->val; return entry->val;
} else {
/*
* cmpxchg() failed. Loop around once
* more to check what key was inserted.
*/
dup_try++;
continue;
} }
} }
@ -816,67 +955,15 @@ create_sort_entry(void *key, struct tracing_map_elt *elt)
return sort_entry; return sort_entry;
} }
static struct tracing_map_elt *copy_elt(struct tracing_map_elt *elt) static void detect_dups(struct tracing_map_sort_entry **sort_entries,
{
struct tracing_map_elt *dup_elt;
unsigned int i;
dup_elt = tracing_map_elt_alloc(elt->map);
if (IS_ERR(dup_elt))
return NULL;
if (elt->map->ops && elt->map->ops->elt_copy)
elt->map->ops->elt_copy(dup_elt, elt);
dup_elt->private_data = elt->private_data;
memcpy(dup_elt->key, elt->key, elt->map->key_size);
for (i = 0; i < elt->map->n_fields; i++) {
atomic64_set(&dup_elt->fields[i].sum,
atomic64_read(&elt->fields[i].sum));
dup_elt->fields[i].cmp_fn = elt->fields[i].cmp_fn;
}
return dup_elt;
}
static int merge_dup(struct tracing_map_sort_entry **sort_entries,
unsigned int target, unsigned int dup)
{
struct tracing_map_elt *target_elt, *elt;
bool first_dup = (target - dup) == 1;
int i;
if (first_dup) {
elt = sort_entries[target]->elt;
target_elt = copy_elt(elt);
if (!target_elt)
return -ENOMEM;
sort_entries[target]->elt = target_elt;
sort_entries[target]->elt_copied = true;
} else
target_elt = sort_entries[target]->elt;
elt = sort_entries[dup]->elt;
for (i = 0; i < elt->map->n_fields; i++)
atomic64_add(atomic64_read(&elt->fields[i].sum),
&target_elt->fields[i].sum);
sort_entries[dup]->dup = true;
return 0;
}
static int merge_dups(struct tracing_map_sort_entry **sort_entries,
int n_entries, unsigned int key_size) int n_entries, unsigned int key_size)
{ {
unsigned int dups = 0, total_dups = 0; unsigned int dups = 0, total_dups = 0;
int err, i, j; int i;
void *key; void *key;
if (n_entries < 2) if (n_entries < 2)
return total_dups; return;
sort(sort_entries, n_entries, sizeof(struct tracing_map_sort_entry *), sort(sort_entries, n_entries, sizeof(struct tracing_map_sort_entry *),
(int (*)(const void *, const void *))cmp_entries_dup, NULL); (int (*)(const void *, const void *))cmp_entries_dup, NULL);
@ -885,30 +972,14 @@ static int merge_dups(struct tracing_map_sort_entry **sort_entries,
for (i = 1; i < n_entries; i++) { for (i = 1; i < n_entries; i++) {
if (!memcmp(sort_entries[i]->key, key, key_size)) { if (!memcmp(sort_entries[i]->key, key, key_size)) {
dups++; total_dups++; dups++; total_dups++;
err = merge_dup(sort_entries, i - dups, i);
if (err)
return err;
continue; continue;
} }
key = sort_entries[i]->key; key = sort_entries[i]->key;
dups = 0; dups = 0;
} }
if (!total_dups) WARN_ONCE(total_dups > 0,
return total_dups; "Duplicates detected: %d\n", total_dups);
for (i = 0, j = 0; i < n_entries; i++) {
if (!sort_entries[i]->dup) {
sort_entries[j] = sort_entries[i];
if (j++ != i)
sort_entries[i] = NULL;
} else {
destroy_sort_entry(sort_entries[i]);
sort_entries[i] = NULL;
}
}
return total_dups;
} }
static bool is_key(struct tracing_map *map, unsigned int field_idx) static bool is_key(struct tracing_map *map, unsigned int field_idx)
@ -1034,10 +1105,7 @@ int tracing_map_sort_entries(struct tracing_map *map,
return 1; return 1;
} }
ret = merge_dups(entries, n_entries, map->key_size); detect_dups(entries, n_entries, map->key_size);
if (ret < 0)
goto free;
n_entries -= ret;
if (is_key(map, sort_keys[0].field_idx)) if (is_key(map, sort_keys[0].field_idx))
cmp_entries_fn = cmp_entries_key; cmp_entries_fn = cmp_entries_key;

View File

@ -10,6 +10,7 @@
#define TRACING_MAP_VALS_MAX 3 #define TRACING_MAP_VALS_MAX 3
#define TRACING_MAP_FIELDS_MAX (TRACING_MAP_KEYS_MAX + \ #define TRACING_MAP_FIELDS_MAX (TRACING_MAP_KEYS_MAX + \
TRACING_MAP_VALS_MAX) TRACING_MAP_VALS_MAX)
#define TRACING_MAP_VARS_MAX 16
#define TRACING_MAP_SORT_KEYS_MAX 2 #define TRACING_MAP_SORT_KEYS_MAX 2
typedef int (*tracing_map_cmp_fn_t) (void *val_a, void *val_b); typedef int (*tracing_map_cmp_fn_t) (void *val_a, void *val_b);
@ -137,6 +138,8 @@ struct tracing_map_field {
struct tracing_map_elt { struct tracing_map_elt {
struct tracing_map *map; struct tracing_map *map;
struct tracing_map_field *fields; struct tracing_map_field *fields;
atomic64_t *vars;
bool *var_set;
void *key; void *key;
void *private_data; void *private_data;
}; };
@ -192,6 +195,7 @@ struct tracing_map {
int key_idx[TRACING_MAP_KEYS_MAX]; int key_idx[TRACING_MAP_KEYS_MAX];
unsigned int n_keys; unsigned int n_keys;
struct tracing_map_sort_key sort_key; struct tracing_map_sort_key sort_key;
unsigned int n_vars;
atomic64_t hits; atomic64_t hits;
atomic64_t drops; atomic64_t drops;
}; };
@ -215,11 +219,6 @@ struct tracing_map {
* Element allocation occurs before tracing begins, when the * Element allocation occurs before tracing begins, when the
* tracing_map_init() call is made by client code. * tracing_map_init() call is made by client code.
* *
* @elt_copy: At certain points in the lifetime of an element, it may
* need to be copied. The copy should include a copy of the
* client-allocated data, which can be copied into the 'to'
* element from the 'from' element.
*
* @elt_free: When a tracing_map_elt is freed, this function is called * @elt_free: When a tracing_map_elt is freed, this function is called
* and allows client-allocated per-element data to be freed. * and allows client-allocated per-element data to be freed.
* *
@ -233,8 +232,6 @@ struct tracing_map {
*/ */
struct tracing_map_ops { struct tracing_map_ops {
int (*elt_alloc)(struct tracing_map_elt *elt); int (*elt_alloc)(struct tracing_map_elt *elt);
void (*elt_copy)(struct tracing_map_elt *to,
struct tracing_map_elt *from);
void (*elt_free)(struct tracing_map_elt *elt); void (*elt_free)(struct tracing_map_elt *elt);
void (*elt_clear)(struct tracing_map_elt *elt); void (*elt_clear)(struct tracing_map_elt *elt);
void (*elt_init)(struct tracing_map_elt *elt); void (*elt_init)(struct tracing_map_elt *elt);
@ -248,6 +245,7 @@ tracing_map_create(unsigned int map_bits,
extern int tracing_map_init(struct tracing_map *map); extern int tracing_map_init(struct tracing_map *map);
extern int tracing_map_add_sum_field(struct tracing_map *map); extern int tracing_map_add_sum_field(struct tracing_map *map);
extern int tracing_map_add_var(struct tracing_map *map);
extern int tracing_map_add_key_field(struct tracing_map *map, extern int tracing_map_add_key_field(struct tracing_map *map,
unsigned int offset, unsigned int offset,
tracing_map_cmp_fn_t cmp_fn); tracing_map_cmp_fn_t cmp_fn);
@ -267,7 +265,13 @@ extern int tracing_map_cmp_none(void *val_a, void *val_b);
extern void tracing_map_update_sum(struct tracing_map_elt *elt, extern void tracing_map_update_sum(struct tracing_map_elt *elt,
unsigned int i, u64 n); unsigned int i, u64 n);
extern void tracing_map_set_var(struct tracing_map_elt *elt,
unsigned int i, u64 n);
extern bool tracing_map_var_set(struct tracing_map_elt *elt, unsigned int i);
extern u64 tracing_map_read_sum(struct tracing_map_elt *elt, unsigned int i); extern u64 tracing_map_read_sum(struct tracing_map_elt *elt, unsigned int i);
extern u64 tracing_map_read_var(struct tracing_map_elt *elt, unsigned int i);
extern u64 tracing_map_read_var_once(struct tracing_map_elt *elt, unsigned int i);
extern void tracing_map_set_field_descr(struct tracing_map *map, extern void tracing_map_set_field_descr(struct tracing_map *map,
unsigned int i, unsigned int i,
unsigned int key_offset, unsigned int key_offset,

View File

@ -2591,6 +2591,8 @@ int vbin_printf(u32 *bin_buf, size_t size, const char *fmt, va_list args)
case 's': case 's':
case 'F': case 'F':
case 'f': case 'f':
case 'x':
case 'K':
save_arg(void *); save_arg(void *);
break; break;
default: default:
@ -2765,6 +2767,8 @@ int bstr_printf(char *buf, size_t size, const char *fmt, const u32 *bin_buf)
case 's': case 's':
case 'F': case 'F':
case 'f': case 'f':
case 'x':
case 'K':
process = true; process = true;
break; break;
default: default:

View File

@ -30,6 +30,8 @@
#include <linux/string.h> #include <linux/string.h>
#include <net/flow.h> #include <net/flow.h>
#include <trace/events/initcall.h>
#define MAX_LSM_EVM_XATTR 2 #define MAX_LSM_EVM_XATTR 2
/* Maximum number of letters for an LSM name string */ /* Maximum number of letters for an LSM name string */
@ -45,10 +47,14 @@ static __initdata char chosen_lsm[SECURITY_NAME_MAX + 1] =
static void __init do_security_initcalls(void) static void __init do_security_initcalls(void)
{ {
int ret;
initcall_t *call; initcall_t *call;
call = __security_initcall_start; call = __security_initcall_start;
trace_initcall_level("security");
while (call < __security_initcall_end) { while (call < __security_initcall_end) {
(*call) (); trace_initcall_start((*call));
ret = (*call) ();
trace_initcall_finish((*call), ret);
call++; call++;
} }
} }

View File

@ -59,6 +59,13 @@ disable_events() {
echo 0 > events/enable echo 0 > events/enable
} }
clear_synthetic_events() { # reset all current synthetic events
grep -v ^# synthetic_events |
while read line; do
echo "!$line" >> synthetic_events
done
}
initialize_ftrace() { # Reset ftrace to initial-state initialize_ftrace() { # Reset ftrace to initial-state
# As the initial state, ftrace will be set to nop tracer, # As the initial state, ftrace will be set to nop tracer,
# no events, no triggers, no filters, no function filters, # no events, no triggers, no filters, no function filters,

View File

@ -0,0 +1,39 @@
#!/bin/sh
# description: event trigger - test extended error support
do_reset() {
reset_trigger
echo > set_event
clear_trace
}
fail() { #msg
do_reset
echo $1
exit_fail
}
if [ ! -f set_event ]; then
echo "event tracing is not supported"
exit_unsupported
fi
if [ ! -f synthetic_events ]; then
echo "synthetic event is not supported"
exit_unsupported
fi
reset_tracer
do_reset
echo "Test extended error support"
echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="ping"' > events/sched/sched_wakeup/trigger
echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="ping"' >> events/sched/sched_wakeup/trigger &>/dev/null
if ! grep -q "ERROR:" events/sched/sched_wakeup/hist; then
fail "Failed to generate extended error in histogram"
fi
do_reset
exit 0

View File

@ -0,0 +1,54 @@
#!/bin/sh
# description: event trigger - test field variable support
do_reset() {
reset_trigger
echo > set_event
clear_trace
}
fail() { #msg
do_reset
echo $1
exit_fail
}
if [ ! -f set_event ]; then
echo "event tracing is not supported"
exit_unsupported
fi
if [ ! -f synthetic_events ]; then
echo "synthetic event is not supported"
exit_unsupported
fi
clear_synthetic_events
reset_tracer
do_reset
echo "Test field variable support"
echo 'wakeup_latency u64 lat; pid_t pid; int prio; char comm[16]' > synthetic_events
echo 'hist:keys=comm:ts0=common_timestamp.usecs if comm=="ping"' > events/sched/sched_waking/trigger
echo 'hist:keys=next_comm:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,sched.sched_waking.prio,next_comm) if next_comm=="ping"' > events/sched/sched_switch/trigger
echo 'hist:keys=pid,prio,comm:vals=lat:sort=pid,prio' > events/synthetic/wakeup_latency/trigger
ping localhost -c 3
if ! grep -q "ping" events/synthetic/wakeup_latency/hist; then
fail "Failed to create inter-event histogram"
fi
if ! grep -q "synthetic_prio=prio" events/sched/sched_waking/hist; then
fail "Failed to create histogram with field variable"
fi
echo '!hist:keys=next_comm:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,sched.sched_waking.prio,next_comm) if next_comm=="ping"' >> events/sched/sched_switch/trigger
if grep -q "synthetic_prio=prio" events/sched/sched_waking/hist; then
fail "Failed to remove histogram with field variable"
fi
do_reset
exit 0

View File

@ -0,0 +1,58 @@
#!/bin/sh
# description: event trigger - test inter-event combined histogram trigger
do_reset() {
reset_trigger
echo > set_event
clear_trace
}
fail() { #msg
do_reset
echo $1
exit_fail
}
if [ ! -f set_event ]; then
echo "event tracing is not supported"
exit_unsupported
fi
if [ ! -f synthetic_events ]; then
echo "synthetic event is not supported"
exit_unsupported
fi
reset_tracer
do_reset
clear_synthetic_events
echo "Test create synthetic event"
echo 'waking_latency u64 lat pid_t pid' > synthetic_events
if [ ! -d events/synthetic/waking_latency ]; then
fail "Failed to create waking_latency synthetic event"
fi
echo "Test combined histogram"
echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="ping"' > events/sched/sched_waking/trigger
echo 'hist:keys=pid:waking_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).waking_latency($waking_lat,pid) if comm=="ping"' > events/sched/sched_wakeup/trigger
echo 'hist:keys=pid,lat:sort=pid,lat' > events/synthetic/waking_latency/trigger
echo 'wakeup_latency u64 lat pid_t pid' >> synthetic_events
echo 'hist:keys=pid:ts1=common_timestamp.usecs if comm=="ping"' >> events/sched/sched_wakeup/trigger
echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts1:onmatch(sched.sched_wakeup).wakeup_latency($wakeup_lat,next_pid) if next_comm=="ping"' > events/sched/sched_switch/trigger
echo 'waking+wakeup_latency u64 lat; pid_t pid' >> synthetic_events
echo 'hist:keys=pid,lat:sort=pid,lat:ww_lat=$waking_lat+$wakeup_lat:onmatch(synthetic.wakeup_latency).waking+wakeup_latency($ww_lat,pid)' >> events/synthetic/wakeup_latency/trigger
echo 'hist:keys=pid,lat:sort=pid,lat' >> events/synthetic/waking+wakeup_latency/trigger
ping localhost -c 3
if ! grep -q "pid:" events/synthetic/waking+wakeup_latency/hist; then
fail "Failed to create combined histogram"
fi
do_reset
exit 0

View File

@ -0,0 +1,50 @@
#!/bin/sh
# description: event trigger - test inter-event histogram trigger onmatch action
do_reset() {
reset_trigger
echo > set_event
clear_trace
}
fail() { #msg
do_reset
echo $1
exit_fail
}
if [ ! -f set_event ]; then
echo "event tracing is not supported"
exit_unsupported
fi
if [ ! -f synthetic_events ]; then
echo "synthetic event is not supported"
exit_unsupported
fi
clear_synthetic_events
reset_tracer
do_reset
echo "Test create synthetic event"
echo 'wakeup_latency u64 lat pid_t pid char comm[16]' > synthetic_events
if [ ! -d events/synthetic/wakeup_latency ]; then
fail "Failed to create wakeup_latency synthetic event"
fi
echo "Test create histogram for synthetic event"
echo "Test histogram variables,simple expression support and onmatch action"
echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="ping"' > events/sched/sched_wakeup/trigger
echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_wakeup).wakeup_latency($wakeup_lat,next_pid,next_comm) if next_comm=="ping"' > events/sched/sched_switch/trigger
echo 'hist:keys=comm,pid,lat:wakeup_lat=lat:sort=lat' > events/synthetic/wakeup_latency/trigger
ping localhost -c 5
if ! grep -q "ping" events/synthetic/wakeup_latency/hist; then
fail "Failed to create onmatch action inter-event histogram"
fi
do_reset
exit 0

View File

@ -0,0 +1,50 @@
#!/bin/sh
# description: event trigger - test inter-event histogram trigger onmatch-onmax action
do_reset() {
reset_trigger
echo > set_event
clear_trace
}
fail() { #msg
do_reset
echo $1
exit_fail
}
if [ ! -f set_event ]; then
echo "event tracing is not supported"
exit_unsupported
fi
if [ ! -f synthetic_events ]; then
echo "synthetic event is not supported"
exit_unsupported
fi
clear_synthetic_events
reset_tracer
do_reset
echo "Test create synthetic event"
echo 'wakeup_latency u64 lat pid_t pid char comm[16]' > synthetic_events
if [ ! -d events/synthetic/wakeup_latency ]; then
fail "Failed to create wakeup_latency synthetic event"
fi
echo "Test create histogram for synthetic event"
echo "Test histogram variables,simple expression support and onmatch-onmax action"
echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="ping"' > events/sched/sched_wakeup/trigger
echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_wakeup).wakeup_latency($wakeup_lat,next_pid,next_comm):onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm) if next_comm=="ping"' >> events/sched/sched_switch/trigger
echo 'hist:keys=comm,pid,lat:wakeup_lat=lat:sort=lat' > events/synthetic/wakeup_latency/trigger
ping localhost -c 5
if [ ! grep -q "ping" events/synthetic/wakeup_latency/hist -o ! grep -q "max:" events/sched/sched_switch/hist]; then
fail "Failed to create onmatch-onmax action inter-event histogram"
fi
do_reset
exit 0

View File

@ -0,0 +1,48 @@
#!/bin/sh
# description: event trigger - test inter-event histogram trigger onmax action
do_reset() {
reset_trigger
echo > set_event
clear_trace
}
fail() { #msg
do_reset
echo $1
exit_fail
}
if [ ! -f set_event ]; then
echo "event tracing is not supported"
exit_unsupported
fi
if [ ! -f synthetic_events ]; then
echo "synthetic event is not supported"
exit_unsupported
fi
clear_synthetic_events
reset_tracer
do_reset
echo "Test create synthetic event"
echo 'wakeup_latency u64 lat pid_t pid char comm[16]' > synthetic_events
if [ ! -d events/synthetic/wakeup_latency ]; then
fail "Failed to create wakeup_latency synthetic event"
fi
echo "Test onmax action"
echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="ping"' >> events/sched/sched_waking/trigger
echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm) if next_comm=="ping"' >> events/sched/sched_switch/trigger
ping localhost -c 3
if ! grep -q "max:" events/sched/sched_switch/hist; then
fail "Failed to create onmax action inter-event histogram"
fi
do_reset
exit 0

View File

@ -0,0 +1,54 @@
#!/bin/sh
# description: event trigger - test synthetic event create remove
do_reset() {
reset_trigger
echo > set_event
clear_trace
}
fail() { #msg
do_reset
echo $1
exit_fail
}
if [ ! -f set_event ]; then
echo "event tracing is not supported"
exit_unsupported
fi
if [ ! -f synthetic_events ]; then
echo "synthetic event is not supported"
exit_unsupported
fi
clear_synthetic_events
reset_tracer
do_reset
echo "Test create synthetic event"
echo 'wakeup_latency u64 lat pid_t pid char comm[16]' > synthetic_events
if [ ! -d events/synthetic/wakeup_latency ]; then
fail "Failed to create wakeup_latency synthetic event"
fi
reset_trigger
echo "Test create synthetic event with an error"
echo 'wakeup_latency u64 lat pid_t pid char' > synthetic_events > /dev/null
if [ -d events/synthetic/wakeup_latency ]; then
fail "Created wakeup_latency synthetic event with an invalid format"
fi
reset_trigger
echo "Test remove synthetic event"
echo '!wakeup_latency u64 lat pid_t pid char comm[16]' > synthetic_events
if [ -d events/synthetic/wakeup_latency ]; then
fail "Failed to delete wakeup_latency synthetic event"
fi
do_reset
exit 0