License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2022-01-20 10:08:16 +08:00
|
|
|
/*
|
|
|
|
* NOTE:
|
|
|
|
*
|
|
|
|
* This header has combined a lot of unrelated to each other stuff.
|
|
|
|
* The process of splitting its content is in progress while keeping
|
|
|
|
* backward compatibility. That's why it's highly recommended NOT to
|
|
|
|
* include this header inside another header file, especially under
|
|
|
|
* generic or architectural include/ directory.
|
|
|
|
*/
|
2005-04-17 06:20:36 +08:00
|
|
|
#ifndef _LINUX_KERNEL_H
|
|
|
|
#define _LINUX_KERNEL_H
|
|
|
|
|
2021-08-03 04:40:32 +08:00
|
|
|
#include <linux/stdarg.h>
|
2021-05-07 09:02:30 +08:00
|
|
|
#include <linux/align.h>
|
2019-03-08 08:27:14 +08:00
|
|
|
#include <linux/limits.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
#include <linux/linkage.h>
|
|
|
|
#include <linux/stddef.h>
|
|
|
|
#include <linux/types.h>
|
|
|
|
#include <linux/compiler.h>
|
2021-11-09 10:32:12 +08:00
|
|
|
#include <linux/container_of.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
#include <linux/bitops.h>
|
2021-07-01 09:56:10 +08:00
|
|
|
#include <linux/kstrtox.h>
|
2006-12-08 18:37:49 +08:00
|
|
|
#include <linux/log2.h>
|
2020-12-16 12:42:48 +08:00
|
|
|
#include <linux/math.h>
|
2020-10-16 11:10:21 +08:00
|
|
|
#include <linux/minmax.h>
|
2008-07-25 16:45:24 +08:00
|
|
|
#include <linux/typecheck.h>
|
2021-07-01 09:54:59 +08:00
|
|
|
#include <linux/panic.h>
|
2010-11-16 05:37:37 +08:00
|
|
|
#include <linux/printk.h>
|
kernel.h: handle pointers to arrays better in container_of()
If the first parameter of container_of() is a pointer to a
non-const-qualified array type (and the third parameter names a
non-const-qualified array member), the local variable __mptr will be
defined with a const-qualified array type. In ISO C, these types are
incompatible. They work as expected in GNU C, but some versions will
issue warnings. For example, GCC 4.9 produces the warning
"initialization from incompatible pointer type".
Here is an example of where the problem occurs:
-------------------------------------------------------
#include <linux/kernel.h>
#include <linux/module.h>
MODULE_LICENSE("GPL");
struct st {
int a;
char b[16];
};
static int __init example_init(void) {
struct st t = { .a = 101, .b = "hello" };
char (*p)[16] = &t.b;
struct st *x = container_of(p, struct st, b);
printk(KERN_DEBUG "%p %p\n", (void *)&t, (void *)x);
return 0;
}
static void __exit example_exit(void) {
}
module_init(example_init);
module_exit(example_exit);
-------------------------------------------------------
Building the module with gcc-4.9 results in these warnings (where '{m}'
is the module source and '{k}' is the kernel source):
-------------------------------------------------------
In file included from {m}/example.c:1:0:
{m}/example.c: In function `example_init':
{k}/include/linux/kernel.h:854:48: warning: initialization from incompatible pointer type
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
^
{m}/example.c:14:17: note: in expansion of macro `container_of'
struct st *x = container_of(p, struct st, b);
^
{k}/include/linux/kernel.h:854:48: warning: (near initialization for `x')
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
^
{m}/example.c:14:17: note: in expansion of macro `container_of'
struct st *x = container_of(p, struct st, b);
^
-------------------------------------------------------
Replace the type checking performed by the macro to avoid these
warnings. Make sure `*(ptr)` either has type compatible with the
member, or has type compatible with `void`, ignoring qualifiers. Raise
compiler errors if this is not true. This is stronger than the previous
behaviour, which only resulted in compiler warnings for a type mismatch.
[arnd@arndb.de: fix new warnings for container_of()]
Link: http://lkml.kernel.org/r/20170620200940.90557-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/20170525120316.24473-7-abbotti@mev.co.uk
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-13 05:33:04 +08:00
|
|
|
#include <linux/build_bug.h>
|
2021-01-18 22:12:20 +08:00
|
|
|
#include <linux/static_call_types.h>
|
2021-11-09 10:32:43 +08:00
|
|
|
#include <linux/instruction_pointer.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
#include <asm/byteorder.h>
|
2020-12-16 12:42:48 +08:00
|
|
|
|
2012-10-13 17:46:48 +08:00
|
|
|
#include <uapi/linux/kernel.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
#define STACK_MAGIC 0xdeadbeef
|
|
|
|
|
2017-10-14 06:58:11 +08:00
|
|
|
/**
|
|
|
|
* REPEAT_BYTE - repeat the value @x multiple times as an unsigned long value
|
|
|
|
* @x: value to repeat
|
|
|
|
*
|
|
|
|
* NOTE: @x is not checked for > 0xff; larger values produce odd results.
|
|
|
|
*/
|
2012-05-24 11:12:50 +08:00
|
|
|
#define REPEAT_BYTE(x) ((~0ul / 0xff) * (x))
|
|
|
|
|
2016-11-01 21:40:11 +08:00
|
|
|
/* generic data direction definitions */
|
|
|
|
#define READ 0
|
|
|
|
#define WRITE 1
|
|
|
|
|
2017-10-14 06:58:11 +08:00
|
|
|
/**
|
|
|
|
* ARRAY_SIZE - get the number of elements in array @arr
|
|
|
|
* @arr: array to be sized
|
|
|
|
*/
|
2007-05-07 05:51:05 +08:00
|
|
|
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
|
|
|
|
|
2021-05-06 01:45:15 +08:00
|
|
|
#define PTR_IF(cond, ptr) ((cond) ? (ptr) : NULL)
|
|
|
|
|
2016-04-26 23:32:27 +08:00
|
|
|
#define u64_to_user_ptr(x) ( \
|
|
|
|
{ \
|
2019-03-30 05:46:49 +08:00
|
|
|
typecheck(u64, (x)); \
|
|
|
|
(void __user *)(uintptr_t)(x); \
|
2016-04-26 23:32:27 +08:00
|
|
|
} \
|
|
|
|
)
|
|
|
|
|
2007-05-10 18:15:18 +08:00
|
|
|
/**
|
|
|
|
* upper_32_bits - return bits 32-63 of a number
|
|
|
|
* @n: the number we're accessing
|
|
|
|
*
|
|
|
|
* A basic shift-right of a 64- or 32-bit quantity. Use this to suppress
|
|
|
|
* the "right shift count >= width of type" warning when that quantity is
|
|
|
|
* 32-bits.
|
|
|
|
*/
|
|
|
|
#define upper_32_bits(n) ((u32)(((n) >> 16) >> 16))
|
|
|
|
|
2008-07-30 13:33:42 +08:00
|
|
|
/**
|
|
|
|
* lower_32_bits - return bits 0-31 of a number
|
|
|
|
* @n: the number we're accessing
|
|
|
|
*/
|
2020-08-28 15:11:25 +08:00
|
|
|
#define lower_32_bits(n) ((u32)((n) & 0xffffffff))
|
2008-07-30 13:33:42 +08:00
|
|
|
|
2021-06-10 00:39:49 +08:00
|
|
|
/**
|
|
|
|
* upper_16_bits - return bits 16-31 of a number
|
|
|
|
* @n: the number we're accessing
|
|
|
|
*/
|
|
|
|
#define upper_16_bits(n) ((u16)((n) >> 16))
|
|
|
|
|
|
|
|
/**
|
|
|
|
* lower_16_bits - return bits 0-15 of a number
|
|
|
|
* @n: the number we're accessing
|
|
|
|
*/
|
|
|
|
#define lower_16_bits(n) ((u16)((n) & 0xffff))
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
struct completion;
|
2006-01-10 12:51:37 +08:00
|
|
|
struct user;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2021-11-11 04:24:44 +08:00
|
|
|
#ifdef CONFIG_PREEMPT_VOLUNTARY_BUILD
|
2021-01-18 22:12:20 +08:00
|
|
|
|
|
|
|
extern int __cond_resched(void);
|
|
|
|
# define might_resched() __cond_resched()
|
|
|
|
|
sched/preempt: Add PREEMPT_DYNAMIC using static keys
Where an architecture selects HAVE_STATIC_CALL but not
HAVE_STATIC_CALL_INLINE, each static call has an out-of-line trampoline
which will either branch to a callee or return to the caller.
On such architectures, a number of constraints can conspire to make
those trampolines more complicated and potentially less useful than we'd
like. For example:
* Hardware and software control flow integrity schemes can require the
addition of "landing pad" instructions (e.g. `BTI` for arm64), which
will also be present at the "real" callee.
* Limited branch ranges can require that trampolines generate or load an
address into a register and perform an indirect branch (or at least
have a slow path that does so). This loses some of the benefits of
having a direct branch.
* Interaction with SW CFI schemes can be complicated and fragile, e.g.
requiring that we can recognise idiomatic codegen and remove
indirections understand, at least until clang proves more helpful
mechanisms for dealing with this.
For PREEMPT_DYNAMIC, we don't need the full power of static calls, as we
really only need to enable/disable specific preemption functions. We can
achieve the same effect without a number of the pain points above by
using static keys to fold early returns into the preemption functions
themselves rather than in an out-of-line trampoline, effectively
inlining the trampoline into the start of the function.
For arm64, this results in good code generation. For example, the
dynamic_cond_resched() wrapper looks as follows when enabled. When
disabled, the first `B` is replaced with a `NOP`, resulting in an early
return.
| <dynamic_cond_resched>:
| bti c
| b <dynamic_cond_resched+0x10> // or `nop`
| mov w0, #0x0
| ret
| mrs x0, sp_el0
| ldr x0, [x0, #8]
| cbnz x0, <dynamic_cond_resched+0x8>
| paciasp
| stp x29, x30, [sp, #-16]!
| mov x29, sp
| bl <preempt_schedule_common>
| mov w0, #0x1
| ldp x29, x30, [sp], #16
| autiasp
| ret
... compared to the regular form of the function:
| <__cond_resched>:
| bti c
| mrs x0, sp_el0
| ldr x1, [x0, #8]
| cbz x1, <__cond_resched+0x18>
| mov w0, #0x0
| ret
| paciasp
| stp x29, x30, [sp, #-16]!
| mov x29, sp
| bl <preempt_schedule_common>
| mov w0, #0x1
| ldp x29, x30, [sp], #16
| autiasp
| ret
Any architecture which implements static keys should be able to use this
to implement PREEMPT_DYNAMIC with similar cost to non-inlined static
calls. Since this is likely to have greater overhead than (inlined)
static calls, PREEMPT_DYNAMIC is only defaulted to enabled when
HAVE_PREEMPT_DYNAMIC_CALL is selected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20220214165216.2231574-6-mark.rutland@arm.com
2022-02-15 00:52:14 +08:00
|
|
|
#elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
|
2021-01-18 22:12:20 +08:00
|
|
|
|
|
|
|
extern int __cond_resched(void);
|
|
|
|
|
|
|
|
DECLARE_STATIC_CALL(might_resched, __cond_resched);
|
|
|
|
|
|
|
|
static __always_inline void might_resched(void)
|
|
|
|
{
|
2021-01-25 23:26:50 +08:00
|
|
|
static_call_mod(might_resched)();
|
2021-01-18 22:12:20 +08:00
|
|
|
}
|
|
|
|
|
sched/preempt: Add PREEMPT_DYNAMIC using static keys
Where an architecture selects HAVE_STATIC_CALL but not
HAVE_STATIC_CALL_INLINE, each static call has an out-of-line trampoline
which will either branch to a callee or return to the caller.
On such architectures, a number of constraints can conspire to make
those trampolines more complicated and potentially less useful than we'd
like. For example:
* Hardware and software control flow integrity schemes can require the
addition of "landing pad" instructions (e.g. `BTI` for arm64), which
will also be present at the "real" callee.
* Limited branch ranges can require that trampolines generate or load an
address into a register and perform an indirect branch (or at least
have a slow path that does so). This loses some of the benefits of
having a direct branch.
* Interaction with SW CFI schemes can be complicated and fragile, e.g.
requiring that we can recognise idiomatic codegen and remove
indirections understand, at least until clang proves more helpful
mechanisms for dealing with this.
For PREEMPT_DYNAMIC, we don't need the full power of static calls, as we
really only need to enable/disable specific preemption functions. We can
achieve the same effect without a number of the pain points above by
using static keys to fold early returns into the preemption functions
themselves rather than in an out-of-line trampoline, effectively
inlining the trampoline into the start of the function.
For arm64, this results in good code generation. For example, the
dynamic_cond_resched() wrapper looks as follows when enabled. When
disabled, the first `B` is replaced with a `NOP`, resulting in an early
return.
| <dynamic_cond_resched>:
| bti c
| b <dynamic_cond_resched+0x10> // or `nop`
| mov w0, #0x0
| ret
| mrs x0, sp_el0
| ldr x0, [x0, #8]
| cbnz x0, <dynamic_cond_resched+0x8>
| paciasp
| stp x29, x30, [sp, #-16]!
| mov x29, sp
| bl <preempt_schedule_common>
| mov w0, #0x1
| ldp x29, x30, [sp], #16
| autiasp
| ret
... compared to the regular form of the function:
| <__cond_resched>:
| bti c
| mrs x0, sp_el0
| ldr x1, [x0, #8]
| cbz x1, <__cond_resched+0x18>
| mov w0, #0x0
| ret
| paciasp
| stp x29, x30, [sp, #-16]!
| mov x29, sp
| bl <preempt_schedule_common>
| mov w0, #0x1
| ldp x29, x30, [sp], #16
| autiasp
| ret
Any architecture which implements static keys should be able to use this
to implement PREEMPT_DYNAMIC with similar cost to non-inlined static
calls. Since this is likely to have greater overhead than (inlined)
static calls, PREEMPT_DYNAMIC is only defaulted to enabled when
HAVE_PREEMPT_DYNAMIC_CALL is selected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20220214165216.2231574-6-mark.rutland@arm.com
2022-02-15 00:52:14 +08:00
|
|
|
#elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
|
|
|
|
|
|
|
|
extern int dynamic_might_resched(void);
|
|
|
|
# define might_resched() dynamic_might_resched()
|
|
|
|
|
2008-08-13 06:08:59 +08:00
|
|
|
#else
|
2021-01-18 22:12:20 +08:00
|
|
|
|
2008-08-13 06:08:59 +08:00
|
|
|
# define might_resched() do { } while (0)
|
2021-01-18 22:12:20 +08:00
|
|
|
|
|
|
|
#endif /* CONFIG_PREEMPT_* */
|
2008-08-13 06:08:59 +08:00
|
|
|
|
2011-06-09 01:31:56 +08:00
|
|
|
#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
|
2021-09-24 00:54:43 +08:00
|
|
|
extern void __might_resched(const char *file, int line, unsigned int offsets);
|
2021-09-24 00:54:38 +08:00
|
|
|
extern void __might_sleep(const char *file, int line);
|
2019-01-29 09:21:52 +08:00
|
|
|
extern void __cant_sleep(const char *file, int line, int preempt_offset);
|
2020-11-19 03:48:42 +08:00
|
|
|
extern void __cant_migrate(const char *file, int line);
|
2019-01-29 09:21:52 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/**
|
|
|
|
* might_sleep - annotation for functions that can sleep
|
|
|
|
*
|
|
|
|
* this macro will print a stack trace if it is executed in an atomic
|
2019-08-27 04:14:23 +08:00
|
|
|
* context (spinlock, irq-handler, ...). Additional sections where blocking is
|
|
|
|
* not allowed can be annotated with non_block_start() and non_block_end()
|
|
|
|
* pairs.
|
2005-04-17 06:20:36 +08:00
|
|
|
*
|
|
|
|
* This is a useful debugging help to be able to catch problems early and not
|
2006-11-30 11:46:13 +08:00
|
|
|
* be bitten later when the calling function happens to sleep when it is not
|
2005-04-17 06:20:36 +08:00
|
|
|
* supposed to.
|
|
|
|
*/
|
2005-06-26 05:57:39 +08:00
|
|
|
# define might_sleep() \
|
2021-09-24 00:54:38 +08:00
|
|
|
do { __might_sleep(__FILE__, __LINE__); might_resched(); } while (0)
|
2019-01-29 09:21:52 +08:00
|
|
|
/**
|
|
|
|
* cant_sleep - annotation for functions that cannot sleep
|
|
|
|
*
|
|
|
|
* this macro will print a stack trace if it is executed with preemption enabled
|
|
|
|
*/
|
|
|
|
# define cant_sleep() \
|
|
|
|
do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
|
sched: don't cause task state changes in nested sleep debugging
Commit 8eb23b9f35aa ("sched: Debug nested sleeps") added code to report
on nested sleep conditions, which we generally want to avoid because the
inner sleeping operation can re-set the thread state to TASK_RUNNING,
but that will then cause the outer sleep loop not actually sleep when it
calls schedule.
However, that's actually valid traditional behavior, with the inner
sleep being some fairly rare case (like taking a sleeping lock that
normally doesn't actually need to sleep).
And the debug code would actually change the state of the task to
TASK_RUNNING internally, which makes that kind of traditional and
working code not work at all, because now the nested sleep doesn't just
sometimes cause the outer one to not block, but will cause it to happen
every time.
In particular, it will cause the cardbus kernel daemon (pccardd) to
basically busy-loop doing scheduling, converting a laptop into a heater,
as reported by Bruno Prémont. But there may be other legacy uses of
that nested sleep model in other drivers that are also likely to never
get converted to the new model.
This fixes both cases:
- don't set TASK_RUNNING when the nested condition happens (note: even
if WARN_ONCE() only _warns_ once, the return value isn't whether the
warning happened, but whether the condition for the warning was true.
So despite the warning only happening once, the "if (WARN_ON(..))"
would trigger for every nested sleep.
- in the cases where we knowingly disable the warning by using
"sched_annotate_sleep()", don't change the task state (that is used
for all core scheduling decisions), instead use '->task_state_change'
that is used for the debugging decision itself.
(Credit for the second part of the fix goes to Oleg Nesterov: "Can't we
avoid this subtle change in behaviour DEBUG_ATOMIC_SLEEP adds?" with the
suggested change to use 'task_state_change' as part of the test)
Reported-and-bisected-by: Bruno Prémont <bonbons@linux-vserver.org>
Tested-by: Rafael J Wysocki <rjw@rjwysocki.net>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Cc: Ilya Dryomov <ilya.dryomov@inktank.com>,
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Hurley <peter@hurleysoftware.com>,
Cc: Davidlohr Bueso <dave@stgolabs.net>,
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-02 04:23:32 +08:00
|
|
|
# define sched_annotate_sleep() (current->task_state_change = 0)
|
2020-11-19 03:48:42 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* cant_migrate - annotation for functions that cannot migrate
|
|
|
|
*
|
|
|
|
* Will print a stack trace if executed in code which is migratable
|
|
|
|
*/
|
|
|
|
# define cant_migrate() \
|
|
|
|
do { \
|
|
|
|
if (IS_ENABLED(CONFIG_SMP)) \
|
|
|
|
__cant_migrate(__FILE__, __LINE__); \
|
|
|
|
} while (0)
|
|
|
|
|
2019-08-27 04:14:23 +08:00
|
|
|
/**
|
|
|
|
* non_block_start - annotate the start of section where sleeping is prohibited
|
|
|
|
*
|
|
|
|
* This is on behalf of the oom reaper, specifically when it is calling the mmu
|
|
|
|
* notifiers. The problem is that if the notifier were to block on, for example,
|
|
|
|
* mutex_lock() and if the process which holds that mutex were to perform a
|
|
|
|
* sleeping memory allocation, the oom reaper is now blocked on completion of
|
|
|
|
* that memory allocation. Other blocking calls like wait_event() pose similar
|
|
|
|
* issues.
|
|
|
|
*/
|
|
|
|
# define non_block_start() (current->non_block_count++)
|
|
|
|
/**
|
|
|
|
* non_block_end - annotate the end of section where sleeping is prohibited
|
|
|
|
*
|
|
|
|
* Closes a section opened by non_block_start().
|
|
|
|
*/
|
|
|
|
# define non_block_end() WARN_ON(current->non_block_count-- == 0)
|
2005-04-17 06:20:36 +08:00
|
|
|
#else
|
2021-09-24 00:54:35 +08:00
|
|
|
static inline void __might_resched(const char *file, int line,
|
2021-09-24 00:54:43 +08:00
|
|
|
unsigned int offsets) { }
|
2021-09-24 00:54:38 +08:00
|
|
|
static inline void __might_sleep(const char *file, int line) { }
|
2005-06-26 05:57:39 +08:00
|
|
|
# define might_sleep() do { might_resched(); } while (0)
|
2019-01-29 09:21:52 +08:00
|
|
|
# define cant_sleep() do { } while (0)
|
2020-11-19 03:48:42 +08:00
|
|
|
# define cant_migrate() do { } while (0)
|
2014-09-24 16:18:49 +08:00
|
|
|
# define sched_annotate_sleep() do { } while (0)
|
2019-08-27 04:14:23 +08:00
|
|
|
# define non_block_start() do { } while (0)
|
|
|
|
# define non_block_end() do { } while (0)
|
2005-04-17 06:20:36 +08:00
|
|
|
#endif
|
|
|
|
|
2006-06-23 17:05:42 +08:00
|
|
|
#define might_sleep_if(cond) do { if (cond) might_sleep(); } while (0)
|
2005-06-26 05:57:39 +08:00
|
|
|
|
2013-12-13 09:12:24 +08:00
|
|
|
#if defined(CONFIG_MMU) && \
|
|
|
|
(defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP))
|
sched/preempt, mm/fault: Trigger might_sleep() in might_fault() with disabled pagefaults
Commit 662bbcb2747c ("mm, sched: Allow uaccess in atomic with
pagefault_disable()") removed might_sleep() checks for all user access
code (that uses might_fault()).
The reason was to disable wrong "sleep in atomic" warnings in the
following scenario:
pagefault_disable()
rc = copy_to_user(...)
pagefault_enable()
Which is valid, as pagefault_disable() increments the preempt counter
and therefore disables the pagefault handler. copy_to_user() will not
sleep and return an error code if a page is not available.
However, as all might_sleep() checks are removed,
CONFIG_DEBUG_ATOMIC_SLEEP would no longer detect the following scenario:
spin_lock(&lock);
rc = copy_to_user(...)
spin_unlock(&lock)
If the kernel is compiled with preemption turned on, preempt_disable()
will make in_atomic() detect disabled preemption. The fault handler would
correctly never sleep on user access.
However, with preemption turned off, preempt_disable() is usually a NOP
(with !CONFIG_PREEMPT_COUNT), therefore in_atomic() will not be able to
detect disabled preemption nor disabled pagefaults. The fault handler
could sleep.
We really want to enable CONFIG_DEBUG_ATOMIC_SLEEP checks for user access
functions again, otherwise we can end up with horrible deadlocks.
Root of all evil is that pagefault_disable() acts almost as
preempt_disable(), depending on preemption being turned on/off.
As we now have pagefault_disabled(), we can use it to distinguish
whether user acces functions might sleep.
Convert might_fault() into a makro that calls __might_fault(), to
allow proper file + line messages in case of a might_sleep() warning.
Reviewed-and-tested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: airlied@linux.ie
Cc: akpm@linux-foundation.org
Cc: benh@kernel.crashing.org
Cc: bigeasy@linutronix.de
Cc: borntraeger@de.ibm.com
Cc: daniel.vetter@intel.com
Cc: heiko.carstens@de.ibm.com
Cc: herbert@gondor.apana.org.au
Cc: hocko@suse.cz
Cc: hughd@google.com
Cc: mst@redhat.com
Cc: paulus@samba.org
Cc: ralf@linux-mips.org
Cc: schwidefsky@de.ibm.com
Cc: yang.shi@windriver.com
Link: http://lkml.kernel.org/r/1431359540-32227-3-git-send-email-dahi@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-11 23:52:07 +08:00
|
|
|
#define might_fault() __might_fault(__FILE__, __LINE__)
|
|
|
|
void __might_fault(const char *file, int line);
|
2008-09-10 19:37:17 +08:00
|
|
|
#else
|
2013-05-26 22:32:23 +08:00
|
|
|
static inline void might_fault(void) { }
|
2008-09-10 19:37:17 +08:00
|
|
|
#endif
|
|
|
|
|
2016-09-14 00:37:29 +08:00
|
|
|
void do_exit(long error_code) __noreturn;
|
2011-03-23 07:34:40 +08:00
|
|
|
|
2018-04-11 07:31:16 +08:00
|
|
|
extern int num_to_str(char *buf, int size,
|
|
|
|
unsigned long long num, unsigned int width);
|
2012-03-24 06:02:54 +08:00
|
|
|
|
2011-11-01 08:13:10 +08:00
|
|
|
/* lib/printf utilities */
|
|
|
|
|
2011-11-01 08:11:33 +08:00
|
|
|
extern __printf(2, 3) int sprintf(char *buf, const char * fmt, ...);
|
|
|
|
extern __printf(2, 0) int vsprintf(char *buf, const char *, va_list);
|
|
|
|
extern __printf(3, 4)
|
|
|
|
int snprintf(char *buf, size_t size, const char *fmt, ...);
|
|
|
|
extern __printf(3, 0)
|
|
|
|
int vsnprintf(char *buf, size_t size, const char *fmt, va_list args);
|
|
|
|
extern __printf(3, 4)
|
|
|
|
int scnprintf(char *buf, size_t size, const char *fmt, ...);
|
|
|
|
extern __printf(3, 0)
|
|
|
|
int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
|
2016-05-20 08:10:55 +08:00
|
|
|
extern __printf(2, 3) __malloc
|
2011-11-01 08:11:33 +08:00
|
|
|
char *kasprintf(gfp_t gfp, const char *fmt, ...);
|
2016-05-20 08:10:55 +08:00
|
|
|
extern __printf(2, 0) __malloc
|
2015-07-18 07:23:42 +08:00
|
|
|
char *kvasprintf(gfp_t gfp, const char *fmt, va_list args);
|
2015-11-07 08:31:20 +08:00
|
|
|
extern __printf(2, 0)
|
|
|
|
const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2012-03-24 06:02:16 +08:00
|
|
|
extern __scanf(2, 3)
|
|
|
|
int sscanf(const char *, const char *, ...);
|
|
|
|
extern __scanf(2, 0)
|
|
|
|
int vsscanf(const char *, const char *, va_list);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2021-06-29 10:34:52 +08:00
|
|
|
extern int no_hash_pointers_enable(char *str);
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
extern int get_option(char **str, int *pint);
|
|
|
|
extern char *get_options(const char *str, int nints, int *ints);
|
2008-07-25 07:27:46 +08:00
|
|
|
extern unsigned long long memparse(const char *ptr, char **retptr);
|
2014-08-14 17:15:27 +08:00
|
|
|
extern bool parse_option_str(const char *str, const char *option);
|
2017-04-17 21:34:56 +08:00
|
|
|
extern char *next_arg(char *args, char **param, char **val);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2006-05-16 00:44:06 +08:00
|
|
|
extern int core_kernel_text(unsigned long addr);
|
2005-04-17 06:20:36 +08:00
|
|
|
extern int __kernel_text_address(unsigned long addr);
|
|
|
|
extern int kernel_text_address(unsigned long addr);
|
2008-08-16 06:29:38 +08:00
|
|
|
extern int func_ptr_is_kernel_text(void *ptr);
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
extern void bust_spinlocks(int yes);
|
2015-07-01 05:57:46 +08:00
|
|
|
|
2008-02-08 20:19:31 +08:00
|
|
|
extern int root_mountflags;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2011-01-20 19:06:35 +08:00
|
|
|
extern bool early_boot_irqs_disabled;
|
|
|
|
|
2017-05-17 02:42:47 +08:00
|
|
|
/*
|
|
|
|
* Values used for system_state. Ordering of the states must not be changed
|
|
|
|
* as code checks for <, <=, >, >= STATE.
|
|
|
|
*/
|
2005-04-17 06:20:36 +08:00
|
|
|
extern enum system_states {
|
|
|
|
SYSTEM_BOOTING,
|
2017-05-17 02:42:47 +08:00
|
|
|
SYSTEM_SCHEDULING,
|
2021-11-06 04:40:40 +08:00
|
|
|
SYSTEM_FREEING_INITMEM,
|
2005-04-17 06:20:36 +08:00
|
|
|
SYSTEM_RUNNING,
|
|
|
|
SYSTEM_HALT,
|
|
|
|
SYSTEM_POWER_OFF,
|
|
|
|
SYSTEM_RESTART,
|
2018-05-25 23:54:41 +08:00
|
|
|
SYSTEM_SUSPEND,
|
2005-04-17 06:20:36 +08:00
|
|
|
} system_state;
|
|
|
|
|
2008-05-15 07:05:49 +08:00
|
|
|
extern const char hex_asc[];
|
|
|
|
#define hex_asc_lo(x) hex_asc[((x) & 0x0f)]
|
|
|
|
#define hex_asc_hi(x) hex_asc[((x) & 0xf0) >> 4]
|
|
|
|
|
2011-11-01 08:12:41 +08:00
|
|
|
static inline char *hex_byte_pack(char *buf, u8 byte)
|
2008-05-15 07:05:49 +08:00
|
|
|
{
|
|
|
|
*buf++ = hex_asc_hi(byte);
|
|
|
|
*buf++ = hex_asc_lo(byte);
|
|
|
|
return buf;
|
|
|
|
}
|
2007-05-11 13:22:39 +08:00
|
|
|
|
2013-09-14 01:37:12 +08:00
|
|
|
extern const char hex_asc_upper[];
|
|
|
|
#define hex_asc_upper_lo(x) hex_asc_upper[((x) & 0x0f)]
|
|
|
|
#define hex_asc_upper_hi(x) hex_asc_upper[((x) & 0xf0) >> 4]
|
|
|
|
|
|
|
|
static inline char *hex_byte_pack_upper(char *buf, u8 byte)
|
|
|
|
{
|
|
|
|
*buf++ = hex_asc_upper_hi(byte);
|
|
|
|
*buf++ = hex_asc_upper_lo(byte);
|
|
|
|
return buf;
|
|
|
|
}
|
|
|
|
|
2022-04-25 20:07:48 +08:00
|
|
|
extern int hex_to_bin(unsigned char ch);
|
2011-09-20 23:23:49 +08:00
|
|
|
extern int __must_check hex2bin(u8 *dst, const char *src, size_t count);
|
2014-09-17 00:36:01 +08:00
|
|
|
extern char *bin2hex(char *dst, const void *src, size_t count);
|
2010-05-25 05:33:23 +08:00
|
|
|
|
2014-06-25 02:20:48 +08:00
|
|
|
bool mac_pton(const char *s, u8 *mac);
|
2013-06-05 00:46:26 +08:00
|
|
|
|
2009-03-05 17:28:45 +08:00
|
|
|
/*
|
|
|
|
* General tracing related utility functions - trace_printk(),
|
2009-03-05 23:35:56 +08:00
|
|
|
* tracing_on/tracing_off and tracing_start()/tracing_stop
|
|
|
|
*
|
|
|
|
* Use tracing_on/tracing_off when you want to quickly turn on or off
|
|
|
|
* tracing. It simply enables or disables the recording of the trace events.
|
2009-06-02 14:01:37 +08:00
|
|
|
* This also corresponds to the user space /sys/kernel/debug/tracing/tracing_on
|
2009-03-05 23:35:56 +08:00
|
|
|
* file, which gives a means for the kernel and userspace to interact.
|
|
|
|
* Place a tracing_off() in the kernel where you want tracing to end.
|
|
|
|
* From user space, examine the trace, and then echo 1 > tracing_on
|
|
|
|
* to continue tracing.
|
|
|
|
*
|
|
|
|
* tracing_stop/tracing_start has slightly more overhead. It is used
|
|
|
|
* by things like suspend to ram where disabling the recording of the
|
|
|
|
* trace is not enough, but tracing must actually stop because things
|
|
|
|
* like calling smp_processor_id() may crash the system.
|
|
|
|
*
|
|
|
|
* Most likely, you want to use tracing_on/tracing_off.
|
2009-03-05 17:28:45 +08:00
|
|
|
*/
|
2010-04-19 01:08:41 +08:00
|
|
|
|
|
|
|
enum ftrace_dump_mode {
|
|
|
|
DUMP_NONE,
|
|
|
|
DUMP_ALL,
|
|
|
|
DUMP_ORIG,
|
|
|
|
};
|
|
|
|
|
2009-03-05 17:28:45 +08:00
|
|
|
#ifdef CONFIG_TRACING
|
2012-03-21 00:28:29 +08:00
|
|
|
void tracing_on(void);
|
|
|
|
void tracing_off(void);
|
|
|
|
int tracing_is_on(void);
|
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 10:45:37 +08:00
|
|
|
void tracing_snapshot(void);
|
|
|
|
void tracing_snapshot_alloc(void);
|
2012-03-21 00:28:29 +08:00
|
|
|
|
2009-03-05 17:28:45 +08:00
|
|
|
extern void tracing_start(void);
|
|
|
|
extern void tracing_stop(void);
|
|
|
|
|
2011-11-01 08:11:33 +08:00
|
|
|
static inline __printf(1, 2)
|
|
|
|
void ____trace_printk_check_format(const char *fmt, ...)
|
2009-03-07 00:21:49 +08:00
|
|
|
{
|
|
|
|
}
|
|
|
|
#define __trace_printk_check_format(fmt, args...) \
|
|
|
|
do { \
|
|
|
|
if (0) \
|
|
|
|
____trace_printk_check_format(fmt, ##args); \
|
|
|
|
} while (0)
|
|
|
|
|
2009-03-05 17:28:45 +08:00
|
|
|
/**
|
|
|
|
* trace_printk - printf formatting in the ftrace buffer
|
|
|
|
* @fmt: the printf format for printing
|
|
|
|
*
|
2017-10-14 06:58:11 +08:00
|
|
|
* Note: __trace_printk is an internal function for trace_printk() and
|
|
|
|
* the @ip is passed in via the trace_printk() macro.
|
2009-03-05 17:28:45 +08:00
|
|
|
*
|
|
|
|
* This function allows a kernel developer to debug fast path sections
|
|
|
|
* that printk is not appropriate for. By scattering in various
|
|
|
|
* printk like tracing in the code, a developer can quickly see
|
|
|
|
* where problems are occurring.
|
|
|
|
*
|
|
|
|
* This is intended as a debugging tool for the developer only.
|
|
|
|
* Please refrain from leaving trace_printks scattered around in
|
2013-03-09 10:02:34 +08:00
|
|
|
* your code. (Extra memory is used for special buffers that are
|
2017-10-14 06:58:11 +08:00
|
|
|
* allocated when trace_printk() is used.)
|
tracing: Optimize trace_printk() with one arg to use trace_puts()
Although trace_printk() is extremely fast, especially when it uses
trace_bprintk() (writes args straight to buffer instead of inserting
into string), it still has the overhead of calling one of the printf
sprintf() functions, that need to scan the fmt string to determine
what, if any args it has.
This is a waste of precious CPU cycles if the printk format has no
args but a single constant string. It is better to use trace_puts()
which does not have the overhead of the fmt scanning.
But wouldn't it be nice if the developer didn't have to think about
such things, and the compile would just do it for them?
trace_printk("this string has no args\n");
[...]
trace_printk("this sting does %p %d\n", foo, bar);
As tracing is critical to have the least amount of overhead,
especially when dealing with race conditions, and you want to
eliminate any "Heisenbugs", you want the trace_printk() to use the
fastest possible means of tracing.
Currently the macro magic determines if it will use trace_bprintk()
or if the fmt is a dynamic string (a variable), it will fall
back to the slow trace_printk() method that does a full snprintf()
before copying it into the buffer, where as trace_bprintk() only
copys the pointer to the fmt and the args into the buffer.
Well, now there's a way to spend some more Hogwarts cash and come
up with new fancy macro magic.
#define trace_printk(fmt, ...) \
do { \
char _______STR[] = __stringify((__VA_ARGS__)); \
if (sizeof(_______STR) > 3) \
do_trace_printk(fmt, ##__VA_ARGS__); \
else \
trace_puts(fmt); \
} while (0)
The above needs a bit of explaining (both here and in the comments).
By stringifying the __VA_ARGS__, we can, at compile time, determine
the number of args that are being passed to trace_printk(). The extra
parenthesis are required, otherwise the compiler complains about
too many parameters for __stringify if there is more than one arg.
When there are no args, the __stringify((__VA_ARGS__)) converts into
"()\0", a string of 3 characters. Anything else, will be a string
containing more than 3 characters. Now we assign that string to a
dynamic char array, and then take the sizeof() of that array.
If it is greater than 3 characters, we know trace_printk() has args
and we need to do the full "do_trace_printk()" on them, otherwise
it was only passed a single arg and we can optimize to use trace_puts().
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven "The King of Nasty Macros!" Rostedt <rostedt@goodmis.org>
2013-03-09 11:11:57 +08:00
|
|
|
*
|
2018-04-25 05:22:38 +08:00
|
|
|
* A little optimization trick is done here. If there's only one
|
tracing: Optimize trace_printk() with one arg to use trace_puts()
Although trace_printk() is extremely fast, especially when it uses
trace_bprintk() (writes args straight to buffer instead of inserting
into string), it still has the overhead of calling one of the printf
sprintf() functions, that need to scan the fmt string to determine
what, if any args it has.
This is a waste of precious CPU cycles if the printk format has no
args but a single constant string. It is better to use trace_puts()
which does not have the overhead of the fmt scanning.
But wouldn't it be nice if the developer didn't have to think about
such things, and the compile would just do it for them?
trace_printk("this string has no args\n");
[...]
trace_printk("this sting does %p %d\n", foo, bar);
As tracing is critical to have the least amount of overhead,
especially when dealing with race conditions, and you want to
eliminate any "Heisenbugs", you want the trace_printk() to use the
fastest possible means of tracing.
Currently the macro magic determines if it will use trace_bprintk()
or if the fmt is a dynamic string (a variable), it will fall
back to the slow trace_printk() method that does a full snprintf()
before copying it into the buffer, where as trace_bprintk() only
copys the pointer to the fmt and the args into the buffer.
Well, now there's a way to spend some more Hogwarts cash and come
up with new fancy macro magic.
#define trace_printk(fmt, ...) \
do { \
char _______STR[] = __stringify((__VA_ARGS__)); \
if (sizeof(_______STR) > 3) \
do_trace_printk(fmt, ##__VA_ARGS__); \
else \
trace_puts(fmt); \
} while (0)
The above needs a bit of explaining (both here and in the comments).
By stringifying the __VA_ARGS__, we can, at compile time, determine
the number of args that are being passed to trace_printk(). The extra
parenthesis are required, otherwise the compiler complains about
too many parameters for __stringify if there is more than one arg.
When there are no args, the __stringify((__VA_ARGS__)) converts into
"()\0", a string of 3 characters. Anything else, will be a string
containing more than 3 characters. Now we assign that string to a
dynamic char array, and then take the sizeof() of that array.
If it is greater than 3 characters, we know trace_printk() has args
and we need to do the full "do_trace_printk()" on them, otherwise
it was only passed a single arg and we can optimize to use trace_puts().
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven "The King of Nasty Macros!" Rostedt <rostedt@goodmis.org>
2013-03-09 11:11:57 +08:00
|
|
|
* argument, there's no need to scan the string for printf formats.
|
|
|
|
* The trace_puts() will suffice. But how can we take advantage of
|
|
|
|
* using trace_puts() when trace_printk() has only one argument?
|
|
|
|
* By stringifying the args and checking the size we can tell
|
|
|
|
* whether or not there are args. __stringify((__VA_ARGS__)) will
|
|
|
|
* turn into "()\0" with a size of 3 when there are no args, anything
|
|
|
|
* else will be bigger. All we need to do is define a string to this,
|
|
|
|
* and then take its size and compare to 3. If it's bigger, use
|
|
|
|
* do_trace_printk() otherwise, optimize it to trace_puts(). Then just
|
|
|
|
* let gcc optimize the rest.
|
2009-03-05 17:28:45 +08:00
|
|
|
*/
|
2009-03-07 00:21:49 +08:00
|
|
|
|
tracing: Optimize trace_printk() with one arg to use trace_puts()
Although trace_printk() is extremely fast, especially when it uses
trace_bprintk() (writes args straight to buffer instead of inserting
into string), it still has the overhead of calling one of the printf
sprintf() functions, that need to scan the fmt string to determine
what, if any args it has.
This is a waste of precious CPU cycles if the printk format has no
args but a single constant string. It is better to use trace_puts()
which does not have the overhead of the fmt scanning.
But wouldn't it be nice if the developer didn't have to think about
such things, and the compile would just do it for them?
trace_printk("this string has no args\n");
[...]
trace_printk("this sting does %p %d\n", foo, bar);
As tracing is critical to have the least amount of overhead,
especially when dealing with race conditions, and you want to
eliminate any "Heisenbugs", you want the trace_printk() to use the
fastest possible means of tracing.
Currently the macro magic determines if it will use trace_bprintk()
or if the fmt is a dynamic string (a variable), it will fall
back to the slow trace_printk() method that does a full snprintf()
before copying it into the buffer, where as trace_bprintk() only
copys the pointer to the fmt and the args into the buffer.
Well, now there's a way to spend some more Hogwarts cash and come
up with new fancy macro magic.
#define trace_printk(fmt, ...) \
do { \
char _______STR[] = __stringify((__VA_ARGS__)); \
if (sizeof(_______STR) > 3) \
do_trace_printk(fmt, ##__VA_ARGS__); \
else \
trace_puts(fmt); \
} while (0)
The above needs a bit of explaining (both here and in the comments).
By stringifying the __VA_ARGS__, we can, at compile time, determine
the number of args that are being passed to trace_printk(). The extra
parenthesis are required, otherwise the compiler complains about
too many parameters for __stringify if there is more than one arg.
When there are no args, the __stringify((__VA_ARGS__)) converts into
"()\0", a string of 3 characters. Anything else, will be a string
containing more than 3 characters. Now we assign that string to a
dynamic char array, and then take the sizeof() of that array.
If it is greater than 3 characters, we know trace_printk() has args
and we need to do the full "do_trace_printk()" on them, otherwise
it was only passed a single arg and we can optimize to use trace_puts().
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven "The King of Nasty Macros!" Rostedt <rostedt@goodmis.org>
2013-03-09 11:11:57 +08:00
|
|
|
#define trace_printk(fmt, ...) \
|
|
|
|
do { \
|
|
|
|
char _______STR[] = __stringify((__VA_ARGS__)); \
|
|
|
|
if (sizeof(_______STR) > 3) \
|
|
|
|
do_trace_printk(fmt, ##__VA_ARGS__); \
|
|
|
|
else \
|
|
|
|
trace_puts(fmt); \
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
#define do_trace_printk(fmt, args...) \
|
2009-03-07 00:21:49 +08:00
|
|
|
do { \
|
2016-03-23 05:30:58 +08:00
|
|
|
static const char *trace_printk_fmt __used \
|
2020-10-22 10:36:07 +08:00
|
|
|
__section("__trace_printk_fmt") = \
|
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-23 02:01:55 +08:00
|
|
|
__builtin_constant_p(fmt) ? fmt : NULL; \
|
|
|
|
\
|
2009-03-07 00:21:49 +08:00
|
|
|
__trace_printk_check_format(fmt, ##args); \
|
2009-03-13 01:24:49 +08:00
|
|
|
\
|
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-23 02:01:55 +08:00
|
|
|
if (__builtin_constant_p(fmt)) \
|
2009-03-13 01:24:49 +08:00
|
|
|
__trace_bprintk(_THIS_IP_, trace_printk_fmt, ##args); \
|
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-23 02:01:55 +08:00
|
|
|
else \
|
|
|
|
__trace_printk(_THIS_IP_, fmt, ##args); \
|
2009-03-07 00:21:49 +08:00
|
|
|
} while (0)
|
|
|
|
|
2011-11-01 08:11:33 +08:00
|
|
|
extern __printf(2, 3)
|
|
|
|
int __trace_bprintk(unsigned long ip, const char *fmt, ...);
|
2009-03-13 01:24:49 +08:00
|
|
|
|
2011-11-01 08:11:33 +08:00
|
|
|
extern __printf(2, 3)
|
|
|
|
int __trace_printk(unsigned long ip, const char *fmt, ...);
|
2009-03-07 00:21:49 +08:00
|
|
|
|
2013-03-09 10:02:34 +08:00
|
|
|
/**
|
|
|
|
* trace_puts - write a string into the ftrace buffer
|
|
|
|
* @str: the string to record
|
|
|
|
*
|
|
|
|
* Note: __trace_bputs is an internal function for trace_puts and
|
|
|
|
* the @ip is passed in via the trace_puts macro.
|
|
|
|
*
|
|
|
|
* This is similar to trace_printk() but is made for those really fast
|
2017-10-14 06:58:11 +08:00
|
|
|
* paths that a developer wants the least amount of "Heisenbug" effects,
|
2013-03-09 10:02:34 +08:00
|
|
|
* where the processing of the print format is still too much.
|
|
|
|
*
|
|
|
|
* This function allows a kernel developer to debug fast path sections
|
|
|
|
* that printk is not appropriate for. By scattering in various
|
|
|
|
* printk like tracing in the code, a developer can quickly see
|
|
|
|
* where problems are occurring.
|
|
|
|
*
|
|
|
|
* This is intended as a debugging tool for the developer only.
|
|
|
|
* Please refrain from leaving trace_puts scattered around in
|
|
|
|
* your code. (Extra memory is used for special buffers that are
|
2017-10-14 06:58:11 +08:00
|
|
|
* allocated when trace_puts() is used.)
|
2013-03-09 10:02:34 +08:00
|
|
|
*
|
|
|
|
* Returns: 0 if nothing was written, positive # if string was.
|
|
|
|
* (1 when __trace_bputs is used, strlen(str) when __trace_puts is used)
|
|
|
|
*/
|
|
|
|
|
|
|
|
#define trace_puts(str) ({ \
|
2016-03-23 05:30:58 +08:00
|
|
|
static const char *trace_printk_fmt __used \
|
2020-10-22 10:36:07 +08:00
|
|
|
__section("__trace_printk_fmt") = \
|
2013-03-09 10:02:34 +08:00
|
|
|
__builtin_constant_p(str) ? str : NULL; \
|
|
|
|
\
|
|
|
|
if (__builtin_constant_p(str)) \
|
|
|
|
__trace_bputs(_THIS_IP_, trace_printk_fmt); \
|
|
|
|
else \
|
|
|
|
__trace_puts(_THIS_IP_, str, strlen(str)); \
|
|
|
|
})
|
2013-05-02 23:26:13 +08:00
|
|
|
extern int __trace_bputs(unsigned long ip, const char *str);
|
|
|
|
extern int __trace_puts(unsigned long ip, const char *str, int size);
|
2013-03-09 10:02:34 +08:00
|
|
|
|
2013-03-13 21:55:57 +08:00
|
|
|
extern void trace_dump_stack(int skip);
|
2009-12-11 22:48:22 +08:00
|
|
|
|
2009-03-13 01:24:49 +08:00
|
|
|
/*
|
|
|
|
* The double __builtin_constant_p is because gcc will give us an error
|
|
|
|
* if we try to allocate the static variable to fmt if it is not a
|
|
|
|
* constant. Even with the outer if statement.
|
|
|
|
*/
|
2009-03-07 00:21:49 +08:00
|
|
|
#define ftrace_vprintk(fmt, vargs) \
|
|
|
|
do { \
|
2009-03-13 01:24:49 +08:00
|
|
|
if (__builtin_constant_p(fmt)) { \
|
2016-03-23 05:30:58 +08:00
|
|
|
static const char *trace_printk_fmt __used \
|
2020-10-22 10:36:07 +08:00
|
|
|
__section("__trace_printk_fmt") = \
|
2009-03-13 01:24:49 +08:00
|
|
|
__builtin_constant_p(fmt) ? fmt : NULL; \
|
2009-03-09 17:11:36 +08:00
|
|
|
\
|
2009-03-13 01:24:49 +08:00
|
|
|
__ftrace_vbprintk(_THIS_IP_, trace_printk_fmt, vargs); \
|
|
|
|
} else \
|
|
|
|
__ftrace_vprintk(_THIS_IP_, fmt, vargs); \
|
2009-03-07 00:21:49 +08:00
|
|
|
} while (0)
|
|
|
|
|
2015-07-18 07:23:42 +08:00
|
|
|
extern __printf(2, 0) int
|
2009-03-13 01:24:49 +08:00
|
|
|
__ftrace_vbprintk(unsigned long ip, const char *fmt, va_list ap);
|
|
|
|
|
2015-07-18 07:23:42 +08:00
|
|
|
extern __printf(2, 0) int
|
2009-03-05 17:28:45 +08:00
|
|
|
__ftrace_vprintk(unsigned long ip, const char *fmt, va_list ap);
|
2009-03-07 00:21:49 +08:00
|
|
|
|
2010-04-19 01:08:41 +08:00
|
|
|
extern void ftrace_dump(enum ftrace_dump_mode oops_dump_mode);
|
2009-03-05 17:28:45 +08:00
|
|
|
#else
|
|
|
|
static inline void tracing_start(void) { }
|
|
|
|
static inline void tracing_stop(void) { }
|
2013-08-03 02:47:29 +08:00
|
|
|
static inline void trace_dump_stack(int skip) { }
|
2012-03-21 00:28:29 +08:00
|
|
|
|
|
|
|
static inline void tracing_on(void) { }
|
|
|
|
static inline void tracing_off(void) { }
|
|
|
|
static inline int tracing_is_on(void) { return 0; }
|
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 10:45:37 +08:00
|
|
|
static inline void tracing_snapshot(void) { }
|
|
|
|
static inline void tracing_snapshot_alloc(void) { }
|
2012-03-21 00:28:29 +08:00
|
|
|
|
2012-10-25 21:41:51 +08:00
|
|
|
static inline __printf(1, 2)
|
|
|
|
int trace_printk(const char *fmt, ...)
|
2009-03-05 17:28:45 +08:00
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2015-07-18 07:23:42 +08:00
|
|
|
static __printf(1, 0) inline int
|
2009-03-05 17:28:45 +08:00
|
|
|
ftrace_vprintk(const char *fmt, va_list ap)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2010-04-19 01:08:41 +08:00
|
|
|
static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { }
|
2009-03-07 00:21:49 +08:00
|
|
|
#endif /* CONFIG_TRACING */
|
2009-03-05 17:28:45 +08:00
|
|
|
|
2018-03-29 03:05:36 +08:00
|
|
|
/* This counts to 12. Any more, it will return 13th argument. */
|
|
|
|
#define __COUNT_ARGS(_0, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _n, X...) _n
|
|
|
|
#define COUNT_ARGS(X...) __COUNT_ARGS(, ##X, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
|
|
|
|
|
|
|
|
#define __CONCAT(a, b) a ## b
|
|
|
|
#define CONCATENATE(a, b) __CONCAT(a, b)
|
|
|
|
|
2011-07-26 08:13:03 +08:00
|
|
|
/* Rebuild everything on CONFIG_FTRACE_MCOUNT_RECORD */
|
|
|
|
#ifdef CONFIG_FTRACE_MCOUNT_RECORD
|
|
|
|
# define REBUILD_DUE_TO_FTRACE_MCOUNT_RECORD
|
|
|
|
#endif
|
2011-07-26 08:13:02 +08:00
|
|
|
|
2014-03-24 09:30:34 +08:00
|
|
|
/* Permissions on a sysfs file: you didn't miss the 0 prefix did you? */
|
2015-05-27 09:39:38 +08:00
|
|
|
#define VERIFY_OCTAL_PERMISSIONS(perms) \
|
|
|
|
(BUILD_BUG_ON_ZERO((perms) < 0) + \
|
|
|
|
BUILD_BUG_ON_ZERO((perms) > 0777) + \
|
|
|
|
/* USER_READABLE >= GROUP_READABLE >= OTHER_READABLE */ \
|
|
|
|
BUILD_BUG_ON_ZERO((((perms) >> 6) & 4) < (((perms) >> 3) & 4)) + \
|
|
|
|
BUILD_BUG_ON_ZERO((((perms) >> 3) & 4) < ((perms) & 4)) + \
|
|
|
|
/* USER_WRITABLE >= GROUP_WRITABLE */ \
|
|
|
|
BUILD_BUG_ON_ZERO((((perms) >> 6) & 2) < (((perms) >> 3) & 2)) + \
|
|
|
|
/* OTHER_WRITABLE? Generally considered a bad idea. */ \
|
|
|
|
BUILD_BUG_ON_ZERO((perms) & 2) + \
|
2014-03-24 09:30:34 +08:00
|
|
|
(perms))
|
2005-04-17 06:20:36 +08:00
|
|
|
#endif
|