Pull livepatching updates from Jiri Kosina:
- support for something we call 'atomic replace', and allows for much
better handling of cumulative patches (which is something very useful
for distros), from Jason Baron with help of Petr Mladek and Joe
Lawrence
- improvement of handling of tasks blocking finalization, from Miroslav
Benes
- update of MAINTAINERS file to reflect move towards group
maintainership
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: (22 commits)
livepatch/selftests: use "$@" to preserve argument list
livepatch: Module coming and going callbacks can proceed with all listed patches
livepatch: Proper error handling in the shadow variables selftest
livepatch: return -ENOMEM on ptr_id() allocation failure
livepatch: Introduce klp_for_each_patch macro
livepatch: core: Return EOPNOTSUPP instead of ENOSYS
selftests/livepatch: add DYNAMIC_DEBUG config dependency
livepatch: samples: non static warnings fix
livepatch: update MAINTAINERS
livepatch: Remove signal sysfs attribute
livepatch: Send a fake signal periodically
selftests/livepatch: introduce tests
livepatch: Remove ordering (stacking) of the livepatches
livepatch: Atomic replace and cumulative patches documentation
livepatch: Remove Nop structures when unused
livepatch: Add atomic replace
livepatch: Use lists to manage patches, objects and functions
livepatch: Simplify API by removing registration step
livepatch: Don't block the removal of patches loaded after a forced transition
livepatch: Consolidate klp_free functions
...
Merge more updates from Andrew Morton:
- some of the rest of MM
- various misc things
- dynamic-debug updates
- checkpatch
- some epoll speedups
- autofs
- rapidio
- lib/, lib/lzo/ updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits)
samples/mic/mpssd/mpssd.h: remove duplicate header
kernel/fork.c: remove duplicated include
include/linux/relay.h: fix percpu annotation in struct rchan
arch/nios2/mm/fault.c: remove duplicate include
unicore32: stop printing the virtual memory layout
MAINTAINERS: fix GTA02 entry and mark as orphan
mm: create the new vm_fault_t type
arm, s390, unicore32: remove oneliner wrappers for memblock_alloc()
arch: simplify several early memory allocations
openrisc: simplify pte_alloc_one_kernel()
sh: prefer memblock APIs returning virtual address
microblaze: prefer memblock API returning virtual address
powerpc: prefer memblock APIs returning virtual address
lib/lzo: separate lzo-rle from lzo
lib/lzo: implement run-length encoding
lib/lzo: fast 8-byte copy on arm64
lib/lzo: 64-bit CTZ on arm64
lib/lzo: tidy-up ifdefs
ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size
ipc: annotate implicit fall through
...
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided refcount_t
type and API that prevents accidental counter overflows and underflows.
This is important since overflows and underflows can lead to
use-after-free situation and be exploitable.
The variable kcov.refcount is used as pure reference counter. Convert
it to refcount_t and fix up the operations.
**Important note for maintainers:
Some functions from refcount_t API defined in lib/refcount.c have
different memory ordering guarantees than their atomic counterparts.
The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57
and it is hopefully soon in state to be merged to the documentation
tree. Normally the differences should not matter since refcount_t
provides enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter. Please double check that you don't
have some undocumented memory guarantees for this variable usage.
For the kcov.refcount it might make a difference
in following places:
- kcov_put(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart
Link: http://lkml.kernel.org/r/1547634429-772-1-git-send-email-elena.reshetova@intel.com
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Link: http://lkml.kernel.org/r/20190122152151.16139-46-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This slightly optimizes the kernel/configs.c build.
bin2c is not very efficient because it converts a data file into a huge
array to embed it into a *.c file.
Instead, we can use the .incbin directive.
Also, this simplifies the code; Makefile is cleaner, and the way to get
the offset/size of the config_data.gz is more straightforward.
I used the "asm" statement in *.c instead of splitting it into *.S
because MODULE_* tags are not supported in *.S files.
I also cleaned up kernel/.gitignore; "config_data.gz" is unneeded
because the top-level .gitignore takes care of the "*.gz" pattern.
[yamada.masahiro@socionext.com: v2]
Link: http://lkml.kernel.org/r/1550108893-21226-1-git-send-email-yamada.masahiro@socionext.com
Link: http://lkml.kernel.org/r/1549941160-8084-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct foo {
int stuff;
void *entry[];
};
instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
This code was detected with the help of Coccinelle.
Link: http://lkml.kernel.org/r/20190109172445.GA15908@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, when writing
echo 18446744073709551616 > /proc/sys/fs/file-max
/proc/sys/fs/file-max will overflow and be set to 0. That quickly
crashes the system.
This commit sets the max and min value for file-max. The max value is
set to long int. Any higher value cannot currently be used as the
percpu counters are long ints and not unsigned integers.
Note that the file-max value is ultimately parsed via
__do_proc_doulongvec_minmax(). This function does not report error when
min or max are exceeded. Which means if a value largen that long int is
written userspace will not receive an error instead the old value will be
kept. There is an argument to be made that this should be changed and
__do_proc_doulongvec_minmax() should return an error when a dedicated min
or max value are exceeded. However this has the potential to break
userspace so let's defer this to an RFC patch.
Link: http://lkml.kernel.org/r/20190107222700.15954-3-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Waiman Long <longman@redhat.com>
[christian@brauner.io: v4]
Link: http://lkml.kernel.org/r/20190210203943.8227-3-christian@brauner.io
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
proc_get_long() is a funny function. It uses simple_strtoul() and for a
good reason. proc_get_long() wants to always succeed the parse and
return the maybe incorrect value and the trailing characters to check
against a pre-defined list of acceptable trailing values. However,
simple_strtoul() explicitly ignores overflows which can cause funny
things like the following to happen:
echo 18446744073709551616 > /proc/sys/fs/file-max
cat /proc/sys/fs/file-max
0
(Which will cause your system to silently die behind your back.)
On the other hand kstrtoul() does do overflow detection but does not
return the trailing characters, and also fails the parse when anything
other than '\n' is a trailing character whereas proc_get_long() wants to
be more lenient.
Now, before adding another kstrtoul() function let's simply add a static
parse strtoul_lenient() which:
- fails on overflow with -ERANGE
- returns the trailing characters to the caller
The reason why we should fail on ERANGE is that we already do a partial
fail on overflow right now. Namely, when the TMPBUFLEN is exceeded. So
we already reject values such as 184467440737095516160 (21 chars) but
accept values such as 18446744073709551616 (20 chars) but both are
overflows. So we should just always reject 64bit overflows and not
special-case this based on the number of chars.
Link: http://lkml.kernel.org/r/20190107222700.15954-2-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This function can only be called safely from very specific scheduler
contexts. Document those.
Link: http://lkml.kernel.org/r/20190206150528.31198-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For symmetry with ddebug_remove_module, and to avoid a bit of ifdeffery
in module.c, move the declaration of ddebug_add_module inside #if
defined(CONFIG_DYNAMIC_DEBUG) and add a corresponding no-op stub in the
#else branch.
Link: http://lkml.kernel.org/r/20190212214150.4807-10-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Jason Baron <jbaron@akamai.com>
Cc: David Sterba <dsterba@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This serves two purposes: First, we get a diagnostic if (though
extremely unlikely), any of the calls of ddebug_add_module for built-in
code fails, effectively disabling dynamic_debug. Second, I want to make
struct _ddebug opaque, and avoid accessing any of its members outside
dynamic_debug.[ch].
Link: http://lkml.kernel.org/r/20190212214150.4807-9-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Jason Baron <jbaron@akamai.com>
Cc: David Sterba <dsterba@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is a plan to build the kernel with -Wimplicit-fallthrough and this
place in the code produced a warning (W=1).
This commit remove the following warning:
kernel/sys.c:1748:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
Link: http://lkml.kernel.org/r/20190114203347.17530-1-malat@debian.org
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since commit a2e5144538 ("kernel/hung_task.c: allow to set checking
interval separately from timeout") added hung_task_check_interval_secs,
setting a value different from hung_task_timeout_secs
echo 0 > /proc/sys/kernel/hung_task_panic
echo 120 > /proc/sys/kernel/hung_task_timeout_secs
echo 5 > /proc/sys/kernel/hung_task_check_interval_secs
causes confusing output as if the task was blocked for
hung_task_timeout_secs seconds from the previous report.
[ 399.395930] INFO: task kswapd0:75 blocked for more than 120 seconds.
[ 405.027637] INFO: task kswapd0:75 blocked for more than 120 seconds.
[ 410.659725] INFO: task kswapd0:75 blocked for more than 120 seconds.
[ 416.292860] INFO: task kswapd0:75 blocked for more than 120 seconds.
[ 421.932305] INFO: task kswapd0:75 blocked for more than 120 seconds.
Although we could update t->last_switch_time after sched_show_task(t) if
we want to report only every 120 seconds, reporting every 5 seconds
might not be very bad for monitoring after a problematic situation has
started. Thus, let's use continuously blocked time instead of updating
previously reported time.
[ 677.985011] INFO: task kswapd0:80 blocked for more than 122 seconds.
[ 693.856126] INFO: task kswapd0:80 blocked for more than 138 seconds.
[ 709.728075] INFO: task kswapd0:80 blocked for more than 154 seconds.
[ 725.600018] INFO: task kswapd0:80 blocked for more than 170 seconds.
[ 741.473133] INFO: task kswapd0:80 blocked for more than 186 seconds.
Link: http://lkml.kernel.org/r/1551175083-10669-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sparse complains:
CHECK kernel/hung_task.c
kernel/hung_task.c:28:19: warning: symbol 'sysctl_hung_task_check_count' was not declared. Should it be static?
kernel/hung_task.c:42:29: warning: symbol 'sysctl_hung_task_timeout_secs' was not declared. Should it be static?
kernel/hung_task.c:47:29: warning: symbol 'sysctl_hung_task_check_interval_secs' was not declared. Should it be static?
kernel/hung_task.c:49:19: warning: symbol 'sysctl_hung_task_warnings' was not declared. Should it be static?
kernel/hung_task.c:61:28: warning: symbol 'sysctl_hung_task_panic' was not declared. Should it be static?
kernel/hung_task.c:219:5: warning: symbol 'proc_dohung_task_timeout_secs' was not declared. Should it be static?
Add the appropriate header file to provide declarations.
Link: http://lkml.kernel.org/r/467.1548649525@turing-police.cc.vt.edu
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Cc: "Paul E. McKenney" <paulmck@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE for
debugfs files.
Semantic patch information:
Rationale: DEFINE_SIMPLE_ATTRIBUTE + debugfs_create_file()
imposes some significant overhead as compared to
DEFINE_DEBUGFS_ATTRIBUTE + debugfs_create_file_unsafe().
Generated by: scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci
The _unsafe() part suggests that some of them "safeness
responsibilities" are now panic.c responsibilities. The patch is OK
since panic's clear_warn_once_fops struct file_operations is safe
against removal, so we don't have to use otherwise necessary
debugfs_file_get()/debugfs_file_put().
[sergey.senozhatsky.work@gmail.com: changelog addition]
Link: http://lkml.kernel.org/r/1545990861-158097-1-git-send-email-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Petr Mladek <pmladek@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Notable changes:
- Enable THREAD_INFO_IN_TASK to move thread_info off the stack.
- A big series from Christoph reworking our DMA code to use more of the generic
infrastructure, as he said:
"This series switches the powerpc port to use the generic swiotlb and
noncoherent dma ops, and to use more generic code for the coherent direct
mapping, as well as removing a lot of dead code."
- Increase our vmalloc space to 512T with the Hash MMU on modern CPUs, allowing
us to support machines with larger amounts of total RAM or distance between
nodes.
- Two series from Christophe, one to optimise TLB miss handlers on 6xx, and
another to optimise the way STRICT_KERNEL_RWX is implemented on some 32-bit
CPUs.
- Support for KCOV coverage instrumentation which means we can run syzkaller
and discover even more bugs in our code.
And as always many clean-ups, reworks and minor fixes etc.
Thanks to:
Alan Modra, Alexey Kardashevskiy, Alistair Popple, Andrea Arcangeli, Andrew
Donnellan, Aneesh Kumar K.V, Aravinda Prasad, Balbir Singh, Brajeswar Ghosh,
Breno Leitao, Christian Lamparter, Christian Zigotzky, Christophe Leroy,
Christoph Hellwig, Corentin Labbe, Daniel Axtens, David Gibson, Diana Craciun,
Firoz Khan, Gustavo A. R. Silva, Igor Stoppa, Joe Lawrence, Joel Stanley,
Jonathan Neuschäfer, Jordan Niethe, Laurent Dufour, Madhavan Srinivasan, Mahesh
Salgaonkar, Mark Cave-Ayland, Masahiro Yamada, Mathieu Malaterre, Matteo Croce,
Meelis Roos, Michael W. Bringmann, Nathan Chancellor, Nathan Fontenot, Nicholas
Piggin, Nick Desaulniers, Nicolai Stange, Oliver O'Halloran, Paul Mackerras,
Peter Xu, PrasannaKumar Muralidharan, Qian Cai, Rashmica Gupta, Reza Arbab,
Robert P. J. Day, Russell Currey, Sabyasachi Gupta, Sam Bobroff, Sandipan Das,
Sergey Senozhatsky, Souptick Joarder, Stewart Smith, Tyrel Datwyler, Vaibhav
Jain, YueHaibing.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJcgRJlAAoJEFHr6jzI4aWAL9oP+gPlrZgyaAg/51lmubLtlbtk
QuGU8EiuJZoJD1OHrMPtppBOY7rQZOxJe58AoPig8wTvs+j/TxJ25fmiZncnf5U2
PC8QAjbj0UmQHgy+K30sUeOnDg9tdkHKHJ5/ecjJcvykkqsjyMnV7biFQ1cOA0HT
LflXHEEtiG9P9u7jZoAhtnfpgn1/l9mhTYMe26J1fqvC0164qMDFaXDTQXyDfyvG
gmuqccGMawSk7IdagmQxwXtwyfwOnarmGn+n31XKRejApGZ/pjiEA23JOJOaJcia
m76Jy3roao6sEtCUNpBFXEtwOy9POy3OiGy6yg/9896tDMvG84OuO6ltV1nFGawL
PmwE+ug63L4g/HWxZyAeb26T2oTTp/YIaKQPtsq4d286pvg/qr2KPNzFoAEhmJqU
yLrebv276pVeiLpLmCLPvcPj9t76vWKZaUm0FoE+zUDg7Rl7Alow8A/c4tdjOI6y
QwpbCiYseyiJ32lCZZdbN7Cy6+iM6vb3i1oNKc8MVqhBGTwLJnTU0ruPBSvCaRvD
NoQWO1RWpNu/BuivuLEKS9q3AoxenGwiqowxGhdVmI3Oc9jGWcEYlduR00VDYPVp
/RCfwtTY5NyC++h5cnbz8aLJ1hBXG5m79CXfprV+zPWeiLPCaMT6w9Y5QUS2wqA+
EZ734NknDJOjaHc4cGdZ
=Z9bb
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Notable changes:
- Enable THREAD_INFO_IN_TASK to move thread_info off the stack.
- A big series from Christoph reworking our DMA code to use more of
the generic infrastructure, as he said:
"This series switches the powerpc port to use the generic swiotlb
and noncoherent dma ops, and to use more generic code for the
coherent direct mapping, as well as removing a lot of dead
code."
- Increase our vmalloc space to 512T with the Hash MMU on modern
CPUs, allowing us to support machines with larger amounts of total
RAM or distance between nodes.
- Two series from Christophe, one to optimise TLB miss handlers on
6xx, and another to optimise the way STRICT_KERNEL_RWX is
implemented on some 32-bit CPUs.
- Support for KCOV coverage instrumentation which means we can run
syzkaller and discover even more bugs in our code.
And as always many clean-ups, reworks and minor fixes etc.
Thanks to: Alan Modra, Alexey Kardashevskiy, Alistair Popple, Andrea
Arcangeli, Andrew Donnellan, Aneesh Kumar K.V, Aravinda Prasad, Balbir
Singh, Brajeswar Ghosh, Breno Leitao, Christian Lamparter, Christian
Zigotzky, Christophe Leroy, Christoph Hellwig, Corentin Labbe, Daniel
Axtens, David Gibson, Diana Craciun, Firoz Khan, Gustavo A. R. Silva,
Igor Stoppa, Joe Lawrence, Joel Stanley, Jonathan Neuschäfer, Jordan
Niethe, Laurent Dufour, Madhavan Srinivasan, Mahesh Salgaonkar, Mark
Cave-Ayland, Masahiro Yamada, Mathieu Malaterre, Matteo Croce, Meelis
Roos, Michael W. Bringmann, Nathan Chancellor, Nathan Fontenot,
Nicholas Piggin, Nick Desaulniers, Nicolai Stange, Oliver O'Halloran,
Paul Mackerras, Peter Xu, PrasannaKumar Muralidharan, Qian Cai,
Rashmica Gupta, Reza Arbab, Robert P. J. Day, Russell Currey,
Sabyasachi Gupta, Sam Bobroff, Sandipan Das, Sergey Senozhatsky,
Souptick Joarder, Stewart Smith, Tyrel Datwyler, Vaibhav Jain,
YueHaibing"
* tag 'powerpc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits)
powerpc/32: Clear on-stack exception marker upon exception return
powerpc: Remove export of save_stack_trace_tsk_reliable()
powerpc/mm: fix "section_base" set but not used
powerpc/mm: Fix "sz" set but not used warning
powerpc/mm: Check secondary hash page table
powerpc: remove nargs from __SYSCALL
powerpc/64s: Fix unrelocated interrupt trampoline address test
powerpc/powernv/ioda: Fix locked_vm counting for memory used by IOMMU tables
powerpc/fsl: Fix the flush of branch predictor.
powerpc/powernv: Make opal log only readable by root
powerpc/xmon: Fix opcode being uninitialized in print_insn_powerpc
powerpc/powernv: move OPAL call wrapper tracing and interrupt handling to C
powerpc/64s: Fix data interrupts vs d-side MCE reentrancy
powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy
powerpc/64s: system reset interrupt preserve HSRRs
powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
powerpc/mm/hash: Handle mmap_min_addr correctly in get_unmapped_area topdown search
powerpc/hugetlb: Handle mmap_min_addr correctly in get_unmapped_area callback
selftests/powerpc: Remove duplicate header
powerpc sstep: Add support for modsd, modud instructions
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAlx+8ZgUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOlDhAAiGlirQ9syyG2fYzaARZZ2QoU/GGD
PSAeiNmP3jvJzXArCvugRCw+YSNDdQOBM3SrLQC+cM0MAIDRYXN0NdcrsbTchlMA
51Fx1egZ9Fyj+Ehgida3muh2lRUy7DQwMCL6tAVqwz7vYkSTGDUf+MlYqOqXDka5
74pEExOS3Jdi7560BsE8b6QoW9JIJqEJnirXGkG9o2qC0oFHCR6PKxIyQ7TJrLR1
F23aFTqLTH1nbPUQjnox2PTf13iQVh4j2gwzd+9c9KBfxoGSge3dmxId7BJHy2aG
M27fPdCYTNZAGWpPVujsCPAh1WPQ9NQqg3mA9+g14PEbiLqPcqU+kWmnDU7T7bEw
Qx0kt6Y8GiknwCqq8pDbKYclgRmOjSGdfutzd0z8uDpbaeunS4/NqnDb/FUaDVcr
jA4d6ep7qEgHpYbL8KgOeZCexfaTfz6mcwRWNq3Uu9cLZbZqSSQ7PXolMADHvoRs
LS7VH2jcP7q4p4GWmdfjv67xyUUo9HG5HHX74h5pLfQSYXiBWo4ht0UOAzX/6EcE
CJNHAFHv+OanI5Rg/6JQ8b3/bJYxzAJVyLZpCuMtlKk6lYBGNeADk9BezEDIYsm8
tSe4/GqqyR9+Qz8rSdpAZ0KKkfqS535IcHUPUJau7Bzg1xqSEP5gzZN6QsjdXg0+
5wFFfdFICTfJFXo=
=57/1
-----END PGP SIGNATURE-----
Merge tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
"A lucky 13 audit patches for v5.1.
Despite the rather large diffstat, most of the changes are from two
bug fix patches that move code from one Kconfig option to another.
Beyond that bit of churn, the remaining changes are largely cleanups
and bug-fixes as we slowly march towards container auditing. It isn't
all boring though, we do have a couple of new things: file
capabilities v3 support, and expanded support for filtering on
filesystems to solve problems with remote filesystems.
All changes pass the audit-testsuite. Please merge for v5.1"
* tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: mark expected switch fall-through
audit: hide auditsc_get_stamp and audit_serial prototypes
audit: join tty records to their syscall
audit: remove audit_context when CONFIG_ AUDIT and not AUDITSYSCALL
audit: remove unused actx param from audit_rule_match
audit: ignore fcaps on umount
audit: clean up AUDITSYSCALL prototypes and stubs
audit: more filter PATH records keyed on filesystem magic
audit: add support for fcaps v3
audit: move loginuid and sessionid from CONFIG_AUDITSYSCALL to CONFIG_AUDIT
audit: add syscall information to CONFIG_CHANGE records
audit: hand taken context to audit_kill_trees for syscall logging
audit: give a clue what CONFIG_CHANGE op was involved
Pull security subsystem updates from James Morris:
- Extend LSM stacking to allow sharing of cred, file, ipc, inode, and
task blobs. This paves the way for more full-featured LSMs to be
merged, and is specifically aimed at LandLock and SARA LSMs. This
work is from Casey and Kees.
- There's a new LSM from Micah Morton: "SafeSetID gates the setid
family of syscalls to restrict UID/GID transitions from a given
UID/GID to only those approved by a system-wide whitelist." This
feature is currently shipping in ChromeOS.
* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (62 commits)
keys: fix missing __user in KEYCTL_PKEY_QUERY
LSM: Update list of SECURITYFS users in Kconfig
LSM: Ignore "security=" when "lsm=" is specified
LSM: Update function documentation for cap_capable
security: mark expected switch fall-throughs and add a missing break
tomoyo: Bump version.
LSM: fix return value check in safesetid_init_securityfs()
LSM: SafeSetID: add selftest
LSM: SafeSetID: remove unused include
LSM: SafeSetID: 'depend' on CONFIG_SECURITY
LSM: Add 'name' field for SafeSetID in DEFINE_LSM
LSM: add SafeSetID module that gates setid calls
LSM: add SafeSetID module that gates setid calls
tomoyo: Allow multiple use_group lines.
tomoyo: Coding style fix.
tomoyo: Swicth from cred->security to task_struct->security.
security: keys: annotate implicit fall throughs
security: keys: annotate implicit fall throughs
security: keys: annotate implicit fall through
capabilities:: annotate implicit fall through
...
Pull workqueue updates from Tejun Heo:
"All trivial. Two comment updates and one more initialization sanity
check in flush_work()"
* 'for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Fix spelling in source code comments
workqueue: fix typo in comment
workqueue: Try to catch flush_work() without INIT_WORK().
A small fix Pavel sent me back in august was accidentally lost due to it
being placed with some other patches that failed some tests, and was rebased
out of my local tree. Which was a regression that caused event filters
not to handle negative numbers.
The clean up is from Masami that realized that the code in kprobes that
calls probe_mem_read() wrapper, which is to be used in code used by both
kprobes and uprobes, was only in code for kprobes. It should not use the
wrapper there, but instead call probe_kernel_read() directly.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXH2gZRQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qrTyAQC+dfxfyS32BWpGOwZbUIHGYYHtnpEB
7bIQzXy2q8r4YQD/UGJ0qoHur1gSMEIVjhIM0Qow4VepqqhthIAV/1iVbgU=
=F9yc
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.0-pre' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix/cleanup from Steven Rostedt:
"This is a "pre-pull". It's only one small fix and one small clean up.
I'm testing a few small patches for my real pull request which will
come at a later time. The second patch depends on your tree anyway so
I included it along with the urgent fix.
A small fix Pavel sent me back in august was accidentally lost due to
it being placed with some other patches that failed some tests, and
was rebased out of my local tree. Which was a regression that caused
event filters not to handle negative numbers.
The clean up is from Masami that realized that the code in kprobes
that calls probe_mem_read() wrapper, which is to be used in code used
by both kprobes and uprobes, was only in code for kprobes. It should
not use the wrapper there, but instead call probe_kernel_read()
directly"
* tag 'trace-v5.0-pre' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobes: Use probe_kernel_read instead of probe_mem_read
tracing: Fix event filters and triggers to handle negative numbers
Here is the big driver core patchset for 5.1-rc1
More patches than "normal" here this merge window, due to some work in
the driver core by Alexander Duyck to rework the async probe
functionality to work better for a number of devices, and independant
work from Rafael for the device link functionality to make it work
"correctly".
Also in here is:
- lots of BUS_ATTR() removals, the macro is about to go away
- firmware test fixups
- ihex fixups and simplification
- component additions (also includes i915 patches)
- lots of minor coding style fixups and cleanups.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXH+euQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynyTgCfbV8CLums843sBnT8NnWrTMTdTCcAn1K4re0m
ep8g+6oRLxJy414hogxQ
=bLs2
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big driver core patchset for 5.1-rc1
More patches than "normal" here this merge window, due to some work in
the driver core by Alexander Duyck to rework the async probe
functionality to work better for a number of devices, and independant
work from Rafael for the device link functionality to make it work
"correctly".
Also in here is:
- lots of BUS_ATTR() removals, the macro is about to go away
- firmware test fixups
- ihex fixups and simplification
- component additions (also includes i915 patches)
- lots of minor coding style fixups and cleanups.
All of these have been in linux-next for a while with no reported
issues"
* tag 'driver-core-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (65 commits)
driver core: platform: remove misleading err_alloc label
platform: set of_node in platform_device_register_full()
firmware: hardcode the debug message for -ENOENT
driver core: Add missing description of new struct device_link field
driver core: Fix PM-runtime for links added during consumer probe
drivers/component: kerneldoc polish
async: Add cmdline option to specify drivers to be async probed
driver core: Fix possible supplier PM-usage counter imbalance
PM-runtime: Fix __pm_runtime_set_status() race with runtime resume
driver: platform: Support parsing GpioInt 0 in platform_get_irq()
selftests: firmware: fix verify_reqs() return value
Revert "selftests: firmware: remove use of non-standard diff -Z option"
Revert "selftests: firmware: add CONFIG_FW_LOADER_USER_HELPER_FALLBACK to config"
device: Fix comment for driver_data in struct device
kernfs: Allocating memory for kernfs_iattrs with kmem_cache.
sysfs: remove unused include of kernfs-internal.h
driver core: Postpone DMA tear-down until after devres release
driver core: Document limitation related to DL_FLAG_RPM_ACTIVE
PM-runtime: Take suppliers into account in __pm_runtime_set_status()
device.h: Add __cold to dev_<level> logging functions
...
- Update the PM-runtime framework to use ktime instead of
jiffies for accounting (Thara Gopinath, Vincent Guittot).
- Optimize the autosuspend code in the PM-runtime framework
somewhat (Ladislav Michl).
- Add a PM core flag to mark devices that don't need any form of
power management (Sudeep Holla).
- Introduce driver API documentation for cpuidle and add a new
cpuidle governor for tickless systems (Rafael Wysocki).
- Add Jacobsville support to the intel_idle driver (Zhang Rui).
- Clean up a cpuidle core header file and the cpuidle-dt and ACPI
processor-idle drivers (Yangtao Li, Joseph Lo, Yazen Ghannam).
- Add new cpufreq driver for Armada 8K (Gregory Clement).
- Fix and clean up cpufreq core (Rafael Wysocki, Viresh Kumar,
Amit Kucheria).
- Add support for light-weight tear-down and bring-up of CPUs to the
cpufreq core and use it in the cpufreq-dt driver (Viresh Kumar).
- Fix cpu_cooling Kconfig dependencies, add support for CPU cooling
auto-registration to the cpufreq core and use it in multiple
cpufreq drivers (Amit Kucheria).
- Fix some minor issues and do some cleanups in the davinci,
e_powersaver, ap806, s5pv210, qcom and kryo cpufreq drivers
(Bartosz Golaszewski, Gustavo Silva, Julia Lawall, Paweł Chmiel,
Taniya Das, Viresh Kumar).
- Add a Hisilicon CPPC quirk to the cppc_cpufreq driver (Xiongfeng
Wang).
- Clean up the intel_pstate and acpi-cpufreq drivers (Erwan Velu,
Rafael Wysocki).
- Clean up multiple cpufreq drivers (Yangtao Li).
- Update cpufreq-related MAINTAINERS entries (Baruch Siach, Lukas
Bulwahn).
- Add support for exposing the Energy Model via debugfs and make
multiple cpufreq drivers register an Energy Model to support
energy-aware scheduling (Quentin Perret, Dietmar Eggemann,
Matthias Kaehlcke).
- Add Ice Lake mobile and Jacobsville support to the Intel RAPL
power-capping driver (Gayatri Kammela, Zhang Rui).
- Add a power estimation helper to the operating performance points
(OPP) framework and clean up a core function in it (Quentin Perret,
Viresh Kumar).
- Make minor improvements in the generic power domains (genpd), OPP
and system suspend frameworks and in the PM core (Aditya Pakki,
Douglas Anderson, Greg Kroah-Hartman, Rafael Wysocki, Yangtao Li).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJcfSGlAAoJEILEb/54YlRxikwP/1rQ9+HqDmDUvO2QeYREGO/m
R4kK+iUQW7O4ZJzsSvoGyuKCl7c2ANPlJWmbsEZKbevpKZ4XuUcv/CJDqKD1izV7
hfsQyum34ePSCUEMf6CpMAGAkdmK//NVysHiLXZ4j1hhzi6gA6Cm50qyNZ8xX6kF
Ri6zYG5x7nhn/o/l569FDe+K5W/LDDaZUmvr858pPsrZZR5c4p3ylq+HBrZt0FPQ
70D+u7RcT5v3DQLTghNrgHHiOJ0/DQM43I7aZvkKM3JA8BCDou/Nvq+gH0C0YUP0
QE+oFK9C8CBPEz9N9cSMTb0+S78GQNB0GntJPDN3QQFCHRe6EYKUtu6CvllIE1v9
5pFfagXGVi9UmShu80v+qGGUILVK1ZJ5fjSyxx4UcneTsarNJZg7Y7d72mrX+0zi
J3KodcqQi295jNq9P55K/9XtAiRdpRR6bQzXBtrprpw8PA94yqBHPpxbD32Wl05/
U2+ss/SNyMAzhsP9kqzxSxPBlTFek/ArxZm0Uk4kHt75gkl09CG64r+6OG8gLtwD
Skkr02AeYvx6fx0kFnKIS4sc2c2/8xW3FUtHlv+TDPvuzCEaL0ooqsWgt7rcwlmg
Xz5ufXbEIiVSlLlH/YGZxbgy+WfIzYA5WMpYrA1Givn8s5jI9Sm+ROD2qhOKA2n4
aekEDkum/bxVVeykZaXy
=TSKG
-----END PGP SIGNATURE-----
Merge tag 'pm-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These are PM-runtime framework changes to use ktime instead of jiffies
for accounting, new PM core flag to mark devices that don't need any
form of power management, cpuidle updates including driver API
documentation and a new governor, cpufreq updates including a new
driver for Armada 8K, thermal cleanups and more, some energy-aware
scheduling (EAS) enabling changes, new chips support in the intel_idle
and RAPL drivers and assorted cleanups in some other places.
Specifics:
- Update the PM-runtime framework to use ktime instead of jiffies for
accounting (Thara Gopinath, Vincent Guittot)
- Optimize the autosuspend code in the PM-runtime framework somewhat
(Ladislav Michl)
- Add a PM core flag to mark devices that don't need any form of
power management (Sudeep Holla)
- Introduce driver API documentation for cpuidle and add a new
cpuidle governor for tickless systems (Rafael Wysocki)
- Add Jacobsville support to the intel_idle driver (Zhang Rui)
- Clean up a cpuidle core header file and the cpuidle-dt and ACPI
processor-idle drivers (Yangtao Li, Joseph Lo, Yazen Ghannam)
- Add new cpufreq driver for Armada 8K (Gregory Clement)
- Fix and clean up cpufreq core (Rafael Wysocki, Viresh Kumar, Amit
Kucheria)
- Add support for light-weight tear-down and bring-up of CPUs to the
cpufreq core and use it in the cpufreq-dt driver (Viresh Kumar)
- Fix cpu_cooling Kconfig dependencies, add support for CPU cooling
auto-registration to the cpufreq core and use it in multiple
cpufreq drivers (Amit Kucheria)
- Fix some minor issues and do some cleanups in the davinci,
e_powersaver, ap806, s5pv210, qcom and kryo cpufreq drivers
(Bartosz Golaszewski, Gustavo Silva, Julia Lawall, Paweł Chmiel,
Taniya Das, Viresh Kumar)
- Add a Hisilicon CPPC quirk to the cppc_cpufreq driver (Xiongfeng
Wang)
- Clean up the intel_pstate and acpi-cpufreq drivers (Erwan Velu,
Rafael Wysocki)
- Clean up multiple cpufreq drivers (Yangtao Li)
- Update cpufreq-related MAINTAINERS entries (Baruch Siach, Lukas
Bulwahn)
- Add support for exposing the Energy Model via debugfs and make
multiple cpufreq drivers register an Energy Model to support
energy-aware scheduling (Quentin Perret, Dietmar Eggemann, Matthias
Kaehlcke)
- Add Ice Lake mobile and Jacobsville support to the Intel RAPL
power-capping driver (Gayatri Kammela, Zhang Rui)
- Add a power estimation helper to the operating performance points
(OPP) framework and clean up a core function in it (Quentin Perret,
Viresh Kumar)
- Make minor improvements in the generic power domains (genpd), OPP
and system suspend frameworks and in the PM core (Aditya Pakki,
Douglas Anderson, Greg Kroah-Hartman, Rafael Wysocki, Yangtao Li)"
* tag 'pm-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (80 commits)
cpufreq: kryo: Release OPP tables on module removal
cpufreq: ap806: add missing of_node_put after of_device_is_available
cpufreq: acpi-cpufreq: Report if CPU doesn't support boost technologies
cpufreq: Pass updated policy to driver ->setpolicy() callback
cpufreq: Fix two debug messages in cpufreq_set_policy()
cpufreq: Reorder and simplify cpufreq_update_policy()
cpufreq: Add kerneldoc comments for two core functions
PM / core: Add support to skip power management in device/driver model
cpufreq: intel_pstate: Rework iowait boosting to be less aggressive
cpufreq: intel_pstate: Eliminate intel_pstate_get_base_pstate()
cpufreq: intel_pstate: Avoid redundant initialization of local vars
powercap/intel_rapl: add Ice Lake mobile
ACPI / processor: Set P_LVL{2,3} idle state descriptions
cpufreq / cppc: Work around for Hisilicon CPPC cpufreq
ACPI / CPPC: Add a helper to get desired performance
cpufreq: davinci: move configuration to include/linux/platform_data
cpufreq: speedstep: convert BUG() to BUG_ON()
cpufreq: powernv: fix missing check of return value in init_powernv_pstates()
cpufreq: longhaul: remove unneeded semicolon
cpufreq: pcc-cpufreq: remove unneeded semicolon
..
Merge misc updates from Andrew Morton:
- a few misc things
- ocfs2 updates
- most of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (159 commits)
tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include
proc: more robust bulk read test
proc: test /proc/*/maps, smaps, smaps_rollup, statm
proc: use seq_puts() everywhere
proc: read kernel cpu stat pointer once
proc: remove unused argument in proc_pid_lookup()
fs/proc/thread_self.c: code cleanup for proc_setup_thread_self()
fs/proc/self.c: code cleanup for proc_setup_self()
proc: return exit code 4 for skipped tests
mm,mremap: bail out earlier in mremap_to under map pressure
mm/sparse: fix a bad comparison
mm/memory.c: do_fault: avoid usage of stale vm_area_struct
writeback: fix inode cgroup switching comment
mm/huge_memory.c: fix "orig_pud" set but not used
mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC
mm/memcontrol.c: fix bad line in comment
mm/cma.c: cma_declare_contiguous: correct err handling
mm/page_ext.c: fix an imbalance with kmemleak
mm/compaction: pass pgdat to too_many_isolated() instead of zone
mm: remove zone_lru_lock() function, access ->lru_lock directly
...
Here is one more patch on top of the y2038 changes already pulled
for linux-5.1, for some reason this had escaped all testing.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJcf+C5AAoJEGCrR//JCVInRWkP/1YDi0/hDhmb8vRK+gDquo0a
DW4NE/XUaC8mBS2o3Wjqhqf9E9IgJ4o0NFhtttdsAow4xFWk0+SRyBm3bBPei5fH
CpX0/nfFZwpS67RMTHlX7Aq868mb+fNPnFdR+sP4u7oL7XCbxDAnKbERWMRi1JmU
/QV10l/MpJl0uQebm+43xbKLS4vo4pUOBVYHZ6KkhLXXEE4jiDO9d+OYICFT1EzX
GrCmNlc/+iMLKsZbZbrxDV5VSGhEudvv45SQM4QqnuaMPVsayf9ch5Rlq+uwAZtJ
OmMpyusi0FVKhbh8FCNYmu4cESXc9/ovbwCytfF/MSrQWuvd30fxDgmyv+YtOo8x
FDyfvTlK9wdzzKb/3K1D/ACsPw2dh3jxc0cZYsqjw4IDz7inMMtgQBpsbgNA62dr
THsTYKoGFsfy5Zic+jO1TNQRyTkCPnsi46N2EoKSj1k9Ck4C+Sw20Sd9FXzbzKt0
aTi+LjUjKlEAoifPaDnC+dIVyQv+GE9VjgCag4yxMEj/PBmUq1KgyGSwsNXOahpX
KNiR77EzUih6RHbD6U7IguGt+gxxMYw5e5So+sPJj9/5RsgW55g5ySdehslRBNw1
Crrih3iiMMhTi0CLBLt8/JxcqNEQvAWO7CVIWBeuRh0ojI+aUSbHO/waJyg+emgw
w2x3DkhCRHzjsLPPte66
=/MC2
-----END PGP SIGNATURE-----
Merge tag 'y2038-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground
Pull y2038 build fix for compat mode from Arnd Bergmann:
"Here is one more patch on top of the y2038 changes already pulled for
linux-5.1, for some reason this had escaped all testing"
* tag 'y2038-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground:
ipc: Fix building compat mode without sysvipc
Pull scheduler updates from Ingo Molnar:
"The main changes in this cycle were:
- refcount conversions
- Solve the rq->leaf_cfs_rq_list can of worms for real.
- improve power-aware scheduling
- add sysctl knob for Energy Aware Scheduling
- documentation updates
- misc other changes"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
kthread: Do not use TIMER_IRQSAFE
kthread: Convert worker lock to raw spinlock
sched/fair: Use non-atomic cpumask_{set,clear}_cpu()
sched/fair: Remove unused 'sd' parameter from select_idle_smt()
sched/wait: Use freezable_schedule() when possible
sched/fair: Prune, fix and simplify the nohz_balancer_kick() comment block
sched/fair: Explain LLC nohz kick condition
sched/fair: Simplify nohz_balancer_kick()
sched/topology: Fix percpu data types in struct sd_data & struct s_data
sched/fair: Simplify post_init_entity_util_avg() by calling it with a task_struct pointer argument
sched/fair: Fix O(nr_cgroups) in the load balancing path
sched/fair: Optimize update_blocked_averages()
sched/fair: Fix insertion in rq->leaf_cfs_rq_list
sched/fair: Add tmp_alone_branch assertion
sched/core: Use READ_ONCE()/WRITE_ONCE() in move_queued_task()/task_rq_lock()
sched/debug: Initialize sd_sysctl_cpus if !CONFIG_CPUMASK_OFFSTACK
sched/pelt: Skip updating util_est when utilization is higher than CPU's capacity
sched/fair: Update scale invariance of PELT
sched/fair: Move the rq_of() helper function
sched/core: Convert task_struct.stack_refcount to refcount_t
...
Pull perf updates from Ingo Molnar:
"Lots of tooling updates - too many to list, here's a few highlights:
- Various subcommand updates to 'perf trace', 'perf report', 'perf
record', 'perf annotate', 'perf script', 'perf test', etc.
- CPU and NUMA topology and affinity handling improvements,
- HW tracing and HW support updates:
- Intel PT updates
- ARM CoreSight updates
- vendor HW event updates
- BPF updates
- Tons of infrastructure updates, both on the build system and the
library support side
- Documentation updates.
- ... and lots of other changes, see the changelog for details.
Kernel side updates:
- Tighten up kprobes blacklist handling, reduce the number of places
where developers can install a kprobe and hang/crash the system.
- Fix/enhance vma address filter handling.
- Various PMU driver updates, small fixes and additions.
- refcount_t conversions
- BPF updates
- error code propagation enhancements
- misc other changes"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
perf script python: Add Python3 support to syscall-counts-by-pid.py
perf script python: Add Python3 support to syscall-counts.py
perf script python: Add Python3 support to stat-cpi.py
perf script python: Add Python3 support to stackcollapse.py
perf script python: Add Python3 support to sctop.py
perf script python: Add Python3 support to powerpc-hcalls.py
perf script python: Add Python3 support to net_dropmonitor.py
perf script python: Add Python3 support to mem-phys-addr.py
perf script python: Add Python3 support to failed-syscalls-by-pid.py
perf script python: Add Python3 support to netdev-times.py
perf tools: Add perf_exe() helper to find perf binary
perf script: Handle missing fields with -F +..
perf data: Add perf_data__open_dir_data function
perf data: Add perf_data__(create_dir|close_dir) functions
perf data: Fail check_backup in case of error
perf data: Make check_backup work over directories
perf tools: Add rm_rf_perf_data function
perf tools: Add pattern name checking to rm_rf
perf tools: Add depth checking to rm_rf
perf data: Add global path holder
...
Pull locking updates from Ingo Molnar:
"The biggest part of this tree is the new auto-generated atomics API
wrappers by Mark Rutland.
The primary motivation was to allow instrumentation without uglifying
the primary source code.
The linecount increase comes from adding the auto-generated files to
the Git space as well:
include/asm-generic/atomic-instrumented.h | 1689 ++++++++++++++++--
include/asm-generic/atomic-long.h | 1174 ++++++++++---
include/linux/atomic-fallback.h | 2295 +++++++++++++++++++++++++
include/linux/atomic.h | 1241 +------------
I preferred this approach, so that the full call stack of the (already
complex) locking APIs is still fully visible in 'git grep'.
But if this is excessive we could certainly hide them.
There's a separate build-time mechanism to determine whether the
headers are out of date (they should never be stale if we do our job
right).
Anyway, nothing from this should be visible to regular kernel
developers.
Other changes:
- Add support for dynamic keys, which removes a source of false
positives in the workqueue code, among other things (Bart Van
Assche)
- Updates to tools/memory-model (Andrea Parri, Paul E. McKenney)
- qspinlock, wake_q and lockdep micro-optimizations (Waiman Long)
- misc other updates and enhancements"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
locking/lockdep: Shrink struct lock_class_key
locking/lockdep: Add module_param to enable consistency checks
lockdep/lib/tests: Test dynamic key registration
lockdep/lib/tests: Fix run_tests.sh
kernel/workqueue: Use dynamic lockdep keys for workqueues
locking/lockdep: Add support for dynamic keys
locking/lockdep: Verify whether lock objects are small enough to be used as class keys
locking/lockdep: Check data structure consistency
locking/lockdep: Reuse lock chains that have been freed
locking/lockdep: Fix a comment in add_chain_cache()
locking/lockdep: Introduce lockdep_next_lockchain() and lock_chain_count()
locking/lockdep: Reuse list entries that are no longer in use
locking/lockdep: Free lock classes that are no longer in use
locking/lockdep: Update two outdated comments
locking/lockdep: Make it easy to detect whether or not inside a selftest
locking/lockdep: Split lockdep_free_key_range() and lockdep_reset_lock()
locking/lockdep: Initialize the locks_before and locks_after lists earlier
locking/lockdep: Make zap_class() remove all matching lock order entries
locking/lockdep: Reorder struct lock_class members
locking/lockdep: Avoid that add_chain_cache() adds an invalid chain to the cache
...
As John Stultz noticed, my y2038 syscall series caused a link
failure when CONFIG_SYSVIPC is disabled but CONFIG_COMPAT is
enabled:
arch/arm64/kernel/sys32.o:(.rodata+0x960): undefined reference to `__arm64_compat_sys_old_semctl'
arch/arm64/kernel/sys32.o:(.rodata+0x980): undefined reference to `__arm64_compat_sys_old_msgctl'
arch/arm64/kernel/sys32.o:(.rodata+0x9a0): undefined reference to `__arm64_compat_sys_old_shmctl'
Add the missing entries in kernel/sys_ni.c for the new system
calls.
Cc: Laura Abbott <labbott@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cgroup has a standardized poll/notification mechanism for waking all
pollers on all fds when a filesystem node changes. To allow polling for
custom events, add a .poll callback that can override the default.
This is in preparation for pollable cgroup pressure files which have
per-fd trigger configurations.
Link: http://lkml.kernel.org/r/20190124211518.244221-3-surenb@google.com
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Compaction is inherently race-prone as a suitable page freed during
compaction can be allocated by any parallel task. This patch uses a
capture_control structure to isolate a page immediately when it is freed
by a direct compactor in the slow path of the page allocator. The
intent is to avoid redundant scanning.
5.0.0-rc1 5.0.0-rc1
selective-v3r17 capture-v3r19
Amean fault-both-1 0.00 ( 0.00%) 0.00 * 0.00%*
Amean fault-both-3 2582.11 ( 0.00%) 2563.68 ( 0.71%)
Amean fault-both-5 4500.26 ( 0.00%) 4233.52 ( 5.93%)
Amean fault-both-7 5819.53 ( 0.00%) 6333.65 ( -8.83%)
Amean fault-both-12 9321.18 ( 0.00%) 9759.38 ( -4.70%)
Amean fault-both-18 9782.76 ( 0.00%) 10338.76 ( -5.68%)
Amean fault-both-24 15272.81 ( 0.00%) 13379.55 * 12.40%*
Amean fault-both-30 15121.34 ( 0.00%) 16158.25 ( -6.86%)
Amean fault-both-32 18466.67 ( 0.00%) 18971.21 ( -2.73%)
Latency is only moderately affected but the devil is in the details. A
closer examination indicates that base page fault latency is reduced but
latency of huge pages is increased as it takes creater care to succeed.
Part of the "problem" is that allocation success rates are close to 100%
even when under pressure and compaction gets harder
5.0.0-rc1 5.0.0-rc1
selective-v3r17 capture-v3r19
Percentage huge-3 96.70 ( 0.00%) 98.23 ( 1.58%)
Percentage huge-5 96.99 ( 0.00%) 95.30 ( -1.75%)
Percentage huge-7 94.19 ( 0.00%) 97.24 ( 3.24%)
Percentage huge-12 94.95 ( 0.00%) 97.35 ( 2.53%)
Percentage huge-18 96.74 ( 0.00%) 97.30 ( 0.58%)
Percentage huge-24 97.07 ( 0.00%) 97.55 ( 0.50%)
Percentage huge-30 95.69 ( 0.00%) 98.50 ( 2.95%)
Percentage huge-32 96.70 ( 0.00%) 99.27 ( 2.65%)
And scan rates are reduced as expected by 6% for the migration scanner
and 29% for the free scanner indicating that there is less redundant
work.
Compaction migrate scanned 20815362 19573286
Compaction free scanned 16352612 11510663
[mgorman@techsingularity.net: remove redundant check]
Link: http://lkml.kernel.org/r/20190201143853.GH9565@techsingularity.net
Link: http://lkml.kernel.org/r/20190118175136.31341-23-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sysctl_extfrag_handler() neglects to propagate the return value from
proc_dointvec_minmax() to its caller. It's a wrapper that doesn't need
to exist, so just use proc_dointvec_minmax() directly.
Link: http://lkml.kernel.org/r/20190104032557.3056-1-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reported-by: Aditya Pakki <pakki001@umn.edu>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "Replace all open encodings for NUMA_NO_NODE", v3.
All these places for replacement were found by running the following
grep patterns on the entire kernel code. Please let me know if this
might have missed some instances. This might also have replaced some
false positives. I will appreciate suggestions, inputs and review.
1. git grep "nid == -1"
2. git grep "node == -1"
3. git grep "nid = -1"
4. git grep "node = -1"
This patch (of 2):
At present there are multiple places where invalid node number is
encoded as -1. Even though implicitly understood it is always better to
have macros in there. Replace these open encodings for an invalid node
number with the global macro NUMA_NO_NODE. This helps remove NUMA
related assumptions like 'invalid node' from various places redirecting
them to a common definition.
Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> [ixgbe]
Acked-by: Jens Axboe <axboe@kernel.dk> [mtip32xx]
Acked-by: Vinod Koul <vkoul@kernel.org> [dmaengine.c]
Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc]
Acked-by: Doug Ledford <dledford@redhat.com> [drivers/infiniband]
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The content of pages that are marked PG_offline is not of interest (e.g.
inflated by a balloon driver), let's skip these pages.
In saveable_highmem_page(), move the PageReserved() check to a new check
along with the PageOffline() check to separate it from the swsusp
checks.
[david@redhat.com: v2]
Link: http://lkml.kernel.org/r/20181122100627.5189-9-david@redhat.com
Link: http://lkml.kernel.org/r/20181119101616.8901-9-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <len.brown@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juergen Gross <jgross@suse.com>
Cc: Julien Freche <jfreche@vmware.com>
Cc: Kairui Song <kasong@redhat.com>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Pankaj gupta <pagupta@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Right now, pages inflated as part of a balloon driver will be dumped by
dump tools like makedumpfile. While XEN is able to check in the crash
kernel whether a certain pfn is actuall backed by memory in the
hypervisor (see xen_oldmem_pfn_is_ram) and optimize this case, dumps of
other balloon inflated memory will essentially result in zero pages
getting allocated by the hypervisor and the dump getting filled with
this data.
The allocation and reading of zero pages can directly be avoided if a
dumping tool could know which pages only contain stale information not
to be dumped.
We now have PG_offline which can be (and already is by virtio-balloon)
used for marking pages as logically offline. Follow up patches will
make use of this flag also in other balloon implementations.
Let's export PG_offline via PAGE_OFFLINE_MAPCOUNT_VALUE, so makedumpfile
can directly skip pages that are logically offline and the content
therefore stale.
Please note that this is also helpful for a problem we were seeing under
Hyper-V: Dumping logically offline memory (pages kept fake offline while
onlining a section via online_page_callback) would under some condicions
result in a kernel panic when dumping them.
Link: http://lkml.kernel.org/r/20181119101616.8901-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juergen Gross <jgross@suse.com>
Cc: Julien Freche <jfreche@vmware.com>
Cc: Kairui Song <kasong@redhat.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Pankaj gupta <pagupta@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull year 2038 updates from Thomas Gleixner:
"Another round of changes to make the kernel ready for 2038. After lots
of preparatory work this is the first set of syscalls which are 2038
safe:
403 clock_gettime64
404 clock_settime64
405 clock_adjtime64
406 clock_getres_time64
407 clock_nanosleep_time64
408 timer_gettime64
409 timer_settime64
410 timerfd_gettime64
411 timerfd_settime64
412 utimensat_time64
413 pselect6_time64
414 ppoll_time64
416 io_pgetevents_time64
417 recvmmsg_time64
418 mq_timedsend_time64
419 mq_timedreceiv_time64
420 semtimedop_time64
421 rt_sigtimedwait_time64
422 futex_time64
423 sched_rr_get_interval_time64
The syscall numbers are identical all over the architectures"
* 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
riscv: Use latest system call ABI
checksyscalls: fix up mq_timedreceive and stat exceptions
unicore32: Fix __ARCH_WANT_STAT64 definition
asm-generic: Make time32 syscall numbers optional
asm-generic: Drop getrlimit and setrlimit syscalls from default list
32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option
compat ABI: use non-compat openat and open_by_handle_at variants
y2038: add 64-bit time_t syscalls to all 32-bit architectures
y2038: rename old time and utime syscalls
y2038: remove struct definition redirects
y2038: use time32 syscall names on 32-bit
syscalls: remove obsolete __IGNORE_ macros
y2038: syscalls: rename y2038 compat syscalls
x86/x32: use time64 versions of sigtimedwait and recvmmsg
timex: change syscalls to use struct __kernel_timex
timex: use __kernel_timex internally
sparc64: add custom adjtimex/clock_adjtime functions
time: fix sys_timer_settime prototype
time: Add struct __kernel_timex
time: make adjtime compat handling available for 32 bit
...
Pull irq updates from Thomas Gleixner:
"The interrupt departement delivers this time:
- New infrastructure to manage NMIs on platforms which have a sane
NMI delivery, i.e. identifiable NMI vectors instead of a single
lump.
- Simplification of the interrupt affinity management so drivers
don't have to implement ugly loops around the PCI/MSI enablement.
- Speedup for interrupt statistics in /proc/stat
- Provide a function to retrieve the default irq domain
- A new interrupt controller for the Loongson LS1X platform
- Affinity support for the SiFive PLIC
- Better support for the iMX irqsteer driver
- NUMA aware memory allocations for GICv3
- The usual small fixes, improvements and cleanups all over the
place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
irqchip/imx-irqsteer: Add multi output interrupts support
irqchip/imx-irqsteer: Change to use reg_num instead of irq_group
dt-bindings: irq: imx-irqsteer: Add multi output interrupts support
dt-binding: irq: imx-irqsteer: Use irq number instead of group number
irqchip/brcmstb-l2: Use _irqsave locking variants in non-interrupt code
irqchip/gicv3-its: Use NUMA aware memory allocation for ITS tables
irqdomain: Allow the default irq domain to be retrieved
irqchip/sifive-plic: Implement irq_set_affinity() for SMP host
irqchip/sifive-plic: Differentiate between PLIC handler and context
irqchip/sifive-plic: Add warning in plic_init() if handler already present
irqchip/sifive-plic: Pre-compute context hart base and enable base
PCI/MSI: Remove obsolete sanity checks for multiple interrupt sets
genirq/affinity: Remove the leftovers of the original set support
nvme-pci: Simplify interrupt allocation
genirq/affinity: Add new callback for (re)calculating interrupt sets
genirq/affinity: Store interrupt sets size in struct irq_affinity
genirq/affinity: Code consolidation
irqchip/irq-sifive-plic: Check and continue in case of an invalid cpuid.
irqchip/i8259: Fix shutdown order by moving syscore_ops registration
dt-bindings: interrupt-controller: loongson ls1x intc
...
Pull timer and clockevent updates from Thomas Gleixner:
"The time(r) core and clockevent updates are mostly boring this time:
- A new driver for the Tegra210 timer
- Small fixes and improvements alll over the place
- Documentation updates and cleanups"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
soc/tegra: default select TEGRA_TIMER for Tegra210
clocksource/drivers/tegra: Add Tegra210 timer support
dt-bindings: timer: add Tegra210 timer
clocksource/drivers/timer-cs5535: Rename the file for consistency
clocksource/drivers/timer-pxa: Rename the file for consistency
clocksource/drivers/tango-xtal: Rename the file for consistency
dt-bindings: timer: gpt: update binding doc
clocksource/drivers/exynos_mct: Remove unused header includes
dt-bindings: timer: mediatek: update bindings for MT7629 SoC
clocksource/drivers/exynos_mct: Fix error path in timer resources initialization
clocksource/drivers/exynos_mct: Remove dead code
clocksource/drivers/riscv: Add required checks during clock source init
dt-bindings: timer: renesas: tmu: Document r8a774c0 bindings
dt-bindings: timer: renesas, cmt: Document r8a774c0 CMT support
clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown
clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR
clocksource/drivers/arch_timer: Workaround for Allwinner A64 timer instability
clocksource/drivers/sun5i: Fail gracefully when clock rate is unavailable
timers: Mark expected switch fall-throughs
timekeeping/debug: No need to check return value of debugfs_create functions
...
- A copy of Arnds compat wrapper generation series
- Pass information about the KVM guest to the host in form the control
program code and the control program version code
- Map IOV resources to support PCI physical functions on s390
- Add vector load and store alignment hints to improve performance
- Use the "jdd" constraint with gcc 9 to make jump labels working again
- Remove amode workaround for old z/VM releases from the DCSS code
- Add support for in-kernel performance measurements using the
CPU measurement counter facility
- Introduce a new PMU device cpum_cf_diag to capture counters and
store thenn as event raw data.
- Bug fixes and cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJcfh4QAAoJEDjwexyKj9rgXVAH/RzVbi3vznldujSNfCFTZKPu
EmFFAZIfbhifW3szfylyOJL52pFhxjcWzY0hkFEkbs2t90sn8l1BNkDscYZtfNHC
XvN3N9LsHyxOeyxvQuWLSio58qm+Lr1L0UrIhbMvqyAVkOLmIHvybFwi83OkMptm
djoL8NbuNsAA2s26y2bZLNtU7FmOW5smJIlnt7H4dmK4SFylqZKS/EnUZxGDgn+7
UrrTTOQUir0QZ8vraANsP1M0/LqPcd2YusLmj4jOdZ5Muc2Ch2AA991FofqdKShO
/8cGlsIzwHWGgdnP/YDea5gbetvonayYduixKy3EnYpWQ9iogiBjH4G7QNxcncs=
=v26J
-----END PGP SIGNATURE-----
Merge tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
- A copy of Arnds compat wrapper generation series
- Pass information about the KVM guest to the host in form the control
program code and the control program version code
- Map IOV resources to support PCI physical functions on s390
- Add vector load and store alignment hints to improve performance
- Use the "jdd" constraint with gcc 9 to make jump labels working again
- Remove amode workaround for old z/VM releases from the DCSS code
- Add support for in-kernel performance measurements using the CPU
measurement counter facility
- Introduce a new PMU device cpum_cf_diag to capture counters and store
thenn as event raw data.
- Bug fixes and cleanups
* tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits)
Revert "s390/cpum_cf: Add kernel message exaplanations"
s390/dasd: fix read device characteristic with CONFIG_VMAP_STACK=y
s390/suspend: fix prefix register reset in swsusp_arch_resume
s390: warn about clearing als implied facilities
s390: allow overriding facilities via command line
s390: clean up redundant facilities list setup
s390/als: remove duplicated in-place implementation of stfle
s390/cio: Use cpa range elsewhere within vfio-ccw
s390/cio: Fix vfio-ccw handling of recursive TICs
s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem
s390/cpum_cf: Handle EBUSY return code from CPU counter facility reservation
s390/cpum_cf: Add kernel message exaplanations
s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace
s390/cpum_cf: add ctr_stcctm() function
s390/cpum_cf: move common functions into a separate file
s390/cpum_cf: introduce kernel_cpumcf_avail() function
s390/cpu_mf: replace stcctm5() with the stcctm() function
s390/cpu_mf: add store cpu counter multiple instruction support
s390/cpum_cf: Add minimal in-kernel interface for counter measurements
s390/cpum_cf: introduce kernel_cpumcf_alert() to obtain measurement alerts
...
Pull networking updates from David Miller:
"Here we go, another merge window full of networking and #ebpf changes:
1) Snoop DHCPACKS in batman-adv to learn MAC/IP pairs in the DHCP
range without dealing with floods of ARP traffic, from Linus
Lüssing.
2) Throttle buffered multicast packet transmission in mt76, from
Felix Fietkau.
3) Support adaptive interrupt moderation in ice, from Brett Creeley.
4) A lot of struct_size conversions, from Gustavo A. R. Silva.
5) Add peek/push/pop commands to bpftool, as well as bash completion,
from Stanislav Fomichev.
6) Optimize sk_msg_clone(), from Vakul Garg.
7) Add SO_BINDTOIFINDEX, from David Herrmann.
8) Be more conservative with local resends due to local congestion,
from Yuchung Cheng.
9) Allow vetoing of unsupported VXLAN FDBs, from Petr Machata.
10) Add health buffer support to devlink, from Eran Ben Elisha.
11) Add TXQ scheduling API to mac80211, from Toke Høiland-Jørgensen.
12) Add statistics to basic packet scheduler filter, from Cong Wang.
13) Add GRE tunnel support for mlxsw Spectrum-2, from Nir Dotan.
14) Lots of new IP tunneling forwarding tests, also from Nir Dotan.
15) Add 3ad stats to bonding, from Nikolay Aleksandrov.
16) Lots of probing improvements for bpftool, from Quentin Monnet.
17) Various nfp drive #ebpf JIT improvements from Jakub Kicinski.
18) Allow #ebpf programs to access gso_segs from skb shared info, from
Eric Dumazet.
19) Add sock_diag support for AF_XDP sockets, from Björn Töpel.
20) Support 22260 iwlwifi devices, from Luca Coelho.
21) Use rbtree for ipv6 defragmentation, from Peter Oskolkov.
22) Add JMP32 instruction class support to #ebpf, from Jiong Wang.
23) Add spinlock support to #ebpf, from Alexei Starovoitov.
24) Support 256-bit keys and TLS 1.3 in ktls, from Dave Watson.
25) Add device infomation API to devlink, from Jakub Kicinski.
26) Add new timestamping socket options which are y2038 safe, from
Deepa Dinamani.
27) Add RX checksum offloading for various sh_eth chips, from Sergei
Shtylyov.
28) Flow offload infrastructure, from Pablo Neira Ayuso.
29) Numerous cleanups, improvements, and bug fixes to the PHY layer
and many drivers from Heiner Kallweit.
30) Lots of changes to try and make packet scheduler classifiers run
lockless as much as possible, from Vlad Buslov.
31) Support BCM957504 chip in bnxt_en driver, from Erik Burrows.
32) Add concurrency tests to tc-tests infrastructure, from Vlad
Buslov.
33) Add hwmon support to aquantia, from Heiner Kallweit.
34) Allow 64-bit values for SO_MAX_PACING_RATE, from Eric Dumazet.
And I would be remiss if I didn't thank the various major networking
subsystem maintainers for integrating much of this work before I even
saw it. Alexei Starovoitov, Daniel Borkmann, Pablo Neira Ayuso,
Johannes Berg, Kalle Valo, and many others. Thank you!"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2207 commits)
net/sched: avoid unused-label warning
net: ignore sysctl_devconf_inherit_init_net without SYSCTL
phy: mdio-mux: fix Kconfig dependencies
net: phy: use phy_modify_mmd_changed in genphy_c45_an_config_aneg
net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework
selftest/net: Remove duplicate header
sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79
net/mlx5e: Update tx reporter status in case channels were successfully opened
devlink: Add support for direct reporter health state update
devlink: Update reporter state to error even if recover aborted
sctp: call iov_iter_revert() after sending ABORT
team: Free BPF filter when unregistering netdev
ip6mr: Do not call __IP6_INC_STATS() from preemptible context
isdn: mISDN: Fix potential NULL pointer dereference of kzalloc
net: dsa: mv88e6xxx: support in-band signalling on SGMII ports with external PHYs
cxgb4/chtls: Prefix adapter flags with CXGB4
net-sysfs: Switch to bitmap_zalloc()
mellanox: Switch to bitmap_zalloc()
bpf: add test cases for non-pointer sanitiation logic
mlxsw: i2c: Extend initialization by querying resources data
...
Change "execuing" into "executing" and "guarnateed" into "guaranteed".
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
The atomic replace allows to create cumulative patches. They are useful when
you maintain many livepatches and want to remove one that is lower on the
stack. In addition it is very useful when more patches touch the same function
and there are dependencies between them.
It's also a feature some of the distros are using already to distribute
their patches.
Pull vfs fixes from Al Viro:
"Assorted fixes that sat in -next for a while, all over the place"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
aio: Fix locking in aio_poll()
exec: Fix mem leak in kernel_read_file
copy_mount_string: Limit string length to PATH_MAX
cgroup: saner refcounting for cgroup_root
fix cgroup_do_mount() handling of failure exits
Daniel Borkmann says:
====================
pull-request: bpf-next 2019-03-04
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add AF_XDP support to libbpf. Rationale is to facilitate writing
AF_XDP applications by offering higher-level APIs that hide many
of the details of the AF_XDP uapi. Sample programs are converted
over to this new interface as well, from Magnus.
2) Introduce a new cant_sleep() macro for annotation of functions
that cannot sleep and use it in BPF_PROG_RUN() to assert that
BPF programs run under preemption disabled context, from Peter.
3) Introduce per BPF prog stats in order to monitor the usage
of BPF; this is controlled by kernel.bpf_stats_enabled sysctl
knob where monitoring tools can make use of this to efficiently
determine the average cost of programs, from Alexei.
4) Split up BPF selftest's test_progs similarly as we already
did with test_verifier. This allows to further reduce merge
conflicts in future and to get more structure into our
quickly growing BPF selftest suite, from Stanislav.
5) Fix a bug in BTF's dedup algorithm which can cause an infinite
loop in some circumstances; also various BPF doc fixes and
improvements, from Andrii.
6) Various BPF sample cleanups and migration to libbpf in order
to further isolate the old sample loader code (so we can get
rid of it at some point), from Jakub.
7) Add a new BPF helper for BPF cgroup skb progs that allows
to set ECN CE code point and a Host Bandwidth Manager (HBM)
sample program for limiting the bandwidth used by v2 cgroups,
from Lawrence.
8) Enable write access to skb->queue_mapping from tc BPF egress
programs in order to let BPF pick TX queue, from Jesper.
9) Fix a bug in BPF spinlock handling for map-in-map which did
not propagate spin_lock_off to the meta map, from Yonghong.
10) Fix a bug in the new per-CPU BPF prog counters to properly
initialize stats for each CPU, from Eric.
11) Add various BPF helper prototypes to selftest's bpf_helpers.h,
from Willem.
12) Fix various BPF samples bugs in XDP and tracing progs,
from Toke, Daniel and Yonghong.
13) Silence preemption splat in test_bpf after BPF_PROG_RUN()
enforces it now everywhere, from Anders.
14) Fix a signedness bug in libbpf's btf_dedup_ref_type() to
get error handling working, from Dan.
15) Fix bpftool documentation and auto-completion with regards
to stream_{verdict,parser} attach types, from Alban.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
* pm-core:
PM / core: Add support to skip power management in device/driver model
PM / suspend: Print debug messages for device using direct-complete
PM-runtime: update time accounting only when enabled
PM-runtime: Switch accounting over to ktime_get_mono_fast_ns()
PM-runtime: Optimize pm_runtime_autosuspend_expiration()
PM-runtime: Replace jiffies-based accounting with ktime-based accounting
PM-runtime: update accounting_timestamp on enable
PM: clock_ops: fix missing clk_prepare() return value check
drm/i915: Move on the new pm runtime interface
PM-runtime: Add new interface to get accounted time
* pm-sleep:
PM / wakeup: fix kerneldoc comment for pm_wakeup_dev_event()
* pm-qos:
PM: QoS: no need to check return value of debugfs_create functions
* pm-domains:
PM / Domains: Mark "name" const in dev_pm_domain_attach_by_name()
PM / Domains: Mark "name" const in genpd_dev_pm_attach_by_name()
PM: domains: no need to check return value of debugfs_create functions
* pm-em:
PM / EM: Expose the Energy Model in debugfs
Marek reported that he saw an issue with the below snippet in that
timing measurements where off when loaded as unpriv while results
were reasonable when loaded as privileged:
[...]
uint64_t a = bpf_ktime_get_ns();
uint64_t b = bpf_ktime_get_ns();
uint64_t delta = b - a;
if ((int64_t)delta > 0) {
[...]
Turns out there is a bug where a corner case is missing in the fix
d3bd7413e0 ("bpf: fix sanitation of alu op with pointer / scalar
type from different paths"), namely fixup_bpf_calls() only checks
whether aux has a non-zero alu_state, but it also needs to test for
the case of BPF_ALU_NON_POINTER since in both occasions we need to
skip the masking rewrite (as there is nothing to mask).
Fixes: d3bd7413e0 ("bpf: fix sanitation of alu op with pointer / scalar type from different paths")
Reported-by: Marek Majkowski <marek@cloudflare.com>
Reported-by: Arthur Fabre <afabre@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/netdev/CAJPywTJqP34cK20iLM5YmUMz9KXQOdu1-+BZrGMAGgLuBWz7fg@mail.gmail.com/T/
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>