Return vm_fault_t codes directly from the appropriate mm routines instead
of converting from errnos ourselves. Fixes a minor bug where we'd return
SIGBUS instead of the correct OOM code if we ran out of memory allocating
page tables.
Link: http://lkml.kernel.org/r/20180828145728.11873-5-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Like vm_insert_pfn_prot(), but returns a vm_fault_t instead of an errno.
Also unexport vm_insert_pfn_prot as it has no modular users.
Link: http://lkml.kernel.org/r/20180828145728.11873-4-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All callers are now converted to vmf_insert_mixed() so convert
vmf_insert_mixed() from being a compatibility wrapper into the real
function.
Link: http://lkml.kernel.org/r/20180828145728.11873-3-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cramfs is the only remaining user of vm_insert_mixed() and should be
converted to vmf_insert_mixed().
Based on a previous patch from Matthew Wilcox.
Link: http://lkml.kernel.org/r/nycvar.YSQ.7.76.1808290945450.10215@knanqh.ubzr
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>a
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As part of vm_fault_t conversion filemap_page_mkwrite() for the NOMMU case
was missed. Now converted.
Link: http://lkml.kernel.org/r/20180828174952.GA29229@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
check_for_memory() looks a bit confusing. First of all, we have this:
if (N_MEMORY == N_NORMAL_MEMORY)
return;
Checking the ENUM declaration, looks like N_MEMORY canot be equal to
N_NORMAL_MEMORY.
I could not find where N_MEMORY is set to N_NORMAL_MEMORY, or the other
way around either, so unless I am missing something, this condition will
never evaluate to true. It makes sense to get rid of it.
Moving forward, the operations within the loop look a bit confusing as
well.
We set N_HIGH_MEMORY unconditionally, and then we set N_NORMAL_MEMORY in
case we have CONFIG_HIGHMEM (N_NORMAL_MEMORY != N_HIGH_MEMORY) and zone <=
ZONE_NORMAL. (N_HIGH_MEMORY falls back to N_NORMAL_MEMORY on
!CONFIG_HIGHMEM systems, and that is why we can just go ahead and set
N_HIGH_MEMORY unconditionally)
Although this works, it is a bit subtle.
I think that this could be easier to follow:
First, we should only set N_HIGH_MEMORY in case we have CONFIG_HIGHMEM.
And then we should set N_NORMAL_MEMORY in case zone <= ZONE_NORMAL,
without further checking whether we have CONFIG_HIGHMEM or not.
Link: http://lkml.kernel.org/r/20180828210158.4617-1-osalvador@techadventures.net
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michael Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
si->swap_map[] of the swap entries in cluster needs to be cleared during
freeing. Previously, this is done in the caller of swap_free_cluster().
This may cause code duplication (one user now, will add more users later)
and lock/unlock cluster unnecessarily. In this patch, the clearing code
is moved to swap_free_cluster() to avoid the downside.
Link: http://lkml.kernel.org/r/20180827075535.17406-4-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a code cleanup patch without functionality change.
Originally, when __swap_entry_free() is called, and its return value is 0,
free_swap_slot() will always be called to free the swap entry to the
per-CPU pool. So move the call to free_swap_slot() to __swap_entry_free()
to simplify the code.
Link: http://lkml.kernel.org/r/20180827075535.17406-3-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The code path to reclaim the swap entry in free_swap_and_cache() is
almost same as that of __try_to_reclaim_swap(). The largest
difference is just coding style. So the support to the additional
requirement of free_swap_and_cache() is added into
__try_to_reclaim_swap(). free_swap_and_cache() is changed to call
__try_to_reclaim_swap(), and delete the duplicated code. This will
improve code readability and reduce the potential bugs.
There are 2 functionality differences between __try_to_reclaim_swap()
and swap entry reclaim code of free_swap_and_cache().
- free_swap_and_cache() only reclaims the swap entry if the page is
unmapped or swap is getting full. The support has been added into
__try_to_reclaim_swap().
- try_to_free_swap() (called by __try_to_reclaim_swap()) checks
pm_suspended_storage(), while free_swap_and_cache() not. I think
this is OK. Because the page and the swap entry can be reclaimed
later eventually.
Link: http://lkml.kernel.org/r/20180827075535.17406-2-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, kmemleak only prints the number of suspected leaks to dmesg but
requires the user to read a debugfs file to get the actual stack traces of
the objects' allocation points. Add a module option to print the full
object information to dmesg too. It can be enabled with
kmemleak.verbose=1 on the kernel command line, or "echo 1 >
/sys/module/kmemleak/parameters/verbose":
This allows easier integration of kmemleak into test systems: We have
automated test infrastructure to test our Linux systems. With this
option, running our tests with kmemleak is as simple as enabling kmemleak
and passing this command line option; the test infrastructure knows how to
save kernel logs, which will now include kmemleak reports. Without this
option, the test infrastructure needs to be specifically taught to read
out the kmemleak debugfs file. Removing this need for special handling
makes kmemleak more similar to other kernel debug options (slab debugging,
debug objects, etc).
Link: http://lkml.kernel.org/r/20180903144046.21023-1-vincent.whitchurch@axis.com
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert 5ff7091f5a ("mm, mmu_notifier: annotate mmu notifiers with
blockable invalidate callbacks").
MMU_INVALIDATE_DOES_NOT_BLOCK flags was the only one used and it is no
longer needed since 93065ac753 ("mm, oom: distinguish blockable mode for
mmu notifiers"). We now have a full support for per range !blocking
behavior so we can drop the stop gap workaround which the per notifier
flag was used for.
Link: http://lkml.kernel.org/r/20180827112623.8992-4-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If invalidate_range_start() is called for !blocking mode then all
callbacks have to guarantee they will no block/sleep. The same obviously
applies to invalidate_range_end because this operation pairs with the
former and they are called from the same context. Make sure this is
appropriately documented.
Link: http://lkml.kernel.org/r/20180827112623.8992-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tetsuo Handa has reported that it is possible to bypass the short sleep
for PF_WQ_WORKER threads which was introduced by commit 373ccbe592
("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make
any progress") and lock up the system if OOM.
The primary reason is that WQ_MEM_RECLAIM WQs are not guaranteed to run
even when they have a rescuer available. Those workers might be essential
for reclaim to make a forward progress, however. If we are too unlucky
all the allocations requests can get stuck waiting for a WQ_MEM_RECLAIM
work item and the system is essentially stuck in an OOM condition without
much hope to move on. Tetsuo has seen the reclaim stuck on
drain_local_pages_wq or xlog_cil_push_work (xfs). There might be others.
Since should_reclaim_retry() should be a natural reschedule point,
let's do the short sleep for PF_WQ_WORKER threads unconditionally in
order to guarantee that other pending work items are started. This
will workaround this problem and it is less fragile than hunting down
when the sleep is missed. Having a single sleeping point is more
robust.
[akpm@linux-foundation.org: reflow comment to 80 cols to save a couple of lines]
Link: http://lkml.kernel.org/r/20180827135101.15700-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Debugged-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I've noticed, that dying memory cgroups are often pinned in memory by a
single pagecache page. Even under moderate memory pressure they sometimes
stayed in such state for a long time. That looked strange.
My investigation showed that the problem is caused by applying the LRU
pressure balancing math:
scan = div64_u64(scan * fraction[lru], denominator),
where
denominator = fraction[anon] + fraction[file] + 1.
Because fraction[lru] is always less than denominator, if the initial scan
size is 1, the result is always 0.
This means the last page is not scanned and has
no chances to be reclaimed.
Fix this by rounding up the result of the division.
In practice this change significantly improves the speed of dying cgroups
reclaim.
[guro@fb.com: prevent double calculation of DIV64_U64_ROUND_UP() arguments]
Link: http://lkml.kernel.org/r/20180829213311.GA13501@castle
Link: http://lkml.kernel.org/r/20180827162621.30187-3-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Memcg charge is batched using per-cpu stocks, so an offline memcg can be
pinned by a cached charge up to a moment, when a process belonging to some
other cgroup will charge some memory on the same cpu. In other words,
cached charges can prevent a memory cgroup from being reclaimed for some
time, without any clear need.
Let's optimize it by explicit draining of all stocks on css offlining. As
draining is performed asynchronously, and is skipped if any parallel
draining is happening, it's cheap.
Link: http://lkml.kernel.org/r/20180827162621.30187-2-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If CONFIG_VMAP_STACK is set, kernel stacks are allocated using
__vmalloc_node_range() with __GFP_ACCOUNT. So kernel stack pages are
charged against corresponding memory cgroups on allocation and uncharged
on releasing them.
The problem is that we do cache kernel stacks in small per-cpu caches and
do reuse them for new tasks, which can belong to different memory cgroups.
Each stack page still holds a reference to the original cgroup, so the
cgroup can't be released until the vmap area is released.
To make this happen we need more than two subsequent exits without forks
in between on the current cpu, which makes it very unlikely to happen. As
a result, I saw a significant number of dying cgroups (in theory, up to 2
* number_of_cpu + number_of_tasks), which can't be released even by
significant memory pressure.
As a cgroup structure can take a significant amount of memory (first of
all, per-cpu data like memcg statistics), it leads to a noticeable waste
of memory.
Link: http://lkml.kernel.org/r/20180827162621.30187-1-guro@fb.com
Fixes: ac496bf48d ("fork: Optimize task creation by caching two thread stacks per CPU if CONFIG_VMAP_STACK=y")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Extend the slub_debug syntax to "slub_debug=<flags>[,<slub>]*", where
<slub> may contain an asterisk at the end. For example, the following
would poison all kmalloc slabs:
slub_debug=P,kmalloc*
and the following would apply the default flags to all kmalloc and all
block IO slabs:
slub_debug=,bio*,kmalloc*
Please note that a similar patch was posted by Iliyan Malchev some time
ago but was never merged:
https://marc.info/?l=linux-mm&m=131283905330474&w=2
Link: http://lkml.kernel.org/r/20180928111139.27962-1-atomlin@redhat.com
Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Iliyan Malchev <malchev@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Slub does not call kmalloc_slab() for sizes > KMALLOC_MAX_CACHE_SIZE,
instead it falls back to kmalloc_large().
For slab KMALLOC_MAX_CACHE_SIZE == KMALLOC_MAX_SIZE and it calls
kmalloc_slab() for all allocations relying on NULL return value for
over-sized allocations.
This inconsistency leads to unwanted warnings from kmalloc_slab() for
over-sized allocations for slab. Returning NULL for failed allocations is
the expected behavior.
Make slub and slab code consistent by checking size >
KMALLOC_MAX_CACHE_SIZE in slab before calling kmalloc_slab().
While we are here also fix the check in kmalloc_slab(). We should check
against KMALLOC_MAX_CACHE_SIZE rather than KMALLOC_MAX_SIZE. It all kinda
worked because for slab the constants are the same, and slub always checks
the size against KMALLOC_MAX_CACHE_SIZE before kmalloc_slab(). But if we
get there with size > KMALLOC_MAX_CACHE_SIZE anyhow bad things will
happen. For example, in case of a newly introduced bug in slub code.
Also move the check in kmalloc_slab() from function entry to the size >
192 case. This partially compensates for the additional check in slab
code and makes slub code a bit faster (at least theoretically).
Also drop __GFP_NOWARN in the warning check. This warning means a bug in
slab code itself, user-passed flags have nothing to do with it.
Nothing of this affects slob.
Link: http://lkml.kernel.org/r/20180927171502.226522-1-dvyukov@gmail.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: syzbot+87829a10073277282ad1@syzkaller.appspotmail.com
Reported-by: syzbot+ef4e8fc3a06e9019bb40@syzkaller.appspotmail.com
Reported-by: syzbot+6e438f4036df52cbb863@syzkaller.appspotmail.com
Reported-by: syzbot+8574471d8734457d98aa@syzkaller.appspotmail.com
Reported-by: syzbot+af1504df0807a083dbd9@syzkaller.appspotmail.com
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch to bitmap_zalloc() to show clearly what we are allocating. Besides
that it returns pointer of bitmap type instead of opaque void *.
Link: http://lkml.kernel.org/r/20180830104301.61649-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
What xtensa has in asm/vga.h is the same as what can be found in
asm-generic/vga.h. So use the latter header.
Link: http://lkml.kernel.org/r/20180907132219.12979-1-jslaby@suse.cz
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change iomap_page_mkwrite() return type to vm_fault_t.
see commit 1c8f422059 ("mm: change return type to vm_fault_t") for
reference.
Link: http://lkml.kernel.org/r/20180827172050.GA18673@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes gcc '-Wunused-but-set-variable' warning:
fs/ocfs2/refcounttree.c: In function 'ocfs2_create_reflink_node':
fs/ocfs2/refcounttree.c:4138:31: warning:
variable 'rb' set but not used [-Wunused-but-set-variable]
Link: http://lkml.kernel.org/r/1536198443-113047-1-git-send-email-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The kernel module may sleep with holding a spinlock.
The function call paths (from bottom to top) in Linux-4.16 are:
[FUNC] get_zeroed_page(GFP_NOFS)
fs/ocfs2/dlm/dlmdebug.c, 332: get_zeroed_page in dlm_print_one_mle
fs/ocfs2/dlm/dlmmaster.c, 240: dlm_print_one_mle in __dlm_put_mle
fs/ocfs2/dlm/dlmmaster.c, 255: __dlm_put_mle in dlm_put_mle
fs/ocfs2/dlm/dlmmaster.c, 254: spin_lock in dlm_put_ml
[FUNC] get_zeroed_page(GFP_NOFS)
fs/ocfs2/dlm/dlmdebug.c, 332: get_zeroed_page in dlm_print_one_mle
fs/ocfs2/dlm/dlmmaster.c, 240: dlm_print_one_mle in __dlm_put_mle
fs/ocfs2/dlm/dlmmaster.c, 222: __dlm_put_mle in dlm_put_mle_inuse
fs/ocfs2/dlm/dlmmaster.c, 219: spin_lock in dlm_put_mle_inuse
To fix this bug, GFP_NOFS is replaced with GFP_ATOMIC.
This bug is found by my static analysis tool DSAC.
Link: http://lkml.kernel.org/r/20180901112528.27025-1-baijiaju1990@gmail.com
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Null check for kfree is unnecessary, so remove it.
Link: http://lkml.kernel.org/r/1535704514-26559-1-git-send-email-dingxiang@cmss.chinamobile.com
Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pointer 'eb' is being assigned but is never used hence it is
redundant and can be removed.
Cleans up clang warning:
warning: variable 'eb' set but not used [-Wunused-but-set-variable]
Link: http://lkml.kernel.org/r/20180828141907.10826-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Clang warns when more than one set of parentheses is used for a
single conditional statement:
fs/ocfs2/dlm/dlmthread.c:534:18: warning: equality comparison with extraneous
parentheses [-Wparentheses-equality]
if ((res->owner == dlm->node_num)) {
~~~~~~~~~~~^~~~~~~~~~~~~~~~
fs/ocfs2/dlm/dlmthread.c:534:18: note: remove extraneous parentheses around the
comparison to silence this warning
if ((res->owner == dlm->node_num)) {
~ ^ ~
Link: http://lkml.kernel.org/r/20180924181929.6853-1-natechancellor@gmail.com
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arch code may have asm implementation of string/memory API functions
instead of using generic one from lib/string.c. KASAN don't see memory
accesses in asm code, thus can miss many bugs.
E.g. on ARM64 KASAN don't see bugs in memchr(), memcmp(), str[r]chr(),
str[n]cmp(), str[n]len(). Add tests for these functions to be sure that
we notice the problem on other architectures.
Link: http://lkml.kernel.org/r/20180920135631.23833-3-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kyeongdon Kim <kyeongdon.kim@lge.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ARM64 has asm implementation of memchr(), memcmp(), str[r]chr(),
str[n]cmp(), str[n]len(). KASAN don't see memory accesses in asm code,
thus it can potentially miss many bugs.
Ifdef out __HAVE_ARCH_* defines of these functions when KASAN is enabled,
so the generic implementations from lib/string.c will be used.
We can't just remove the asm functions because efistub uses them. And we
can't have two non-weak functions either, so declare the asm functions as
weak.
Link: http://lkml.kernel.org/r/20180920135631.23833-2-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Kyeongdon Kim <kyeongdon.kim@lge.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since WEAK() supposed to be used instead of ENTRY() to define weak
symbols, but unlike ENTRY() it doesn't have ALIGN directive. It seems
there is no actual reason to not have, so let's add ALIGN to WEAK() too.
Link: http://lkml.kernel.org/r/20180920135631.23833-1-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Will Deacon <will.deacon@arm.com>, Catalin Marinas <catalin.marinas@arm.com>
Cc: Kyeongdon Kim <kyeongdon.kim@lge.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tracing the event "fs_dax:dax_pmd_insert_mapping" with perf produces this
warning:
[fs_dax:dax_pmd_insert_mapping] unknown op '~'
It is printed in process_op (tools/lib/traceevent/event-parse.c) because
'~' is parsed as a binary operator.
perf reads the format of fs_dax:dax_pmd_insert_mapping ("print fmt") from
/sys/kernel/debug/tracing/events/fs_dax/dax_pmd_insert_mapping/format .
The format contains:
~(((u64) ~(~(((1UL) << 12)-1)))
^
\ interpreted as a binary operator by process_op().
This part is generated in the declaration of the event class
dax_pmd_insert_mapping_class in include/trace/events/fs_dax.h :
__print_flags_u64(__entry->pfn_val & PFN_FLAGS_MASK, "|",
PFN_FLAGS_TRACE),
This patch adds a pair of parentheses in the declaration of PFN_FLAGS_MASK
to make sure that '~' is parsed as a unary operator by perf.
The part of the format that was problematic is now:
~(((u64) (~(~(((1UL) << 12)-1))))
Now, all the '~' are parsed as unary operators.
Link: http://lkml.kernel.org/r/20181021145939.8760-1-sebhtml@videotron.qc.ca
Signed-off-by: Sebastien Boisvert <sebhtml@videotron.qc.ca>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Elenie Godzaridis <arangradient@gmail.com>
Cc: <stable@vger.kerenl.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
userfaultfd contains howe-grown locking of the waitqueue lock, and does
not disable interrupts. This relies on the fact that no one else takes it
from interrupt context and violates an invariat of the normal waitqueue
locking scheme. With aio poll it is easy to trigger other locks that
disable interrupts (or are called from interrupt context).
Link: http://lkml.kernel.org/r/20181018154101.18750-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org> [4.19.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here is the big set of char/misc patches for 4.20-rc1.
Loads of things here, we have new code in all of these driver
subsystems:
fpga
stm
extcon
nvmem
eeprom
hyper-v
gsmi
coresight
thunderbolt
vmw_balloon
goldfish
soundwire
along with lots of fixes and minor changes to other small drivers.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCW9Le5A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yn+BQCfZ6DtCIgqo0UW3dLV8Fd0wya9kw0AoNglzJJ6
YRZiaSdRiggARpNdh3ME
=97BX
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big set of char/misc patches for 4.20-rc1.
Loads of things here, we have new code in all of these driver
subsystems:
- fpga
- stm
- extcon
- nvmem
- eeprom
- hyper-v
- gsmi
- coresight
- thunderbolt
- vmw_balloon
- goldfish
- soundwire
along with lots of fixes and minor changes to other small drivers.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (245 commits)
Documentation/security-bugs: Clarify treatment of embargoed information
lib: Fix ia64 bootloader linkage
MAINTAINERS: Clarify UIO vs UIOVEC maintainer
docs/uio: fix a grammar nitpick
docs: fpga: document programming fpgas using regions
fpga: add devm_fpga_region_create
fpga: bridge: add devm_fpga_bridge_create
fpga: mgr: add devm_fpga_mgr_create
hv_balloon: Replace spin_is_locked() with lockdep
sgi-xp: Replace spin_is_locked() with lockdep
eeprom: New ee1004 driver for DDR4 memory
eeprom: at25: remove unneeded 'at25_remove'
w1: IAD Register is yet readable trough iad sys file. Fix snprintf (%u for unsigned, count for max size).
misc: mic: scif: remove set but not used variables 'src_dma_addr, dst_dma_addr'
misc: mic: fix a DMA pool free failure
platform: goldfish: pipe: Add a blank line to separate varibles and code
platform: goldfish: pipe: Remove redundant casting
platform: goldfish: pipe: Call misc_deregister if init fails
platform: goldfish: pipe: Move the file-scope goldfish_pipe_dev variable into the driver state
platform: goldfish: pipe: Move the file-scope goldfish_pipe_miscdev variable into the driver state
...
Driver core patches for 4.20-rc1
Here is a small number of driver core patches for 4.20-rc1.
Not much happened here this merge window, only a very tiny number of
patches that do:
- add BUS_ATTR_WO() for use by drivers
- component error path fixes
- kernfs range check fix
- other tiny error path fixes and const changes
All of these have been in linux-next with no reported issues for a
while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCW9Lhtw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykHTgCguaJ3SgRefuC/WijjqboTC/SikCoAnRVTUxfU
v8BisSN22kR3jmxwsXud
=/IvY
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is a small number of driver core patches for 4.20-rc1.
Not much happened here this merge window, only a very tiny number of
patches that do:
- add BUS_ATTR_WO() for use by drivers
- component error path fixes
- kernfs range check fix
- other tiny error path fixes and const changes
All of these have been in linux-next with no reported issues for a
while"
* tag 'driver-core-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
devres: provide devm_kstrdup_const()
mm: move is_kernel_rodata() to asm-generic/sections.h
devres: constify p in devm_kfree()
driver core: add BUS_ATTR_WO() macro
kernfs: Fix range checks in kernfs_get_target_path
component: fix loop condition to call unbind() if bind() fails
drivers/base/devtmpfs.c: don't pretend path is const in delete_path
kernfs: update comment about kernfs_path() return value
Here is the big USB/PHY driver patches for 4.20-rc1
Lots of USB changes in here, primarily in these areas:
- typec updates and new drivers
- new PHY drivers
- dwc2 driver updates and additions (this old core keeps getting added
to new devices.)
- usbtmc major update based on the industry group coming together and
working to add new features and performance to the driver.
- USB gadget additions for new features
- USB gadget configfs updates
- chipidea driver updates
- other USB gadget updates
- USB serial driver updates
- renesas driver updates
- xhci driver updates
- other tiny USB driver updates
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCW9LlHw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymnvwCffYmMWyMG9zSOw1oSzFPl7TVN1hYAoMyJqzLg
umyLwWxC9ZWWkrpc3iD8
=ux+Y
-----END PGP SIGNATURE-----
Merge tag 'usb-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB/PHY updates from Greg KH:
"Here is the big USB/PHY driver patches for 4.20-rc1
Lots of USB changes in here, primarily in these areas:
- typec updates and new drivers
- new PHY drivers
- dwc2 driver updates and additions (this old core keeps getting
added to new devices.)
- usbtmc major update based on the industry group coming together and
working to add new features and performance to the driver.
- USB gadget additions for new features
- USB gadget configfs updates
- chipidea driver updates
- other USB gadget updates
- USB serial driver updates
- renesas driver updates
- xhci driver updates
- other tiny USB driver updates
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (229 commits)
usb: phy: ab8500: silence some uninitialized variable warnings
usb: xhci: tegra: Add genpd support
usb: xhci: tegra: Power-off power-domains on removal
usbip:vudc: BUG kmalloc-2048 (Not tainted): Poison overwritten
usbip: tools: fix atoi() on non-null terminated string
USB: misc: appledisplay: fix backlight update_status return code
phy: phy-pxa-usb: add a new driver
usb: host: add DT bindings for faraday fotg2
usb: host: ohci-at91: fix request of irq for optional gpio
usb/early: remove set but not used variable 'remain_length'
usb: typec: Fix copy/paste on typec_set_vconn_role() kerneldoc
usb: typec: tcpm: Report back negotiated PPS voltage and current
USB: core: remove set but not used variable 'udev'
usb: core: fix memory leak on port_dev_path allocation
USB: net2280: Remove ->disconnect() callback from net2280_pullup()
usb: dwc2: disable power_down on rockchip devices
usb: gadget: udc: renesas_usb3: add support for r8a77990
dt-bindings: usb: renesas_usb3: add bindings for r8a77990
usb: gadget: udc: renesas_usb3: Add r8a774a1 support
USB: serial: cypress_m8: remove set but not used variable 'iflag'
...
This has been a smaller cycle with many of the commits being smallish code
fixes and improvements across the drivers.
- Driver updates for bnxt_re, cxgb4, hfi1, hns, mlx5, nes, qedr, and rxe
- Memory window support in hns
- mlx5 user API 'flow mutate/steering' allows accessing the full packet
mangling and matching machinery from user space
- Support inter-working with verbs API calls in the 'devx' mlx5 user API, and
provide options to use devx with less privilege
- Modernize the use of syfs and the device interface to use attribute groups
and cdev properly for uverbs, and clean up some of the core code's device list
management
- More progress on net namespaces for RDMA devices
- Consolidate driver BAR mmapping support into core code helpers and rework
how RDMA holds poitners to mm_struct for get_user_pages cases
- First pass to use 'dev_name' instead of ib_device->name
- Device renaming for RDMA devices
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlvR7dUACgkQOG33FX4g
mxojiw//a9GU5kq4IZ3LNAEio/3Ql/NHRF0uie5tSzJgipRJA1Ln9zW0Cm1S/ms1
VCmaSJ3l3q3GC4i3tIlsZSIIkN5qtjv/FsT/i+TZwSJYx9BDpPbzWtG6Mp4PSDj0
v3xzklFCN5HMOmEcjkNmyZw3VjHOt2Iw2mKjqvGbI9imCPLOYnw+WQaZLmMWMH6p
GL0HDbAopN5Lv8ireWd8pOhPLVbSb12cWM1crx+yHOS3q8YNWjIXGiZr/QkOPtPr
cymSXB8yuITJ7gnjbs/GxZHg6rxU0knC/Ck8hE7FqqYYHgytTklOXDE2ef1J2lFe
1VmotD+nTsCir0mZWSdcRrszEk7tzaZT7n1oWggKvWySDB6qaH0II8vWumJchQnN
pElIQn/WDgpekIqplamNqXJnKnDXZJpEVA01OHHDN4MNSc+Ad08hQy4FyFzpB6/G
jv9TnDMfGC6ma9pr1ipOXyCgCa2pHYEUCaYxUqRA0O/4ATVl7/PplqT0rqtJ6hKg
o/hmaVCawIFOUKD87/bo7Em2HBs3xNwE/c5ggbsQElLYeydrgPrZfrPfjkshv5K3
eIKDb+HPyis0is1aiF7m/bz1hSIYZp0bQhuKCdzLRjZobwCm5WDPhtuuAWb7vYVw
GSLCJWyet+bLyZxynNOt67gKm9je9lt8YTr5nilz49KeDytspK0=
=pacJ
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"This has been a smaller cycle with many of the commits being smallish
code fixes and improvements across the drivers.
- Driver updates for bnxt_re, cxgb4, hfi1, hns, mlx5, nes, qedr, and
rxe
- Memory window support in hns
- mlx5 user API 'flow mutate/steering' allows accessing the full
packet mangling and matching machinery from user space
- Support inter-working with verbs API calls in the 'devx' mlx5 user
API, and provide options to use devx with less privilege
- Modernize the use of syfs and the device interface to use attribute
groups and cdev properly for uverbs, and clean up some of the core
code's device list management
- More progress on net namespaces for RDMA devices
- Consolidate driver BAR mmapping support into core code helpers and
rework how RDMA holds poitners to mm_struct for get_user_pages
cases
- First pass to use 'dev_name' instead of ib_device->name
- Device renaming for RDMA devices"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (242 commits)
IB/mlx5: Add support for extended atomic operations
RDMA/core: Fix comment for hw stats init for port == 0
RDMA/core: Refactor ib_register_device() function
RDMA/core: Fix unwinding flow in case of error to register device
ib_srp: Remove WARN_ON in srp_terminate_io()
IB/mlx5: Allow scatter to CQE without global signaled WRs
IB/mlx5: Verify that driver supports user flags
IB/mlx5: Support scatter to CQE for DC transport type
RDMA/drivers: Use core provided API for registering device attributes
RDMA/core: Allow existing drivers to set one sysfs group per device
IB/rxe: Remove unnecessary enum values
RDMA/umad: Use kernel API to allocate umad indexes
RDMA/uverbs: Use kernel API to allocate uverbs indexes
RDMA/core: Increase total number of RDMA ports across all devices
IB/mlx4: Add port and TID to MAD debug print
IB/mlx4: Enable debug print of SMPs
RDMA/core: Rename ports_parent to ports_kobj
RDMA/core: Do not expose unsupported counters
IB/mlx4: Refer to the device kobject instead of ports_parent
RDMA/nldev: Allow IB device rename through RDMA netlink
...
This patch set contains a lot (at least, for me) of improvements to the
RISC-V kernel port:
* The removal of some cacheinfo values that were bogus.
* On systems with F but without D the kernel will not show the F
extension to userspace, as it isn't actually supported.
* Support for futexes.
* Removal of some unused code.
* Cleanup of some menuconfig entries.
* Support for systems without a floating-point unit, and for building
kernels that will never use the floating-point unit.
* More fixes to the RV32I port, which regressed again. It's really time
to get this into a regression test somewhere so I stop breaking it.
Thanks to Zong for resurrecting it again!
* Various fixes that resulted from a year old review of our original
patch set that I finally got around to.
* Various improvements to SMP support, largely based around having
switched to logical hart numbering, as well as some interrupt
improvements. This one is in the same patch set as above, thanks to
Atish for sheparding everything though as my patch set was a bit of a
mess.
I'm pretty sure this is our largest patch set since the original kernel
contribution, and it's certainly the one with the most contributors.
While I don't have anything else I know I'm going to submit for the
merge window, I would be somewhat surprised if I didn't screw anything
up.
Thanks for the help, everyone!
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlvOdqMTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRDvTKFQLMurQSG+EAC37bHo+3WZhRrQhNn/NTXVNtyPq50X
6tHP5dqilt5ClygJgThs46uxES+TtxGuyJt/1+auGfJn/YDFxgO6pSeNukONG3ho
Vs7dbYWviZTH+nMIET/4s6vB+n6QEP2C3BguT6yrCBoFvUPojZXY7Rj1HVn15mu/
Uj5cJgHETw30o/sM022N5fl8/QeY3DTVmfRmrVV1OJIiEEJNu8vJcjt0zGOQPqDT
TZZ1oMUr+VmPQkR2AYGNnzJa8R3qrSsOYCwKlhRvvPWAph8qbKriiN+VXFfvv3ne
rum4l+p8/WDQ87AsDuC1oKCyjuXFxnl50F5fu5u00MEwszEhjB6zgfsYLU7StmB9
FLDtGhLQ7GsbY32Lu13kEchsiewY9EVlTuwVRwuRordAO+j3fSl73r4Gp61FlrfI
uW+LBr7qbh/eqiOF/PUa/3ivhwHEra+aTuRExUtGUy3Cx1IjzpApSINTnNShjSTn
tuQnCNkREUiOYSAQ+XqonvYeMOtvfqrtj2ts6da6BjLg3hwfOro1LIl1913289+p
taQRkll4k609x/EPyXOWOU5fkr0+T2bZq4Jfl/5YgfUOD+5x7bWJBQuZ4NNgj7mP
gBhQLewo7eKo7JZiWxoXzpHQjtJJpHwTgMJutMEAIUWfjhzR4cB3sZnooWSud2UN
smBehmFq2r1IRw==
=YL+A
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-4.20-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
Pull RISC-V updates from Palmer Dabbelt:
"This patch set contains a lot (at least, for me) of improvements to
the RISC-V kernel port:
- The removal of some cacheinfo values that were bogus.
- On systems with F but without D the kernel will not show the F
extension to userspace, as it isn't actually supported.
- Support for futexes.
- Removal of some unused code.
- Cleanup of some menuconfig entries.
- Support for systems without a floating-point unit, and for building
kernels that will never use the floating-point unit.
- More fixes to the RV32I port, which regressed again. It's really
time to get this into a regression test somewhere so I stop
breaking it. Thanks to Zong for resurrecting it again!
- Various fixes that resulted from a year old review of our original
patch set that I finally got around to.
- Various improvements to SMP support, largely based around having
switched to logical hart numbering, as well as some interrupt
improvements. This one is in the same patch set as above, thanks to
Atish for sheparding everything though as my patch set was a bit of
a mess.
I'm pretty sure this is our largest patch set since the original
kernel contribution, and it's certainly the one with the most
contributors. While I don't have anything else I know I'm going to
submit for the merge window, I would be somewhat surprised if I didn't
screw anything up.
Thanks for the help, everyone!"
* tag 'riscv-for-linus-4.20-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux: (31 commits)
RISC-V: Cosmetic menuconfig changes
riscv: move GCC version check for ARCH_SUPPORTS_INT128 to Kconfig
RISC-V: remove the unused return_to_handler export
RISC-V: Add futex support.
RISC-V: Add FP register ptrace support for gdb.
RISC-V: Mask out the F extension on systems without D
RISC-V: Don't set cacheinfo.{physical_line_partition,attributes}
RISC-V: Show IPI stats
RISC-V: Show CPU ID and Hart ID separately in /proc/cpuinfo
RISC-V: Use Linux logical CPU number instead of hartid
RISC-V: Add logical CPU indexing for RISC-V
RISC-V: Use WRITE_ONCE instead of direct access
RISC-V: Use mmgrab()
RISC-V: Rename im_okay_therefore_i_am to found_boot_cpu
RISC-V: Rename riscv_of_processor_hart to riscv_of_processor_hartid
RISC-V: Provide a cleaner raw_smp_processor_id()
RISC-V: Disable preemption before enabling interrupts
RISC-V: Comment on the TLB flush in smp_callin()
RISC-V: Filter ISA and MMU values in cpuinfo
RISC-V: Don't set cacheinfo.{physical_line_partition,attributes}
...
ARM:
- Improved guest IPA space support (32 to 52 bits)
- RAS event delivery for 32bit
- PMU fixes
- Guest entry hardening
- Various cleanups
- Port of dirty_log_test selftest
PPC:
- Nested HV KVM support for radix guests on POWER9. The performance is
much better than with PR KVM. Migration and arbitrary level of
nesting is supported.
- Disable nested HV-KVM on early POWER9 chips that need a particular hardware
bug workaround
- One VM per core mode to prevent potential data leaks
- PCI pass-through optimization
- merge ppc-kvm topic branch and kvm-ppc-fixes to get a better base
s390:
- Initial version of AP crypto virtualization via vfio-mdev
- Improvement for vfio-ap
- Set the host program identifier
- Optimize page table locking
x86:
- Enable nested virtualization by default
- Implement Hyper-V IPI hypercalls
- Improve #PF and #DB handling
- Allow guests to use Enlightened VMCS
- Add migration selftests for VMCS and Enlightened VMCS
- Allow coalesced PIO accesses
- Add an option to perform nested VMCS host state consistency check
through hardware
- Automatic tuning of lapic_timer_advance_ns
- Many fixes, minor improvements, and cleanups
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJb0FINAAoJEED/6hsPKofoI60IAJRS3vOAQ9Fav8cJsO1oBHcX
3+NexfnBke1bzrjIR3SUcHKGZbdnVPNZc+Q4JjIbPpPmmOMU5jc9BC1dmd5f4Vzh
BMnQ0yCvgFv3A3fy/Icx1Z8NJppxosdmqdQLrQrNo8aD3cjnqY2yQixdXrAfzLzw
XEgKdIFCCz8oVN/C9TT4wwJn6l9OE7BM5bMKGFy5VNXzMu7t64UDOLbbjZxNgi1g
teYvfVGdt5mH0N7b2GPPWRbJmgnz5ygVVpVNQUEFrdKZoCm6r5u9d19N+RRXAwan
ZYFj10W2T8pJOUf3tryev4V33X7MRQitfJBo4tP5hZfi9uRX89np5zP1CFE7AtY=
=yEPW
-----END PGP SIGNATURE-----
Merge tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Radim Krčmář:
"ARM:
- Improved guest IPA space support (32 to 52 bits)
- RAS event delivery for 32bit
- PMU fixes
- Guest entry hardening
- Various cleanups
- Port of dirty_log_test selftest
PPC:
- Nested HV KVM support for radix guests on POWER9. The performance
is much better than with PR KVM. Migration and arbitrary level of
nesting is supported.
- Disable nested HV-KVM on early POWER9 chips that need a particular
hardware bug workaround
- One VM per core mode to prevent potential data leaks
- PCI pass-through optimization
- merge ppc-kvm topic branch and kvm-ppc-fixes to get a better base
s390:
- Initial version of AP crypto virtualization via vfio-mdev
- Improvement for vfio-ap
- Set the host program identifier
- Optimize page table locking
x86:
- Enable nested virtualization by default
- Implement Hyper-V IPI hypercalls
- Improve #PF and #DB handling
- Allow guests to use Enlightened VMCS
- Add migration selftests for VMCS and Enlightened VMCS
- Allow coalesced PIO accesses
- Add an option to perform nested VMCS host state consistency check
through hardware
- Automatic tuning of lapic_timer_advance_ns
- Many fixes, minor improvements, and cleanups"
* tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits)
KVM/nVMX: Do not validate that posted_intr_desc_addr is page aligned
Revert "kvm: x86: optimize dr6 restore"
KVM: PPC: Optimize clearing TCEs for sparse tables
x86/kvm/nVMX: tweak shadow fields
selftests/kvm: add missing executables to .gitignore
KVM: arm64: Safety check PSTATE when entering guest and handle IL
KVM: PPC: Book3S HV: Don't use streamlined entry path on early POWER9 chips
arm/arm64: KVM: Enable 32 bits kvm vcpu events support
arm/arm64: KVM: Rename function kvm_arch_dev_ioctl_check_extension()
KVM: arm64: Fix caching of host MDCR_EL2 value
KVM: VMX: enable nested virtualization by default
KVM/x86: Use 32bit xor to clear registers in svm.c
kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD
kvm: vmx: Defer setting of DR6 until #DB delivery
kvm: x86: Defer setting of CR2 until #PF delivery
kvm: x86: Add payload operands to kvm_multiple_exception
kvm: x86: Add exception payload fields to kvm_vcpu_events
kvm: x86: Add has_payload and payload to kvm_queued_exception
KVM: Documentation: Fix omission in struct kvm_vcpu_events
KVM: selftests: add Enlightened VMCS test
...
Pull cgroup updates from Tejun Heo:
"All trivial changes - simplification, typo fix and adding
cond_resched() in a netclassid update loop"
* 'for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup, netclassid: add a preemption point to write_classid
rdmacg: fix a typo in rdmacg documentation
cgroup: Simplify cgroup_ancestor
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJb0GRmAAoJEFKgDEdIgJTy/IEQAKOC4fHonA5LUa8GEO7s+byX
LNQzH/NAR+86CCKdWzaiCpyNEbwzXC/5kFuGB+NIKGrutQ+HO1haKG6URRvZiw0c
YgxaRpJ1h5OfZNuCjql5dX/bFAuBPwEPUAPusA4YJYSiXota2O76OW+RwEpr71i5
/Z2ygi3nlPECOhS1jTwY+cxGci67cfBIzKKdTXEft53xO38xAp0+Ea5Ljf2kIbgl
rNz6XqJcy7rcAwEvh1kHw0AVEauLWs4NRlLX5eX7FHnqoh4TVFxWhLfNKirRo7gb
vHemuucVUdvgG8yoFyg9CkFNLIMV9fWyDXkxab7dvrgD61oceLbNZ7dL86eijz7j
qBoQy/igiH1nqIiczhTtp+JltIMzjPmC3unaie7f+oTHnKinzAaaND3wUjqObdZm
MZQWsjIpBXC1nIcIs35NZiVMs8xcOG/sekkRcjU6/kbrBkoRqR5xhbm/tIcaCj0Z
wKTlgET9b4dnmX8ZiEpvrfeMGxEu4yqfh1O3rvKnk8hKgxTvnzsSriHKh86KSv1L
Gby4BA1zYwQxsJJ6LZMJjtHptxKBTcLANx8C/E9wPETP4EM5A1m5egJYRlDW3hb9
MhM4vzUQfq63b9gPduP10jlLrXsWBQRAcAvtvm2lou3TNqipm6ZqVn9vqkv0retR
Auk7mO33MVpHbDOQw1GK
=N+Ts
-----END PGP SIGNATURE-----
Merge tag 'printk-for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:
- Fix two more locations where printf formatting leaked pointers
- Better log_buf_len parameter handling
- Add prefix to messages from printk code
- Do not miss messages on other consoles when the log is replayed on a
new one
- Reduce race between console registration and panic() when the log
might get replayed on all consoles
- Some cont buffer code clean up
- Call console only when there is something to do (log vs cont buffer)
* tag 'printk-for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
lib/vsprintf: Hash printed address for netdev bits fallback
lib/vsprintf: Hash legacy clock addresses
lib/vsprintf: Prepare for more general use of ptr_to_id()
lib/vsprintf: Make ptr argument conts in ptr_to_id()
printk: fix integer overflow in setup_log_buf()
printk: do not preliminary split up cont buffer
printk: lock/unlock console only for new logbuf entries
printk: keep kernel cont support always enabled
printk: Give error on attempt to set log buffer length to over 2G
printk: Add KBUILD_MODNAME and remove a redundant print prefix
printk: Correct wrong casting
printk: Fix panic caused by passing log_buf_len to command line
printk: CON_PRINTBUFFER console registration is a bit racy
printk: Do not miss new messages when replaying the log
Pull LoadPin updates from James Morris:
"From Kees: This is a small reporting improvement and the param change
needed for the ordering series (but since the loadpin change is
desired and separable, I'm putting it here)"
* 'next-loadpin' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
LoadPin: Rename boot param "enabled" to "enforce"
LoadPin: Report friendly block device name
Pull smack updates from James Morris:
"From Casey: three patches for Smack for 4.20. Two clean up warnings
and one is a rarely encountered ptrace capability check"
* 'next-smack' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
Smack: Mark expected switch fall-through
Smack: ptrace capability use fixes
Smack: remove set but not used variable 'root_inode'
Pull TPM updates from James Morris:
"From Jarkko: The only new feature is non-blocking operation for
/dev/tpm0"
* 'next-tpm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
tpm: Restore functionality to xen vtpm driver.
tpm: add support for nonblocking operation
tpm: add ptr to the tpm_space struct to file_priv
tpm: Make SECURITYFS a weak dependency
tpm: suppress transmit cmd error logs when TPM 1.2 is disabled/deactivated
tpm: fix response size validation in tpm_get_random()
Pull integrity updates from James Morris:
"From Mimi: This contains a couple of bug fixes, including one for a
recent problem with calculating file hashes on overlayfs, and some
code cleanup"
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
MAINTAINERS: add Jarkko as maintainer for trusted keys
ima: open a new file instance if no read permissions
ima: fix showing large 'violations' or 'runtime_measurements_count'
security/integrity: remove unnecessary 'init_keyring' variable
security/integrity: constify some read-only data
vfs: require i_size <= SIZE_MAX in kernel_read_file()
Pull more ->lookup() cleanups from Al Viro:
"Some ->lookup() instances are still overcomplicating the life
for themselves, open-coding the stuff that would be handled by
d_splice_alias() just fine.
Simplify a couple of such cases caught this cycle and document
d_splice_alias() intended use"
* 'work.lookup' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Document d_splice_alias() calling conventions for ->lookup() users.
simplify btrfs_lookup()
clean erofs_lookup()
Pull alpha syscall glue updates from Al Viro:
"Two old patches making alpha syscall glue a bit less mysterious"
* 'work.alpha' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
alpha: unify the glue for sigreturn-like syscalls
alpha: use alpha_ni_syscall only for syscall zero
Pull compat_ioctl fixes from Al Viro:
"A bunch of compat_ioctl fixes, mostly in bluetooth.
Hopefully, most of fs/compat_ioctl.c will get killed off over the next
few cycles; between this, tty series already merged and Arnd's work
this cycle ought to take a good chunk out of the damn thing..."
* 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
hidp: fix compat_ioctl
hidp: constify hidp_connection_add()
cmtp: fix compat_ioctl
bnep: fix compat_ioctl
compat_ioctl: trim the pointless includes