License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2017-02-09 01:51:29 +08:00
|
|
|
#ifndef _LINUX_SCHED_MM_H
|
|
|
|
#define _LINUX_SCHED_MM_H
|
|
|
|
|
2017-02-09 01:51:54 +08:00
|
|
|
#include <linux/kernel.h>
|
|
|
|
#include <linux/atomic.h>
|
2017-02-09 01:51:29 +08:00
|
|
|
#include <linux/sched.h>
|
2017-02-04 07:16:44 +08:00
|
|
|
#include <linux/mm_types.h>
|
2017-02-03 03:56:33 +08:00
|
|
|
#include <linux/gfp.h>
|
2018-01-30 04:20:17 +08:00
|
|
|
#include <linux/sync_core.h>
|
2017-02-09 01:51:29 +08:00
|
|
|
|
2017-02-02 02:08:20 +08:00
|
|
|
/*
|
|
|
|
* Routines for handling mm_structs
|
|
|
|
*/
|
2018-02-01 08:15:51 +08:00
|
|
|
extern struct mm_struct *mm_alloc(void);
|
2017-02-02 02:08:20 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* mmgrab() - Pin a &struct mm_struct.
|
|
|
|
* @mm: The &struct mm_struct to pin.
|
|
|
|
*
|
|
|
|
* Make sure that @mm will not get freed even after the owning task
|
|
|
|
* exits. This doesn't guarantee that the associated address space
|
|
|
|
* will still exist later on and mmget_not_zero() has to be used before
|
|
|
|
* accessing it.
|
|
|
|
*
|
|
|
|
* This is a preferred way to to pin @mm for a longer/unbounded amount
|
|
|
|
* of time.
|
|
|
|
*
|
|
|
|
* Use mmdrop() to release the reference acquired by mmgrab().
|
|
|
|
*
|
|
|
|
* See also <Documentation/vm/active_mm.txt> for an in-depth explanation
|
|
|
|
* of &mm_struct.mm_count vs &mm_struct.mm_users.
|
|
|
|
*/
|
|
|
|
static inline void mmgrab(struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
atomic_inc(&mm->mm_count);
|
|
|
|
}
|
|
|
|
|
2018-02-01 08:15:51 +08:00
|
|
|
extern void mmdrop(struct mm_struct *mm);
|
2017-02-02 02:08:20 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* mmget() - Pin the address space associated with a &struct mm_struct.
|
|
|
|
* @mm: The address space to pin.
|
|
|
|
*
|
|
|
|
* Make sure that the address space of the given &struct mm_struct doesn't
|
|
|
|
* go away. This does not protect against parts of the address space being
|
|
|
|
* modified or freed, however.
|
|
|
|
*
|
|
|
|
* Never use this function to pin this address space for an
|
|
|
|
* unbounded/indefinite amount of time.
|
|
|
|
*
|
|
|
|
* Use mmput() to release the reference acquired by mmget().
|
|
|
|
*
|
|
|
|
* See also <Documentation/vm/active_mm.txt> for an in-depth explanation
|
|
|
|
* of &mm_struct.mm_count vs &mm_struct.mm_users.
|
|
|
|
*/
|
|
|
|
static inline void mmget(struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
atomic_inc(&mm->mm_users);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool mmget_not_zero(struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
return atomic_inc_not_zero(&mm->mm_users);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* mmput gets rid of the mappings and all user-space */
|
|
|
|
extern void mmput(struct mm_struct *);
|
2017-10-04 07:15:00 +08:00
|
|
|
#ifdef CONFIG_MMU
|
|
|
|
/* same as above but performs the slow path from the async context. Can
|
|
|
|
* be called from the atomic context as well
|
|
|
|
*/
|
|
|
|
void mmput_async(struct mm_struct *);
|
|
|
|
#endif
|
2017-02-02 02:08:20 +08:00
|
|
|
|
|
|
|
/* Grab a reference to a task's mm, if it is not already going away */
|
|
|
|
extern struct mm_struct *get_task_mm(struct task_struct *task);
|
|
|
|
/*
|
|
|
|
* Grab a reference to a task's mm, if it is not already going away
|
|
|
|
* and ptrace_may_access with the mode parameter passed to it
|
|
|
|
* succeeds.
|
|
|
|
*/
|
|
|
|
extern struct mm_struct *mm_access(struct task_struct *task, unsigned int mode);
|
|
|
|
/* Remove the current tasks stale references to the old mm_struct */
|
|
|
|
extern void mm_release(struct task_struct *, struct mm_struct *);
|
|
|
|
|
2017-02-02 19:18:24 +08:00
|
|
|
#ifdef CONFIG_MEMCG
|
|
|
|
extern void mm_update_next_owner(struct mm_struct *mm);
|
|
|
|
#else
|
|
|
|
static inline void mm_update_next_owner(struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
#endif /* CONFIG_MEMCG */
|
|
|
|
|
|
|
|
#ifdef CONFIG_MMU
|
|
|
|
extern void arch_pick_mmap_layout(struct mm_struct *mm);
|
|
|
|
extern unsigned long
|
|
|
|
arch_get_unmapped_area(struct file *, unsigned long, unsigned long,
|
|
|
|
unsigned long, unsigned long);
|
|
|
|
extern unsigned long
|
|
|
|
arch_get_unmapped_area_topdown(struct file *filp, unsigned long addr,
|
|
|
|
unsigned long len, unsigned long pgoff,
|
|
|
|
unsigned long flags);
|
|
|
|
#else
|
|
|
|
static inline void arch_pick_mmap_layout(struct mm_struct *mm) {}
|
|
|
|
#endif
|
|
|
|
|
2017-02-02 19:32:21 +08:00
|
|
|
static inline bool in_vfork(struct task_struct *tsk)
|
|
|
|
{
|
|
|
|
bool ret;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* need RCU to access ->real_parent if CLONE_VM was used along with
|
|
|
|
* CLONE_PARENT.
|
|
|
|
*
|
|
|
|
* We check real_parent->mm == tsk->mm because CLONE_VFORK does not
|
|
|
|
* imply CLONE_VM
|
|
|
|
*
|
|
|
|
* CLONE_VFORK can be used with CLONE_PARENT/CLONE_THREAD and thus
|
|
|
|
* ->real_parent is not necessarily the task doing vfork(), so in
|
|
|
|
* theory we can't rely on task_lock() if we want to dereference it.
|
|
|
|
*
|
|
|
|
* And in this case we can't trust the real_parent->mm == tsk->mm
|
|
|
|
* check, it can be false negative. But we do not care, if init or
|
|
|
|
* another oom-unkillable task does this it should blame itself.
|
|
|
|
*/
|
|
|
|
rcu_read_lock();
|
|
|
|
ret = tsk->vfork_done && tsk->real_parent->mm == tsk->mm;
|
|
|
|
rcu_read_unlock();
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
mm: introduce memalloc_nofs_{save,restore} API
GFP_NOFS context is used for the following 5 reasons currently:
- to prevent from deadlocks when the lock held by the allocation
context would be needed during the memory reclaim
- to prevent from stack overflows during the reclaim because the
allocation is performed from a deep context already
- to prevent lockups when the allocation context depends on other
reclaimers to make a forward progress indirectly
- just in case because this would be safe from the fs POV
- silence lockdep false positives
Unfortunately overuse of this allocation context brings some problems to
the MM. Memory reclaim is much weaker (especially during heavy FS
metadata workloads), OOM killer cannot be invoked because the MM layer
doesn't have enough information about how much memory is freeable by the
FS layer.
In many cases it is far from clear why the weaker context is even used
and so it might be used unnecessarily. We would like to get rid of
those as much as possible. One way to do that is to use the flag in
scopes rather than isolated cases. Such a scope is declared when really
necessary, tracked per task and all the allocation requests from within
the context will simply inherit the GFP_NOFS semantic.
Not only this is easier to understand and maintain because there are
much less problematic contexts than specific allocation requests, this
also helps code paths where FS layer interacts with other layers (e.g.
crypto, security modules, MM etc...) and there is no easy way to convey
the allocation context between the layers.
Introduce memalloc_nofs_{save,restore} API to control the scope of
GFP_NOFS allocation context. This is basically copying
memalloc_noio_{save,restore} API we have for other restricted allocation
context GFP_NOIO. The PF_MEMALLOC_NOFS flag already exists and it is
just an alias for PF_FSTRANS which has been xfs specific until recently.
There are no more PF_FSTRANS users anymore so let's just drop it.
PF_MEMALLOC_NOFS is now checked in the MM layer and drops __GFP_FS
implicitly same as PF_MEMALLOC_NOIO drops __GFP_IO. memalloc_noio_flags
is renamed to current_gfp_context because it now cares about both
PF_MEMALLOC_NOFS and PF_MEMALLOC_NOIO contexts. Xfs code paths preserve
their semantic. kmem_flags_convert() doesn't need to evaluate the flag
anymore.
This patch shouldn't introduce any functional changes.
Let's hope that filesystems will drop direct GFP_NOFS (resp. ~__GFP_FS)
usage as much as possible and only use a properly documented
memalloc_nofs_{save,restore} checkpoints where they are appropriate.
[akpm@linux-foundation.org: fix comment typo, reflow comment]
Link: http://lkml.kernel.org/r/20170306131408.9828-5-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Chris Mason <clm@fb.com>
Cc: David Sterba <dsterba@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Nikolay Borisov <nborisov@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-04 05:53:15 +08:00
|
|
|
/*
|
|
|
|
* Applies per-task gfp context to the given allocation flags.
|
|
|
|
* PF_MEMALLOC_NOIO implies GFP_NOIO
|
|
|
|
* PF_MEMALLOC_NOFS implies GFP_NOFS
|
2017-02-03 03:43:54 +08:00
|
|
|
*/
|
mm: introduce memalloc_nofs_{save,restore} API
GFP_NOFS context is used for the following 5 reasons currently:
- to prevent from deadlocks when the lock held by the allocation
context would be needed during the memory reclaim
- to prevent from stack overflows during the reclaim because the
allocation is performed from a deep context already
- to prevent lockups when the allocation context depends on other
reclaimers to make a forward progress indirectly
- just in case because this would be safe from the fs POV
- silence lockdep false positives
Unfortunately overuse of this allocation context brings some problems to
the MM. Memory reclaim is much weaker (especially during heavy FS
metadata workloads), OOM killer cannot be invoked because the MM layer
doesn't have enough information about how much memory is freeable by the
FS layer.
In many cases it is far from clear why the weaker context is even used
and so it might be used unnecessarily. We would like to get rid of
those as much as possible. One way to do that is to use the flag in
scopes rather than isolated cases. Such a scope is declared when really
necessary, tracked per task and all the allocation requests from within
the context will simply inherit the GFP_NOFS semantic.
Not only this is easier to understand and maintain because there are
much less problematic contexts than specific allocation requests, this
also helps code paths where FS layer interacts with other layers (e.g.
crypto, security modules, MM etc...) and there is no easy way to convey
the allocation context between the layers.
Introduce memalloc_nofs_{save,restore} API to control the scope of
GFP_NOFS allocation context. This is basically copying
memalloc_noio_{save,restore} API we have for other restricted allocation
context GFP_NOIO. The PF_MEMALLOC_NOFS flag already exists and it is
just an alias for PF_FSTRANS which has been xfs specific until recently.
There are no more PF_FSTRANS users anymore so let's just drop it.
PF_MEMALLOC_NOFS is now checked in the MM layer and drops __GFP_FS
implicitly same as PF_MEMALLOC_NOIO drops __GFP_IO. memalloc_noio_flags
is renamed to current_gfp_context because it now cares about both
PF_MEMALLOC_NOFS and PF_MEMALLOC_NOIO contexts. Xfs code paths preserve
their semantic. kmem_flags_convert() doesn't need to evaluate the flag
anymore.
This patch shouldn't introduce any functional changes.
Let's hope that filesystems will drop direct GFP_NOFS (resp. ~__GFP_FS)
usage as much as possible and only use a properly documented
memalloc_nofs_{save,restore} checkpoints where they are appropriate.
[akpm@linux-foundation.org: fix comment typo, reflow comment]
Link: http://lkml.kernel.org/r/20170306131408.9828-5-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Chris Mason <clm@fb.com>
Cc: David Sterba <dsterba@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Nikolay Borisov <nborisov@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-04 05:53:15 +08:00
|
|
|
static inline gfp_t current_gfp_context(gfp_t flags)
|
2017-02-03 03:43:54 +08:00
|
|
|
{
|
mm: introduce memalloc_nofs_{save,restore} API
GFP_NOFS context is used for the following 5 reasons currently:
- to prevent from deadlocks when the lock held by the allocation
context would be needed during the memory reclaim
- to prevent from stack overflows during the reclaim because the
allocation is performed from a deep context already
- to prevent lockups when the allocation context depends on other
reclaimers to make a forward progress indirectly
- just in case because this would be safe from the fs POV
- silence lockdep false positives
Unfortunately overuse of this allocation context brings some problems to
the MM. Memory reclaim is much weaker (especially during heavy FS
metadata workloads), OOM killer cannot be invoked because the MM layer
doesn't have enough information about how much memory is freeable by the
FS layer.
In many cases it is far from clear why the weaker context is even used
and so it might be used unnecessarily. We would like to get rid of
those as much as possible. One way to do that is to use the flag in
scopes rather than isolated cases. Such a scope is declared when really
necessary, tracked per task and all the allocation requests from within
the context will simply inherit the GFP_NOFS semantic.
Not only this is easier to understand and maintain because there are
much less problematic contexts than specific allocation requests, this
also helps code paths where FS layer interacts with other layers (e.g.
crypto, security modules, MM etc...) and there is no easy way to convey
the allocation context between the layers.
Introduce memalloc_nofs_{save,restore} API to control the scope of
GFP_NOFS allocation context. This is basically copying
memalloc_noio_{save,restore} API we have for other restricted allocation
context GFP_NOIO. The PF_MEMALLOC_NOFS flag already exists and it is
just an alias for PF_FSTRANS which has been xfs specific until recently.
There are no more PF_FSTRANS users anymore so let's just drop it.
PF_MEMALLOC_NOFS is now checked in the MM layer and drops __GFP_FS
implicitly same as PF_MEMALLOC_NOIO drops __GFP_IO. memalloc_noio_flags
is renamed to current_gfp_context because it now cares about both
PF_MEMALLOC_NOFS and PF_MEMALLOC_NOIO contexts. Xfs code paths preserve
their semantic. kmem_flags_convert() doesn't need to evaluate the flag
anymore.
This patch shouldn't introduce any functional changes.
Let's hope that filesystems will drop direct GFP_NOFS (resp. ~__GFP_FS)
usage as much as possible and only use a properly documented
memalloc_nofs_{save,restore} checkpoints where they are appropriate.
[akpm@linux-foundation.org: fix comment typo, reflow comment]
Link: http://lkml.kernel.org/r/20170306131408.9828-5-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Chris Mason <clm@fb.com>
Cc: David Sterba <dsterba@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Nikolay Borisov <nborisov@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-04 05:53:15 +08:00
|
|
|
/*
|
|
|
|
* NOIO implies both NOIO and NOFS and it is a weaker context
|
|
|
|
* so always make sure it makes precendence
|
|
|
|
*/
|
2017-02-03 03:43:54 +08:00
|
|
|
if (unlikely(current->flags & PF_MEMALLOC_NOIO))
|
|
|
|
flags &= ~(__GFP_IO | __GFP_FS);
|
mm: introduce memalloc_nofs_{save,restore} API
GFP_NOFS context is used for the following 5 reasons currently:
- to prevent from deadlocks when the lock held by the allocation
context would be needed during the memory reclaim
- to prevent from stack overflows during the reclaim because the
allocation is performed from a deep context already
- to prevent lockups when the allocation context depends on other
reclaimers to make a forward progress indirectly
- just in case because this would be safe from the fs POV
- silence lockdep false positives
Unfortunately overuse of this allocation context brings some problems to
the MM. Memory reclaim is much weaker (especially during heavy FS
metadata workloads), OOM killer cannot be invoked because the MM layer
doesn't have enough information about how much memory is freeable by the
FS layer.
In many cases it is far from clear why the weaker context is even used
and so it might be used unnecessarily. We would like to get rid of
those as much as possible. One way to do that is to use the flag in
scopes rather than isolated cases. Such a scope is declared when really
necessary, tracked per task and all the allocation requests from within
the context will simply inherit the GFP_NOFS semantic.
Not only this is easier to understand and maintain because there are
much less problematic contexts than specific allocation requests, this
also helps code paths where FS layer interacts with other layers (e.g.
crypto, security modules, MM etc...) and there is no easy way to convey
the allocation context between the layers.
Introduce memalloc_nofs_{save,restore} API to control the scope of
GFP_NOFS allocation context. This is basically copying
memalloc_noio_{save,restore} API we have for other restricted allocation
context GFP_NOIO. The PF_MEMALLOC_NOFS flag already exists and it is
just an alias for PF_FSTRANS which has been xfs specific until recently.
There are no more PF_FSTRANS users anymore so let's just drop it.
PF_MEMALLOC_NOFS is now checked in the MM layer and drops __GFP_FS
implicitly same as PF_MEMALLOC_NOIO drops __GFP_IO. memalloc_noio_flags
is renamed to current_gfp_context because it now cares about both
PF_MEMALLOC_NOFS and PF_MEMALLOC_NOIO contexts. Xfs code paths preserve
their semantic. kmem_flags_convert() doesn't need to evaluate the flag
anymore.
This patch shouldn't introduce any functional changes.
Let's hope that filesystems will drop direct GFP_NOFS (resp. ~__GFP_FS)
usage as much as possible and only use a properly documented
memalloc_nofs_{save,restore} checkpoints where they are appropriate.
[akpm@linux-foundation.org: fix comment typo, reflow comment]
Link: http://lkml.kernel.org/r/20170306131408.9828-5-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Chris Mason <clm@fb.com>
Cc: David Sterba <dsterba@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Nikolay Borisov <nborisov@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-04 05:53:15 +08:00
|
|
|
else if (unlikely(current->flags & PF_MEMALLOC_NOFS))
|
|
|
|
flags &= ~__GFP_FS;
|
2017-02-03 03:43:54 +08:00
|
|
|
return flags;
|
|
|
|
}
|
|
|
|
|
2017-03-03 17:13:38 +08:00
|
|
|
#ifdef CONFIG_LOCKDEP
|
|
|
|
extern void fs_reclaim_acquire(gfp_t gfp_mask);
|
|
|
|
extern void fs_reclaim_release(gfp_t gfp_mask);
|
|
|
|
#else
|
|
|
|
static inline void fs_reclaim_acquire(gfp_t gfp_mask) { }
|
|
|
|
static inline void fs_reclaim_release(gfp_t gfp_mask) { }
|
|
|
|
#endif
|
|
|
|
|
2017-02-03 03:43:54 +08:00
|
|
|
static inline unsigned int memalloc_noio_save(void)
|
|
|
|
{
|
|
|
|
unsigned int flags = current->flags & PF_MEMALLOC_NOIO;
|
|
|
|
current->flags |= PF_MEMALLOC_NOIO;
|
|
|
|
return flags;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void memalloc_noio_restore(unsigned int flags)
|
|
|
|
{
|
|
|
|
current->flags = (current->flags & ~PF_MEMALLOC_NOIO) | flags;
|
|
|
|
}
|
|
|
|
|
mm: introduce memalloc_nofs_{save,restore} API
GFP_NOFS context is used for the following 5 reasons currently:
- to prevent from deadlocks when the lock held by the allocation
context would be needed during the memory reclaim
- to prevent from stack overflows during the reclaim because the
allocation is performed from a deep context already
- to prevent lockups when the allocation context depends on other
reclaimers to make a forward progress indirectly
- just in case because this would be safe from the fs POV
- silence lockdep false positives
Unfortunately overuse of this allocation context brings some problems to
the MM. Memory reclaim is much weaker (especially during heavy FS
metadata workloads), OOM killer cannot be invoked because the MM layer
doesn't have enough information about how much memory is freeable by the
FS layer.
In many cases it is far from clear why the weaker context is even used
and so it might be used unnecessarily. We would like to get rid of
those as much as possible. One way to do that is to use the flag in
scopes rather than isolated cases. Such a scope is declared when really
necessary, tracked per task and all the allocation requests from within
the context will simply inherit the GFP_NOFS semantic.
Not only this is easier to understand and maintain because there are
much less problematic contexts than specific allocation requests, this
also helps code paths where FS layer interacts with other layers (e.g.
crypto, security modules, MM etc...) and there is no easy way to convey
the allocation context between the layers.
Introduce memalloc_nofs_{save,restore} API to control the scope of
GFP_NOFS allocation context. This is basically copying
memalloc_noio_{save,restore} API we have for other restricted allocation
context GFP_NOIO. The PF_MEMALLOC_NOFS flag already exists and it is
just an alias for PF_FSTRANS which has been xfs specific until recently.
There are no more PF_FSTRANS users anymore so let's just drop it.
PF_MEMALLOC_NOFS is now checked in the MM layer and drops __GFP_FS
implicitly same as PF_MEMALLOC_NOIO drops __GFP_IO. memalloc_noio_flags
is renamed to current_gfp_context because it now cares about both
PF_MEMALLOC_NOFS and PF_MEMALLOC_NOIO contexts. Xfs code paths preserve
their semantic. kmem_flags_convert() doesn't need to evaluate the flag
anymore.
This patch shouldn't introduce any functional changes.
Let's hope that filesystems will drop direct GFP_NOFS (resp. ~__GFP_FS)
usage as much as possible and only use a properly documented
memalloc_nofs_{save,restore} checkpoints where they are appropriate.
[akpm@linux-foundation.org: fix comment typo, reflow comment]
Link: http://lkml.kernel.org/r/20170306131408.9828-5-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Chris Mason <clm@fb.com>
Cc: David Sterba <dsterba@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Nikolay Borisov <nborisov@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-04 05:53:15 +08:00
|
|
|
static inline unsigned int memalloc_nofs_save(void)
|
|
|
|
{
|
|
|
|
unsigned int flags = current->flags & PF_MEMALLOC_NOFS;
|
|
|
|
current->flags |= PF_MEMALLOC_NOFS;
|
|
|
|
return flags;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void memalloc_nofs_restore(unsigned int flags)
|
|
|
|
{
|
|
|
|
current->flags = (current->flags & ~PF_MEMALLOC_NOFS) | flags;
|
|
|
|
}
|
|
|
|
|
2017-05-09 06:59:50 +08:00
|
|
|
static inline unsigned int memalloc_noreclaim_save(void)
|
|
|
|
{
|
|
|
|
unsigned int flags = current->flags & PF_MEMALLOC;
|
|
|
|
current->flags |= PF_MEMALLOC;
|
|
|
|
return flags;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void memalloc_noreclaim_restore(unsigned int flags)
|
|
|
|
{
|
|
|
|
current->flags = (current->flags & ~PF_MEMALLOC) | flags;
|
|
|
|
}
|
|
|
|
|
2017-10-20 01:30:15 +08:00
|
|
|
#ifdef CONFIG_MEMBARRIER
|
|
|
|
enum {
|
2018-01-30 04:20:13 +08:00
|
|
|
MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY = (1U << 0),
|
|
|
|
MEMBARRIER_STATE_PRIVATE_EXPEDITED = (1U << 1),
|
|
|
|
MEMBARRIER_STATE_GLOBAL_EXPEDITED_READY = (1U << 2),
|
|
|
|
MEMBARRIER_STATE_GLOBAL_EXPEDITED = (1U << 3),
|
2018-01-30 04:20:17 +08:00
|
|
|
MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY = (1U << 4),
|
|
|
|
MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE = (1U << 5),
|
|
|
|
};
|
|
|
|
|
|
|
|
enum {
|
|
|
|
MEMBARRIER_FLAG_SYNC_CORE = (1U << 0),
|
2017-10-20 01:30:15 +08:00
|
|
|
};
|
|
|
|
|
2018-01-30 04:20:11 +08:00
|
|
|
#ifdef CONFIG_ARCH_HAS_MEMBARRIER_CALLBACKS
|
|
|
|
#include <asm/membarrier.h>
|
|
|
|
#endif
|
|
|
|
|
2018-01-30 04:20:17 +08:00
|
|
|
static inline void membarrier_mm_sync_core_before_usermode(struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
if (likely(!(atomic_read(&mm->membarrier_state) &
|
|
|
|
MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE)))
|
|
|
|
return;
|
|
|
|
sync_core_before_usermode();
|
|
|
|
}
|
|
|
|
|
2017-10-20 01:30:15 +08:00
|
|
|
static inline void membarrier_execve(struct task_struct *t)
|
|
|
|
{
|
|
|
|
atomic_set(&t->mm->membarrier_state, 0);
|
|
|
|
}
|
|
|
|
#else
|
2018-01-30 04:20:11 +08:00
|
|
|
#ifdef CONFIG_ARCH_HAS_MEMBARRIER_CALLBACKS
|
|
|
|
static inline void membarrier_arch_switch_mm(struct mm_struct *prev,
|
|
|
|
struct mm_struct *next,
|
|
|
|
struct task_struct *tsk)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
#endif
|
2017-10-20 01:30:15 +08:00
|
|
|
static inline void membarrier_execve(struct task_struct *t)
|
|
|
|
{
|
|
|
|
}
|
2018-01-30 04:20:17 +08:00
|
|
|
static inline void membarrier_mm_sync_core_before_usermode(struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
}
|
2017-10-20 01:30:15 +08:00
|
|
|
#endif
|
|
|
|
|
2017-02-09 01:51:29 +08:00
|
|
|
#endif /* _LINUX_SCHED_MM_H */
|