License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2005-04-17 06:20:36 +08:00
|
|
|
#ifndef __LINUX_GFP_H
|
|
|
|
#define __LINUX_GFP_H
|
|
|
|
|
2022-04-15 00:42:28 +08:00
|
|
|
#include <linux/gfp_types.h>
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
#include <linux/mmzone.h>
|
2009-03-13 21:13:37 +08:00
|
|
|
#include <linux/topology.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
struct vm_area_struct;
|
|
|
|
|
2015-11-07 08:28:43 +08:00
|
|
|
/* Convert GFP flags to their corresponding migrate type */
|
2007-10-16 16:25:52 +08:00
|
|
|
#define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
|
2015-11-07 08:28:18 +08:00
|
|
|
#define GFP_MOVABLE_SHIFT 3
|
2007-10-16 16:25:41 +08:00
|
|
|
|
2020-06-04 06:59:08 +08:00
|
|
|
static inline int gfp_migratetype(const gfp_t gfp_flags)
|
Print out statistics in relation to fragmentation avoidance to /proc/pagetypeinfo
This patch provides fragmentation avoidance statistics via /proc/pagetypeinfo.
The information is collected only on request so there is no runtime overhead.
The statistics are in three parts:
The first part prints information on the size of blocks that pages are
being grouped on and looks like
Page block order: 10
Pages per block: 1024
The second part is a more detailed version of /proc/buddyinfo and looks like
Free pages count per migrate type at order 0 1 2 3 4 5 6 7 8 9 10
Node 0, zone DMA, type Unmovable 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA, type Reclaimable 1 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA, type Movable 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA, type Reserve 0 4 4 0 0 0 0 1 0 1 0
Node 0, zone Normal, type Unmovable 111 8 4 4 2 3 1 0 0 0 0
Node 0, zone Normal, type Reclaimable 293 89 8 0 0 0 0 0 0 0 0
Node 0, zone Normal, type Movable 1 6 13 9 7 6 3 0 0 0 0
Node 0, zone Normal, type Reserve 0 0 0 0 0 0 0 0 0 0 4
The third part looks like
Number of blocks type Unmovable Reclaimable Movable Reserve
Node 0, zone DMA 0 1 2 1
Node 0, zone Normal 3 17 94 4
To walk the zones within a node with interrupts disabled, walk_zones_in_node()
is introduced and shared between /proc/buddyinfo, /proc/zoneinfo and
/proc/pagetypeinfo to reduce code duplication. It seems specific to what
vmstat.c requires but could be broken out as a general utility function in
mmzone.c if there were other other potential users.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 16:26:02 +08:00
|
|
|
{
|
2015-11-07 08:28:18 +08:00
|
|
|
VM_WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
|
|
|
|
BUILD_BUG_ON((1UL << GFP_MOVABLE_SHIFT) != ___GFP_MOVABLE);
|
|
|
|
BUILD_BUG_ON((___GFP_MOVABLE >> GFP_MOVABLE_SHIFT) != MIGRATE_MOVABLE);
|
2022-07-27 07:02:41 +08:00
|
|
|
BUILD_BUG_ON((___GFP_RECLAIMABLE >> GFP_MOVABLE_SHIFT) != MIGRATE_RECLAIMABLE);
|
|
|
|
BUILD_BUG_ON(((___GFP_MOVABLE | ___GFP_RECLAIMABLE) >>
|
|
|
|
GFP_MOVABLE_SHIFT) != MIGRATE_HIGHATOMIC);
|
Print out statistics in relation to fragmentation avoidance to /proc/pagetypeinfo
This patch provides fragmentation avoidance statistics via /proc/pagetypeinfo.
The information is collected only on request so there is no runtime overhead.
The statistics are in three parts:
The first part prints information on the size of blocks that pages are
being grouped on and looks like
Page block order: 10
Pages per block: 1024
The second part is a more detailed version of /proc/buddyinfo and looks like
Free pages count per migrate type at order 0 1 2 3 4 5 6 7 8 9 10
Node 0, zone DMA, type Unmovable 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA, type Reclaimable 1 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA, type Movable 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA, type Reserve 0 4 4 0 0 0 0 1 0 1 0
Node 0, zone Normal, type Unmovable 111 8 4 4 2 3 1 0 0 0 0
Node 0, zone Normal, type Reclaimable 293 89 8 0 0 0 0 0 0 0 0
Node 0, zone Normal, type Movable 1 6 13 9 7 6 3 0 0 0 0
Node 0, zone Normal, type Reserve 0 0 0 0 0 0 0 0 0 0 4
The third part looks like
Number of blocks type Unmovable Reclaimable Movable Reserve
Node 0, zone DMA 0 1 2 1
Node 0, zone Normal 3 17 94 4
To walk the zones within a node with interrupts disabled, walk_zones_in_node()
is introduced and shared between /proc/buddyinfo, /proc/zoneinfo and
/proc/pagetypeinfo to reduce code duplication. It seems specific to what
vmstat.c requires but could be broken out as a general utility function in
mmzone.c if there were other other potential users.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 16:26:02 +08:00
|
|
|
|
|
|
|
if (unlikely(page_group_by_mobility_disabled))
|
|
|
|
return MIGRATE_UNMOVABLE;
|
|
|
|
|
|
|
|
/* Group based on mobility */
|
2022-05-13 11:23:08 +08:00
|
|
|
return (__force unsigned long)(gfp_flags & GFP_MOVABLE_MASK) >> GFP_MOVABLE_SHIFT;
|
Print out statistics in relation to fragmentation avoidance to /proc/pagetypeinfo
This patch provides fragmentation avoidance statistics via /proc/pagetypeinfo.
The information is collected only on request so there is no runtime overhead.
The statistics are in three parts:
The first part prints information on the size of blocks that pages are
being grouped on and looks like
Page block order: 10
Pages per block: 1024
The second part is a more detailed version of /proc/buddyinfo and looks like
Free pages count per migrate type at order 0 1 2 3 4 5 6 7 8 9 10
Node 0, zone DMA, type Unmovable 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA, type Reclaimable 1 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA, type Movable 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA, type Reserve 0 4 4 0 0 0 0 1 0 1 0
Node 0, zone Normal, type Unmovable 111 8 4 4 2 3 1 0 0 0 0
Node 0, zone Normal, type Reclaimable 293 89 8 0 0 0 0 0 0 0 0
Node 0, zone Normal, type Movable 1 6 13 9 7 6 3 0 0 0 0
Node 0, zone Normal, type Reserve 0 0 0 0 0 0 0 0 0 0 4
The third part looks like
Number of blocks type Unmovable Reclaimable Movable Reserve
Node 0, zone DMA 0 1 2 1
Node 0, zone Normal 3 17 94 4
To walk the zones within a node with interrupts disabled, walk_zones_in_node()
is introduced and shared between /proc/buddyinfo, /proc/zoneinfo and
/proc/pagetypeinfo to reduce code duplication. It seems specific to what
vmstat.c requires but could be broken out as a general utility function in
mmzone.c if there were other other potential users.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 16:26:02 +08:00
|
|
|
}
|
2015-11-07 08:28:43 +08:00
|
|
|
#undef GFP_MOVABLE_MASK
|
|
|
|
#undef GFP_MOVABLE_SHIFT
|
2005-11-06 00:25:53 +08:00
|
|
|
|
2015-11-07 08:28:21 +08:00
|
|
|
static inline bool gfpflags_allow_blocking(const gfp_t gfp_flags)
|
|
|
|
{
|
2016-01-15 07:22:10 +08:00
|
|
|
return !!(gfp_flags & __GFP_DIRECT_RECLAIM);
|
2015-11-07 08:28:21 +08:00
|
|
|
}
|
|
|
|
|
2009-06-17 06:32:46 +08:00
|
|
|
#ifdef CONFIG_HIGHMEM
|
|
|
|
#define OPT_ZONE_HIGHMEM ZONE_HIGHMEM
|
|
|
|
#else
|
|
|
|
#define OPT_ZONE_HIGHMEM ZONE_NORMAL
|
|
|
|
#endif
|
|
|
|
|
2007-02-10 17:43:10 +08:00
|
|
|
#ifdef CONFIG_ZONE_DMA
|
2009-06-17 06:32:46 +08:00
|
|
|
#define OPT_ZONE_DMA ZONE_DMA
|
|
|
|
#else
|
|
|
|
#define OPT_ZONE_DMA ZONE_NORMAL
|
2007-02-10 17:43:10 +08:00
|
|
|
#endif
|
2009-06-17 06:32:46 +08:00
|
|
|
|
2006-09-26 14:31:17 +08:00
|
|
|
#ifdef CONFIG_ZONE_DMA32
|
2009-06-17 06:32:46 +08:00
|
|
|
#define OPT_ZONE_DMA32 ZONE_DMA32
|
|
|
|
#else
|
|
|
|
#define OPT_ZONE_DMA32 ZONE_NORMAL
|
2006-09-26 14:31:17 +08:00
|
|
|
#endif
|
2009-06-17 06:32:46 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* GFP_ZONE_TABLE is a word size bitstring that is used for looking up the
|
2017-05-04 05:54:51 +08:00
|
|
|
* zone to use given the lowest 4 bits of gfp_t. Entries are GFP_ZONES_SHIFT
|
|
|
|
* bits long and there are 16 of them to cover all possible combinations of
|
2010-05-25 05:32:44 +08:00
|
|
|
* __GFP_DMA, __GFP_DMA32, __GFP_MOVABLE and __GFP_HIGHMEM.
|
2009-06-17 06:32:46 +08:00
|
|
|
*
|
|
|
|
* The zone fallback order is MOVABLE=>HIGHMEM=>NORMAL=>DMA32=>DMA.
|
|
|
|
* But GFP_MOVABLE is not only a zone specifier but also an allocation
|
|
|
|
* policy. Therefore __GFP_MOVABLE plus another zone selector is valid.
|
2010-05-25 05:32:44 +08:00
|
|
|
* Only 1 bit of the lowest 3 bits (DMA,DMA32,HIGHMEM) can be set to "1".
|
2009-06-17 06:32:46 +08:00
|
|
|
*
|
|
|
|
* bit result
|
|
|
|
* =================
|
|
|
|
* 0x0 => NORMAL
|
|
|
|
* 0x1 => DMA or NORMAL
|
|
|
|
* 0x2 => HIGHMEM or NORMAL
|
|
|
|
* 0x3 => BAD (DMA+HIGHMEM)
|
2018-06-08 08:09:36 +08:00
|
|
|
* 0x4 => DMA32 or NORMAL
|
2009-06-17 06:32:46 +08:00
|
|
|
* 0x5 => BAD (DMA+DMA32)
|
|
|
|
* 0x6 => BAD (HIGHMEM+DMA32)
|
|
|
|
* 0x7 => BAD (HIGHMEM+DMA32+DMA)
|
|
|
|
* 0x8 => NORMAL (MOVABLE+0)
|
|
|
|
* 0x9 => DMA or NORMAL (MOVABLE+DMA)
|
|
|
|
* 0xa => MOVABLE (Movable is valid only if HIGHMEM is set too)
|
|
|
|
* 0xb => BAD (MOVABLE+HIGHMEM+DMA)
|
2018-06-08 08:09:36 +08:00
|
|
|
* 0xc => DMA32 or NORMAL (MOVABLE+DMA32)
|
2009-06-17 06:32:46 +08:00
|
|
|
* 0xd => BAD (MOVABLE+DMA32+DMA)
|
|
|
|
* 0xe => BAD (MOVABLE+DMA32+HIGHMEM)
|
|
|
|
* 0xf => BAD (MOVABLE+DMA32+HIGHMEM+DMA)
|
|
|
|
*
|
2016-03-18 05:19:41 +08:00
|
|
|
* GFP_ZONES_SHIFT must be <= 2 on 32 bit platforms.
|
2009-06-17 06:32:46 +08:00
|
|
|
*/
|
|
|
|
|
2016-03-18 05:19:41 +08:00
|
|
|
#if defined(CONFIG_ZONE_DEVICE) && (MAX_NR_ZONES-1) <= 4
|
|
|
|
/* ZONE_DEVICE is not a valid GFP zone specifier */
|
|
|
|
#define GFP_ZONES_SHIFT 2
|
|
|
|
#else
|
|
|
|
#define GFP_ZONES_SHIFT ZONES_SHIFT
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#if 16 * GFP_ZONES_SHIFT > BITS_PER_LONG
|
|
|
|
#error GFP_ZONES_SHIFT too large to create GFP_ZONE_TABLE integer
|
2009-06-17 06:32:46 +08:00
|
|
|
#endif
|
|
|
|
|
|
|
|
#define GFP_ZONE_TABLE ( \
|
2016-03-18 05:19:41 +08:00
|
|
|
(ZONE_NORMAL << 0 * GFP_ZONES_SHIFT) \
|
|
|
|
| (OPT_ZONE_DMA << ___GFP_DMA * GFP_ZONES_SHIFT) \
|
|
|
|
| (OPT_ZONE_HIGHMEM << ___GFP_HIGHMEM * GFP_ZONES_SHIFT) \
|
|
|
|
| (OPT_ZONE_DMA32 << ___GFP_DMA32 * GFP_ZONES_SHIFT) \
|
|
|
|
| (ZONE_NORMAL << ___GFP_MOVABLE * GFP_ZONES_SHIFT) \
|
|
|
|
| (OPT_ZONE_DMA << (___GFP_MOVABLE | ___GFP_DMA) * GFP_ZONES_SHIFT) \
|
|
|
|
| (ZONE_MOVABLE << (___GFP_MOVABLE | ___GFP_HIGHMEM) * GFP_ZONES_SHIFT)\
|
|
|
|
| (OPT_ZONE_DMA32 << (___GFP_MOVABLE | ___GFP_DMA32) * GFP_ZONES_SHIFT)\
|
2009-06-17 06:32:46 +08:00
|
|
|
)
|
|
|
|
|
|
|
|
/*
|
2010-05-25 05:32:44 +08:00
|
|
|
* GFP_ZONE_BAD is a bitmap for all combinations of __GFP_DMA, __GFP_DMA32
|
2009-06-17 06:32:46 +08:00
|
|
|
* __GFP_HIGHMEM and __GFP_MOVABLE that are not permitted. One flag per
|
|
|
|
* entry starting with bit 0. Bit is set if the combination is not
|
|
|
|
* allowed.
|
|
|
|
*/
|
|
|
|
#define GFP_ZONE_BAD ( \
|
2010-10-27 05:22:04 +08:00
|
|
|
1 << (___GFP_DMA | ___GFP_HIGHMEM) \
|
|
|
|
| 1 << (___GFP_DMA | ___GFP_DMA32) \
|
|
|
|
| 1 << (___GFP_DMA32 | ___GFP_HIGHMEM) \
|
|
|
|
| 1 << (___GFP_DMA | ___GFP_DMA32 | ___GFP_HIGHMEM) \
|
|
|
|
| 1 << (___GFP_MOVABLE | ___GFP_HIGHMEM | ___GFP_DMA) \
|
|
|
|
| 1 << (___GFP_MOVABLE | ___GFP_DMA32 | ___GFP_DMA) \
|
|
|
|
| 1 << (___GFP_MOVABLE | ___GFP_DMA32 | ___GFP_HIGHMEM) \
|
|
|
|
| 1 << (___GFP_MOVABLE | ___GFP_DMA32 | ___GFP_DMA | ___GFP_HIGHMEM) \
|
2009-06-17 06:32:46 +08:00
|
|
|
)
|
|
|
|
|
|
|
|
static inline enum zone_type gfp_zone(gfp_t flags)
|
|
|
|
{
|
|
|
|
enum zone_type z;
|
2010-10-27 05:22:04 +08:00
|
|
|
int bit = (__force int) (flags & GFP_ZONEMASK);
|
2009-06-17 06:32:46 +08:00
|
|
|
|
2016-03-18 05:19:41 +08:00
|
|
|
z = (GFP_ZONE_TABLE >> (bit * GFP_ZONES_SHIFT)) &
|
|
|
|
((1 << GFP_ZONES_SHIFT) - 1);
|
2011-05-25 08:11:42 +08:00
|
|
|
VM_BUG_ON((GFP_ZONE_BAD >> bit) & 1);
|
2009-06-17 06:32:46 +08:00
|
|
|
return z;
|
2006-09-26 14:31:17 +08:00
|
|
|
}
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/*
|
|
|
|
* There is only one page-allocator function, and two main namespaces to
|
|
|
|
* it. The alloc_page*() variants return 'struct page *' and as such
|
|
|
|
* can allocate highmem pages, the *get*page*() variants return
|
|
|
|
* virtual kernel addresses to the allocated page(s).
|
|
|
|
*/
|
|
|
|
|
2008-04-28 17:12:16 +08:00
|
|
|
static inline int gfp_zonelist(gfp_t flags)
|
|
|
|
{
|
2016-01-15 07:19:00 +08:00
|
|
|
#ifdef CONFIG_NUMA
|
|
|
|
if (unlikely(flags & __GFP_THISNODE))
|
|
|
|
return ZONELIST_NOFALLBACK;
|
|
|
|
#endif
|
|
|
|
return ZONELIST_FALLBACK;
|
2008-04-28 17:12:16 +08:00
|
|
|
}
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/*
|
|
|
|
* We get the zone list from the current node and the gfp_mask.
|
2021-05-07 09:05:51 +08:00
|
|
|
* This zone list contains a maximum of MAX_NUMNODES*MAX_NR_ZONES zones.
|
2008-04-28 17:12:16 +08:00
|
|
|
* There are two zonelists per node, one for all zones with memory and
|
|
|
|
* one containing just zones from the node the zonelist belongs to.
|
2005-04-17 06:20:36 +08:00
|
|
|
*
|
2021-06-29 10:42:55 +08:00
|
|
|
* For the case of non-NUMA systems the NODE_DATA() gets optimized to
|
|
|
|
* &contig_page_data at compile-time.
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
2008-04-28 17:12:14 +08:00
|
|
|
static inline struct zonelist *node_zonelist(int nid, gfp_t flags)
|
|
|
|
{
|
2008-04-28 17:12:16 +08:00
|
|
|
return NODE_DATA(nid)->node_zonelists + gfp_zonelist(flags);
|
2008-04-28 17:12:14 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
#ifndef HAVE_ARCH_FREE_PAGE
|
|
|
|
static inline void arch_free_page(struct page *page, int order) { }
|
|
|
|
#endif
|
2006-12-07 12:32:00 +08:00
|
|
|
#ifndef HAVE_ARCH_ALLOC_PAGE
|
|
|
|
static inline void arch_alloc_page(struct page *page, int order) { }
|
|
|
|
#endif
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2021-04-30 14:01:15 +08:00
|
|
|
struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
|
|
|
|
nodemask_t *nodemask);
|
2020-12-16 11:55:54 +08:00
|
|
|
struct folio *__folio_alloc(gfp_t gfp, unsigned int order, int preferred_nid,
|
|
|
|
nodemask_t *nodemask);
|
2008-07-24 12:27:01 +08:00
|
|
|
|
2021-04-30 14:01:45 +08:00
|
|
|
unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
|
|
|
|
nodemask_t *nodemask, int nr_pages,
|
2021-04-30 14:01:48 +08:00
|
|
|
struct list_head *page_list,
|
|
|
|
struct page **page_array);
|
2021-04-30 14:01:45 +08:00
|
|
|
|
2021-11-06 04:39:53 +08:00
|
|
|
unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp,
|
|
|
|
unsigned long nr_pages,
|
|
|
|
struct page **page_array);
|
|
|
|
|
2021-04-30 14:01:45 +08:00
|
|
|
/* Bulk allocate order-0 pages */
|
|
|
|
static inline unsigned long
|
2021-04-30 14:01:48 +08:00
|
|
|
alloc_pages_bulk_list(gfp_t gfp, unsigned long nr_pages, struct list_head *list)
|
2021-04-30 14:01:45 +08:00
|
|
|
{
|
2021-04-30 14:01:48 +08:00
|
|
|
return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, list, NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline unsigned long
|
|
|
|
alloc_pages_bulk_array(gfp_t gfp, unsigned long nr_pages, struct page **page_array)
|
|
|
|
{
|
|
|
|
return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, NULL, page_array);
|
2021-04-30 14:01:45 +08:00
|
|
|
}
|
|
|
|
|
2021-06-29 10:40:11 +08:00
|
|
|
static inline unsigned long
|
|
|
|
alloc_pages_bulk_array_node(gfp_t gfp, int nid, unsigned long nr_pages, struct page **page_array)
|
|
|
|
{
|
|
|
|
if (nid == NUMA_NO_NODE)
|
|
|
|
nid = numa_mem_id();
|
|
|
|
|
|
|
|
return __alloc_pages_bulk(gfp, nid, NULL, nr_pages, NULL, page_array);
|
|
|
|
}
|
|
|
|
|
mm: rename alloc_pages_exact_node() to __alloc_pages_node()
alloc_pages_exact_node() was introduced in commit 6484eb3e2a81 ("page
allocator: do not check NUMA node ID when the caller knows the node is
valid") as an optimized variant of alloc_pages_node(), that doesn't
fallback to current node for nid == NUMA_NO_NODE. Unfortunately the
name of the function can easily suggest that the allocation is
restricted to the given node and fails otherwise. In truth, the node is
only preferred, unless __GFP_THISNODE is passed among the gfp flags.
The misleading name has lead to mistakes in the past, see for example
commits 5265047ac301 ("mm, thp: really limit transparent hugepage
allocation to local node") and b360edb43f8e ("mm, mempolicy:
migrate_to_node should only migrate to node").
Another issue with the name is that there's a family of
alloc_pages_exact*() functions where 'exact' means exact size (instead
of page order), which leads to more confusion.
To prevent further mistakes, this patch effectively renames
alloc_pages_exact_node() to __alloc_pages_node() to better convey that
it's an optimized variant of alloc_pages_node() not intended for general
usage. Both functions get described in comments.
It has been also considered to really provide a convenience function for
allocations restricted to a node, but the major opinion seems to be that
__GFP_THISNODE already provides that functionality and we shouldn't
duplicate the API needlessly. The number of users would be small
anyway.
Existing callers of alloc_pages_exact_node() are simply converted to
call __alloc_pages_node(), with the exception of sba_alloc_coherent()
which open-codes the check for NUMA_NO_NODE, so it is converted to use
alloc_pages_node() instead. This means it no longer performs some
VM_BUG_ON checks, and since the current check for nid in
alloc_pages_node() uses a 'nid < 0' comparison (which includes
NUMA_NO_NODE), it may hide wrong values which would be previously
exposed.
Both differences will be rectified by the next patch.
To sum up, this patch makes no functional changes, except temporarily
hiding potentially buggy callers. Restricting the checks in
alloc_pages_node() is left for the next patch which can in turn expose
more existing buggy callers.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Robin Holt <robinmholt@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cliff Whickman <cpw@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-09 06:03:50 +08:00
|
|
|
/*
|
|
|
|
* Allocate pages, preferring the node given as nid. The node must be valid and
|
|
|
|
* online. For more general interface, see alloc_pages_node().
|
|
|
|
*/
|
|
|
|
static inline struct page *
|
|
|
|
__alloc_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2015-09-09 06:03:53 +08:00
|
|
|
VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
|
2018-05-26 05:47:46 +08:00
|
|
|
VM_WARN_ON((gfp_mask & __GFP_THISNODE) && !node_online(nid));
|
2006-01-12 05:43:45 +08:00
|
|
|
|
2021-04-30 14:01:15 +08:00
|
|
|
return __alloc_pages(gfp_mask, order, nid, NULL);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2020-12-16 11:55:54 +08:00
|
|
|
static inline
|
|
|
|
struct folio *__folio_alloc_node(gfp_t gfp, unsigned int order, int nid)
|
|
|
|
{
|
|
|
|
VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
|
|
|
|
VM_WARN_ON((gfp & __GFP_THISNODE) && !node_online(nid));
|
|
|
|
|
|
|
|
return __folio_alloc(gfp, order, nid, NULL);
|
|
|
|
}
|
|
|
|
|
mm: rename alloc_pages_exact_node() to __alloc_pages_node()
alloc_pages_exact_node() was introduced in commit 6484eb3e2a81 ("page
allocator: do not check NUMA node ID when the caller knows the node is
valid") as an optimized variant of alloc_pages_node(), that doesn't
fallback to current node for nid == NUMA_NO_NODE. Unfortunately the
name of the function can easily suggest that the allocation is
restricted to the given node and fails otherwise. In truth, the node is
only preferred, unless __GFP_THISNODE is passed among the gfp flags.
The misleading name has lead to mistakes in the past, see for example
commits 5265047ac301 ("mm, thp: really limit transparent hugepage
allocation to local node") and b360edb43f8e ("mm, mempolicy:
migrate_to_node should only migrate to node").
Another issue with the name is that there's a family of
alloc_pages_exact*() functions where 'exact' means exact size (instead
of page order), which leads to more confusion.
To prevent further mistakes, this patch effectively renames
alloc_pages_exact_node() to __alloc_pages_node() to better convey that
it's an optimized variant of alloc_pages_node() not intended for general
usage. Both functions get described in comments.
It has been also considered to really provide a convenience function for
allocations restricted to a node, but the major opinion seems to be that
__GFP_THISNODE already provides that functionality and we shouldn't
duplicate the API needlessly. The number of users would be small
anyway.
Existing callers of alloc_pages_exact_node() are simply converted to
call __alloc_pages_node(), with the exception of sba_alloc_coherent()
which open-codes the check for NUMA_NO_NODE, so it is converted to use
alloc_pages_node() instead. This means it no longer performs some
VM_BUG_ON checks, and since the current check for nid in
alloc_pages_node() uses a 'nid < 0' comparison (which includes
NUMA_NO_NODE), it may hide wrong values which would be previously
exposed.
Both differences will be rectified by the next patch.
To sum up, this patch makes no functional changes, except temporarily
hiding potentially buggy callers. Restricting the checks in
alloc_pages_node() is left for the next patch which can in turn expose
more existing buggy callers.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Robin Holt <robinmholt@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cliff Whickman <cpw@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-09 06:03:50 +08:00
|
|
|
/*
|
|
|
|
* Allocate pages, preferring the node given as nid. When nid == NUMA_NO_NODE,
|
2015-09-09 06:03:56 +08:00
|
|
|
* prefer the current CPU's closest node. Otherwise node must be valid and
|
|
|
|
* online.
|
mm: rename alloc_pages_exact_node() to __alloc_pages_node()
alloc_pages_exact_node() was introduced in commit 6484eb3e2a81 ("page
allocator: do not check NUMA node ID when the caller knows the node is
valid") as an optimized variant of alloc_pages_node(), that doesn't
fallback to current node for nid == NUMA_NO_NODE. Unfortunately the
name of the function can easily suggest that the allocation is
restricted to the given node and fails otherwise. In truth, the node is
only preferred, unless __GFP_THISNODE is passed among the gfp flags.
The misleading name has lead to mistakes in the past, see for example
commits 5265047ac301 ("mm, thp: really limit transparent hugepage
allocation to local node") and b360edb43f8e ("mm, mempolicy:
migrate_to_node should only migrate to node").
Another issue with the name is that there's a family of
alloc_pages_exact*() functions where 'exact' means exact size (instead
of page order), which leads to more confusion.
To prevent further mistakes, this patch effectively renames
alloc_pages_exact_node() to __alloc_pages_node() to better convey that
it's an optimized variant of alloc_pages_node() not intended for general
usage. Both functions get described in comments.
It has been also considered to really provide a convenience function for
allocations restricted to a node, but the major opinion seems to be that
__GFP_THISNODE already provides that functionality and we shouldn't
duplicate the API needlessly. The number of users would be small
anyway.
Existing callers of alloc_pages_exact_node() are simply converted to
call __alloc_pages_node(), with the exception of sba_alloc_coherent()
which open-codes the check for NUMA_NO_NODE, so it is converted to use
alloc_pages_node() instead. This means it no longer performs some
VM_BUG_ON checks, and since the current check for nid in
alloc_pages_node() uses a 'nid < 0' comparison (which includes
NUMA_NO_NODE), it may hide wrong values which would be previously
exposed.
Both differences will be rectified by the next patch.
To sum up, this patch makes no functional changes, except temporarily
hiding potentially buggy callers. Restricting the checks in
alloc_pages_node() is left for the next patch which can in turn expose
more existing buggy callers.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Robin Holt <robinmholt@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cliff Whickman <cpw@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-09 06:03:50 +08:00
|
|
|
*/
|
|
|
|
static inline struct page *alloc_pages_node(int nid, gfp_t gfp_mask,
|
2009-06-17 06:31:54 +08:00
|
|
|
unsigned int order)
|
|
|
|
{
|
2015-09-09 06:03:53 +08:00
|
|
|
if (nid == NUMA_NO_NODE)
|
2015-09-09 06:03:56 +08:00
|
|
|
nid = numa_mem_id();
|
2009-06-17 06:31:54 +08:00
|
|
|
|
2015-09-09 06:03:53 +08:00
|
|
|
return __alloc_pages_node(nid, gfp_mask, order);
|
2009-06-17 06:31:54 +08:00
|
|
|
}
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
#ifdef CONFIG_NUMA
|
2021-04-30 14:01:18 +08:00
|
|
|
struct page *alloc_pages(gfp_t gfp, unsigned int order);
|
2020-12-16 11:55:54 +08:00
|
|
|
struct folio *folio_alloc(gfp_t gfp, unsigned order);
|
2022-04-05 03:11:04 +08:00
|
|
|
struct folio *vma_alloc_folio(gfp_t gfp, int order, struct vm_area_struct *vma,
|
|
|
|
unsigned long addr, bool hugepage);
|
2005-04-17 06:20:36 +08:00
|
|
|
#else
|
2020-08-18 01:17:20 +08:00
|
|
|
static inline struct page *alloc_pages(gfp_t gfp_mask, unsigned int order)
|
|
|
|
{
|
|
|
|
return alloc_pages_node(numa_node_id(), gfp_mask, order);
|
|
|
|
}
|
2020-12-16 11:55:54 +08:00
|
|
|
static inline struct folio *folio_alloc(gfp_t gfp, unsigned int order)
|
|
|
|
{
|
|
|
|
return __folio_alloc_node(gfp, order, numa_node_id());
|
|
|
|
}
|
2022-04-05 03:11:04 +08:00
|
|
|
#define vma_alloc_folio(gfp, order, vma, addr, hugepage) \
|
|
|
|
folio_alloc(gfp, order)
|
2005-04-17 06:20:36 +08:00
|
|
|
#endif
|
|
|
|
#define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0)
|
2022-05-13 11:23:01 +08:00
|
|
|
static inline struct page *alloc_page_vma(gfp_t gfp,
|
|
|
|
struct vm_area_struct *vma, unsigned long addr)
|
|
|
|
{
|
|
|
|
struct folio *folio = vma_alloc_folio(gfp, 0, vma, addr, false);
|
|
|
|
|
|
|
|
return &folio->page;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-02-14 07:03:15 +08:00
|
|
|
extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order);
|
|
|
|
extern unsigned long get_zeroed_page(gfp_t gfp_mask);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2021-11-06 04:36:38 +08:00
|
|
|
void *alloc_pages_exact(size_t size, gfp_t gfp_mask) __alloc_size(1);
|
2008-07-24 12:28:11 +08:00
|
|
|
void free_pages_exact(void *virt, size_t size);
|
2021-12-25 13:12:51 +08:00
|
|
|
__meminit void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask) __alloc_size(2);
|
2008-07-24 12:28:11 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
#define __get_free_page(gfp_mask) \
|
2010-05-25 05:32:45 +08:00
|
|
|
__get_free_pages((gfp_mask), 0)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
#define __get_dma_pages(gfp_mask, order) \
|
2010-05-25 05:32:45 +08:00
|
|
|
__get_free_pages((gfp_mask) | GFP_DMA, (order))
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-02-14 07:03:15 +08:00
|
|
|
extern void __free_pages(struct page *page, unsigned int order);
|
|
|
|
extern void free_pages(unsigned long addr, unsigned int order);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2015-05-07 12:11:57 +08:00
|
|
|
struct page_frag_cache;
|
2017-01-11 08:58:09 +08:00
|
|
|
extern void __page_frag_cache_drain(struct page *page, unsigned int count);
|
2021-02-04 18:56:35 +08:00
|
|
|
extern void *page_frag_alloc_align(struct page_frag_cache *nc,
|
|
|
|
unsigned int fragsz, gfp_t gfp_mask,
|
|
|
|
unsigned int align_mask);
|
|
|
|
|
|
|
|
static inline void *page_frag_alloc(struct page_frag_cache *nc,
|
|
|
|
unsigned int fragsz, gfp_t gfp_mask)
|
|
|
|
{
|
|
|
|
return page_frag_alloc_align(nc, fragsz, gfp_mask, ~0u);
|
|
|
|
}
|
|
|
|
|
2017-01-11 08:58:06 +08:00
|
|
|
extern void page_frag_free(void *addr);
|
2015-05-07 12:11:57 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
#define __free_page(page) __free_pages((page), 0)
|
2010-05-25 05:32:45 +08:00
|
|
|
#define free_page(addr) free_pages((addr), 0)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
void page_alloc_init(void);
|
2007-05-09 17:35:14 +08:00
|
|
|
void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
|
2014-12-11 07:43:01 +08:00
|
|
|
void drain_all_pages(struct zone *zone);
|
|
|
|
void drain_local_pages(struct zone *zone);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2015-07-01 05:57:27 +08:00
|
|
|
void page_alloc_init_late(void);
|
|
|
|
|
2012-01-11 07:07:15 +08:00
|
|
|
/*
|
|
|
|
* gfp_allowed_mask is set to GFP_BOOT_MASK during early boot to restrict what
|
|
|
|
* GFP flags are used before interrupts are enabled. Once interrupts are
|
|
|
|
* enabled, it is set to __GFP_BITS_MASK while the system is running. During
|
|
|
|
* hibernation, it is used by PM to avoid I/O during memory allocation while
|
|
|
|
* devices are suspended.
|
|
|
|
*/
|
2009-06-18 11:24:12 +08:00
|
|
|
extern gfp_t gfp_allowed_mask;
|
|
|
|
|
2012-08-01 07:44:19 +08:00
|
|
|
/* Returns true if the gfp_mask allows use of ALLOC_NO_WATERMARK */
|
|
|
|
bool gfp_pfmemalloc_allowed(gfp_t gfp_mask);
|
|
|
|
|
2010-12-04 05:57:45 +08:00
|
|
|
extern void pm_restrict_gfp_mask(void);
|
|
|
|
extern void pm_restore_gfp_mask(void);
|
2009-06-18 11:24:12 +08:00
|
|
|
|
mm,thp,shmem: limit shmem THP alloc gfp_mask
Patch series "mm,thp,shm: limit shmem THP alloc gfp_mask", v6.
The allocation flags of anonymous transparent huge pages can be controlled
through the files in /sys/kernel/mm/transparent_hugepage/defrag, which can
help the system from getting bogged down in the page reclaim and
compaction code when many THPs are getting allocated simultaneously.
However, the gfp_mask for shmem THP allocations were not limited by those
configuration settings, and some workloads ended up with all CPUs stuck on
the LRU lock in the page reclaim code, trying to allocate dozens of THPs
simultaneously.
This patch applies the same configurated limitation of THPs to shmem
hugepage allocations, to prevent that from happening.
This way a THP defrag setting of "never" or "defer+madvise" will result in
quick allocation failures without direct reclaim when no 2MB free pages
are available.
With this patch applied, THP allocations for tmpfs will be a little more
aggressive than today for files mmapped with MADV_HUGEPAGE, and a little
less aggressive for files that are not mmapped or mapped without that
flag.
This patch (of 4):
The allocation flags of anonymous transparent huge pages can be controlled
through the files in /sys/kernel/mm/transparent_hugepage/defrag, which can
help the system from getting bogged down in the page reclaim and
compaction code when many THPs are getting allocated simultaneously.
However, the gfp_mask for shmem THP allocations were not limited by those
configuration settings, and some workloads ended up with all CPUs stuck on
the LRU lock in the page reclaim code, trying to allocate dozens of THPs
simultaneously.
This patch applies the same configurated limitation of THPs to shmem
hugepage allocations, to prevent that from happening.
Controlling the gfp_mask of THP allocations through the knobs in sysfs
allows users to determine the balance between how aggressively the system
tries to allocate THPs at fault time, and how much the application may end
up stalling attempting those allocations.
This way a THP defrag setting of "never" or "defer+madvise" will result in
quick allocation failures without direct reclaim when no 2MB free pages
are available.
With this patch applied, THP allocations for tmpfs will be a little more
aggressive than today for files mmapped with MADV_HUGEPAGE, and a little
less aggressive for files that are not mmapped or mapped without that
flag.
Link: https://lkml.kernel.org/r/20201124194925.623931-1-riel@surriel.com
Link: https://lkml.kernel.org/r/20201124194925.623931-2-riel@surriel.com
Signed-off-by: Rik van Riel <riel@surriel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Xu Yu <xuyu@linux.alibaba.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 09:16:18 +08:00
|
|
|
extern gfp_t vma_thp_gfp_mask(struct vm_area_struct *vma);
|
|
|
|
|
2012-01-11 07:07:15 +08:00
|
|
|
#ifdef CONFIG_PM_SLEEP
|
|
|
|
extern bool pm_suspended_storage(void);
|
|
|
|
#else
|
|
|
|
static inline bool pm_suspended_storage(void)
|
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
#endif /* CONFIG_PM_SLEEP */
|
|
|
|
|
2019-05-14 08:19:00 +08:00
|
|
|
#ifdef CONFIG_CONTIG_ALLOC
|
2011-12-29 20:09:50 +08:00
|
|
|
/* The below functions must be run on a range from a single zone. */
|
2012-04-03 21:06:15 +08:00
|
|
|
extern int alloc_contig_range(unsigned long start, unsigned long end,
|
2017-02-25 06:58:37 +08:00
|
|
|
unsigned migratetype, gfp_t gfp_mask);
|
2019-12-01 09:55:06 +08:00
|
|
|
extern struct page *alloc_contig_pages(unsigned long nr_pages, gfp_t gfp_mask,
|
|
|
|
int nid, nodemask_t *nodemask);
|
2016-02-06 07:36:41 +08:00
|
|
|
#endif
|
2021-05-05 09:37:34 +08:00
|
|
|
void free_contig_range(unsigned long pfn, unsigned long nr_pages);
|
2011-12-29 20:09:50 +08:00
|
|
|
|
2016-02-06 07:36:41 +08:00
|
|
|
#ifdef CONFIG_CMA
|
2011-12-29 20:09:50 +08:00
|
|
|
/* CMA stuff */
|
|
|
|
extern void init_cma_reserved_pageblock(struct page *page);
|
2011-12-29 20:09:50 +08:00
|
|
|
#endif
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
#endif /* __LINUX_GFP_H */
|