Merge branch 'akpm' (patches from Andrew)

Merge misc updates from Andrew Morton:

 - a few misc things

 - ocfs2 updates

 - most of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (159 commits)
  tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include
  proc: more robust bulk read test
  proc: test /proc/*/maps, smaps, smaps_rollup, statm
  proc: use seq_puts() everywhere
  proc: read kernel cpu stat pointer once
  proc: remove unused argument in proc_pid_lookup()
  fs/proc/thread_self.c: code cleanup for proc_setup_thread_self()
  fs/proc/self.c: code cleanup for proc_setup_self()
  proc: return exit code 4 for skipped tests
  mm,mremap: bail out earlier in mremap_to under map pressure
  mm/sparse: fix a bad comparison
  mm/memory.c: do_fault: avoid usage of stale vm_area_struct
  writeback: fix inode cgroup switching comment
  mm/huge_memory.c: fix "orig_pud" set but not used
  mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC
  mm/memcontrol.c: fix bad line in comment
  mm/cma.c: cma_declare_contiguous: correct err handling
  mm/page_ext.c: fix an imbalance with kmemleak
  mm/compaction: pass pgdat to too_many_isolated() instead of zone
  mm: remove zone_lru_lock() function, access ->lru_lock directly
  ...
This commit is contained in:
Linus Torvalds 2019-03-06 10:31:36 -08:00
commit 8dcd175bc3
213 changed files with 4924 additions and 2321 deletions

View File

@ -1189,6 +1189,10 @@ PAGE_SIZE multiple when read back.
Amount of cached filesystem data that was modified and Amount of cached filesystem data that was modified and
is currently being written back to disk is currently being written back to disk
anon_thp
Amount of memory used in anonymous mappings backed by
transparent hugepages
inactive_anon, active_anon, inactive_file, active_file, unevictable inactive_anon, active_anon, inactive_file, active_file, unevictable
Amount of memory, swap-backed and filesystem-backed, Amount of memory, swap-backed and filesystem-backed,
on the internal memory management lists used by the on the internal memory management lists used by the
@ -1248,6 +1252,18 @@ PAGE_SIZE multiple when read back.
Amount of reclaimed lazyfree pages Amount of reclaimed lazyfree pages
thp_fault_alloc
Number of transparent hugepages which were allocated to satisfy
a page fault, including COW faults. This counter is not present
when CONFIG_TRANSPARENT_HUGEPAGE is not set.
thp_collapse_alloc
Number of transparent hugepages which were allocated to allow
collapsing an existing range of pages. This counter is not
present when CONFIG_TRANSPARENT_HUGEPAGE is not set.
memory.swap.current memory.swap.current
A read-only single value file which exists on non-root A read-only single value file which exists on non-root
cgroups. cgroups.

View File

@ -75,9 +75,10 @@ number of times a page is mapped.
20. NOPAGE 20. NOPAGE
21. KSM 21. KSM
22. THP 22. THP
23. BALLOON 23. OFFLINE
24. ZERO_PAGE 24. ZERO_PAGE
25. IDLE 25. IDLE
26. PGTABLE
* ``/proc/kpagecgroup``. This file contains a 64-bit inode number of the * ``/proc/kpagecgroup``. This file contains a 64-bit inode number of the
memory cgroup each page is charged to, indexed by PFN. Only available when memory cgroup each page is charged to, indexed by PFN. Only available when
@ -118,8 +119,8 @@ Short descriptions to the page flags
identical memory pages dynamically shared between one or more processes identical memory pages dynamically shared between one or more processes
22 - THP 22 - THP
contiguous pages which construct transparent hugepages contiguous pages which construct transparent hugepages
23 - BALLOON 23 - OFFLINE
balloon compaction page page is logically offline
24 - ZERO_PAGE 24 - ZERO_PAGE
zero page for pfn_zero or huge_zero page zero page for pfn_zero or huge_zero page
25 - IDLE 25 - IDLE
@ -128,6 +129,8 @@ Short descriptions to the page flags
Note that this flag may be stale in case the page was accessed via Note that this flag may be stale in case the page was accessed via
a PTE. To make sure the flag is up-to-date one has to read a PTE. To make sure the flag is up-to-date one has to read
``/sys/kernel/mm/page_idle/bitmap`` first. ``/sys/kernel/mm/page_idle/bitmap`` first.
26 - PGTABLE
page is in use as a page table
IO related page flags IO related page flags
--------------------- ---------------------

View File

@ -107,9 +107,9 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
8. LRU 8. LRU
Each memcg has its own private LRU. Now, its handling is under global Each memcg has its own private LRU. Now, its handling is under global
VM's control (means that it's handled under global zone_lru_lock). VM's control (means that it's handled under global pgdat->lru_lock).
Almost all routines around memcg's LRU is called by global LRU's Almost all routines around memcg's LRU is called by global LRU's
list management functions under zone_lru_lock(). list management functions under pgdat->lru_lock.
A special function is mem_cgroup_isolate_pages(). This scans A special function is mem_cgroup_isolate_pages(). This scans
memcg's private LRU and call __isolate_lru_page() to extract a page memcg's private LRU and call __isolate_lru_page() to extract a page

View File

@ -267,11 +267,11 @@ When oom event notifier is registered, event will be delivered.
Other lock order is following: Other lock order is following:
PG_locked. PG_locked.
mm->page_table_lock mm->page_table_lock
zone_lru_lock pgdat->lru_lock
lock_page_cgroup. lock_page_cgroup.
In many cases, just lock_page_cgroup() is called. In many cases, just lock_page_cgroup() is called.
per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by
zone_lru_lock, it has no lock of its own. pgdat->lru_lock, it has no lock of its own.
2.7 Kernel Memory Extension (CONFIG_MEMCG_KMEM) 2.7 Kernel Memory Extension (CONFIG_MEMCG_KMEM)

View File

@ -9835,6 +9835,14 @@ F: kernel/sched/membarrier.c
F: include/uapi/linux/membarrier.h F: include/uapi/linux/membarrier.h
F: arch/powerpc/include/asm/membarrier.h F: arch/powerpc/include/asm/membarrier.h
MEMBLOCK
M: Mike Rapoport <rppt@linux.ibm.com>
L: linux-mm@kvack.org
S: Maintained
F: include/linux/memblock.h
F: mm/memblock.c
F: Documentation/core-api/boot-time-mm.rst
MEMORY MANAGEMENT MEMORY MANAGEMENT
L: linux-mm@kvack.org L: linux-mm@kvack.org
W: http://www.linux-mm.org W: http://www.linux-mm.org

View File

@ -4,6 +4,7 @@
#include <linux/smp.h> #include <linux/smp.h>
#include <linux/threads.h> #include <linux/threads.h>
#include <linux/numa.h>
#include <asm/machvec.h> #include <asm/machvec.h>
#ifdef CONFIG_NUMA #ifdef CONFIG_NUMA
@ -29,7 +30,7 @@ static const struct cpumask *cpumask_of_node(int node)
{ {
int cpu; int cpu;
if (node == -1) if (node == NUMA_NO_NODE)
return cpu_all_mask; return cpu_all_mask;
cpumask_clear(&node_to_cpumask_map[node]); cpumask_clear(&node_to_cpumask_map[node]);

View File

@ -1467,6 +1467,10 @@ config SYSVIPC_COMPAT
def_bool y def_bool y
depends on COMPAT && SYSVIPC depends on COMPAT && SYSVIPC
config ARCH_ENABLE_HUGEPAGE_MIGRATION
def_bool y
depends on HUGETLB_PAGE && MIGRATION
menu "Power management options" menu "Power management options"
source "kernel/power/Kconfig" source "kernel/power/Kconfig"

View File

@ -20,6 +20,11 @@
#include <asm/page.h> #include <asm/page.h>
#ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
#define arch_hugetlb_migration_supported arch_hugetlb_migration_supported
extern bool arch_hugetlb_migration_supported(struct hstate *h);
#endif
#define __HAVE_ARCH_HUGE_PTEP_GET #define __HAVE_ARCH_HUGE_PTEP_GET
static inline pte_t huge_ptep_get(pte_t *ptep) static inline pte_t huge_ptep_get(pte_t *ptep)
{ {

View File

@ -80,11 +80,7 @@
*/ */
#ifdef CONFIG_KASAN #ifdef CONFIG_KASAN
#define KASAN_SHADOW_SIZE (UL(1) << (VA_BITS - KASAN_SHADOW_SCALE_SHIFT)) #define KASAN_SHADOW_SIZE (UL(1) << (VA_BITS - KASAN_SHADOW_SCALE_SHIFT))
#ifdef CONFIG_KASAN_EXTRA
#define KASAN_THREAD_SHIFT 2
#else
#define KASAN_THREAD_SHIFT 1 #define KASAN_THREAD_SHIFT 1
#endif /* CONFIG_KASAN_EXTRA */
#else #else
#define KASAN_SHADOW_SIZE (0) #define KASAN_SHADOW_SIZE (0)
#define KASAN_THREAD_SHIFT 0 #define KASAN_THREAD_SHIFT 0

View File

@ -321,7 +321,7 @@ void crash_post_resume(void)
* but does not hold any data of loaded kernel image. * but does not hold any data of loaded kernel image.
* *
* Note that all the pages in crash dump kernel memory have been initially * Note that all the pages in crash dump kernel memory have been initially
* marked as Reserved in kexec_reserve_crashkres_pages(). * marked as Reserved as memory was allocated via memblock_reserve().
* *
* In hibernation, the pages which are Reserved and yet "nosave" are excluded * In hibernation, the pages which are Reserved and yet "nosave" are excluded
* from the hibernation iamge. crash_is_nosave() does thich check for crash * from the hibernation iamge. crash_is_nosave() does thich check for crash
@ -361,7 +361,6 @@ void crash_free_reserved_phys_range(unsigned long begin, unsigned long end)
for (addr = begin; addr < end; addr += PAGE_SIZE) { for (addr = begin; addr < end; addr += PAGE_SIZE) {
page = phys_to_page(addr); page = phys_to_page(addr);
ClearPageReserved(page);
free_reserved_page(page); free_reserved_page(page);
} }
} }

View File

@ -27,6 +27,26 @@
#include <asm/tlbflush.h> #include <asm/tlbflush.h>
#include <asm/pgalloc.h> #include <asm/pgalloc.h>
#ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
bool arch_hugetlb_migration_supported(struct hstate *h)
{
size_t pagesize = huge_page_size(h);
switch (pagesize) {
#ifdef CONFIG_ARM64_4K_PAGES
case PUD_SIZE:
#endif
case PMD_SIZE:
case CONT_PMD_SIZE:
case CONT_PTE_SIZE:
return true;
}
pr_warn("%s: unrecognized huge page size 0x%lx\n",
__func__, pagesize);
return false;
}
#endif
int pmd_huge(pmd_t pmd) int pmd_huge(pmd_t pmd)
{ {
return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT); return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT);

View File

@ -118,35 +118,10 @@ static void __init reserve_crashkernel(void)
crashk_res.start = crash_base; crashk_res.start = crash_base;
crashk_res.end = crash_base + crash_size - 1; crashk_res.end = crash_base + crash_size - 1;
} }
static void __init kexec_reserve_crashkres_pages(void)
{
#ifdef CONFIG_HIBERNATION
phys_addr_t addr;
struct page *page;
if (!crashk_res.end)
return;
/*
* To reduce the size of hibernation image, all the pages are
* marked as Reserved initially.
*/
for (addr = crashk_res.start; addr < (crashk_res.end + 1);
addr += PAGE_SIZE) {
page = phys_to_page(addr);
SetPageReserved(page);
}
#endif
}
#else #else
static void __init reserve_crashkernel(void) static void __init reserve_crashkernel(void)
{ {
} }
static void __init kexec_reserve_crashkres_pages(void)
{
}
#endif /* CONFIG_KEXEC_CORE */ #endif /* CONFIG_KEXEC_CORE */
#ifdef CONFIG_CRASH_DUMP #ifdef CONFIG_CRASH_DUMP
@ -586,8 +561,6 @@ void __init mem_init(void)
/* this will put all unused low memory onto the freelists */ /* this will put all unused low memory onto the freelists */
memblock_free_all(); memblock_free_all();
kexec_reserve_crashkres_pages();
mem_init_print_info(NULL); mem_init_print_info(NULL);
/* /*

View File

@ -120,7 +120,7 @@ static void __init setup_node_to_cpumask_map(void)
} }
/* cpumask_of_node() will now work */ /* cpumask_of_node() will now work */
pr_debug("Node to cpumask map for %d nodes\n", nr_node_ids); pr_debug("Node to cpumask map for %u nodes\n", nr_node_ids);
} }
/* /*

View File

@ -74,7 +74,7 @@ void __init build_cpu_to_node_map(void)
cpumask_clear(&node_to_cpu_mask[node]); cpumask_clear(&node_to_cpu_mask[node]);
for_each_possible_early_cpu(cpu) { for_each_possible_early_cpu(cpu) {
node = -1; node = NUMA_NO_NODE;
for (i = 0; i < NR_CPUS; ++i) for (i = 0; i < NR_CPUS; ++i)
if (cpu_physical_id(cpu) == node_cpuid[i].phys_id) { if (cpu_physical_id(cpu) == node_cpuid[i].phys_id) {
node = node_cpuid[i].nid; node = node_cpuid[i].nid;

View File

@ -583,17 +583,6 @@ pfm_put_task(struct task_struct *task)
if (task != current) put_task_struct(task); if (task != current) put_task_struct(task);
} }
static inline void
pfm_reserve_page(unsigned long a)
{
SetPageReserved(vmalloc_to_page((void *)a));
}
static inline void
pfm_unreserve_page(unsigned long a)
{
ClearPageReserved(vmalloc_to_page((void*)a));
}
static inline unsigned long static inline unsigned long
pfm_protect_ctx_ctxsw(pfm_context_t *x) pfm_protect_ctx_ctxsw(pfm_context_t *x)
{ {
@ -816,44 +805,6 @@ pfm_reset_msgq(pfm_context_t *ctx)
DPRINT(("ctx=%p msgq reset\n", ctx)); DPRINT(("ctx=%p msgq reset\n", ctx));
} }
static void *
pfm_rvmalloc(unsigned long size)
{
void *mem;
unsigned long addr;
size = PAGE_ALIGN(size);
mem = vzalloc(size);
if (mem) {
//printk("perfmon: CPU%d pfm_rvmalloc(%ld)=%p\n", smp_processor_id(), size, mem);
addr = (unsigned long)mem;
while (size > 0) {
pfm_reserve_page(addr);
addr+=PAGE_SIZE;
size-=PAGE_SIZE;
}
}
return mem;
}
static void
pfm_rvfree(void *mem, unsigned long size)
{
unsigned long addr;
if (mem) {
DPRINT(("freeing physical buffer @%p size=%lu\n", mem, size));
addr = (unsigned long) mem;
while ((long) size > 0) {
pfm_unreserve_page(addr);
addr+=PAGE_SIZE;
size-=PAGE_SIZE;
}
vfree(mem);
}
return;
}
static pfm_context_t * static pfm_context_t *
pfm_context_alloc(int ctx_flags) pfm_context_alloc(int ctx_flags)
{ {
@ -1498,7 +1449,7 @@ pfm_free_smpl_buffer(pfm_context_t *ctx)
/* /*
* free the buffer * free the buffer
*/ */
pfm_rvfree(ctx->ctx_smpl_hdr, ctx->ctx_smpl_size); vfree(ctx->ctx_smpl_hdr);
ctx->ctx_smpl_hdr = NULL; ctx->ctx_smpl_hdr = NULL;
ctx->ctx_smpl_size = 0UL; ctx->ctx_smpl_size = 0UL;
@ -2137,7 +2088,7 @@ doit:
* All memory free operations (especially for vmalloc'ed memory) * All memory free operations (especially for vmalloc'ed memory)
* MUST be done with interrupts ENABLED. * MUST be done with interrupts ENABLED.
*/ */
if (smpl_buf_addr) pfm_rvfree(smpl_buf_addr, smpl_buf_size); vfree(smpl_buf_addr);
/* /*
* return the memory used by the context * return the memory used by the context
@ -2266,10 +2217,8 @@ pfm_smpl_buffer_alloc(struct task_struct *task, struct file *filp, pfm_context_t
/* /*
* We do the easy to undo allocations first. * We do the easy to undo allocations first.
*
* pfm_rvmalloc(), clears the buffer, so there is no leak
*/ */
smpl_buf = pfm_rvmalloc(size); smpl_buf = vzalloc(size);
if (smpl_buf == NULL) { if (smpl_buf == NULL) {
DPRINT(("Can't allocate sampling buffer\n")); DPRINT(("Can't allocate sampling buffer\n"));
return -ENOMEM; return -ENOMEM;
@ -2346,7 +2295,7 @@ pfm_smpl_buffer_alloc(struct task_struct *task, struct file *filp, pfm_context_t
error: error:
vm_area_free(vma); vm_area_free(vma);
error_kmem: error_kmem:
pfm_rvfree(smpl_buf, size); vfree(smpl_buf);
return -ENOMEM; return -ENOMEM;
} }

View File

@ -227,7 +227,7 @@ void __init setup_per_cpu_areas(void)
* CPUs are put into groups according to node. Walk cpu_map * CPUs are put into groups according to node. Walk cpu_map
* and create new groups at node boundaries. * and create new groups at node boundaries.
*/ */
prev_node = -1; prev_node = NUMA_NO_NODE;
ai->nr_groups = 0; ai->nr_groups = 0;
for (unit = 0; unit < nr_units; unit++) { for (unit = 0; unit < nr_units; unit++) {
cpu = cpu_map[unit]; cpu = cpu_map[unit];
@ -435,7 +435,7 @@ static void __init *memory_less_node_alloc(int nid, unsigned long pernodesize)
{ {
void *ptr = NULL; void *ptr = NULL;
u8 best = 0xff; u8 best = 0xff;
int bestnode = -1, node, anynode = 0; int bestnode = NUMA_NO_NODE, node, anynode = 0;
for_each_online_node(node) { for_each_online_node(node) {
if (node_isset(node, memory_less_mask)) if (node_isset(node, memory_less_mask))
@ -447,7 +447,7 @@ static void __init *memory_less_node_alloc(int nid, unsigned long pernodesize)
anynode = node; anynode = node;
} }
if (bestnode == -1) if (bestnode == NUMA_NO_NODE)
bestnode = anynode; bestnode = anynode;
ptr = memblock_alloc_try_nid(pernodesize, PERCPU_PAGE_SIZE, ptr = memblock_alloc_try_nid(pernodesize, PERCPU_PAGE_SIZE,

View File

@ -51,7 +51,7 @@ void __init init_pointer_table(unsigned long ptable)
pr_debug("init_pointer_table: %lx, %x\n", ptable, PD_MARKBITS(dp)); pr_debug("init_pointer_table: %lx, %x\n", ptable, PD_MARKBITS(dp));
/* unreserve the page so it's possible to free that page */ /* unreserve the page so it's possible to free that page */
PD_PAGE(dp)->flags &= ~(1 << PG_reserved); __ClearPageReserved(PD_PAGE(dp));
init_page_count(PD_PAGE(dp)); init_page_count(PD_PAGE(dp));
return; return;

View File

@ -13,6 +13,10 @@ radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long len, unsigned long pgoff,
unsigned long flags); unsigned long flags);
extern void radix__huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
pte_t old_pte, pte_t pte);
static inline int hstate_get_psize(struct hstate *hstate) static inline int hstate_get_psize(struct hstate *hstate)
{ {
unsigned long shift; unsigned long shift;
@ -42,4 +46,12 @@ static inline bool gigantic_page_supported(void)
/* hugepd entry valid bit */ /* hugepd entry valid bit */
#define HUGEPD_VAL_BITS (0x8000000000000000UL) #define HUGEPD_VAL_BITS (0x8000000000000000UL)
#define huge_ptep_modify_prot_start huge_ptep_modify_prot_start
extern pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep);
#define huge_ptep_modify_prot_commit huge_ptep_modify_prot_commit
extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
pte_t old_pte, pte_t new_pte);
#endif #endif

View File

@ -1306,6 +1306,24 @@ static inline int pud_pfn(pud_t pud)
BUILD_BUG(); BUILD_BUG();
return 0; return 0;
} }
#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long,
pte_t *, pte_t, pte_t);
/*
* Returns true for a R -> RW upgrade of pte
*/
static inline bool is_pte_rw_upgrade(unsigned long old_val, unsigned long new_val)
{
if (!(old_val & _PAGE_READ))
return false;
if ((!(old_val & _PAGE_WRITE)) && (new_val & _PAGE_WRITE))
return true;
return false;
}
#endif /* __ASSEMBLY__ */ #endif /* __ASSEMBLY__ */
#endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */ #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */

View File

@ -127,6 +127,10 @@ extern void radix__ptep_set_access_flags(struct vm_area_struct *vma, pte_t *ptep
pte_t entry, unsigned long address, pte_t entry, unsigned long address,
int psize); int psize);
extern void radix__ptep_modify_prot_commit(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
pte_t old_pte, pte_t pte);
static inline unsigned long __radix_pte_update(pte_t *ptep, unsigned long clr, static inline unsigned long __radix_pte_update(pte_t *ptep, unsigned long clr,
unsigned long set) unsigned long set)
{ {

View File

@ -10,6 +10,7 @@
#include <linux/pci.h> #include <linux/pci.h>
#include <linux/list.h> #include <linux/list.h>
#include <linux/ioport.h> #include <linux/ioport.h>
#include <linux/numa.h>
struct device_node; struct device_node;
@ -265,7 +266,7 @@ extern int pcibios_map_io_space(struct pci_bus *bus);
#ifdef CONFIG_NUMA #ifdef CONFIG_NUMA
#define PHB_SET_NODE(PHB, NODE) ((PHB)->node = (NODE)) #define PHB_SET_NODE(PHB, NODE) ((PHB)->node = (NODE))
#else #else
#define PHB_SET_NODE(PHB, NODE) ((PHB)->node = -1) #define PHB_SET_NODE(PHB, NODE) ((PHB)->node = NUMA_NO_NODE)
#endif #endif
#endif /* CONFIG_PPC64 */ #endif /* CONFIG_PPC64 */

View File

@ -11,6 +11,7 @@
#include <linux/export.h> #include <linux/export.h>
#include <linux/memblock.h> #include <linux/memblock.h>
#include <linux/sched/task.h> #include <linux/sched/task.h>
#include <linux/numa.h>
#include <asm/lppaca.h> #include <asm/lppaca.h>
#include <asm/paca.h> #include <asm/paca.h>
@ -36,7 +37,7 @@ static void *__init alloc_paca_data(unsigned long size, unsigned long align,
* which will put its paca in the right place. * which will put its paca in the right place.
*/ */
if (cpu == boot_cpuid) { if (cpu == boot_cpuid) {
nid = -1; nid = NUMA_NO_NODE;
memblock_set_bottom_up(true); memblock_set_bottom_up(true);
} else { } else {
nid = early_cpu_to_node(cpu); nid = early_cpu_to_node(cpu);

View File

@ -32,6 +32,7 @@
#include <linux/vmalloc.h> #include <linux/vmalloc.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/vgaarb.h> #include <linux/vgaarb.h>
#include <linux/numa.h>
#include <asm/processor.h> #include <asm/processor.h>
#include <asm/io.h> #include <asm/io.h>
@ -132,7 +133,7 @@ struct pci_controller *pcibios_alloc_controller(struct device_node *dev)
int nid = of_node_to_nid(dev); int nid = of_node_to_nid(dev);
if (nid < 0 || !node_online(nid)) if (nid < 0 || !node_online(nid))
nid = -1; nid = NUMA_NO_NODE;
PHB_SET_NODE(phb, nid); PHB_SET_NODE(phb, nid);
} }

View File

@ -798,7 +798,6 @@ static int __init vdso_init(void)
BUG_ON(vdso32_pagelist == NULL); BUG_ON(vdso32_pagelist == NULL);
for (i = 0; i < vdso32_pages; i++) { for (i = 0; i < vdso32_pages; i++) {
struct page *pg = virt_to_page(vdso32_kbase + i*PAGE_SIZE); struct page *pg = virt_to_page(vdso32_kbase + i*PAGE_SIZE);
ClearPageReserved(pg);
get_page(pg); get_page(pg);
vdso32_pagelist[i] = pg; vdso32_pagelist[i] = pg;
} }
@ -812,7 +811,6 @@ static int __init vdso_init(void)
BUG_ON(vdso64_pagelist == NULL); BUG_ON(vdso64_pagelist == NULL);
for (i = 0; i < vdso64_pages; i++) { for (i = 0; i < vdso64_pages; i++) {
struct page *pg = virt_to_page(vdso64_kbase + i*PAGE_SIZE); struct page *pg = virt_to_page(vdso64_kbase + i*PAGE_SIZE);
ClearPageReserved(pg);
get_page(pg); get_page(pg);
vdso64_pagelist[i] = pg; vdso64_pagelist[i] = pg;
} }

View File

@ -121,3 +121,28 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
*ptep = __pte(new_pte & ~H_PAGE_BUSY); *ptep = __pte(new_pte & ~H_PAGE_BUSY);
return 0; return 0;
} }
pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep)
{
unsigned long pte_val;
/*
* Clear the _PAGE_PRESENT so that no hardware parallel update is
* possible. Also keep the pte_present true so that we don't take
* wrong fault.
*/
pte_val = pte_update(vma->vm_mm, addr, ptep,
_PAGE_PRESENT, _PAGE_INVALID, 1);
return __pte(pte_val);
}
void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, pte_t old_pte, pte_t pte)
{
if (radix_enabled())
return radix__huge_ptep_modify_prot_commit(vma, addr, ptep,
old_pte, pte);
set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
}

View File

@ -90,3 +90,20 @@ radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
return vm_unmapped_area(&info); return vm_unmapped_area(&info);
} }
void radix__huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
pte_t old_pte, pte_t pte)
{
struct mm_struct *mm = vma->vm_mm;
/*
* To avoid NMMU hang while relaxing access we need to flush the tlb before
* we set the new value.
*/
if (is_pte_rw_upgrade(pte_val(old_pte), pte_val(pte)) &&
(atomic_read(&mm->context.copros) > 0))
radix__flush_hugetlb_page(vma, addr);
set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
}

View File

@ -21,6 +21,7 @@
#include <linux/sizes.h> #include <linux/sizes.h>
#include <asm/mmu_context.h> #include <asm/mmu_context.h>
#include <asm/pte-walk.h> #include <asm/pte-walk.h>
#include <linux/mm_inline.h>
static DEFINE_MUTEX(mem_list_mutex); static DEFINE_MUTEX(mem_list_mutex);
@ -34,8 +35,18 @@ struct mm_iommu_table_group_mem_t {
atomic64_t mapped; atomic64_t mapped;
unsigned int pageshift; unsigned int pageshift;
u64 ua; /* userspace address */ u64 ua; /* userspace address */
u64 entries; /* number of entries in hpas[] */ u64 entries; /* number of entries in hpas/hpages[] */
u64 *hpas; /* vmalloc'ed */ /*
* in mm_iommu_get we temporarily use this to store
* struct page address.
*
* We need to convert ua to hpa in real mode. Make it
* simpler by storing physical address.
*/
union {
struct page **hpages; /* vmalloc'ed */
phys_addr_t *hpas;
};
#define MM_IOMMU_TABLE_INVALID_HPA ((uint64_t)-1) #define MM_IOMMU_TABLE_INVALID_HPA ((uint64_t)-1)
u64 dev_hpa; /* Device memory base address */ u64 dev_hpa; /* Device memory base address */
}; };
@ -80,64 +91,13 @@ bool mm_iommu_preregistered(struct mm_struct *mm)
} }
EXPORT_SYMBOL_GPL(mm_iommu_preregistered); EXPORT_SYMBOL_GPL(mm_iommu_preregistered);
/*
* Taken from alloc_migrate_target with changes to remove CMA allocations
*/
struct page *new_iommu_non_cma_page(struct page *page, unsigned long private)
{
gfp_t gfp_mask = GFP_USER;
struct page *new_page;
if (PageCompound(page))
return NULL;
if (PageHighMem(page))
gfp_mask |= __GFP_HIGHMEM;
/*
* We don't want the allocation to force an OOM if possibe
*/
new_page = alloc_page(gfp_mask | __GFP_NORETRY | __GFP_NOWARN);
return new_page;
}
static int mm_iommu_move_page_from_cma(struct page *page)
{
int ret = 0;
LIST_HEAD(cma_migrate_pages);
/* Ignore huge pages for now */
if (PageCompound(page))
return -EBUSY;
lru_add_drain();
ret = isolate_lru_page(page);
if (ret)
return ret;
list_add(&page->lru, &cma_migrate_pages);
put_page(page); /* Drop the gup reference */
ret = migrate_pages(&cma_migrate_pages, new_iommu_non_cma_page,
NULL, 0, MIGRATE_SYNC, MR_CONTIG_RANGE);
if (ret) {
if (!list_empty(&cma_migrate_pages))
putback_movable_pages(&cma_migrate_pages);
}
return 0;
}
static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua,
unsigned long entries, unsigned long dev_hpa, unsigned long entries, unsigned long dev_hpa,
struct mm_iommu_table_group_mem_t **pmem) struct mm_iommu_table_group_mem_t **pmem)
{ {
struct mm_iommu_table_group_mem_t *mem; struct mm_iommu_table_group_mem_t *mem;
long i, j, ret = 0, locked_entries = 0; long i, ret, locked_entries = 0;
unsigned int pageshift; unsigned int pageshift;
unsigned long flags;
unsigned long cur_ua;
struct page *page = NULL;
mutex_lock(&mem_list_mutex); mutex_lock(&mem_list_mutex);
@ -187,62 +147,43 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua,
goto unlock_exit; goto unlock_exit;
} }
for (i = 0; i < entries; ++i) { down_read(&mm->mmap_sem);
cur_ua = ua + (i << PAGE_SHIFT); ret = get_user_pages_longterm(ua, entries, FOLL_WRITE, mem->hpages, NULL);
if (1 != get_user_pages_fast(cur_ua, up_read(&mm->mmap_sem);
1/* pages */, 1/* iswrite */, &page)) { if (ret != entries) {
ret = -EFAULT; /* free the reference taken */
for (j = 0; j < i; ++j) for (i = 0; i < ret; i++)
put_page(pfn_to_page(mem->hpas[j] >> put_page(mem->hpages[i]);
PAGE_SHIFT));
vfree(mem->hpas); vfree(mem->hpas);
kfree(mem); kfree(mem);
goto unlock_exit;
}
/*
* If we get a page from the CMA zone, since we are going to
* be pinning these entries, we might as well move them out
* of the CMA zone if possible. NOTE: faulting in + migration
* can be expensive. Batching can be considered later
*/
if (is_migrate_cma_page(page)) {
if (mm_iommu_move_page_from_cma(page))
goto populate;
if (1 != get_user_pages_fast(cur_ua,
1/* pages */, 1/* iswrite */,
&page)) {
ret = -EFAULT; ret = -EFAULT;
for (j = 0; j < i; ++j)
put_page(pfn_to_page(mem->hpas[j] >>
PAGE_SHIFT));
vfree(mem->hpas);
kfree(mem);
goto unlock_exit; goto unlock_exit;
} }
}
populate:
pageshift = PAGE_SHIFT; pageshift = PAGE_SHIFT;
if (mem->pageshift > PAGE_SHIFT && PageCompound(page)) { for (i = 0; i < entries; ++i) {
pte_t *pte; struct page *page = mem->hpages[i];
/*
* Allow to use larger than 64k IOMMU pages. Only do that
* if we are backed by hugetlb.
*/
if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page)) {
struct page *head = compound_head(page); struct page *head = compound_head(page);
unsigned int compshift = compound_order(head);
unsigned int pteshift;
local_irq_save(flags); /* disables as well */ pageshift = compound_order(head) + PAGE_SHIFT;
pte = find_linux_pte(mm->pgd, cur_ua, NULL, &pteshift);
/* Double check it is still the same pinned page */
if (pte && pte_page(*pte) == head &&
pteshift == compshift + PAGE_SHIFT)
pageshift = max_t(unsigned int, pteshift,
PAGE_SHIFT);
local_irq_restore(flags);
} }
mem->pageshift = min(mem->pageshift, pageshift); mem->pageshift = min(mem->pageshift, pageshift);
/*
* We don't need struct page reference any more, switch
* to physical address.
*/
mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
} }
good_exit: good_exit:
ret = 0;
atomic64_set(&mem->mapped, 1); atomic64_set(&mem->mapped, 1);
mem->used = 1; mem->used = 1;
mem->ua = ua; mem->ua = ua;

View File

@ -84,7 +84,7 @@ static void __init setup_node_to_cpumask_map(void)
alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]); alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]);
/* cpumask_of_node() will now work */ /* cpumask_of_node() will now work */
dbg("Node to cpumask map for %d nodes\n", nr_node_ids); dbg("Node to cpumask map for %u nodes\n", nr_node_ids);
} }
static int __init fake_numa_create_new_node(unsigned long end_pfn, static int __init fake_numa_create_new_node(unsigned long end_pfn,
@ -215,7 +215,7 @@ static void initialize_distance_lookup_table(int nid,
*/ */
static int associativity_to_nid(const __be32 *associativity) static int associativity_to_nid(const __be32 *associativity)
{ {
int nid = -1; int nid = NUMA_NO_NODE;
if (min_common_depth == -1) if (min_common_depth == -1)
goto out; goto out;
@ -225,7 +225,7 @@ static int associativity_to_nid(const __be32 *associativity)
/* POWER4 LPAR uses 0xffff as invalid node */ /* POWER4 LPAR uses 0xffff as invalid node */
if (nid == 0xffff || nid >= MAX_NUMNODES) if (nid == 0xffff || nid >= MAX_NUMNODES)
nid = -1; nid = NUMA_NO_NODE;
if (nid > 0 && if (nid > 0 &&
of_read_number(associativity, 1) >= distance_ref_points_depth) { of_read_number(associativity, 1) >= distance_ref_points_depth) {
@ -244,7 +244,7 @@ out:
*/ */
static int of_node_to_nid_single(struct device_node *device) static int of_node_to_nid_single(struct device_node *device)
{ {
int nid = -1; int nid = NUMA_NO_NODE;
const __be32 *tmp; const __be32 *tmp;
tmp = of_get_associativity(device); tmp = of_get_associativity(device);
@ -256,7 +256,7 @@ static int of_node_to_nid_single(struct device_node *device)
/* Walk the device tree upwards, looking for an associativity id */ /* Walk the device tree upwards, looking for an associativity id */
int of_node_to_nid(struct device_node *device) int of_node_to_nid(struct device_node *device)
{ {
int nid = -1; int nid = NUMA_NO_NODE;
of_node_get(device); of_node_get(device);
while (device) { while (device) {
@ -454,7 +454,7 @@ static int of_drconf_to_nid_single(struct drmem_lmb *lmb)
*/ */
static int numa_setup_cpu(unsigned long lcpu) static int numa_setup_cpu(unsigned long lcpu)
{ {
int nid = -1; int nid = NUMA_NO_NODE;
struct device_node *cpu; struct device_node *cpu;
/* /*
@ -930,7 +930,7 @@ static int hot_add_drconf_scn_to_nid(unsigned long scn_addr)
{ {
struct drmem_lmb *lmb; struct drmem_lmb *lmb;
unsigned long lmb_size; unsigned long lmb_size;
int nid = -1; int nid = NUMA_NO_NODE;
lmb_size = drmem_lmb_size(); lmb_size = drmem_lmb_size();
@ -960,7 +960,7 @@ static int hot_add_drconf_scn_to_nid(unsigned long scn_addr)
static int hot_add_node_scn_to_nid(unsigned long scn_addr) static int hot_add_node_scn_to_nid(unsigned long scn_addr)
{ {
struct device_node *memory; struct device_node *memory;
int nid = -1; int nid = NUMA_NO_NODE;
for_each_node_by_type(memory, "memory") { for_each_node_by_type(memory, "memory") {
unsigned long start, size; unsigned long start, size;

View File

@ -401,6 +401,31 @@ void arch_report_meminfo(struct seq_file *m)
} }
#endif /* CONFIG_PROC_FS */ #endif /* CONFIG_PROC_FS */
pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep)
{
unsigned long pte_val;
/*
* Clear the _PAGE_PRESENT so that no hardware parallel update is
* possible. Also keep the pte_present true so that we don't take
* wrong fault.
*/
pte_val = pte_update(vma->vm_mm, addr, ptep, _PAGE_PRESENT, _PAGE_INVALID, 0);
return __pte(pte_val);
}
void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, pte_t old_pte, pte_t pte)
{
if (radix_enabled())
return radix__ptep_modify_prot_commit(vma, addr,
ptep, old_pte, pte);
set_pte_at(vma->vm_mm, addr, ptep, pte);
}
/* /*
* For hash translation mode, we use the deposited table to store hash slot * For hash translation mode, we use the deposited table to store hash slot
* information and they are stored at PTRS_PER_PMD offset from related pmd * information and they are stored at PTRS_PER_PMD offset from related pmd

View File

@ -1063,3 +1063,21 @@ void radix__ptep_set_access_flags(struct vm_area_struct *vma, pte_t *ptep,
} }
/* See ptesync comment in radix__set_pte_at */ /* See ptesync comment in radix__set_pte_at */
} }
void radix__ptep_modify_prot_commit(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
pte_t old_pte, pte_t pte)
{
struct mm_struct *mm = vma->vm_mm;
/*
* To avoid NMMU hang while relaxing access we need to flush the tlb before
* we set the new value. We need to do this only for radix, because hash
* translation does flush when updating the linux pte.
*/
if (is_pte_rw_upgrade(pte_val(old_pte), pte_val(pte)) &&
(atomic_read(&mm->context.copros) > 0))
radix__flush_tlb_page(vma, addr);
set_pte_at(mm, addr, ptep, pte);
}

View File

@ -20,6 +20,7 @@
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/memory.h> #include <linux/memory.h>
#include <linux/memory_hotplug.h> #include <linux/memory_hotplug.h>
#include <linux/numa.h>
#include <asm/machdep.h> #include <asm/machdep.h>
#include <asm/debugfs.h> #include <asm/debugfs.h>
@ -223,7 +224,7 @@ static int memtrace_online(void)
ent = &memtrace_array[i]; ent = &memtrace_array[i];
/* We have onlined this chunk previously */ /* We have onlined this chunk previously */
if (ent->nid == -1) if (ent->nid == NUMA_NO_NODE)
continue; continue;
/* Remove from io mappings */ /* Remove from io mappings */
@ -257,7 +258,7 @@ static int memtrace_online(void)
*/ */
debugfs_remove_recursive(ent->dir); debugfs_remove_recursive(ent->dir);
pr_info("Added trace memory back to node %d\n", ent->nid); pr_info("Added trace memory back to node %d\n", ent->nid);
ent->size = ent->start = ent->nid = -1; ent->size = ent->start = ent->nid = NUMA_NO_NODE;
} }
if (ret) if (ret)
return ret; return ret;

View File

@ -54,7 +54,6 @@ static int __init vdso_init(void)
struct page *pg; struct page *pg;
pg = virt_to_page(vdso_start + (i << PAGE_SHIFT)); pg = virt_to_page(vdso_start + (i << PAGE_SHIFT));
ClearPageReserved(pg);
vdso_pagelist[i] = pg; vdso_pagelist[i] = pg;
} }
vdso_pagelist[i] = virt_to_page(vdso_data); vdso_pagelist[i] = virt_to_page(vdso_data);

View File

@ -1069,8 +1069,9 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
} }
#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
pte_t ptep_modify_prot_start(struct mm_struct *, unsigned long, pte_t *); pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
void ptep_modify_prot_commit(struct mm_struct *, unsigned long, pte_t *, pte_t); void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long,
pte_t *, pte_t, pte_t);
#define __HAVE_ARCH_PTEP_CLEAR_FLUSH #define __HAVE_ARCH_PTEP_CLEAR_FLUSH
static inline pte_t ptep_clear_flush(struct vm_area_struct *vma, static inline pte_t ptep_clear_flush(struct vm_area_struct *vma,

View File

@ -291,7 +291,6 @@ static int __init vdso_init(void)
BUG_ON(vdso32_pagelist == NULL); BUG_ON(vdso32_pagelist == NULL);
for (i = 0; i < vdso32_pages - 1; i++) { for (i = 0; i < vdso32_pages - 1; i++) {
struct page *pg = virt_to_page(vdso32_kbase + i*PAGE_SIZE); struct page *pg = virt_to_page(vdso32_kbase + i*PAGE_SIZE);
ClearPageReserved(pg);
get_page(pg); get_page(pg);
vdso32_pagelist[i] = pg; vdso32_pagelist[i] = pg;
} }
@ -309,7 +308,6 @@ static int __init vdso_init(void)
BUG_ON(vdso64_pagelist == NULL); BUG_ON(vdso64_pagelist == NULL);
for (i = 0; i < vdso64_pages - 1; i++) { for (i = 0; i < vdso64_pages - 1; i++) {
struct page *pg = virt_to_page(vdso64_kbase + i*PAGE_SIZE); struct page *pg = virt_to_page(vdso64_kbase + i*PAGE_SIZE);
ClearPageReserved(pg);
get_page(pg); get_page(pg);
vdso64_pagelist[i] = pg; vdso64_pagelist[i] = pg;
} }

View File

@ -301,12 +301,13 @@ pte_t ptep_xchg_lazy(struct mm_struct *mm, unsigned long addr,
} }
EXPORT_SYMBOL(ptep_xchg_lazy); EXPORT_SYMBOL(ptep_xchg_lazy);
pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep) pte_t *ptep)
{ {
pgste_t pgste; pgste_t pgste;
pte_t old; pte_t old;
int nodat; int nodat;
struct mm_struct *mm = vma->vm_mm;
preempt_disable(); preempt_disable();
pgste = ptep_xchg_start(mm, addr, ptep); pgste = ptep_xchg_start(mm, addr, ptep);
@ -319,10 +320,11 @@ pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr,
return old; return old;
} }
void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr, void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, pte_t pte) pte_t *ptep, pte_t old_pte, pte_t pte)
{ {
pgste_t pgste; pgste_t pgste;
struct mm_struct *mm = vma->vm_mm;
if (!MACHINE_HAS_NX) if (!MACHINE_HAS_NX)
pte_val(pte) &= ~_PAGE_NOEXEC; pte_val(pte) &= ~_PAGE_NOEXEC;

View File

@ -13,10 +13,10 @@ emit() {
t_entry="$3" t_entry="$3"
while [ $t_nxt -lt $t_nr ]; do while [ $t_nxt -lt $t_nr ]; do
printf "__SYSCALL(%s, sys_ni_syscall, )\n" "${t_nxt}" printf "__SYSCALL(%s,sys_ni_syscall)\n" "${t_nxt}"
t_nxt=$((t_nxt+1)) t_nxt=$((t_nxt+1))
done done
printf "__SYSCALL(%s, %s, )\n" "${t_nxt}" "${t_entry}" printf "__SYSCALL(%s,%s)\n" "${t_nxt}" "${t_entry}"
} }
grep -E "^[0-9A-Fa-fXx]+[[:space:]]+${my_abis}" "$in" | sort -n | ( grep -E "^[0-9A-Fa-fXx]+[[:space:]]+${my_abis}" "$in" | sort -n | (

View File

@ -10,7 +10,7 @@
#include <linux/sys.h> #include <linux/sys.h>
#include <linux/linkage.h> #include <linux/linkage.h>
#define __SYSCALL(nr, entry, nargs) .long entry #define __SYSCALL(nr, entry) .long entry
.data .data
ENTRY(sys_call_table) ENTRY(sys_call_table)
#include <asm/syscall_table.h> #include <asm/syscall_table.h>

View File

@ -11,6 +11,7 @@
#include <linux/export.h> #include <linux/export.h>
#include <linux/irq.h> #include <linux/irq.h>
#include <linux/of_device.h> #include <linux/of_device.h>
#include <linux/numa.h>
#include <asm/prom.h> #include <asm/prom.h>
#include <asm/irq.h> #include <asm/irq.h>
@ -416,7 +417,7 @@ static int pci_fire_pbm_init(struct pci_pbm_info *pbm,
struct device_node *dp = op->dev.of_node; struct device_node *dp = op->dev.of_node;
int err; int err;
pbm->numa_node = -1; pbm->numa_node = NUMA_NO_NODE;
pbm->pci_ops = &sun4u_pci_ops; pbm->pci_ops = &sun4u_pci_ops;
pbm->config_space_reg_bits = 12; pbm->config_space_reg_bits = 12;

View File

@ -12,6 +12,7 @@
#include <linux/export.h> #include <linux/export.h>
#include <linux/interrupt.h> #include <linux/interrupt.h>
#include <linux/of_device.h> #include <linux/of_device.h>
#include <linux/numa.h>
#include <asm/iommu.h> #include <asm/iommu.h>
#include <asm/irq.h> #include <asm/irq.h>
@ -1347,7 +1348,7 @@ static int schizo_pbm_init(struct pci_pbm_info *pbm,
pbm->next = pci_pbm_root; pbm->next = pci_pbm_root;
pci_pbm_root = pbm; pci_pbm_root = pbm;
pbm->numa_node = -1; pbm->numa_node = NUMA_NO_NODE;
pbm->pci_ops = &sun4u_pci_ops; pbm->pci_ops = &sun4u_pci_ops;
pbm->config_space_reg_bits = 8; pbm->config_space_reg_bits = 8;

View File

@ -5,6 +5,7 @@
*/ */
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/interrupt.h> #include <linux/interrupt.h>
#include <linux/numa.h>
#include <asm/upa.h> #include <asm/upa.h>
@ -454,7 +455,7 @@ void psycho_pbm_init_common(struct pci_pbm_info *pbm, struct platform_device *op
struct device_node *dp = op->dev.of_node; struct device_node *dp = op->dev.of_node;
pbm->name = dp->full_name; pbm->name = dp->full_name;
pbm->numa_node = -1; pbm->numa_node = NUMA_NO_NODE;
pbm->chip_type = chip_type; pbm->chip_type = chip_type;
pbm->chip_version = of_getintprop_default(dp, "version#", 0); pbm->chip_version = of_getintprop_default(dp, "version#", 0);
pbm->chip_revision = of_getintprop_default(dp, "module-revision#", 0); pbm->chip_revision = of_getintprop_default(dp, "module-revision#", 0);

View File

@ -15,6 +15,7 @@
#include <linux/interrupt.h> #include <linux/interrupt.h>
#include <linux/of.h> #include <linux/of.h>
#include <linux/of_device.h> #include <linux/of_device.h>
#include <linux/numa.h>
#include <asm/page.h> #include <asm/page.h>
#include <asm/io.h> #include <asm/io.h>
@ -561,7 +562,7 @@ static void __init sbus_iommu_init(struct platform_device *op)
op->dev.archdata.iommu = iommu; op->dev.archdata.iommu = iommu;
op->dev.archdata.stc = strbuf; op->dev.archdata.stc = strbuf;
op->dev.archdata.numa_node = -1; op->dev.archdata.numa_node = NUMA_NO_NODE;
reg_base = regs + SYSIO_IOMMUREG_BASE; reg_base = regs + SYSIO_IOMMUREG_BASE;
iommu->iommu_control = reg_base + IOMMU_CONTROL; iommu->iommu_control = reg_base + IOMMU_CONTROL;

View File

@ -976,13 +976,13 @@ static u64 __init memblock_nid_range_sun4u(u64 start, u64 end, int *nid)
{ {
int prev_nid, new_nid; int prev_nid, new_nid;
prev_nid = -1; prev_nid = NUMA_NO_NODE;
for ( ; start < end; start += PAGE_SIZE) { for ( ; start < end; start += PAGE_SIZE) {
for (new_nid = 0; new_nid < num_node_masks; new_nid++) { for (new_nid = 0; new_nid < num_node_masks; new_nid++) {
struct node_mem_mask *p = &node_masks[new_nid]; struct node_mem_mask *p = &node_masks[new_nid];
if ((start & p->mask) == p->match) { if ((start & p->mask) == p->match) {
if (prev_nid == -1) if (prev_nid == NUMA_NO_NODE)
prev_nid = new_nid; prev_nid = new_nid;
break; break;
} }
@ -1208,7 +1208,7 @@ int of_node_to_nid(struct device_node *dp)
md = mdesc_grab(); md = mdesc_grab();
count = 0; count = 0;
nid = -1; nid = NUMA_NO_NODE;
mdesc_for_each_node_by_name(md, grp, "group") { mdesc_for_each_node_by_name(md, grp, "group") {
if (!scan_arcs_for_cfg_handle(md, grp, cfg_handle)) { if (!scan_arcs_for_cfg_handle(md, grp, cfg_handle)) {
nid = count; nid = count;

View File

@ -422,25 +422,26 @@ static inline pgdval_t pgd_val(pgd_t pgd)
} }
#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
static inline pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, static inline pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep) pte_t *ptep)
{ {
pteval_t ret; pteval_t ret;
ret = PVOP_CALL3(pteval_t, mmu.ptep_modify_prot_start, mm, addr, ptep); ret = PVOP_CALL3(pteval_t, mmu.ptep_modify_prot_start, vma, addr, ptep);
return (pte_t) { .pte = ret }; return (pte_t) { .pte = ret };
} }
static inline void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr, static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, pte_t pte) pte_t *ptep, pte_t old_pte, pte_t pte)
{ {
if (sizeof(pteval_t) > sizeof(long)) if (sizeof(pteval_t) > sizeof(long))
/* 5 arg words */ /* 5 arg words */
pv_ops.mmu.ptep_modify_prot_commit(mm, addr, ptep, pte); pv_ops.mmu.ptep_modify_prot_commit(vma, addr, ptep, pte);
else else
PVOP_VCALL4(mmu.ptep_modify_prot_commit, PVOP_VCALL4(mmu.ptep_modify_prot_commit,
mm, addr, ptep, pte.pte); vma, addr, ptep, pte.pte);
} }
static inline void set_pte(pte_t *ptep, pte_t pte) static inline void set_pte(pte_t *ptep, pte_t pte)

View File

@ -55,6 +55,7 @@ struct task_struct;
struct cpumask; struct cpumask;
struct flush_tlb_info; struct flush_tlb_info;
struct mmu_gather; struct mmu_gather;
struct vm_area_struct;
/* /*
* Wrapper type for pointers to code which uses the non-standard * Wrapper type for pointers to code which uses the non-standard
@ -254,9 +255,9 @@ struct pv_mmu_ops {
pte_t *ptep, pte_t pteval); pte_t *ptep, pte_t pteval);
void (*set_pmd)(pmd_t *pmdp, pmd_t pmdval); void (*set_pmd)(pmd_t *pmdp, pmd_t pmdval);
pte_t (*ptep_modify_prot_start)(struct mm_struct *mm, unsigned long addr, pte_t (*ptep_modify_prot_start)(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep); pte_t *ptep);
void (*ptep_modify_prot_commit)(struct mm_struct *mm, unsigned long addr, void (*ptep_modify_prot_commit)(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, pte_t pte); pte_t *ptep, pte_t pte);
struct paravirt_callee_save pte_val; struct paravirt_callee_save pte_val;

View File

@ -7,6 +7,7 @@
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/string.h> #include <linux/string.h>
#include <linux/scatterlist.h> #include <linux/scatterlist.h>
#include <linux/numa.h>
#include <asm/io.h> #include <asm/io.h>
#include <asm/pat.h> #include <asm/pat.h>
#include <asm/x86_init.h> #include <asm/x86_init.h>
@ -141,7 +142,7 @@ cpumask_of_pcibus(const struct pci_bus *bus)
int node; int node;
node = __pcibus_to_node(bus); node = __pcibus_to_node(bus);
return (node == -1) ? cpu_online_mask : return (node == NUMA_NO_NODE) ? cpu_online_mask :
cpumask_of_node(node); cpumask_of_node(node);
} }
#endif #endif

View File

@ -75,7 +75,7 @@ static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, un
#endif #endif
/** /**
* access_ok: - Checks if a user space pointer is valid * access_ok - Checks if a user space pointer is valid
* @addr: User space pointer to start of block to check * @addr: User space pointer to start of block to check
* @size: Size of block to check * @size: Size of block to check
* *
@ -84,12 +84,12 @@ static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, un
* *
* Checks if a pointer to a block of memory in user space is valid. * Checks if a pointer to a block of memory in user space is valid.
* *
* Returns true (nonzero) if the memory block may be valid, false (zero)
* if it is definitely invalid.
*
* Note that, depending on architecture, this function probably just * Note that, depending on architecture, this function probably just
* checks that the pointer is in the user space range - after calling * checks that the pointer is in the user space range - after calling
* this function, memory access functions may still return -EFAULT. * this function, memory access functions may still return -EFAULT.
*
* Return: true (nonzero) if the memory block may be valid, false (zero)
* if it is definitely invalid.
*/ */
#define access_ok(addr, size) \ #define access_ok(addr, size) \
({ \ ({ \
@ -134,7 +134,7 @@ extern int __get_user_bad(void);
__typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
/** /**
* get_user: - Get a simple variable from user space. * get_user - Get a simple variable from user space.
* @x: Variable to store result. * @x: Variable to store result.
* @ptr: Source address, in user space. * @ptr: Source address, in user space.
* *
@ -148,7 +148,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
* @ptr must have pointer-to-simple-variable type, and the result of * @ptr must have pointer-to-simple-variable type, and the result of
* dereferencing @ptr must be assignable to @x without a cast. * dereferencing @ptr must be assignable to @x without a cast.
* *
* Returns zero on success, or -EFAULT on error. * Return: zero on success, or -EFAULT on error.
* On error, the variable @x is set to zero. * On error, the variable @x is set to zero.
*/ */
/* /*
@ -226,7 +226,7 @@ extern void __put_user_4(void);
extern void __put_user_8(void); extern void __put_user_8(void);
/** /**
* put_user: - Write a simple value into user space. * put_user - Write a simple value into user space.
* @x: Value to copy to user space. * @x: Value to copy to user space.
* @ptr: Destination address, in user space. * @ptr: Destination address, in user space.
* *
@ -240,7 +240,7 @@ extern void __put_user_8(void);
* @ptr must have pointer-to-simple-variable type, and @x must be assignable * @ptr must have pointer-to-simple-variable type, and @x must be assignable
* to the result of dereferencing @ptr. * to the result of dereferencing @ptr.
* *
* Returns zero on success, or -EFAULT on error. * Return: zero on success, or -EFAULT on error.
*/ */
#define put_user(x, ptr) \ #define put_user(x, ptr) \
({ \ ({ \
@ -502,7 +502,7 @@ struct __large_struct { unsigned long buf[100]; };
} while (0) } while (0)
/** /**
* __get_user: - Get a simple variable from user space, with less checking. * __get_user - Get a simple variable from user space, with less checking.
* @x: Variable to store result. * @x: Variable to store result.
* @ptr: Source address, in user space. * @ptr: Source address, in user space.
* *
@ -519,7 +519,7 @@ struct __large_struct { unsigned long buf[100]; };
* Caller must check the pointer with access_ok() before calling this * Caller must check the pointer with access_ok() before calling this
* function. * function.
* *
* Returns zero on success, or -EFAULT on error. * Return: zero on success, or -EFAULT on error.
* On error, the variable @x is set to zero. * On error, the variable @x is set to zero.
*/ */
@ -527,7 +527,7 @@ struct __large_struct { unsigned long buf[100]; };
__get_user_nocheck((x), (ptr), sizeof(*(ptr))) __get_user_nocheck((x), (ptr), sizeof(*(ptr)))
/** /**
* __put_user: - Write a simple value into user space, with less checking. * __put_user - Write a simple value into user space, with less checking.
* @x: Value to copy to user space. * @x: Value to copy to user space.
* @ptr: Destination address, in user space. * @ptr: Destination address, in user space.
* *
@ -544,7 +544,7 @@ struct __large_struct { unsigned long buf[100]; };
* Caller must check the pointer with access_ok() before calling this * Caller must check the pointer with access_ok() before calling this
* function. * function.
* *
* Returns zero on success, or -EFAULT on error. * Return: zero on success, or -EFAULT on error.
*/ */
#define __put_user(x, ptr) \ #define __put_user(x, ptr) \

View File

@ -27,6 +27,7 @@
#include <linux/crash_dump.h> #include <linux/crash_dump.h>
#include <linux/reboot.h> #include <linux/reboot.h>
#include <linux/memory.h> #include <linux/memory.h>
#include <linux/numa.h>
#include <asm/uv/uv_mmrs.h> #include <asm/uv/uv_mmrs.h>
#include <asm/uv/uv_hub.h> #include <asm/uv/uv_hub.h>
@ -1390,7 +1391,7 @@ static void __init build_socket_tables(void)
} }
/* Set socket -> node values: */ /* Set socket -> node values: */
lnid = -1; lnid = NUMA_NO_NODE;
for_each_present_cpu(cpu) { for_each_present_cpu(cpu) {
int nid = cpu_to_node(cpu); int nid = cpu_to_node(cpu);
int apicid, sockid; int apicid, sockid;
@ -1521,7 +1522,7 @@ static void __init uv_system_init_hub(void)
new_hub->pnode = 0xffff; new_hub->pnode = 0xffff;
new_hub->numa_blade_id = uv_node_to_blade_id(nodeid); new_hub->numa_blade_id = uv_node_to_blade_id(nodeid);
new_hub->memory_nid = -1; new_hub->memory_nid = NUMA_NO_NODE;
new_hub->nr_possible_cpus = 0; new_hub->nr_possible_cpus = 0;
new_hub->nr_online_cpus = 0; new_hub->nr_online_cpus = 0;
} }
@ -1538,7 +1539,7 @@ static void __init uv_system_init_hub(void)
uv_cpu_info_per(cpu)->p_uv_hub_info = uv_hub_info_list(nodeid); uv_cpu_info_per(cpu)->p_uv_hub_info = uv_hub_info_list(nodeid);
uv_cpu_info_per(cpu)->blade_cpu_id = uv_cpu_hub_info(cpu)->nr_possible_cpus++; uv_cpu_info_per(cpu)->blade_cpu_id = uv_cpu_hub_info(cpu)->nr_possible_cpus++;
if (uv_cpu_hub_info(cpu)->memory_nid == -1) if (uv_cpu_hub_info(cpu)->memory_nid == NUMA_NO_NODE)
uv_cpu_hub_info(cpu)->memory_nid = cpu_to_node(cpu); uv_cpu_hub_info(cpu)->memory_nid = cpu_to_node(cpu);
/* Init memoryless node: */ /* Init memoryless node: */

View File

@ -171,7 +171,7 @@ void __init setup_per_cpu_areas(void)
unsigned long delta; unsigned long delta;
int rc; int rc;
pr_info("NR_CPUS:%d nr_cpumask_bits:%d nr_cpu_ids:%u nr_node_ids:%d\n", pr_info("NR_CPUS:%d nr_cpumask_bits:%d nr_cpu_ids:%u nr_node_ids:%u\n",
NR_CPUS, nr_cpumask_bits, nr_cpu_ids, nr_node_ids); NR_CPUS, nr_cpumask_bits, nr_cpu_ids, nr_node_ids);
/* /*

View File

@ -56,6 +56,7 @@
#include <linux/stackprotector.h> #include <linux/stackprotector.h>
#include <linux/gfp.h> #include <linux/gfp.h>
#include <linux/cpuidle.h> #include <linux/cpuidle.h>
#include <linux/numa.h>
#include <asm/acpi.h> #include <asm/acpi.h>
#include <asm/desc.h> #include <asm/desc.h>
@ -841,7 +842,7 @@ wakeup_secondary_cpu_via_init(int phys_apicid, unsigned long start_eip)
/* reduce the number of lines printed when booting a large cpu count system */ /* reduce the number of lines printed when booting a large cpu count system */
static void announce_cpu(int cpu, int apicid) static void announce_cpu(int cpu, int apicid)
{ {
static int current_node = -1; static int current_node = NUMA_NO_NODE;
int node = early_cpu_to_node(cpu); int node = early_cpu_to_node(cpu);
static int width, node_width; static int width, node_width;

View File

@ -54,13 +54,13 @@ do { \
} while (0) } while (0)
/** /**
* clear_user: - Zero a block of memory in user space. * clear_user - Zero a block of memory in user space.
* @to: Destination address, in user space. * @to: Destination address, in user space.
* @n: Number of bytes to zero. * @n: Number of bytes to zero.
* *
* Zero a block of memory in user space. * Zero a block of memory in user space.
* *
* Returns number of bytes that could not be cleared. * Return: number of bytes that could not be cleared.
* On success, this will be zero. * On success, this will be zero.
*/ */
unsigned long unsigned long
@ -74,14 +74,14 @@ clear_user(void __user *to, unsigned long n)
EXPORT_SYMBOL(clear_user); EXPORT_SYMBOL(clear_user);
/** /**
* __clear_user: - Zero a block of memory in user space, with less checking. * __clear_user - Zero a block of memory in user space, with less checking.
* @to: Destination address, in user space. * @to: Destination address, in user space.
* @n: Number of bytes to zero. * @n: Number of bytes to zero.
* *
* Zero a block of memory in user space. Caller must check * Zero a block of memory in user space. Caller must check
* the specified block with access_ok() before calling this function. * the specified block with access_ok() before calling this function.
* *
* Returns number of bytes that could not be cleared. * Return: number of bytes that could not be cleared.
* On success, this will be zero. * On success, this will be zero.
*/ */
unsigned long unsigned long

View File

@ -123,7 +123,7 @@ void __init setup_node_to_cpumask_map(void)
alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]); alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]);
/* cpumask_of_node() will now work */ /* cpumask_of_node() will now work */
pr_debug("Node to cpumask map for %d nodes\n", nr_node_ids); pr_debug("Node to cpumask map for %u nodes\n", nr_node_ids);
} }
static int __init numa_add_memblk_to(int nid, u64 start, u64 end, static int __init numa_add_memblk_to(int nid, u64 start, u64 end,
@ -866,7 +866,7 @@ const struct cpumask *cpumask_of_node(int node)
{ {
if (node >= nr_node_ids) { if (node >= nr_node_ids) {
printk(KERN_WARNING printk(KERN_WARNING
"cpumask_of_node(%d): node > nr_node_ids(%d)\n", "cpumask_of_node(%d): node > nr_node_ids(%u)\n",
node, nr_node_ids); node, nr_node_ids);
dump_stack(); dump_stack();
return cpu_none_mask; return cpu_none_mask;

View File

@ -17,8 +17,8 @@ bool __set_phys_to_machine(unsigned long pfn, unsigned long mfn);
void set_pte_mfn(unsigned long vaddr, unsigned long pfn, pgprot_t flags); void set_pte_mfn(unsigned long vaddr, unsigned long pfn, pgprot_t flags);
pte_t xen_ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, pte_t *ptep); pte_t xen_ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep);
void xen_ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr, void xen_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, pte_t pte); pte_t *ptep, pte_t pte);
unsigned long xen_read_cr2_direct(void); unsigned long xen_read_cr2_direct(void);

View File

@ -306,20 +306,20 @@ static void xen_set_pte_at(struct mm_struct *mm, unsigned long addr,
__xen_set_pte(ptep, pteval); __xen_set_pte(ptep, pteval);
} }
pte_t xen_ptep_modify_prot_start(struct mm_struct *mm, pte_t xen_ptep_modify_prot_start(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep) unsigned long addr, pte_t *ptep)
{ {
/* Just return the pte as-is. We preserve the bits on commit */ /* Just return the pte as-is. We preserve the bits on commit */
trace_xen_mmu_ptep_modify_prot_start(mm, addr, ptep, *ptep); trace_xen_mmu_ptep_modify_prot_start(vma->vm_mm, addr, ptep, *ptep);
return *ptep; return *ptep;
} }
void xen_ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr, void xen_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, pte_t pte) pte_t *ptep, pte_t pte)
{ {
struct mmu_update u; struct mmu_update u;
trace_xen_mmu_ptep_modify_prot_commit(mm, addr, ptep, pte); trace_xen_mmu_ptep_modify_prot_commit(vma->vm_mm, addr, ptep, pte);
xen_mc_batch(); xen_mc_batch();
u.ptr = virt_to_machine(ptep).maddr | MMU_PT_UPDATE_PRESERVE_AD; u.ptr = virt_to_machine(ptep).maddr | MMU_PT_UPDATE_PRESERVE_AD;

View File

@ -40,6 +40,7 @@
#include <linux/export.h> #include <linux/export.h>
#include <linux/debugfs.h> #include <linux/debugfs.h>
#include <linux/prefetch.h> #include <linux/prefetch.h>
#include <linux/numa.h>
#include "mtip32xx.h" #include "mtip32xx.h"
#define HW_CMD_SLOT_SZ (MTIP_MAX_COMMAND_SLOTS * 32) #define HW_CMD_SLOT_SZ (MTIP_MAX_COMMAND_SLOTS * 32)
@ -4018,9 +4019,9 @@ static int get_least_used_cpu_on_node(int node)
/* Helper for selecting a node in round robin mode */ /* Helper for selecting a node in round robin mode */
static inline int mtip_get_next_rr_node(void) static inline int mtip_get_next_rr_node(void)
{ {
static int next_node = -1; static int next_node = NUMA_NO_NODE;
if (next_node == -1) { if (next_node == NUMA_NO_NODE) {
next_node = first_online_node; next_node = first_online_node;
return next_node; return next_node;
} }

View File

@ -163,7 +163,6 @@ static int efficeon_free_gatt_table(struct agp_bridge_data *bridge)
unsigned long page = efficeon_private.l1_table[index]; unsigned long page = efficeon_private.l1_table[index];
if (page) { if (page) {
efficeon_private.l1_table[index] = 0; efficeon_private.l1_table[index] = 0;
ClearPageReserved(virt_to_page((char *)page));
free_page(page); free_page(page);
freed++; freed++;
} }
@ -219,7 +218,6 @@ static int efficeon_create_gatt_table(struct agp_bridge_data *bridge)
efficeon_free_gatt_table(agp_bridge); efficeon_free_gatt_table(agp_bridge);
return -ENOMEM; return -ENOMEM;
} }
SetPageReserved(virt_to_page((char *)page));
for (offset = 0; offset < PAGE_SIZE; offset += clflush_chunk) for (offset = 0; offset < PAGE_SIZE; offset += clflush_chunk)
clflush((char *)page+offset); clflush((char *)page+offset);

View File

@ -63,6 +63,7 @@
#include <linux/acpi_dma.h> #include <linux/acpi_dma.h>
#include <linux/of_dma.h> #include <linux/of_dma.h>
#include <linux/mempool.h> #include <linux/mempool.h>
#include <linux/numa.h>
static DEFINE_MUTEX(dma_list_mutex); static DEFINE_MUTEX(dma_list_mutex);
static DEFINE_IDA(dma_ida); static DEFINE_IDA(dma_ida);
@ -386,7 +387,8 @@ EXPORT_SYMBOL(dma_issue_pending_all);
static bool dma_chan_is_local(struct dma_chan *chan, int cpu) static bool dma_chan_is_local(struct dma_chan *chan, int cpu)
{ {
int node = dev_to_node(chan->device->dev); int node = dev_to_node(chan->device->dev);
return node == -1 || cpumask_test_cpu(cpu, cpumask_of_node(node)); return node == NUMA_NO_NODE ||
cpumask_test_cpu(cpu, cpumask_of_node(node));
} }
/** /**

View File

@ -123,12 +123,6 @@ static inline u64 ptr_to_u64(const void *ptr)
#include <linux/list.h> #include <linux/list.h>
static inline int list_is_first(const struct list_head *list,
const struct list_head *head)
{
return head->next == list;
}
static inline void __list_del_many(struct list_head *head, static inline void __list_del_many(struct list_head *head,
struct list_head *first) struct list_head *first)
{ {

View File

@ -681,8 +681,13 @@ static struct notifier_block hv_memory_nb = {
/* Check if the particular page is backed and can be onlined and online it. */ /* Check if the particular page is backed and can be onlined and online it. */
static void hv_page_online_one(struct hv_hotadd_state *has, struct page *pg) static void hv_page_online_one(struct hv_hotadd_state *has, struct page *pg)
{ {
if (!has_pfn_is_backed(has, page_to_pfn(pg))) if (!has_pfn_is_backed(has, page_to_pfn(pg))) {
if (!PageOffline(pg))
__SetPageOffline(pg);
return; return;
}
if (PageOffline(pg))
__ClearPageOffline(pg);
/* This frame is currently backed; online the page. */ /* This frame is currently backed; online the page. */
__online_page_set_limits(pg); __online_page_set_limits(pg);
@ -771,7 +776,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size,
} }
} }
static void hv_online_page(struct page *pg) static void hv_online_page(struct page *pg, unsigned int order)
{ {
struct hv_hotadd_state *has; struct hv_hotadd_state *has;
unsigned long flags; unsigned long flags;
@ -780,10 +785,11 @@ static void hv_online_page(struct page *pg)
spin_lock_irqsave(&dm_device.ha_lock, flags); spin_lock_irqsave(&dm_device.ha_lock, flags);
list_for_each_entry(has, &dm_device.ha_region_list, list) { list_for_each_entry(has, &dm_device.ha_region_list, list) {
/* The page belongs to a different HAS. */ /* The page belongs to a different HAS. */
if ((pfn < has->start_pfn) || (pfn >= has->end_pfn)) if ((pfn < has->start_pfn) ||
(pfn + (1UL << order) > has->end_pfn))
continue; continue;
hv_page_online_one(has, pg); hv_bring_pgs_online(has, pfn, 1UL << order);
break; break;
} }
spin_unlock_irqrestore(&dm_device.ha_lock, flags); spin_unlock_irqrestore(&dm_device.ha_lock, flags);
@ -1201,6 +1207,7 @@ static void free_balloon_pages(struct hv_dynmem_device *dm,
for (i = 0; i < num_pages; i++) { for (i = 0; i < num_pages; i++) {
pg = pfn_to_page(i + start_frame); pg = pfn_to_page(i + start_frame);
__ClearPageOffline(pg);
__free_page(pg); __free_page(pg);
dm->num_pages_ballooned--; dm->num_pages_ballooned--;
} }
@ -1213,7 +1220,7 @@ static unsigned int alloc_balloon_pages(struct hv_dynmem_device *dm,
struct dm_balloon_response *bl_resp, struct dm_balloon_response *bl_resp,
int alloc_unit) int alloc_unit)
{ {
unsigned int i = 0; unsigned int i, j;
struct page *pg; struct page *pg;
if (num_pages < alloc_unit) if (num_pages < alloc_unit)
@ -1245,6 +1252,10 @@ static unsigned int alloc_balloon_pages(struct hv_dynmem_device *dm,
if (alloc_unit != 1) if (alloc_unit != 1)
split_page(pg, get_order(alloc_unit << PAGE_SHIFT)); split_page(pg, get_order(alloc_unit << PAGE_SHIFT));
/* mark all pages offline */
for (j = 0; j < (1 << get_order(alloc_unit << PAGE_SHIFT)); j++)
__SetPageOffline(pg + j);
bl_resp->range_count++; bl_resp->range_count++;
bl_resp->range_array[i].finfo.start_page = bl_resp->range_array[i].finfo.start_page =
page_to_pfn(pg); page_to_pfn(pg);

View File

@ -48,6 +48,7 @@
#include <linux/cpumask.h> #include <linux/cpumask.h>
#include <linux/module.h> #include <linux/module.h>
#include <linux/interrupt.h> #include <linux/interrupt.h>
#include <linux/numa.h>
#include "hfi.h" #include "hfi.h"
#include "affinity.h" #include "affinity.h"
@ -777,7 +778,7 @@ void hfi1_dev_affinity_clean_up(struct hfi1_devdata *dd)
_dev_comp_vect_cpu_mask_clean_up(dd, entry); _dev_comp_vect_cpu_mask_clean_up(dd, entry);
unlock: unlock:
mutex_unlock(&node_affinity.lock); mutex_unlock(&node_affinity.lock);
dd->node = -1; dd->node = NUMA_NO_NODE;
} }
/* /*

View File

@ -54,6 +54,7 @@
#include <linux/printk.h> #include <linux/printk.h>
#include <linux/hrtimer.h> #include <linux/hrtimer.h>
#include <linux/bitmap.h> #include <linux/bitmap.h>
#include <linux/numa.h>
#include <rdma/rdma_vt.h> #include <rdma/rdma_vt.h>
#include "hfi.h" #include "hfi.h"
@ -1303,7 +1304,7 @@ static struct hfi1_devdata *hfi1_alloc_devdata(struct pci_dev *pdev,
dd->unit = ret; dd->unit = ret;
list_add(&dd->list, &hfi1_dev_list); list_add(&dd->list, &hfi1_dev_list);
} }
dd->node = -1; dd->node = NUMA_NO_NODE;
spin_unlock_irqrestore(&hfi1_devs_lock, flags); spin_unlock_irqrestore(&hfi1_devs_lock, flags);
idr_preload_end(); idr_preload_end();

View File

@ -39,6 +39,7 @@
#include <linux/dmi.h> #include <linux/dmi.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/iommu.h> #include <linux/iommu.h>
#include <linux/numa.h>
#include <asm/irq_remapping.h> #include <asm/irq_remapping.h>
#include <asm/iommu_table.h> #include <asm/iommu_table.h>
@ -477,7 +478,7 @@ static int dmar_parse_one_rhsa(struct acpi_dmar_header *header, void *arg)
int node = acpi_map_pxm_to_node(rhsa->proximity_domain); int node = acpi_map_pxm_to_node(rhsa->proximity_domain);
if (!node_online(node)) if (!node_online(node))
node = -1; node = NUMA_NO_NODE;
drhd->iommu->node = node; drhd->iommu->node = node;
return 0; return 0;
} }
@ -1062,7 +1063,7 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd)
iommu->msagaw = msagaw; iommu->msagaw = msagaw;
iommu->segment = drhd->segment; iommu->segment = drhd->segment;
iommu->node = -1; iommu->node = NUMA_NO_NODE;
ver = readl(iommu->reg + DMAR_VER_REG); ver = readl(iommu->reg + DMAR_VER_REG);
pr_info("%s: reg_base_addr %llx ver %d:%d cap %llx ecap %llx\n", pr_info("%s: reg_base_addr %llx ver %d:%d cap %llx ecap %llx\n",

View File

@ -47,6 +47,7 @@
#include <linux/dma-contiguous.h> #include <linux/dma-contiguous.h>
#include <linux/dma-direct.h> #include <linux/dma-direct.h>
#include <linux/crash_dump.h> #include <linux/crash_dump.h>
#include <linux/numa.h>
#include <asm/irq_remapping.h> #include <asm/irq_remapping.h>
#include <asm/cacheflush.h> #include <asm/cacheflush.h>
#include <asm/iommu.h> #include <asm/iommu.h>
@ -1716,7 +1717,7 @@ static struct dmar_domain *alloc_domain(int flags)
return NULL; return NULL;
memset(domain, 0, sizeof(*domain)); memset(domain, 0, sizeof(*domain));
domain->nid = -1; domain->nid = NUMA_NO_NODE;
domain->flags = flags; domain->flags = flags;
domain->has_iotlb_device = false; domain->has_iotlb_device = false;
INIT_LIST_HEAD(&domain->devices); INIT_LIST_HEAD(&domain->devices);

View File

@ -22,6 +22,7 @@
#include <linux/module.h> #include <linux/module.h>
#include <linux/err.h> #include <linux/err.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/numa.h>
#include <asm/uv/uv_hub.h> #include <asm/uv/uv_hub.h>
#if defined CONFIG_X86_64 #if defined CONFIG_X86_64
#include <asm/uv/bios.h> #include <asm/uv/bios.h>
@ -61,7 +62,7 @@ static struct xpc_heartbeat_uv *xpc_heartbeat_uv;
XPC_NOTIFY_MSG_SIZE_UV) XPC_NOTIFY_MSG_SIZE_UV)
#define XPC_NOTIFY_IRQ_NAME "xpc_notify" #define XPC_NOTIFY_IRQ_NAME "xpc_notify"
static int xpc_mq_node = -1; static int xpc_mq_node = NUMA_NO_NODE;
static struct xpc_gru_mq_uv *xpc_activate_mq_uv; static struct xpc_gru_mq_uv *xpc_activate_mq_uv;
static struct xpc_gru_mq_uv *xpc_notify_mq_uv; static struct xpc_gru_mq_uv *xpc_notify_mq_uv;

View File

@ -556,6 +556,36 @@ vmballoon_page_in_frames(enum vmballoon_page_size_type page_size)
return 1 << vmballoon_page_order(page_size); return 1 << vmballoon_page_order(page_size);
} }
/**
* vmballoon_mark_page_offline() - mark a page as offline
* @page: pointer for the page.
* @page_size: the size of the page.
*/
static void
vmballoon_mark_page_offline(struct page *page,
enum vmballoon_page_size_type page_size)
{
int i;
for (i = 0; i < vmballoon_page_in_frames(page_size); i++)
__SetPageOffline(page + i);
}
/**
* vmballoon_mark_page_online() - mark a page as online
* @page: pointer for the page.
* @page_size: the size of the page.
*/
static void
vmballoon_mark_page_online(struct page *page,
enum vmballoon_page_size_type page_size)
{
int i;
for (i = 0; i < vmballoon_page_in_frames(page_size); i++)
__ClearPageOffline(page + i);
}
/** /**
* vmballoon_send_get_target() - Retrieve desired balloon size from the host. * vmballoon_send_get_target() - Retrieve desired balloon size from the host.
* *
@ -612,6 +642,7 @@ static int vmballoon_alloc_page_list(struct vmballoon *b,
ctl->page_size); ctl->page_size);
if (page) { if (page) {
vmballoon_mark_page_offline(page, ctl->page_size);
/* Success. Add the page to the list and continue. */ /* Success. Add the page to the list and continue. */
list_add(&page->lru, &ctl->pages); list_add(&page->lru, &ctl->pages);
continue; continue;
@ -850,6 +881,7 @@ static void vmballoon_release_page_list(struct list_head *page_list,
list_for_each_entry_safe(page, tmp, page_list, lru) { list_for_each_entry_safe(page, tmp, page_list, lru) {
list_del(&page->lru); list_del(&page->lru);
vmballoon_mark_page_online(page, page_size);
__free_pages(page, vmballoon_page_order(page_size)); __free_pages(page, vmballoon_page_order(page_size));
} }

View File

@ -27,6 +27,7 @@
#include <linux/bpf.h> #include <linux/bpf.h>
#include <linux/bpf_trace.h> #include <linux/bpf_trace.h>
#include <linux/atomic.h> #include <linux/atomic.h>
#include <linux/numa.h>
#include <scsi/fc/fc_fcoe.h> #include <scsi/fc/fc_fcoe.h>
#include <net/udp_tunnel.h> #include <net/udp_tunnel.h>
#include <net/pkt_cls.h> #include <net/pkt_cls.h>
@ -6418,7 +6419,7 @@ int ixgbe_setup_tx_resources(struct ixgbe_ring *tx_ring)
{ {
struct device *dev = tx_ring->dev; struct device *dev = tx_ring->dev;
int orig_node = dev_to_node(dev); int orig_node = dev_to_node(dev);
int ring_node = -1; int ring_node = NUMA_NO_NODE;
int size; int size;
size = sizeof(struct ixgbe_tx_buffer) * tx_ring->count; size = sizeof(struct ixgbe_tx_buffer) * tx_ring->count;
@ -6512,7 +6513,7 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter *adapter,
{ {
struct device *dev = rx_ring->dev; struct device *dev = rx_ring->dev;
int orig_node = dev_to_node(dev); int orig_node = dev_to_node(dev);
int ring_node = -1; int ring_node = NUMA_NO_NODE;
int size; int size;
size = sizeof(struct ixgbe_rx_buffer) * rx_ring->count; size = sizeof(struct ixgbe_rx_buffer) * rx_ring->count;

View File

@ -369,14 +369,20 @@ static enum bp_state reserve_additional_memory(void)
return BP_ECANCELED; return BP_ECANCELED;
} }
static void xen_online_page(struct page *page) static void xen_online_page(struct page *page, unsigned int order)
{ {
__online_page_set_limits(page); unsigned long i, size = (1 << order);
unsigned long start_pfn = page_to_pfn(page);
struct page *p;
pr_debug("Online %lu pages starting at pfn 0x%lx\n", size, start_pfn);
mutex_lock(&balloon_mutex); mutex_lock(&balloon_mutex);
for (i = 0; i < size; i++) {
__balloon_append(page); p = pfn_to_page(start_pfn + i);
__online_page_set_limits(p);
__SetPageOffline(p);
__balloon_append(p);
}
mutex_unlock(&balloon_mutex); mutex_unlock(&balloon_mutex);
} }
@ -441,6 +447,7 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
xenmem_reservation_va_mapping_update(1, &page, &frame_list[i]); xenmem_reservation_va_mapping_update(1, &page, &frame_list[i]);
/* Relinquish the page back to the allocator. */ /* Relinquish the page back to the allocator. */
__ClearPageOffline(page);
free_reserved_page(page); free_reserved_page(page);
} }
@ -467,6 +474,7 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
state = BP_EAGAIN; state = BP_EAGAIN;
break; break;
} }
__SetPageOffline(page);
adjust_managed_page_count(page, -1); adjust_managed_page_count(page, -1);
xenmem_reservation_scrub_page(page); xenmem_reservation_scrub_page(page);
list_add(&page->lru, &pages); list_add(&page->lru, &pages);

View File

@ -457,6 +457,7 @@ struct files_struct init_files = {
.full_fds_bits = init_files.full_fds_bits_init, .full_fds_bits = init_files.full_fds_bits_init,
}, },
.file_lock = __SPIN_LOCK_UNLOCKED(init_files.file_lock), .file_lock = __SPIN_LOCK_UNLOCKED(init_files.file_lock),
.resize_wait = __WAIT_QUEUE_HEAD_INITIALIZER(init_files.resize_wait),
}; };
static unsigned int find_next_fd(struct fdtable *fdt, unsigned int start) static unsigned int find_next_fd(struct fdtable *fdt, unsigned int start)

View File

@ -530,7 +530,7 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
inode_lock(inode); inode_lock(inode);
/* protected by i_mutex */ /* protected by i_mutex */
if (info->seals & F_SEAL_WRITE) { if (info->seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE)) {
inode_unlock(inode); inode_unlock(inode);
return -EPERM; return -EPERM;
} }

View File

@ -2093,14 +2093,8 @@ EXPORT_SYMBOL(inode_dio_wait);
void inode_set_flags(struct inode *inode, unsigned int flags, void inode_set_flags(struct inode *inode, unsigned int flags,
unsigned int mask) unsigned int mask)
{ {
unsigned int old_flags, new_flags;
WARN_ON_ONCE(flags & ~mask); WARN_ON_ONCE(flags & ~mask);
do { set_mask_bits(&inode->i_flags, mask, flags);
old_flags = READ_ONCE(inode->i_flags);
new_flags = (old_flags & ~mask) | flags;
} while (unlikely(cmpxchg(&inode->i_flags, old_flags,
new_flags) != old_flags));
} }
EXPORT_SYMBOL(inode_set_flags); EXPORT_SYMBOL(inode_set_flags);

View File

@ -832,26 +832,35 @@ void kernfs_drain_open_files(struct kernfs_node *kn)
* to see if it supports poll (Neither 'poll' nor 'select' return * to see if it supports poll (Neither 'poll' nor 'select' return
* an appropriate error code). When in doubt, set a suitable timeout value. * an appropriate error code). When in doubt, set a suitable timeout value.
*/ */
__poll_t kernfs_generic_poll(struct kernfs_open_file *of, poll_table *wait)
{
struct kernfs_node *kn = kernfs_dentry_node(of->file->f_path.dentry);
struct kernfs_open_node *on = kn->attr.open;
poll_wait(of->file, &on->poll, wait);
if (of->event != atomic_read(&on->event))
return DEFAULT_POLLMASK|EPOLLERR|EPOLLPRI;
return DEFAULT_POLLMASK;
}
static __poll_t kernfs_fop_poll(struct file *filp, poll_table *wait) static __poll_t kernfs_fop_poll(struct file *filp, poll_table *wait)
{ {
struct kernfs_open_file *of = kernfs_of(filp); struct kernfs_open_file *of = kernfs_of(filp);
struct kernfs_node *kn = kernfs_dentry_node(filp->f_path.dentry); struct kernfs_node *kn = kernfs_dentry_node(filp->f_path.dentry);
struct kernfs_open_node *on = kn->attr.open; __poll_t ret;
if (!kernfs_get_active(kn)) if (!kernfs_get_active(kn))
goto trigger; return DEFAULT_POLLMASK|EPOLLERR|EPOLLPRI;
poll_wait(filp, &on->poll, wait); if (kn->attr.ops->poll)
ret = kn->attr.ops->poll(of, wait);
else
ret = kernfs_generic_poll(of, wait);
kernfs_put_active(kn); kernfs_put_active(kn);
return ret;
if (of->event != atomic_read(&on->event))
goto trigger;
return DEFAULT_POLLMASK;
trigger:
return DEFAULT_POLLMASK|EPOLLERR|EPOLLPRI;
} }
static void kernfs_notify_workfn(struct work_struct *work) static void kernfs_notify_workfn(struct work_struct *work)

View File

@ -7532,10 +7532,11 @@ static int ocfs2_trim_group(struct super_block *sb,
return count; return count;
} }
int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range) static
int ocfs2_trim_mainbm(struct super_block *sb, struct fstrim_range *range)
{ {
struct ocfs2_super *osb = OCFS2_SB(sb); struct ocfs2_super *osb = OCFS2_SB(sb);
u64 start, len, trimmed, first_group, last_group, group; u64 start, len, trimmed = 0, first_group, last_group = 0, group = 0;
int ret, cnt; int ret, cnt;
u32 first_bit, last_bit, minlen; u32 first_bit, last_bit, minlen;
struct buffer_head *main_bm_bh = NULL; struct buffer_head *main_bm_bh = NULL;
@ -7543,7 +7544,6 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
struct buffer_head *gd_bh = NULL; struct buffer_head *gd_bh = NULL;
struct ocfs2_dinode *main_bm; struct ocfs2_dinode *main_bm;
struct ocfs2_group_desc *gd = NULL; struct ocfs2_group_desc *gd = NULL;
struct ocfs2_trim_fs_info info, *pinfo = NULL;
start = range->start >> osb->s_clustersize_bits; start = range->start >> osb->s_clustersize_bits;
len = range->len >> osb->s_clustersize_bits; len = range->len >> osb->s_clustersize_bits;
@ -7552,6 +7552,9 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
if (minlen >= osb->bitmap_cpg || range->len < sb->s_blocksize) if (minlen >= osb->bitmap_cpg || range->len < sb->s_blocksize)
return -EINVAL; return -EINVAL;
trace_ocfs2_trim_mainbm(start, len, minlen);
next_group:
main_bm_inode = ocfs2_get_system_file_inode(osb, main_bm_inode = ocfs2_get_system_file_inode(osb,
GLOBAL_BITMAP_SYSTEM_INODE, GLOBAL_BITMAP_SYSTEM_INODE,
OCFS2_INVALID_SLOT); OCFS2_INVALID_SLOT);
@ -7570,64 +7573,34 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
} }
main_bm = (struct ocfs2_dinode *)main_bm_bh->b_data; main_bm = (struct ocfs2_dinode *)main_bm_bh->b_data;
/*
* Do some check before trim the first group.
*/
if (!group) {
if (start >= le32_to_cpu(main_bm->i_clusters)) { if (start >= le32_to_cpu(main_bm->i_clusters)) {
ret = -EINVAL; ret = -EINVAL;
goto out_unlock; goto out_unlock;
} }
len = range->len >> osb->s_clustersize_bits;
if (start + len > le32_to_cpu(main_bm->i_clusters)) if (start + len > le32_to_cpu(main_bm->i_clusters))
len = le32_to_cpu(main_bm->i_clusters) - start; len = le32_to_cpu(main_bm->i_clusters) - start;
trace_ocfs2_trim_fs(start, len, minlen); /*
* Determine first and last group to examine based on
ocfs2_trim_fs_lock_res_init(osb); * start and len
ret = ocfs2_trim_fs_lock(osb, NULL, 1); */
if (ret < 0) {
if (ret != -EAGAIN) {
mlog_errno(ret);
ocfs2_trim_fs_lock_res_uninit(osb);
goto out_unlock;
}
mlog(ML_NOTICE, "Wait for trim on device (%s) to "
"finish, which is running from another node.\n",
osb->dev_str);
ret = ocfs2_trim_fs_lock(osb, &info, 0);
if (ret < 0) {
mlog_errno(ret);
ocfs2_trim_fs_lock_res_uninit(osb);
goto out_unlock;
}
if (info.tf_valid && info.tf_success &&
info.tf_start == start && info.tf_len == len &&
info.tf_minlen == minlen) {
/* Avoid sending duplicated trim to a shared device */
mlog(ML_NOTICE, "The same trim on device (%s) was "
"just done from node (%u), return.\n",
osb->dev_str, info.tf_nodenum);
range->len = info.tf_trimlen;
goto out_trimunlock;
}
}
info.tf_nodenum = osb->node_num;
info.tf_start = start;
info.tf_len = len;
info.tf_minlen = minlen;
/* Determine first and last group to examine based on start and len */
first_group = ocfs2_which_cluster_group(main_bm_inode, start); first_group = ocfs2_which_cluster_group(main_bm_inode, start);
if (first_group == osb->first_cluster_group_blkno) if (first_group == osb->first_cluster_group_blkno)
first_bit = start; first_bit = start;
else else
first_bit = start - ocfs2_blocks_to_clusters(sb, first_group); first_bit = start - ocfs2_blocks_to_clusters(sb,
last_group = ocfs2_which_cluster_group(main_bm_inode, start + len - 1); first_group);
last_bit = osb->bitmap_cpg; last_group = ocfs2_which_cluster_group(main_bm_inode,
start + len - 1);
group = first_group;
}
trimmed = 0; do {
for (group = first_group; group <= last_group;) {
if (first_bit + len >= osb->bitmap_cpg) if (first_bit + len >= osb->bitmap_cpg)
last_bit = osb->bitmap_cpg; last_bit = osb->bitmap_cpg;
else else
@ -7659,21 +7632,81 @@ int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
group = ocfs2_clusters_to_blocks(sb, osb->bitmap_cpg); group = ocfs2_clusters_to_blocks(sb, osb->bitmap_cpg);
else else
group += ocfs2_clusters_to_blocks(sb, osb->bitmap_cpg); group += ocfs2_clusters_to_blocks(sb, osb->bitmap_cpg);
} } while (0);
range->len = trimmed * sb->s_blocksize;
info.tf_trimlen = range->len;
info.tf_success = (ret ? 0 : 1);
pinfo = &info;
out_trimunlock:
ocfs2_trim_fs_unlock(osb, pinfo);
ocfs2_trim_fs_lock_res_uninit(osb);
out_unlock: out_unlock:
ocfs2_inode_unlock(main_bm_inode, 0); ocfs2_inode_unlock(main_bm_inode, 0);
brelse(main_bm_bh); brelse(main_bm_bh);
main_bm_bh = NULL;
out_mutex: out_mutex:
inode_unlock(main_bm_inode); inode_unlock(main_bm_inode);
iput(main_bm_inode); iput(main_bm_inode);
/*
* If all the groups trim are not done or failed, but we should release
* main_bm related locks for avoiding the current IO starve, then go to
* trim the next group
*/
if (ret >= 0 && group <= last_group)
goto next_group;
out: out:
range->len = trimmed * sb->s_blocksize;
return ret;
}
int ocfs2_trim_fs(struct super_block *sb, struct fstrim_range *range)
{
int ret;
struct ocfs2_super *osb = OCFS2_SB(sb);
struct ocfs2_trim_fs_info info, *pinfo = NULL;
ocfs2_trim_fs_lock_res_init(osb);
trace_ocfs2_trim_fs(range->start, range->len, range->minlen);
ret = ocfs2_trim_fs_lock(osb, NULL, 1);
if (ret < 0) {
if (ret != -EAGAIN) {
mlog_errno(ret);
ocfs2_trim_fs_lock_res_uninit(osb);
return ret;
}
mlog(ML_NOTICE, "Wait for trim on device (%s) to "
"finish, which is running from another node.\n",
osb->dev_str);
ret = ocfs2_trim_fs_lock(osb, &info, 0);
if (ret < 0) {
mlog_errno(ret);
ocfs2_trim_fs_lock_res_uninit(osb);
return ret;
}
if (info.tf_valid && info.tf_success &&
info.tf_start == range->start &&
info.tf_len == range->len &&
info.tf_minlen == range->minlen) {
/* Avoid sending duplicated trim to a shared device */
mlog(ML_NOTICE, "The same trim on device (%s) was "
"just done from node (%u), return.\n",
osb->dev_str, info.tf_nodenum);
range->len = info.tf_trimlen;
goto out;
}
}
info.tf_nodenum = osb->node_num;
info.tf_start = range->start;
info.tf_len = range->len;
info.tf_minlen = range->minlen;
ret = ocfs2_trim_mainbm(sb, range);
info.tf_trimlen = range->len;
info.tf_success = (ret < 0 ? 0 : 1);
pinfo = &info;
out:
ocfs2_trim_fs_unlock(osb, pinfo);
ocfs2_trim_fs_lock_res_uninit(osb);
return ret; return ret;
} }

View File

@ -621,6 +621,7 @@ static void o2nm_node_group_drop_item(struct config_group *group,
struct o2nm_node *node = to_o2nm_node(item); struct o2nm_node *node = to_o2nm_node(item);
struct o2nm_cluster *cluster = to_o2nm_cluster(group->cg_item.ci_parent); struct o2nm_cluster *cluster = to_o2nm_cluster(group->cg_item.ci_parent);
if (cluster->cl_nodes[node->nd_num] == node) {
o2net_disconnect_node(node); o2net_disconnect_node(node);
if (cluster->cl_has_local && if (cluster->cl_has_local &&
@ -629,6 +630,7 @@ static void o2nm_node_group_drop_item(struct config_group *group,
cluster->cl_local_node = O2NM_INVALID_NODE_NUM; cluster->cl_local_node = O2NM_INVALID_NODE_NUM;
o2net_stop_listening(node); o2net_stop_listening(node);
} }
}
/* XXX call into net to stop this node from trading messages */ /* XXX call into net to stop this node from trading messages */

View File

@ -686,6 +686,9 @@ void ocfs2_trim_fs_lock_res_init(struct ocfs2_super *osb)
{ {
struct ocfs2_lock_res *lockres = &osb->osb_trim_fs_lockres; struct ocfs2_lock_res *lockres = &osb->osb_trim_fs_lockres;
/* Only one trimfs thread are allowed to work at the same time. */
mutex_lock(&osb->obs_trim_fs_mutex);
ocfs2_lock_res_init_once(lockres); ocfs2_lock_res_init_once(lockres);
ocfs2_build_lock_name(OCFS2_LOCK_TYPE_TRIM_FS, 0, 0, lockres->l_name); ocfs2_build_lock_name(OCFS2_LOCK_TYPE_TRIM_FS, 0, 0, lockres->l_name);
ocfs2_lock_res_init_common(osb, lockres, OCFS2_LOCK_TYPE_TRIM_FS, ocfs2_lock_res_init_common(osb, lockres, OCFS2_LOCK_TYPE_TRIM_FS,
@ -698,6 +701,8 @@ void ocfs2_trim_fs_lock_res_uninit(struct ocfs2_super *osb)
ocfs2_simple_drop_lockres(osb, lockres); ocfs2_simple_drop_lockres(osb, lockres);
ocfs2_lock_res_free(lockres); ocfs2_lock_res_free(lockres);
mutex_unlock(&osb->obs_trim_fs_mutex);
} }
static void ocfs2_orphan_scan_lock_res_init(struct ocfs2_lock_res *res, static void ocfs2_orphan_scan_lock_res_init(struct ocfs2_lock_res *res,

View File

@ -407,6 +407,7 @@ struct ocfs2_super
struct ocfs2_lock_res osb_rename_lockres; struct ocfs2_lock_res osb_rename_lockres;
struct ocfs2_lock_res osb_nfs_sync_lockres; struct ocfs2_lock_res osb_nfs_sync_lockres;
struct ocfs2_lock_res osb_trim_fs_lockres; struct ocfs2_lock_res osb_trim_fs_lockres;
struct mutex obs_trim_fs_mutex;
struct ocfs2_dlm_debug *osb_dlm_debug; struct ocfs2_dlm_debug *osb_dlm_debug;
struct dentry *osb_debug_root; struct dentry *osb_debug_root;

View File

@ -712,6 +712,8 @@ TRACE_EVENT(ocfs2_trim_extent,
DEFINE_OCFS2_ULL_UINT_UINT_UINT_EVENT(ocfs2_trim_group); DEFINE_OCFS2_ULL_UINT_UINT_UINT_EVENT(ocfs2_trim_group);
DEFINE_OCFS2_ULL_ULL_ULL_EVENT(ocfs2_trim_mainbm);
DEFINE_OCFS2_ULL_ULL_ULL_EVENT(ocfs2_trim_fs); DEFINE_OCFS2_ULL_ULL_ULL_EVENT(ocfs2_trim_fs);
/* End of trace events for fs/ocfs2/alloc.c. */ /* End of trace events for fs/ocfs2/alloc.c. */

View File

@ -55,7 +55,7 @@ struct ocfs2_slot_info {
unsigned int si_blocks; unsigned int si_blocks;
struct buffer_head **si_bh; struct buffer_head **si_bh;
unsigned int si_num_slots; unsigned int si_num_slots;
struct ocfs2_slot *si_slots; struct ocfs2_slot si_slots[];
}; };
@ -420,9 +420,7 @@ int ocfs2_init_slot_info(struct ocfs2_super *osb)
struct inode *inode = NULL; struct inode *inode = NULL;
struct ocfs2_slot_info *si; struct ocfs2_slot_info *si;
si = kzalloc(sizeof(struct ocfs2_slot_info) + si = kzalloc(struct_size(si, si_slots, osb->max_slots), GFP_KERNEL);
(sizeof(struct ocfs2_slot) * osb->max_slots),
GFP_KERNEL);
if (!si) { if (!si) {
status = -ENOMEM; status = -ENOMEM;
mlog_errno(status); mlog_errno(status);
@ -431,8 +429,6 @@ int ocfs2_init_slot_info(struct ocfs2_super *osb)
si->si_extended = ocfs2_uses_extended_slot_map(osb); si->si_extended = ocfs2_uses_extended_slot_map(osb);
si->si_num_slots = osb->max_slots; si->si_num_slots = osb->max_slots;
si->si_slots = (struct ocfs2_slot *)((char *)si +
sizeof(struct ocfs2_slot_info));
inode = ocfs2_get_system_file_inode(osb, SLOT_MAP_SYSTEM_INODE, inode = ocfs2_get_system_file_inode(osb, SLOT_MAP_SYSTEM_INODE,
OCFS2_INVALID_SLOT); OCFS2_INVALID_SLOT);

View File

@ -1847,6 +1847,8 @@ static int ocfs2_mount_volume(struct super_block *sb)
if (ocfs2_is_hard_readonly(osb)) if (ocfs2_is_hard_readonly(osb))
goto leave; goto leave;
mutex_init(&osb->obs_trim_fs_mutex);
status = ocfs2_dlm_init(osb); status = ocfs2_dlm_init(osb);
if (status < 0) { if (status < 0) {
mlog_errno(status); mlog_errno(status);

View File

@ -140,7 +140,6 @@ static int anon_pipe_buf_steal(struct pipe_inode_info *pipe,
struct page *page = buf->page; struct page *page = buf->page;
if (page_count(page) == 1) { if (page_count(page) == 1) {
if (memcg_kmem_enabled())
memcg_kmem_uncharge(page, 0); memcg_kmem_uncharge(page, 0);
__SetPageLocked(page); __SetPageLocked(page);
return 0; return 0;

View File

@ -343,28 +343,28 @@ static inline void task_seccomp(struct seq_file *m, struct task_struct *p)
#ifdef CONFIG_SECCOMP #ifdef CONFIG_SECCOMP
seq_put_decimal_ull(m, "\nSeccomp:\t", p->seccomp.mode); seq_put_decimal_ull(m, "\nSeccomp:\t", p->seccomp.mode);
#endif #endif
seq_printf(m, "\nSpeculation_Store_Bypass:\t"); seq_puts(m, "\nSpeculation_Store_Bypass:\t");
switch (arch_prctl_spec_ctrl_get(p, PR_SPEC_STORE_BYPASS)) { switch (arch_prctl_spec_ctrl_get(p, PR_SPEC_STORE_BYPASS)) {
case -EINVAL: case -EINVAL:
seq_printf(m, "unknown"); seq_puts(m, "unknown");
break; break;
case PR_SPEC_NOT_AFFECTED: case PR_SPEC_NOT_AFFECTED:
seq_printf(m, "not vulnerable"); seq_puts(m, "not vulnerable");
break; break;
case PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE: case PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE:
seq_printf(m, "thread force mitigated"); seq_puts(m, "thread force mitigated");
break; break;
case PR_SPEC_PRCTL | PR_SPEC_DISABLE: case PR_SPEC_PRCTL | PR_SPEC_DISABLE:
seq_printf(m, "thread mitigated"); seq_puts(m, "thread mitigated");
break; break;
case PR_SPEC_PRCTL | PR_SPEC_ENABLE: case PR_SPEC_PRCTL | PR_SPEC_ENABLE:
seq_printf(m, "thread vulnerable"); seq_puts(m, "thread vulnerable");
break; break;
case PR_SPEC_DISABLE: case PR_SPEC_DISABLE:
seq_printf(m, "globally mitigated"); seq_puts(m, "globally mitigated");
break; break;
default: default:
seq_printf(m, "vulnerable"); seq_puts(m, "vulnerable");
break; break;
} }
seq_putc(m, '\n'); seq_putc(m, '\n');

View File

@ -456,7 +456,7 @@ static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task) struct pid *pid, struct task_struct *task)
{ {
if (unlikely(!sched_info_on())) if (unlikely(!sched_info_on()))
seq_printf(m, "0 0 0\n"); seq_puts(m, "0 0 0\n");
else else
seq_printf(m, "%llu %llu %lu\n", seq_printf(m, "%llu %llu %lu\n",
(unsigned long long)task->se.sum_exec_runtime, (unsigned long long)task->se.sum_exec_runtime,
@ -3161,7 +3161,7 @@ static struct dentry *proc_pid_instantiate(struct dentry * dentry,
return d_splice_alias(inode, dentry); return d_splice_alias(inode, dentry);
} }
struct dentry *proc_pid_lookup(struct inode *dir, struct dentry * dentry, unsigned int flags) struct dentry *proc_pid_lookup(struct dentry *dentry, unsigned int flags)
{ {
struct task_struct *task; struct task_struct *task;
unsigned tgid; unsigned tgid;

View File

@ -162,7 +162,7 @@ extern struct inode *proc_pid_make_inode(struct super_block *, struct task_struc
extern void pid_update_inode(struct task_struct *, struct inode *); extern void pid_update_inode(struct task_struct *, struct inode *);
extern int pid_delete_dentry(const struct dentry *); extern int pid_delete_dentry(const struct dentry *);
extern int proc_pid_readdir(struct file *, struct dir_context *); extern int proc_pid_readdir(struct file *, struct dir_context *);
extern struct dentry *proc_pid_lookup(struct inode *, struct dentry *, unsigned int); struct dentry *proc_pid_lookup(struct dentry *, unsigned int);
extern loff_t mem_lseek(struct file *, loff_t, int); extern loff_t mem_lseek(struct file *, loff_t, int);
/* Lookups */ /* Lookups */

View File

@ -152,8 +152,8 @@ u64 stable_page_flags(struct page *page)
else if (page_count(page) == 0 && is_free_buddy_page(page)) else if (page_count(page) == 0 && is_free_buddy_page(page))
u |= 1 << KPF_BUDDY; u |= 1 << KPF_BUDDY;
if (PageBalloon(page)) if (PageOffline(page))
u |= 1 << KPF_BALLOON; u |= 1 << KPF_OFFLINE;
if (PageTable(page)) if (PageTable(page))
u |= 1 << KPF_PGTABLE; u |= 1 << KPF_PGTABLE;

View File

@ -154,7 +154,7 @@ static int proc_root_getattr(const struct path *path, struct kstat *stat,
static struct dentry *proc_root_lookup(struct inode * dir, struct dentry * dentry, unsigned int flags) static struct dentry *proc_root_lookup(struct inode * dir, struct dentry * dentry, unsigned int flags)
{ {
if (!proc_pid_lookup(dir, dentry, flags)) if (!proc_pid_lookup(dentry, flags))
return NULL; return NULL;
return proc_lookup(dir, dentry, flags); return proc_lookup(dir, dentry, flags);

View File

@ -38,6 +38,7 @@ int proc_setup_self(struct super_block *s)
struct inode *root_inode = d_inode(s->s_root); struct inode *root_inode = d_inode(s->s_root);
struct pid_namespace *ns = proc_pid_ns(root_inode); struct pid_namespace *ns = proc_pid_ns(root_inode);
struct dentry *self; struct dentry *self;
int ret = -ENOMEM;
inode_lock(root_inode); inode_lock(root_inode);
self = d_alloc_name(s->s_root, "self"); self = d_alloc_name(s->s_root, "self");
@ -51,20 +52,19 @@ int proc_setup_self(struct super_block *s)
inode->i_gid = GLOBAL_ROOT_GID; inode->i_gid = GLOBAL_ROOT_GID;
inode->i_op = &proc_self_inode_operations; inode->i_op = &proc_self_inode_operations;
d_add(self, inode); d_add(self, inode);
ret = 0;
} else { } else {
dput(self); dput(self);
self = ERR_PTR(-ENOMEM);
} }
} else {
self = ERR_PTR(-ENOMEM);
} }
inode_unlock(root_inode); inode_unlock(root_inode);
if (IS_ERR(self)) {
if (ret)
pr_err("proc_fill_super: can't allocate /proc/self\n"); pr_err("proc_fill_super: can't allocate /proc/self\n");
return PTR_ERR(self); else
}
ns->proc_self = self; ns->proc_self = self;
return 0;
return ret;
} }
void __init proc_self_init(void) void __init proc_self_init(void)

View File

@ -23,21 +23,21 @@
#ifdef arch_idle_time #ifdef arch_idle_time
static u64 get_idle_time(int cpu) static u64 get_idle_time(struct kernel_cpustat *kcs, int cpu)
{ {
u64 idle; u64 idle;
idle = kcpustat_cpu(cpu).cpustat[CPUTIME_IDLE]; idle = kcs->cpustat[CPUTIME_IDLE];
if (cpu_online(cpu) && !nr_iowait_cpu(cpu)) if (cpu_online(cpu) && !nr_iowait_cpu(cpu))
idle += arch_idle_time(cpu); idle += arch_idle_time(cpu);
return idle; return idle;
} }
static u64 get_iowait_time(int cpu) static u64 get_iowait_time(struct kernel_cpustat *kcs, int cpu)
{ {
u64 iowait; u64 iowait;
iowait = kcpustat_cpu(cpu).cpustat[CPUTIME_IOWAIT]; iowait = kcs->cpustat[CPUTIME_IOWAIT];
if (cpu_online(cpu) && nr_iowait_cpu(cpu)) if (cpu_online(cpu) && nr_iowait_cpu(cpu))
iowait += arch_idle_time(cpu); iowait += arch_idle_time(cpu);
return iowait; return iowait;
@ -45,7 +45,7 @@ static u64 get_iowait_time(int cpu)
#else #else
static u64 get_idle_time(int cpu) static u64 get_idle_time(struct kernel_cpustat *kcs, int cpu)
{ {
u64 idle, idle_usecs = -1ULL; u64 idle, idle_usecs = -1ULL;
@ -54,14 +54,14 @@ static u64 get_idle_time(int cpu)
if (idle_usecs == -1ULL) if (idle_usecs == -1ULL)
/* !NO_HZ or cpu offline so we can rely on cpustat.idle */ /* !NO_HZ or cpu offline so we can rely on cpustat.idle */
idle = kcpustat_cpu(cpu).cpustat[CPUTIME_IDLE]; idle = kcs->cpustat[CPUTIME_IDLE];
else else
idle = idle_usecs * NSEC_PER_USEC; idle = idle_usecs * NSEC_PER_USEC;
return idle; return idle;
} }
static u64 get_iowait_time(int cpu) static u64 get_iowait_time(struct kernel_cpustat *kcs, int cpu)
{ {
u64 iowait, iowait_usecs = -1ULL; u64 iowait, iowait_usecs = -1ULL;
@ -70,7 +70,7 @@ static u64 get_iowait_time(int cpu)
if (iowait_usecs == -1ULL) if (iowait_usecs == -1ULL)
/* !NO_HZ or cpu offline so we can rely on cpustat.iowait */ /* !NO_HZ or cpu offline so we can rely on cpustat.iowait */
iowait = kcpustat_cpu(cpu).cpustat[CPUTIME_IOWAIT]; iowait = kcs->cpustat[CPUTIME_IOWAIT];
else else
iowait = iowait_usecs * NSEC_PER_USEC; iowait = iowait_usecs * NSEC_PER_USEC;
@ -120,16 +120,18 @@ static int show_stat(struct seq_file *p, void *v)
getboottime64(&boottime); getboottime64(&boottime);
for_each_possible_cpu(i) { for_each_possible_cpu(i) {
user += kcpustat_cpu(i).cpustat[CPUTIME_USER]; struct kernel_cpustat *kcs = &kcpustat_cpu(i);
nice += kcpustat_cpu(i).cpustat[CPUTIME_NICE];
system += kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM]; user += kcs->cpustat[CPUTIME_USER];
idle += get_idle_time(i); nice += kcs->cpustat[CPUTIME_NICE];
iowait += get_iowait_time(i); system += kcs->cpustat[CPUTIME_SYSTEM];
irq += kcpustat_cpu(i).cpustat[CPUTIME_IRQ]; idle += get_idle_time(kcs, i);
softirq += kcpustat_cpu(i).cpustat[CPUTIME_SOFTIRQ]; iowait += get_iowait_time(kcs, i);
steal += kcpustat_cpu(i).cpustat[CPUTIME_STEAL]; irq += kcs->cpustat[CPUTIME_IRQ];
guest += kcpustat_cpu(i).cpustat[CPUTIME_GUEST]; softirq += kcs->cpustat[CPUTIME_SOFTIRQ];
guest_nice += kcpustat_cpu(i).cpustat[CPUTIME_GUEST_NICE]; steal += kcs->cpustat[CPUTIME_STEAL];
guest += kcs->cpustat[CPUTIME_GUEST];
guest_nice += kcs->cpustat[CPUTIME_GUEST_NICE];
sum += kstat_cpu_irqs_sum(i); sum += kstat_cpu_irqs_sum(i);
sum += arch_irq_stat_cpu(i); sum += arch_irq_stat_cpu(i);
@ -155,17 +157,19 @@ static int show_stat(struct seq_file *p, void *v)
seq_putc(p, '\n'); seq_putc(p, '\n');
for_each_online_cpu(i) { for_each_online_cpu(i) {
struct kernel_cpustat *kcs = &kcpustat_cpu(i);
/* Copy values here to work around gcc-2.95.3, gcc-2.96 */ /* Copy values here to work around gcc-2.95.3, gcc-2.96 */
user = kcpustat_cpu(i).cpustat[CPUTIME_USER]; user = kcs->cpustat[CPUTIME_USER];
nice = kcpustat_cpu(i).cpustat[CPUTIME_NICE]; nice = kcs->cpustat[CPUTIME_NICE];
system = kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM]; system = kcs->cpustat[CPUTIME_SYSTEM];
idle = get_idle_time(i); idle = get_idle_time(kcs, i);
iowait = get_iowait_time(i); iowait = get_iowait_time(kcs, i);
irq = kcpustat_cpu(i).cpustat[CPUTIME_IRQ]; irq = kcs->cpustat[CPUTIME_IRQ];
softirq = kcpustat_cpu(i).cpustat[CPUTIME_SOFTIRQ]; softirq = kcs->cpustat[CPUTIME_SOFTIRQ];
steal = kcpustat_cpu(i).cpustat[CPUTIME_STEAL]; steal = kcs->cpustat[CPUTIME_STEAL];
guest = kcpustat_cpu(i).cpustat[CPUTIME_GUEST]; guest = kcs->cpustat[CPUTIME_GUEST];
guest_nice = kcpustat_cpu(i).cpustat[CPUTIME_GUEST_NICE]; guest_nice = kcs->cpustat[CPUTIME_GUEST_NICE];
seq_printf(p, "cpu%d", i); seq_printf(p, "cpu%d", i);
seq_put_decimal_ull(p, " ", nsec_to_clock_t(user)); seq_put_decimal_ull(p, " ", nsec_to_clock_t(user));
seq_put_decimal_ull(p, " ", nsec_to_clock_t(nice)); seq_put_decimal_ull(p, " ", nsec_to_clock_t(nice));

View File

@ -948,10 +948,12 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma,
pte_t ptent = *pte; pte_t ptent = *pte;
if (pte_present(ptent)) { if (pte_present(ptent)) {
ptent = ptep_modify_prot_start(vma->vm_mm, addr, pte); pte_t old_pte;
ptent = pte_wrprotect(ptent);
old_pte = ptep_modify_prot_start(vma, addr, pte);
ptent = pte_wrprotect(old_pte);
ptent = pte_clear_soft_dirty(ptent); ptent = pte_clear_soft_dirty(ptent);
ptep_modify_prot_commit(vma->vm_mm, addr, pte, ptent); ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent);
} else if (is_swap_pte(ptent)) { } else if (is_swap_pte(ptent)) {
ptent = pte_swp_clear_soft_dirty(ptent); ptent = pte_swp_clear_soft_dirty(ptent);
set_pte_at(vma->vm_mm, addr, pte, ptent); set_pte_at(vma->vm_mm, addr, pte, ptent);

View File

@ -178,7 +178,7 @@ static int nommu_vma_show(struct seq_file *m, struct vm_area_struct *vma)
seq_file_path(m, file, ""); seq_file_path(m, file, "");
} else if (mm && is_stack(vma)) { } else if (mm && is_stack(vma)) {
seq_pad(m, ' '); seq_pad(m, ' ');
seq_printf(m, "[stack]"); seq_puts(m, "[stack]");
} }
seq_putc(m, '\n'); seq_putc(m, '\n');

View File

@ -38,6 +38,7 @@ int proc_setup_thread_self(struct super_block *s)
struct inode *root_inode = d_inode(s->s_root); struct inode *root_inode = d_inode(s->s_root);
struct pid_namespace *ns = proc_pid_ns(root_inode); struct pid_namespace *ns = proc_pid_ns(root_inode);
struct dentry *thread_self; struct dentry *thread_self;
int ret = -ENOMEM;
inode_lock(root_inode); inode_lock(root_inode);
thread_self = d_alloc_name(s->s_root, "thread-self"); thread_self = d_alloc_name(s->s_root, "thread-self");
@ -51,20 +52,19 @@ int proc_setup_thread_self(struct super_block *s)
inode->i_gid = GLOBAL_ROOT_GID; inode->i_gid = GLOBAL_ROOT_GID;
inode->i_op = &proc_thread_self_inode_operations; inode->i_op = &proc_thread_self_inode_operations;
d_add(thread_self, inode); d_add(thread_self, inode);
ret = 0;
} else { } else {
dput(thread_self); dput(thread_self);
thread_self = ERR_PTR(-ENOMEM);
} }
} else {
thread_self = ERR_PTR(-ENOMEM);
} }
inode_unlock(root_inode); inode_unlock(root_inode);
if (IS_ERR(thread_self)) {
if (ret)
pr_err("proc_fill_super: can't allocate /proc/thread_self\n"); pr_err("proc_fill_super: can't allocate /proc/thread_self\n");
return PTR_ERR(thread_self); else
}
ns->proc_thread_self = thread_self; ns->proc_thread_self = thread_self;
return 0;
return ret;
} }
void __init proc_thread_self_init(void) void __init proc_thread_self_init(void)

View File

@ -606,7 +606,7 @@ static inline int pmd_none_or_clear_bad(pmd_t *pmd)
return 0; return 0;
} }
static inline pte_t __ptep_modify_prot_start(struct mm_struct *mm, static inline pte_t __ptep_modify_prot_start(struct vm_area_struct *vma,
unsigned long addr, unsigned long addr,
pte_t *ptep) pte_t *ptep)
{ {
@ -615,10 +615,10 @@ static inline pte_t __ptep_modify_prot_start(struct mm_struct *mm,
* non-present, preventing the hardware from asynchronously * non-present, preventing the hardware from asynchronously
* updating it. * updating it.
*/ */
return ptep_get_and_clear(mm, addr, ptep); return ptep_get_and_clear(vma->vm_mm, addr, ptep);
} }
static inline void __ptep_modify_prot_commit(struct mm_struct *mm, static inline void __ptep_modify_prot_commit(struct vm_area_struct *vma,
unsigned long addr, unsigned long addr,
pte_t *ptep, pte_t pte) pte_t *ptep, pte_t pte)
{ {
@ -626,7 +626,7 @@ static inline void __ptep_modify_prot_commit(struct mm_struct *mm,
* The pte is non-present, so there's no hardware state to * The pte is non-present, so there's no hardware state to
* preserve. * preserve.
*/ */
set_pte_at(mm, addr, ptep, pte); set_pte_at(vma->vm_mm, addr, ptep, pte);
} }
#ifndef __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION #ifndef __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
@ -644,22 +644,22 @@ static inline void __ptep_modify_prot_commit(struct mm_struct *mm,
* queue the update to be done at some later time. The update must be * queue the update to be done at some later time. The update must be
* actually committed before the pte lock is released, however. * actually committed before the pte lock is released, however.
*/ */
static inline pte_t ptep_modify_prot_start(struct mm_struct *mm, static inline pte_t ptep_modify_prot_start(struct vm_area_struct *vma,
unsigned long addr, unsigned long addr,
pte_t *ptep) pte_t *ptep)
{ {
return __ptep_modify_prot_start(mm, addr, ptep); return __ptep_modify_prot_start(vma, addr, ptep);
} }
/* /*
* Commit an update to a pte, leaving any hardware-controlled bits in * Commit an update to a pte, leaving any hardware-controlled bits in
* the PTE unmodified. * the PTE unmodified.
*/ */
static inline void ptep_modify_prot_commit(struct mm_struct *mm, static inline void ptep_modify_prot_commit(struct vm_area_struct *vma,
unsigned long addr, unsigned long addr,
pte_t *ptep, pte_t pte) pte_t *ptep, pte_t old_pte, pte_t pte)
{ {
__ptep_modify_prot_commit(mm, addr, ptep, pte); __ptep_modify_prot_commit(vma, addr, ptep, pte);
} }
#endif /* __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION */ #endif /* __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION */
#endif /* CONFIG_MMU */ #endif /* CONFIG_MMU */

View File

@ -365,7 +365,7 @@ unlocked_inode_to_wb_begin(struct inode *inode, struct wb_lock_cookie *cookie)
rcu_read_lock(); rcu_read_lock();
/* /*
* Paired with store_release in inode_switch_wb_work_fn() and * Paired with store_release in inode_switch_wbs_work_fn() and
* ensures that we see the new wb if we see cleared I_WB_SWITCH. * ensures that we see the new wb if we see cleared I_WB_SWITCH.
*/ */
cookie->locked = smp_load_acquire(&inode->i_state) & I_WB_SWITCH; cookie->locked = smp_load_acquire(&inode->i_state) & I_WB_SWITCH;

View File

@ -4,15 +4,18 @@
* *
* Common interface definitions for making balloon pages movable by compaction. * Common interface definitions for making balloon pages movable by compaction.
* *
* Despite being perfectly possible to perform ballooned pages migration, they * Balloon page migration makes use of the general non-lru movable page
* make a special corner case to compaction scans because balloon pages are not * feature.
* enlisted at any LRU list like the other pages we do compact / migrate. *
* page->private is used to reference the responsible balloon device.
* page->mapping is used in context of non-lru page migration to reference
* the address space operations for page isolation/migration/compaction.
* *
* As the page isolation scanning step a compaction thread does is a lockless * As the page isolation scanning step a compaction thread does is a lockless
* procedure (from a page standpoint), it might bring some racy situations while * procedure (from a page standpoint), it might bring some racy situations while
* performing balloon page compaction. In order to sort out these racy scenarios * performing balloon page compaction. In order to sort out these racy scenarios
* and safely perform balloon's page compaction and migration we must, always, * and safely perform balloon's page compaction and migration we must, always,
* ensure following these three simple rules: * ensure following these simple rules:
* *
* i. when updating a balloon's page ->mapping element, strictly do it under * i. when updating a balloon's page ->mapping element, strictly do it under
* the following lock order, independently of the far superior * the following lock order, independently of the far superior
@ -21,19 +24,8 @@
* +--spin_lock_irq(&b_dev_info->pages_lock); * +--spin_lock_irq(&b_dev_info->pages_lock);
* ... page->mapping updates here ... * ... page->mapping updates here ...
* *
* ii. before isolating or dequeueing a balloon page from the balloon device * ii. isolation or dequeueing procedure must remove the page from balloon
* pages list, the page reference counter must be raised by one and the * device page list under b_dev_info->pages_lock.
* extra refcount must be dropped when the page is enqueued back into
* the balloon device page list, thus a balloon page keeps its reference
* counter raised only while it is under our special handling;
*
* iii. after the lockless scan step have selected a potential balloon page for
* isolation, re-test the PageBalloon mark and the PagePrivate flag
* under the proper page lock, to ensure isolating a valid balloon page
* (not yet isolated, nor under release procedure)
*
* iv. isolation or dequeueing procedure must clear PagePrivate flag under
* page lock together with removing page from balloon device page list.
* *
* The functions provided by this interface are placed to help on coping with * The functions provided by this interface are placed to help on coping with
* the aforementioned balloon page corner case, as well as to ensure the simple * the aforementioned balloon page corner case, as well as to ensure the simple
@ -103,7 +95,7 @@ extern int balloon_page_migrate(struct address_space *mapping,
static inline void balloon_page_insert(struct balloon_dev_info *balloon, static inline void balloon_page_insert(struct balloon_dev_info *balloon,
struct page *page) struct page *page)
{ {
__SetPageBalloon(page); __SetPageOffline(page);
__SetPageMovable(page, balloon->inode->i_mapping); __SetPageMovable(page, balloon->inode->i_mapping);
set_page_private(page, (unsigned long)balloon); set_page_private(page, (unsigned long)balloon);
list_add(&page->lru, &balloon->pages); list_add(&page->lru, &balloon->pages);
@ -119,7 +111,7 @@ static inline void balloon_page_insert(struct balloon_dev_info *balloon,
*/ */
static inline void balloon_page_delete(struct page *page) static inline void balloon_page_delete(struct page *page)
{ {
__ClearPageBalloon(page); __ClearPageOffline(page);
__ClearPageMovable(page); __ClearPageMovable(page);
set_page_private(page, 0); set_page_private(page, 0);
/* /*
@ -149,13 +141,13 @@ static inline gfp_t balloon_mapping_gfp_mask(void)
static inline void balloon_page_insert(struct balloon_dev_info *balloon, static inline void balloon_page_insert(struct balloon_dev_info *balloon,
struct page *page) struct page *page)
{ {
__SetPageBalloon(page); __SetPageOffline(page);
list_add(&page->lru, &balloon->pages); list_add(&page->lru, &balloon->pages);
} }
static inline void balloon_page_delete(struct page *page) static inline void balloon_page_delete(struct page *page)
{ {
__ClearPageBalloon(page); __ClearPageOffline(page);
list_del(&page->lru); list_del(&page->lru);
} }

View File

@ -32,6 +32,7 @@ struct kernfs_node;
struct kernfs_ops; struct kernfs_ops;
struct kernfs_open_file; struct kernfs_open_file;
struct seq_file; struct seq_file;
struct poll_table_struct;
#define MAX_CGROUP_TYPE_NAMELEN 32 #define MAX_CGROUP_TYPE_NAMELEN 32
#define MAX_CGROUP_ROOT_NAMELEN 64 #define MAX_CGROUP_ROOT_NAMELEN 64
@ -574,6 +575,9 @@ struct cftype {
ssize_t (*write)(struct kernfs_open_file *of, ssize_t (*write)(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off); char *buf, size_t nbytes, loff_t off);
__poll_t (*poll)(struct kernfs_open_file *of,
struct poll_table_struct *pt);
#ifdef CONFIG_DEBUG_LOCK_ALLOC #ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lock_class_key lockdep_key; struct lock_class_key lockdep_key;
#endif #endif

View File

@ -88,14 +88,13 @@ extern int sysctl_compact_memory;
extern int sysctl_compaction_handler(struct ctl_table *table, int write, extern int sysctl_compaction_handler(struct ctl_table *table, int write,
void __user *buffer, size_t *length, loff_t *ppos); void __user *buffer, size_t *length, loff_t *ppos);
extern int sysctl_extfrag_threshold; extern int sysctl_extfrag_threshold;
extern int sysctl_extfrag_handler(struct ctl_table *table, int write,
void __user *buffer, size_t *length, loff_t *ppos);
extern int sysctl_compact_unevictable_allowed; extern int sysctl_compact_unevictable_allowed;
extern int fragmentation_index(struct zone *zone, unsigned int order); extern int fragmentation_index(struct zone *zone, unsigned int order);
extern enum compact_result try_to_compact_pages(gfp_t gfp_mask, extern enum compact_result try_to_compact_pages(gfp_t gfp_mask,
unsigned int order, unsigned int alloc_flags, unsigned int order, unsigned int alloc_flags,
const struct alloc_context *ac, enum compact_priority prio); const struct alloc_context *ac, enum compact_priority prio,
struct page **page);
extern void reset_isolation_suitable(pg_data_t *pgdat); extern void reset_isolation_suitable(pg_data_t *pgdat);
extern enum compact_result compaction_suitable(struct zone *zone, int order, extern enum compact_result compaction_suitable(struct zone *zone, int order,
unsigned int alloc_flags, int classzone_idx); unsigned int alloc_flags, int classzone_idx);
@ -227,8 +226,8 @@ static inline void wakeup_kcompactd(pg_data_t *pgdat, int order, int classzone_i
#endif /* CONFIG_COMPACTION */ #endif /* CONFIG_COMPACTION */
#if defined(CONFIG_COMPACTION) && defined(CONFIG_SYSFS) && defined(CONFIG_NUMA)
struct node; struct node;
#if defined(CONFIG_COMPACTION) && defined(CONFIG_SYSFS) && defined(CONFIG_NUMA)
extern int compaction_register_node(struct node *node); extern int compaction_register_node(struct node *node);
extern void compaction_unregister_node(struct node *node); extern void compaction_unregister_node(struct node *node);

View File

@ -1095,7 +1095,7 @@ static inline void set_dev_node(struct device *dev, int node)
#else #else
static inline int dev_to_node(struct device *dev) static inline int dev_to_node(struct device *dev)
{ {
return -1; return NUMA_NO_NODE;
} }
static inline void set_dev_node(struct device *dev, int node) static inline void set_dev_node(struct device *dev, int node)
{ {

View File

@ -7,6 +7,13 @@
#include <linux/bitops.h> #include <linux/bitops.h>
#include <linux/jump_label.h> #include <linux/jump_label.h>
/*
* Return code to denote that requested number of
* frontswap pages are unused(moved to page cache).
* Used in in shmem_unuse and try_to_unuse.
*/
#define FRONTSWAP_PAGES_UNUSED 2
struct frontswap_ops { struct frontswap_ops {
void (*init)(unsigned); /* this swap type was just swapon'ed */ void (*init)(unsigned); /* this swap type was just swapon'ed */
int (*store)(unsigned, pgoff_t, struct page *); /* store a page */ int (*store)(unsigned, pgoff_t, struct page *); /* store a page */

View File

@ -2091,7 +2091,7 @@ static inline void init_sync_kiocb(struct kiocb *kiocb, struct file *filp)
* I_WB_SWITCH Cgroup bdi_writeback switching in progress. Used to * I_WB_SWITCH Cgroup bdi_writeback switching in progress. Used to
* synchronize competing switching instances and to tell * synchronize competing switching instances and to tell
* wb stat updates to grab the i_pages lock. See * wb stat updates to grab the i_pages lock. See
* inode_switch_wb_work_fn() for details. * inode_switch_wbs_work_fn() for details.
* *
* I_OVL_INUSE Used by overlayfs to get exclusive ownership on upper * I_OVL_INUSE Used by overlayfs to get exclusive ownership on upper
* and work dirs among overlayfs mounts. * and work dirs among overlayfs mounts.

View File

@ -24,21 +24,21 @@ struct vm_area_struct;
#define ___GFP_HIGH 0x20u #define ___GFP_HIGH 0x20u
#define ___GFP_IO 0x40u #define ___GFP_IO 0x40u
#define ___GFP_FS 0x80u #define ___GFP_FS 0x80u
#define ___GFP_WRITE 0x100u #define ___GFP_ZERO 0x100u
#define ___GFP_NOWARN 0x200u #define ___GFP_ATOMIC 0x200u
#define ___GFP_RETRY_MAYFAIL 0x400u #define ___GFP_DIRECT_RECLAIM 0x400u
#define ___GFP_NOFAIL 0x800u #define ___GFP_KSWAPD_RECLAIM 0x800u
#define ___GFP_NORETRY 0x1000u #define ___GFP_WRITE 0x1000u
#define ___GFP_MEMALLOC 0x2000u #define ___GFP_NOWARN 0x2000u
#define ___GFP_COMP 0x4000u #define ___GFP_RETRY_MAYFAIL 0x4000u
#define ___GFP_ZERO 0x8000u #define ___GFP_NOFAIL 0x8000u
#define ___GFP_NOMEMALLOC 0x10000u #define ___GFP_NORETRY 0x10000u
#define ___GFP_HARDWALL 0x20000u #define ___GFP_MEMALLOC 0x20000u
#define ___GFP_THISNODE 0x40000u #define ___GFP_COMP 0x40000u
#define ___GFP_ATOMIC 0x80000u #define ___GFP_NOMEMALLOC 0x80000u
#define ___GFP_ACCOUNT 0x100000u #define ___GFP_HARDWALL 0x100000u
#define ___GFP_DIRECT_RECLAIM 0x200000u #define ___GFP_THISNODE 0x200000u
#define ___GFP_KSWAPD_RECLAIM 0x400000u #define ___GFP_ACCOUNT 0x400000u
#ifdef CONFIG_LOCKDEP #ifdef CONFIG_LOCKDEP
#define ___GFP_NOLOCKDEP 0x800000u #define ___GFP_NOLOCKDEP 0x800000u
#else #else

View File

@ -371,6 +371,8 @@ struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
nodemask_t *nmask); nodemask_t *nmask);
struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma, struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma,
unsigned long address); unsigned long address);
struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
int nid, nodemask_t *nmask);
int huge_add_to_page_cache(struct page *page, struct address_space *mapping, int huge_add_to_page_cache(struct page *page, struct address_space *mapping,
pgoff_t idx); pgoff_t idx);
@ -493,17 +495,54 @@ static inline pgoff_t basepage_index(struct page *page)
extern int dissolve_free_huge_page(struct page *page); extern int dissolve_free_huge_page(struct page *page);
extern int dissolve_free_huge_pages(unsigned long start_pfn, extern int dissolve_free_huge_pages(unsigned long start_pfn,
unsigned long end_pfn); unsigned long end_pfn);
static inline bool hugepage_migration_supported(struct hstate *h)
{
#ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
#ifndef arch_hugetlb_migration_supported
static inline bool arch_hugetlb_migration_supported(struct hstate *h)
{
if ((huge_page_shift(h) == PMD_SHIFT) || if ((huge_page_shift(h) == PMD_SHIFT) ||
(huge_page_shift(h) == PUD_SHIFT) ||
(huge_page_shift(h) == PGDIR_SHIFT)) (huge_page_shift(h) == PGDIR_SHIFT))
return true; return true;
else else
return false; return false;
#else }
return false;
#endif #endif
#else
static inline bool arch_hugetlb_migration_supported(struct hstate *h)
{
return false;
}
#endif
static inline bool hugepage_migration_supported(struct hstate *h)
{
return arch_hugetlb_migration_supported(h);
}
/*
* Movability check is different as compared to migration check.
* It determines whether or not a huge page should be placed on
* movable zone or not. Movability of any huge page should be
* required only if huge page size is supported for migration.
* There wont be any reason for the huge page to be movable if
* it is not migratable to start with. Also the size of the huge
* page should be large enough to be placed under a movable zone
* and still feasible enough to be migratable. Just the presence
* in movable zone does not make the migration feasible.
*
* So even though large huge page sizes like the gigantic ones
* are migratable they should not be movable because its not
* feasible to migrate them from movable zone.
*/
static inline bool hugepage_movable_supported(struct hstate *h)
{
if (!hugepage_migration_supported(h))
return false;
if (hstate_is_gigantic(h))
return false;
return true;
} }
static inline spinlock_t *huge_pte_lockptr(struct hstate *h, static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
@ -543,6 +582,26 @@ static inline void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr
set_huge_pte_at(mm, addr, ptep, pte); set_huge_pte_at(mm, addr, ptep, pte);
} }
#endif #endif
#ifndef huge_ptep_modify_prot_start
#define huge_ptep_modify_prot_start huge_ptep_modify_prot_start
static inline pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep)
{
return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep);
}
#endif
#ifndef huge_ptep_modify_prot_commit
#define huge_ptep_modify_prot_commit huge_ptep_modify_prot_commit
static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
pte_t old_pte, pte_t pte)
{
set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
}
#endif
#else /* CONFIG_HUGETLB_PAGE */ #else /* CONFIG_HUGETLB_PAGE */
struct hstate {}; struct hstate {};
#define alloc_huge_page(v, a, r) NULL #define alloc_huge_page(v, a, r) NULL
@ -602,6 +661,11 @@ static inline bool hugepage_migration_supported(struct hstate *h)
return false; return false;
} }
static inline bool hugepage_movable_supported(struct hstate *h)
{
return false;
}
static inline spinlock_t *huge_pte_lockptr(struct hstate *h, static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
struct mm_struct *mm, pte_t *pte) struct mm_struct *mm, pte_t *pte)
{ {

View File

@ -2,7 +2,7 @@
#ifndef _LINUX_KASAN_CHECKS_H #ifndef _LINUX_KASAN_CHECKS_H
#define _LINUX_KASAN_CHECKS_H #define _LINUX_KASAN_CHECKS_H
#ifdef CONFIG_KASAN #if defined(__SANITIZE_ADDRESS__) || defined(__KASAN_INTERNAL)
void kasan_check_read(const volatile void *p, unsigned int size); void kasan_check_read(const volatile void *p, unsigned int size);
void kasan_check_write(const volatile void *p, unsigned int size); void kasan_check_write(const volatile void *p, unsigned int size);
#else #else

View File

@ -25,6 +25,7 @@ struct seq_file;
struct vm_area_struct; struct vm_area_struct;
struct super_block; struct super_block;
struct file_system_type; struct file_system_type;
struct poll_table_struct;
struct kernfs_open_node; struct kernfs_open_node;
struct kernfs_iattrs; struct kernfs_iattrs;
@ -261,6 +262,9 @@ struct kernfs_ops {
ssize_t (*write)(struct kernfs_open_file *of, char *buf, size_t bytes, ssize_t (*write)(struct kernfs_open_file *of, char *buf, size_t bytes,
loff_t off); loff_t off);
__poll_t (*poll)(struct kernfs_open_file *of,
struct poll_table_struct *pt);
int (*mmap)(struct kernfs_open_file *of, struct vm_area_struct *vma); int (*mmap)(struct kernfs_open_file *of, struct vm_area_struct *vma);
#ifdef CONFIG_DEBUG_LOCK_ALLOC #ifdef CONFIG_DEBUG_LOCK_ALLOC
@ -350,6 +354,8 @@ int kernfs_remove_by_name_ns(struct kernfs_node *parent, const char *name,
int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent, int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
const char *new_name, const void *new_ns); const char *new_name, const void *new_ns);
int kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr); int kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr);
__poll_t kernfs_generic_poll(struct kernfs_open_file *of,
struct poll_table_struct *pt);
void kernfs_notify(struct kernfs_node *kn); void kernfs_notify(struct kernfs_node *kn);
const void *kernfs_super_ns(struct super_block *sb); const void *kernfs_super_ns(struct super_block *sb);

Some files were not shown because too many files have changed in this diff Show More