numa: mempolicy: dynamic interleave map for system init

This converts the default system init memory policy to use a dynamically
created node map instead of defaulting to all online nodes.  Nodes of a
certain size (>= 16MB) are judged to be suitable for interleave, and are added
to the map.  If all nodes are smaller in size, the largest one is
automatically selected.

Without this, tiny nodes find themselves out of memory before we even make it
to userspace.  Systems with large nodes will notice no change.

Only the system init policy is effected by this change, the regular
MPOL_DEFAULT policy is still switched to later on in the boot process as
normal.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
Paul Mundt 2007-07-15 23:38:15 -07:00 committed by Linus Torvalds
parent f0630fff54
commit b71636e298
1 changed files with 28 additions and 3 deletions

View File

@ -1597,6 +1597,10 @@ void mpol_free_shared_policy(struct shared_policy *p)
/* assumes fs == KERNEL_DS */ /* assumes fs == KERNEL_DS */
void __init numa_policy_init(void) void __init numa_policy_init(void)
{ {
nodemask_t interleave_nodes;
unsigned long largest = 0;
int nid, prefer = 0;
policy_cache = kmem_cache_create("numa_policy", policy_cache = kmem_cache_create("numa_policy",
sizeof(struct mempolicy), sizeof(struct mempolicy),
0, SLAB_PANIC, NULL, NULL); 0, SLAB_PANIC, NULL, NULL);
@ -1605,10 +1609,31 @@ void __init numa_policy_init(void)
sizeof(struct sp_node), sizeof(struct sp_node),
0, SLAB_PANIC, NULL, NULL); 0, SLAB_PANIC, NULL, NULL);
/* Set interleaving policy for system init. This way not all /*
the data structures allocated at system boot end up in node zero. */ * Set interleaving policy for system init. Interleaving is only
* enabled across suitably sized nodes (default is >= 16MB), or
* fall back to the largest node if they're all smaller.
*/
nodes_clear(interleave_nodes);
for_each_online_node(nid) {
unsigned long total_pages = node_present_pages(nid);
if (do_set_mempolicy(MPOL_INTERLEAVE, &node_online_map)) /* Preserve the largest node */
if (largest < total_pages) {
largest = total_pages;
prefer = nid;
}
/* Interleave this node? */
if ((total_pages << PAGE_SHIFT) >= (16 << 20))
node_set(nid, interleave_nodes);
}
/* All too small, use the largest */
if (unlikely(nodes_empty(interleave_nodes)))
node_set(prefer, interleave_nodes);
if (do_set_mempolicy(MPOL_INTERLEAVE, &interleave_nodes))
printk("numa_policy_init: interleaving failed\n"); printk("numa_policy_init: interleaving failed\n");
} }