lib/bitmap.c: enhance bitmap syntax

Today there are platforms with many CPUs (up to 4K).  Trying to boot only
part of the CPUs may result in too long string.

For example lets take NPS platform that is part of arch/arc.  This
platform have SMP system with 256 cores each with 16 HW threads (SMT
machine) where HW thread appears as CPU to the kernel.  In this example
there is total of 4K CPUs.  When one tries to boot only part of the HW
threads from each core the string representing the map may be long...  For
example if for sake of performance we decided to boot only first half of
HW threads of each core the map will look like:
0-7,16-23,32-39,...,4080-4087

This patch introduce new syntax to accommodate with such use case.  I
added an optional postfix to a range of CPUs which will choose according
to given modulo the desired range of reminders i.e.:

    <cpus range>:sed_size/group_size

For example, above map can be described in new syntax like this:
0-4095:8/16

Note that this patch is backward compatible with current syntax.

[akpm@linux-foundation.org: rework documentation]
Link: http://lkml.kernel.org/r/1473579629-4283-1-git-send-email-noamca@mellanox.com
Signed-off-by: Noam Camus <noamca@mellanox.com>
Cc: David Decotigny <decot@googlers.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: David S. Miller <davem@davemloft.net>
Cc: Pan Xinhui <xinhui@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
Noam Camus 2016-10-11 13:51:35 -07:00 committed by Linus Torvalds
parent 8cfd56d479
commit 2d13e6ca42
2 changed files with 82 additions and 18 deletions

View File

@ -33,6 +33,37 @@ can also be entered as
Double-quotes can be used to protect spaces in values, e.g.: Double-quotes can be used to protect spaces in values, e.g.:
param="spaces in here" param="spaces in here"
cpu lists:
----------
Some kernel parameters take a list of CPUs as a value, e.g. isolcpus,
nohz_full, irqaffinity, rcu_nocbs. The format of this list is:
<cpu number>,...,<cpu number>
or
<cpu number>-<cpu number>
(must be a positive range in ascending order)
or a mixture
<cpu number>,...,<cpu number>-<cpu number>
Note that for the special case of a range one can split the range into equal
sized groups and for each group use some amount from the beginning of that
group:
<cpu number>-cpu number>:<used size>/<group size>
For example one can add to the command line following parameter:
isolcpus=1,2,10-20,100-2000:2/25
where the final item represents CPUs 100,101,125,126,150,151,...
This document may not be entirely up to date and comprehensive. The command This document may not be entirely up to date and comprehensive. The command
"modinfo -p ${modulename}" shows a current list of all parameters of a loadable "modinfo -p ${modulename}" shows a current list of all parameters of a loadable
module. Loadable modules, after being loaded into the running kernel, also module. Loadable modules, after being loaded into the running kernel, also
@ -1789,13 +1820,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
See Documentation/filesystems/nfs/nfsroot.txt. See Documentation/filesystems/nfs/nfsroot.txt.
irqaffinity= [SMP] Set the default irq affinity mask irqaffinity= [SMP] Set the default irq affinity mask
Format: The argument is a cpu list, as described above.
<cpu number>,...,<cpu number>
or
<cpu number>-<cpu number>
(must be a positive range in ascending order)
or a mixture
<cpu number>,...,<cpu number>-<cpu number>
irqfixup [HW] irqfixup [HW]
When an interrupt is not handled search all handlers When an interrupt is not handled search all handlers
@ -1812,13 +1837,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
Format: <RDP>,<reset>,<pci_scan>,<verbosity> Format: <RDP>,<reset>,<pci_scan>,<verbosity>
isolcpus= [KNL,SMP] Isolate CPUs from the general scheduler. isolcpus= [KNL,SMP] Isolate CPUs from the general scheduler.
Format: The argument is a cpu list, as described above.
<cpu number>,...,<cpu number>
or
<cpu number>-<cpu number>
(must be a positive range in ascending order)
or a mixture
<cpu number>,...,<cpu number>-<cpu number>
This option can be used to specify one or more CPUs This option can be used to specify one or more CPUs
to isolate from the general SMP balancing and scheduling to isolate from the general SMP balancing and scheduling
@ -2680,6 +2699,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
Default: on Default: on
nohz_full= [KNL,BOOT] nohz_full= [KNL,BOOT]
The argument is a cpu list, as described above.
In kernels built with CONFIG_NO_HZ_FULL=y, set In kernels built with CONFIG_NO_HZ_FULL=y, set
the specified list of CPUs whose tick will be stopped the specified list of CPUs whose tick will be stopped
whenever possible. The boot CPU will be forced outside whenever possible. The boot CPU will be forced outside
@ -3285,6 +3305,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
See Documentation/blockdev/ramdisk.txt. See Documentation/blockdev/ramdisk.txt.
rcu_nocbs= [KNL] rcu_nocbs= [KNL]
The argument is a cpu list, as described above.
In kernels built with CONFIG_RCU_NOCB_CPU=y, set In kernels built with CONFIG_RCU_NOCB_CPU=y, set
the specified list of CPUs to be no-callback CPUs. the specified list of CPUs to be no-callback CPUs.
Invocation of these CPUs' RCU callbacks will Invocation of these CPUs' RCU callbacks will

View File

@ -496,6 +496,11 @@ EXPORT_SYMBOL(bitmap_print_to_pagebuf);
* ranges. Consecutively set bits are shown as two hyphen-separated * ranges. Consecutively set bits are shown as two hyphen-separated
* decimal numbers, the smallest and largest bit numbers set in * decimal numbers, the smallest and largest bit numbers set in
* the range. * the range.
* Optionally each range can be postfixed to denote that only parts of it
* should be set. The range will divided to groups of specific size.
* From each group will be used only defined amount of bits.
* Syntax: range:used_size/group_size
* Example: 0-1023:2/256 ==> 0,1,256,257,512,513,768,769
* *
* Returns 0 on success, -errno on invalid input strings. * Returns 0 on success, -errno on invalid input strings.
* Error values: * Error values:
@ -507,16 +512,20 @@ static int __bitmap_parselist(const char *buf, unsigned int buflen,
int is_user, unsigned long *maskp, int is_user, unsigned long *maskp,
int nmaskbits) int nmaskbits)
{ {
unsigned a, b; unsigned int a, b, old_a, old_b;
unsigned int group_size, used_size;
int c, old_c, totaldigits, ndigits; int c, old_c, totaldigits, ndigits;
const char __user __force *ubuf = (const char __user __force *)buf; const char __user __force *ubuf = (const char __user __force *)buf;
int at_start, in_range; int at_start, in_range, in_partial_range;
totaldigits = c = 0; totaldigits = c = 0;
old_a = old_b = 0;
group_size = used_size = 0;
bitmap_zero(maskp, nmaskbits); bitmap_zero(maskp, nmaskbits);
do { do {
at_start = 1; at_start = 1;
in_range = 0; in_range = 0;
in_partial_range = 0;
a = b = 0; a = b = 0;
ndigits = totaldigits; ndigits = totaldigits;
@ -547,6 +556,24 @@ static int __bitmap_parselist(const char *buf, unsigned int buflen,
if ((totaldigits != ndigits) && isspace(old_c)) if ((totaldigits != ndigits) && isspace(old_c))
return -EINVAL; return -EINVAL;
if (c == '/') {
used_size = a;
at_start = 1;
in_range = 0;
a = b = 0;
continue;
}
if (c == ':') {
old_a = a;
old_b = b;
at_start = 1;
in_range = 0;
in_partial_range = 1;
a = b = 0;
continue;
}
if (c == '-') { if (c == '-') {
if (at_start || in_range) if (at_start || in_range)
return -EINVAL; return -EINVAL;
@ -567,15 +594,30 @@ static int __bitmap_parselist(const char *buf, unsigned int buflen,
} }
if (ndigits == totaldigits) if (ndigits == totaldigits)
continue; continue;
if (in_partial_range) {
group_size = a;
a = old_a;
b = old_b;
old_a = old_b = 0;
}
/* if no digit is after '-', it's wrong*/ /* if no digit is after '-', it's wrong*/
if (at_start && in_range) if (at_start && in_range)
return -EINVAL; return -EINVAL;
if (!(a <= b)) if (!(a <= b) || !(used_size <= group_size))
return -EINVAL; return -EINVAL;
if (b >= nmaskbits) if (b >= nmaskbits)
return -ERANGE; return -ERANGE;
while (a <= b) { while (a <= b) {
set_bit(a, maskp); if (in_partial_range) {
static int pos_in_group = 1;
if (pos_in_group <= used_size)
set_bit(a, maskp);
if (a == b || ++pos_in_group > group_size)
pos_in_group = 1;
} else
set_bit(a, maskp);
a++; a++;
} }
} while (buflen && c == ','); } while (buflen && c == ',');