Patch series "kvmalloc", v5.
There are many open coded kmalloc with vmalloc fallback instances in the
tree. Most of them are not careful enough or simply do not care about
the underlying semantic of the kmalloc/page allocator which means that
a) some vmalloc fallbacks are basically unreachable because the kmalloc
part will keep retrying until it succeeds b) the page allocator can
invoke a really disruptive steps like the OOM killer to move forward
which doesn't sound appropriate when we consider that the vmalloc
fallback is available.
As it can be seen implementing kvmalloc requires quite an intimate
knowledge if the page allocator and the memory reclaim internals which
strongly suggests that a helper should be implemented in the memory
subsystem proper.
Most callers, I could find, have been converted to use the helper
instead. This is patch 6. There are some more relying on __GFP_REPEAT
in the networking stack which I have converted as well and Eric Dumazet
was not opposed [2] to convert them as well.
[1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com
This patch (of 9):
Using kmalloc with the vmalloc fallback for larger allocations is a
common pattern in the kernel code. Yet we do not have any common helper
for that and so users have invented their own helpers. Some of them are
really creative when doing so. Let's just add kv[mz]alloc and make sure
it is implemented properly. This implementation makes sure to not make
a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also
to not warn about allocation failures. This also rules out the OOM
killer as the vmalloc is a more approapriate fallback than a disruptive
user visible action.
This patch also changes some existing users and removes helpers which
are specific for them. In some cases this is not possible (e.g.
ext4_kvmalloc, libcfs_kvzalloc) because those seems to be broken and
require GFP_NO{FS,IO} context which is not vmalloc compatible in general
(note that the page table allocation is GFP_KERNEL). Those need to be
fixed separately.
While we are at it, document that __vmalloc{_node} about unsupported gfp
mask because there seems to be a lot of confusion out there.
kvmalloc_node will warn about GFP_KERNEL incompatible (which are not
superset) flags to catch new abusers. Existing ones would have to die
slowly.
[sfr@canb.auug.org.au: f2fs fixup]
Link: http://lkml.kernel.org/r/20170320163735.332e64b7@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170306103032.2540-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Andreas Dilger <adilger@dilger.ca> [ext4 part]
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
security/apparmor/lib.c:132:9-10: WARNING: return of 0/1 in function 'aa_policy_init' with return type bool
Return statements in functions returning bool should use
true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
The aad macro can replace aad strings when it is not intended to. Switch
to a fn macro so it is only applied when intended.
Also at the same time cleanup audit_data initialization by putting
common boiler plate behind a macro, and dropping the gfp_t parameter
which will become useless.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Calling kmalloc(GFP_NOIO) with order == PAGE_ALLOC_COSTLY_ORDER is not
recommended because it might fall into infinite retry loop without
invoking the OOM killer.
Since aa_dfa_unpack() is the only caller of kvzalloc() and
aa_dfa_unpack() which is calling kvzalloc() via unpack_table() is
doing kzalloc(GFP_KERNEL), it is safe to use GFP_KERNEL from
__aa_kvmalloc().
Since aa_simple_write_to_buffer() is the only caller of kvmalloc()
and aa_simple_write_to_buffer() is calling copy_from_user() which
is GFP_KERNEL context (see memdup_user_nul()), it is safe to use
GFP_KERNEL from __aa_kvmalloc().
Therefore, replace GFP_NOIO with GFP_KERNEL. Also, since we have
vmalloc() fallback, add __GFP_NORETRY so that we don't invoke the OOM
killer by kmalloc(GFP_KERNEL) with order == PAGE_ALLOC_COSTLY_ORDER.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: John Johansen <john.johansen@canonical.com>
This is a follow-up to commit b5b3ee6c "apparmor: no need to delay vfree()".
Since vmalloc() will do "size = PAGE_ALIGN(size);",
we don't need to check for "size >= sizeof(struct work_struct)".
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: John Johansen <john.johansen@canonical.com>
vfree() can be called from interrupt contexts now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
currently apparmor name parsing is only correctly handling
:<NS>:<profile>
but
:<NS>://<profile>
is also a valid form and what is exported to userspace.
Signed-off-by: John Johansen <john.johansen@canonical.com>
It isn't needed. If you don't set the type of the data associated with
that type it is a pretty obvious programming bug. So why waste the cycles?
Signed-off-by: Eric Paris <eparis@redhat.com>
Linus found that the gigantic size of the common audit data caused a big
perf hit on something as simple as running stat() in a loop. This patch
requires LSMs to declare the LSM specific portion separately rather than
doing it in a union. Thus each LSM can be responsible for shrinking their
portion and don't have to pay a penalty just because other LSMs have a
bigger space requirement.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the following sparse warnings:
security/apparmor/lib.c:37:6: warning: symbol 'aa_split_fqname' was not declared. Should it be static?
security/apparmor/lib.c:63:6: warning: symbol 'aa_info_message' was not declared. Should it be static?
security/apparmor/lib.c:83:6: warning: symbol 'kvmalloc' was not declared. Should it be static?
security/apparmor/lib.c:123:6: warning: symbol 'kvfree' was not declared. Should it be static?
Signed-off-by: James Morris <jmorris@namei.org>
Remove linux/mm.h inclusion from netdevice.h -- it's unused (I've checked manually).
To prevent mm.h inclusion via other channels also extract "enum dma_data_direction"
definition into separate header. This tiny piece is what gluing netdevice.h with mm.h
via "netdevice.h => dmaengine.h => dma-mapping.h => scatterlist.h => mm.h".
Removal of mm.h from scatterlist.h was tried and was found not feasible
on most archs, so the link was cutoff earlier.
Hope people are OK with tiny include file.
Note, that mm_types.h is still dragged in, but it is a separate story.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per Dan Carpenter <error27@gmail.com>
If we have a ns name without a following profile then in the original
code it did "*ns_name = &name[1];". "name" is NULL so "*ns_name" is
0x1. That isn't useful and could cause an oops when this function is
called from aa_remove_profiles().
Beyond this the assignment of the namespace name was wrong in the case
where the profile name was provided as it was being set to &name[1]
after name = skip_spaces(split + 1);
Move the ns_name assignment before updating name for the split and
also add skip_spaces, making the interface more robust.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
Miscellaneous functions and defines needed by AppArmor, including
the base path resolution routines.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>