This patch adds further attributes to the event. These attributes are
helpful to understand the context of the message and can be used
to filter the events.
There are three common items. Source context, target context and tclass.
There are also items from the outcome of operation performed.
An event is similar to:
<...>-1309 [002] .... 6346.691689: selinux_audited:
requested=0x4000000 denied=0x4000000 audited=0x4000000
result=-13
scontext=system_u:system_r:cupsd_t:s0-s0:c0.c1023
tcontext=system_u:object_r:bin_t:s0 tclass=file
With systems where many denials are occurring, it is useful to apply a
filter. The filtering is a set of logic that is inserted with
the filter file. Example:
echo "tclass==\"file\" " > events/avc/selinux_audited/filter
This adds that we only get tclass=file.
The trace can also have extra properties. Adding the user stack
can be done with
echo 1 > options/userstacktrace
Now the output will be
runcon-1365 [003] .... 6960.955530: selinux_audited:
requested=0x4000000 denied=0x4000000 audited=0x4000000
result=-13
scontext=system_u:system_r:cupsd_t:s0-s0:c0.c1023
tcontext=system_u:object_r:bin_t:s0 tclass=file
runcon-1365 [003] .... 6960.955560: <user stack trace>
=> <00007f325b4ce45b>
=> <00005607093efa57>
Signed-off-by: Peter Enderborg <peter.enderborg@sony.com>
Reviewed-by: Thiébaud Weksteen <tweek@google.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The audit data currently captures which process and which target
is responsible for a denial. There is no data on where exactly in the
process that call occurred. Debugging can be made easier by being able to
reconstruct the unified kernel and userland stack traces [1]. Add a
tracepoint on the SELinux denials which can then be used by userland
(i.e. perf).
Although this patch could manually be added by each OS developer to
trouble shoot a denial, adding it to the kernel streamlines the
developers workflow.
It is possible to use perf for monitoring the event:
# perf record -e avc:selinux_audited -g -a
^C
# perf report -g
[...]
6.40% 6.40% audited=800000 tclass=4
|
__libc_start_main
|
|--4.60%--__GI___ioctl
| entry_SYSCALL_64
| do_syscall_64
| __x64_sys_ioctl
| ksys_ioctl
| binder_ioctl
| binder_set_nice
| can_nice
| capable
| security_capable
| cred_has_capability.isra.0
| slow_avc_audit
| common_lsm_audit
| avc_audit_post_callback
| avc_audit_post_callback
|
It is also possible to use the ftrace interface:
# echo 1 > /sys/kernel/debug/tracing/events/avc/selinux_audited/enable
# cat /sys/kernel/debug/tracing/trace
tracer: nop
entries-in-buffer/entries-written: 1/1 #P:8
[...]
dmesg-3624 [001] 13072.325358: selinux_denied: audited=800000 tclass=4
The tclass value can be mapped to a class by searching
security/selinux/flask.h. The audited value is a bit field of the
permissions described in security/selinux/av_permissions.h for the
corresponding class.
[1] https://source.android.com/devices/tech/debug/native_stack_dump
Signed-off-by: Thiébaud Weksteen <tweek@google.com>
Suggested-by: Joel Fernandes <joelaf@google.com>
Reviewed-by: Peter Enderborg <peter.enderborg@sony.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
In AVC update we don't call avc_node_kill() when avc_xperms_populate()
fails, resulting in the avc->avc_cache.active_nodes counter having a
false value. In last patch this changes was missed , so correcting it.
Fixes: fa1aa143ac ("selinux: extended permissions for ioctls")
Signed-off-by: Jaihind Yadav <jaihindyadav@codeaurora.org>
Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
[PM: merge fuzz, minor description cleanup]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Fix avc_insert() to call avc_node_kill() if we've already allocated
an AVC node and the code fails to insert the node in the cache.
Fixes: fa1aa143ac ("selinux: extended permissions for ioctls")
Reported-by: rsiddoji@codeaurora.org
Suggested-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
commit bda0be7ad9 ("security: make inode_follow_link RCU-walk aware")
passed down the rcu flag to the SELinux AVC, but failed to adjust the
test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY.
Previously, we only returned -ECHILD if generating an audit record with
LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission.
Move the handling of MAY_NOT_BLOCK to avc_audit() and its inlined
equivalent in selinux_inode_permission() immediately after we determine
that audit is required, and always fall back to ref-walk in this case.
Fixes: bda0be7ad9 ("security: make inode_follow_link RCU-walk aware")
Reported-by: Will Deacon <will@kernel.org>
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
This reverts commit e46e01eebb ("selinux: stop passing MAY_NOT_BLOCK
to the AVC upon follow_link"). The correct fix is to instead fall
back to ref-walk if audit is required irrespective of the specific
audit data type. This is done in the next commit.
Fixes: e46e01eebb ("selinux: stop passing MAY_NOT_BLOCK to the AVC upon follow_link")
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
These strings may come from untrusted sources (e.g. file xattrs) so they
need to be properly escaped.
Reproducer:
# setenforce 0
# touch /tmp/test
# setfattr -n security.selinux -v 'kuřecí řízek' /tmp/test
# runcon system_u:system_r:sshd_t:s0 cat /tmp/test
(look at the generated AVCs)
Actual result:
type=AVC [...] trawcon=kuřecí řízek
Expected result:
type=AVC [...] trawcon=6B75C5996563C3AD20C599C3AD7A656B
Fixes: fede148324 ("selinux: log invalid contexts in AVCs")
Cc: stable@vger.kernel.org # v5.1+
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
commit a2c513835b ("selinux: inline some AVC functions used only once")
introduced usage of audit_log_string() in place of audit_log_format()
for fixed strings. However, audit_log_string() quotes the string.
This breaks the avc audit message format and userspace audit parsers.
Switch back to using audit_log_format().
Fixes: a2c513835b ("selinux: inline some AVC functions used only once")
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
These checks are only guarding against programming errors that could
silently grant too many permissions. These cases are better handled with
WARN_ON(), since it doesn't really help much to crash the machine in
this case.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
In case a file has an invalid context set, in an AVC record generated
upon access to such file, the target context is always reported as
unlabeled. This patch adds new optional fields to the AVC record
(srawcon and trawcon) that report the actual context string if it
differs from the one reported in scontext/tcontext. This is useful for
diagnosing SELinux denials involving invalid contexts.
To trigger an AVC that illustrates this situation:
# setenforce 0
# touch /tmp/testfile
# setfattr -n security.selinux -v system_u:object_r:banana_t:s0 /tmp/testfile
# runcon system_u:system_r:sshd_t:s0 cat /tmp/testfile
AVC before:
type=AVC msg=audit(1547801083.248:11): avc: denied { open } for pid=1149 comm="cat" path="/tmp/testfile" dev="tmpfs" ino=6608 scontext=system_u:system_r:sshd_t:s0 tcontext=system_u:object_r:unlabeled_t:s15:c0.c1023 tclass=file permissive=1
AVC after:
type=AVC msg=audit(1547801083.248:11): avc: denied { open } for pid=1149 comm="cat" path="/tmp/testfile" dev="tmpfs" ino=6608 scontext=system_u:system_r:sshd_t:s0 tcontext=system_u:object_r:unlabeled_t:s15:c0.c1023 tclass=file permissive=1 trawcon=system_u:object_r:banana_t:s0
Note that it is also possible to encounter this situation with the
'scontext' field - e.g. when a new policy is loaded while a process is
running, whose context is not valid in the new policy.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1135683
Cc: Daniel Walsh <dwalsh@redhat.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
We don't need to crash the machine in these cases. Let's just detect the
buggy state early and error out with a warning.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
avc_dump_av() and avc_dump_query() are each used only in one place. Get
rid of them and open code their contents in the call sites.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
commit bda0be7ad9 ("security: make inode_follow_link RCU-walk aware")
switched selinux_inode_follow_link() to use avc_has_perm_flags() and
pass down the MAY_NOT_BLOCK flag if called during RCU walk. However,
the only test of MAY_NOT_BLOCK occurs during slow_avc_audit()
and only if passing an inode as audit data (LSM_AUDIT_DATA_INODE). Since
selinux_inode_follow_link() passes a dentry directly, passing MAY_NOT_BLOCK
here serves no purpose. Switch selinux_inode_follow_link() to use
avc_has_perm() and drop avc_has_perm_flags() since there are no other
users.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
commit 0dc1ba24f7 ("SELINUX: Make selinux cache VFS RCU walks safe")
results in no audit messages at all if in permissive mode because the
cache is updated during the rcu walk and thus no denial occurs on
the subsequent ref walk. Fix this by not updating the cache when
performing a non-blocking permission check. This only affects search
and symlink read checks during rcu walk.
Fixes: 0dc1ba24f7 ("SELINUX: Make selinux cache VFS RCU walks safe")
Reported-by: BMK <bmktuwien@gmail.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Replace printk with pr_* to avoid checkpatch warnings.
Signed-off-by: Peter Enderborg <peter.enderborg@sony.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Wrap the AVC state within the selinux_state structure and
pass it explicitly to all AVC functions. The AVC private state
is encapsulated in a selinux_avc structure that is referenced
from the selinux_state.
This change should have no effect on SELinux behavior or
APIs (userspace or LSM).
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <james.morris@microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Define a selinux state structure (struct selinux_state) for
global SELinux state and pass it explicitly to all security server
functions. The public portion of the structure contains state
that is used throughout the SELinux code, such as the enforcing mode.
The structure also contains a pointer to a selinux_ss structure whose
definition is private to the security server and contains security
server specific state such as the policy database and SID table.
This change should have no effect on SELinux behavior or APIs
(userspace or LSM). It merely wraps SELinux state and passes it
explicitly as needed.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: minor fixups needed due to collisions with the SCTP patches]
Signed-off-by: Paul Moore <paul@paul-moore.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEcQCq365ubpQNLgrWVeRaWujKfIoFAlmoS/wUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQVeRaWujKfIqubxAAhkcmOgf+bh881VOWjkrl0MpO6n30
LAyLHNpa95xYw7AuxSrx+XP21hZHVWOSPEZdDjC+BOTToqv025XyYUAh+vvhm1pc
HgT7oNOyfEnGdXG8VtluC2zhSunw/gDz7uoUh7+dHpVqa+NayRqaopNY+4tgtVjT
6/DMwfvonTD5GWaNxraFZLaOENXAjbdVBcqoHhnY9cp4w5uGQ3rt6dFpLpW/gW7n
/fUzsjnLTztrsRx3nyEkwJuo/pxugbmZU5sjVgCFd7P729CfBVKqoToIh0CqJfj6
s4RIb//XmRxxiTF1EO7N1suPaqnESjT+Ua3moIuEixs4QjiEu25TNZy8K0b2zLsL
sTt40F5KAbKYXH/WyZxEtPf0HOUwL68oFZ+c4VYcCK6LwJmBLnfhan4BSZgH0/EO
rBIlb5O1znyfuGmLnjUfn+BlPuP35PhRpZVWP2eLZtOC4lY+yaVqzauFIEY/wY96
dYM6YwtJYuZ3C8sQxjT6UWuOYyj/02EgPbvlS7nv4zp1pZNnZ0dx8sfEu6FNeakY
QZAaI4oDvkpj7x4a0biNinacCYIUacRDF63jcKQnaNp3F3Nf1Vh4DKQWbFLfMidN
luWsEsVrPfLynUMZLq3KVUg825bTQw1MapqzlADmOyX6Dq/87/a+nY9IXWOH9TSm
fJjuSsMAtnui1/k=
=/6oy
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20170831' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
"A relatively quiet period for SELinux, 11 patches with only two/three
having any substantive changes.
These noteworthy changes include another tweak to the NNP/nosuid
handling, per-file labeling for cgroups, and an object class fix for
AF_UNIX/SOCK_RAW sockets; the rest of the changes are minor tweaks or
administrative updates (Stephen's email update explains the file
explosion in the diffstat).
Everything passes the selinux-testsuite"
[ Also a couple of small patches from the security tree from Tetsuo
Handa for Tomoyo and LSM cleanup. The separation of security policy
updates wasn't all that clean - Linus ]
* tag 'selinux-pr-20170831' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: constify nf_hook_ops
selinux: allow per-file labeling for cgroupfs
lsm_audit: update my email address
selinux: update my email address
MAINTAINERS: update the NetLabel and Labeled Networking information
selinux: use GFP_NOWAIT in the AVC kmem_caches
selinux: Generalize support for NNP/nosuid SELinux domain transitions
selinux: genheaders should fail if too many permissions are defined
selinux: update the selinux info in MAINTAINERS
credits: update Paul Moore's info
selinux: Assign proper class to PF_UNIX/SOCK_RAW sockets
tomoyo: Update URLs in Documentation/admin-guide/LSM/tomoyo.rst
LSM: Remove security_task_create() hook.
In the process of normalizing audit log messages, it was noticed that the AVC
initialization code registered an audit log KERNEL record that didn't fit the
standard format. In the process of attempting to normalize it it was
determined that this record was not even necessary. Remove it.
Ref: http://marc.info/?l=selinux&m=149614868525826&w=2
See: https://github.com/linux-audit/audit-kernel/issues/48
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Update my email address since epoch.ncsc.mil no longer exists.
MAINTAINERS and CREDITS are already correct.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
There is a strange __GFP_NOMEMALLOC usage pattern in SELinux,
specifically GFP_ATOMIC | __GFP_NOMEMALLOC which doesn't make much
sense. GFP_ATOMIC on its own allows to access memory reserves while
__GFP_NOMEMALLOC dictates we cannot use memory reserves. Replace this
with the much more sane GFP_NOWAIT in the AVC code as we can tolerate
memory allocation failures in that code.
Signed-off-by: Michal Hocko <mhocko@kernel.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Add extended permissions logic to selinux. Extended permissions
provides additional permissions in 256 bit increments. Extend the
generic ioctl permission check to use the extended permissions for
per-command filtering. Source/target/class sets including the ioctl
permission may additionally include a set of commands. Example:
allowxperm <source> <target>:<class> ioctl unpriv_app_socket_cmds
auditallowxperm <source> <target>:<class> ioctl priv_gpu_cmds
Where unpriv_app_socket_cmds and priv_gpu_cmds are macros
representing commonly granted sets of ioctl commands.
When ioctl commands are omitted only the permissions are checked.
This feature is intended to provide finer granularity for the ioctl
permission that may be too imprecise. For example, the same driver
may use ioctls to provide important and benign functionality such as
driver version or socket type as well as dangerous capabilities such
as debugging features, read/write/execute to physical memory or
access to sensitive data. Per-command filtering provides a mechanism
to reduce the attack surface of the kernel, and limit applications
to the subset of commands required.
The format of the policy binary has been modified to include ioctl
commands, and the policy version number has been incremented to
POLICYDB_VERSION_XPERMS_IOCTL=30 to account for the format
change.
The extended permissions logic is deliberately generic to allow
components to be reused e.g. netlink filters
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Nick Kralevich <nnk@google.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
This allows MAY_NOT_BLOCK to be passed, in RCU-walk mode, through
the new avc_has_perm_flags() to avc_audit() and thence the slow_avc_audit.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Commit f01e1af445 ("selinux: don't pass in NULL avd to avc_has_perm_noaudit")
made this pointer reassignment unnecessary. Avd should continue to reference
the stack-based copy.
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: tweaked subject line]
Signed-off-by: Paul Moore <pmoore@redhat.com>
Remove the function avc_sidcmp() that is not used anywhere.
This was partially found by using a static code analysis program called cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
[PM: rewrite the patch subject line]
Signed-off-by: Paul Moore <pmoore@redhat.com>
We cannot presently tell from an avc: denied message whether access was in
fact denied or was allowed due to global or per-domain permissive mode.
Add a permissive= field to the avc message to reflect this information.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
.. so get rid of it. The only indirect users were all the
avc_has_perm() callers which just expanded to have a zero flags
argument.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Failing to allocate a cache entry will only harm performance not
correctness. Do not consume valuable reserve pages for something like
that.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Neil Brown <neilb@suse.de>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
avc_add_callback now just used for registering reset functions
in initcalls, and the callback functions just did reset operations.
So, reducing the arguments to only one event is enough now.
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
avc_add_callback now only called from initcalls, so replace the
weak GFP_ATOMIC to GFP_KERNEL, and mark this function __init
to make a warning when not been called from initcalls.
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
We no longer need the distinction. We only need data after we decide to do an
audit. So turn the "late" audit data into just "data" and remove what we
currently have as "data".
Signed-off-by: Eric Paris <eparis@redhat.com>
It isn't needed. If you don't set the type of the data associated with
that type it is a pretty obvious programming bug. So why waste the cycles?
Signed-off-by: Eric Paris <eparis@redhat.com>
We pay a rather large overhead initializing the common_audit_data.
Since we only need this information if we actually emit an audit
message there is little need to set it up in the hot path. This patch
splits the functionality of avc_has_perm() into avc_has_perm_noaudit(),
avc_audit_required() and slow_avc_audit(). But we take care of setting
up to audit between required() and the actual audit call. Thus saving
measurable time in a hot path.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Eric Paris <eparis@redhat.com>
It just bloats the audit data structure for no good reason, since the
only time those fields are filled are just before calling the
common_lsm_audit() function, which is also the only user of those
fields.
So just make them be the arguments to common_lsm_audit(), rather than
bloating that structure that is passed around everywhere, and is
initialized in hot paths.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of declaring the entire selinux_audit_data on the stack when we
start an operation on declare it on the stack if we are going to use it.
We know it's usefulness at the end of the security decision and can declare
it there.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus found that the gigantic size of the common audit data caused a big
perf hit on something as simple as running stat() in a loop. This patch
requires LSMs to declare the LSM specific portion separately rather than
doing it in a union. Thus each LSM can be responsible for shrinking their
portion and don't have to pay a penalty just because other LSMs have a
bigger space requirement.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that all the slow-path code is gone from these functions, we can
inline them into the main caller - avc_has_perm_flags().
Now the compiler can see that 'avc' is allocated on the stack for this
case, which helps register pressure a bit. It also actually shrinks the
total stack frame, because the stack frame that avc_has_perm_flags()
always needed (for that 'avc' allocation) is now sufficient for the
inlined functions too.
Inlining isn't bad - but mindless inlining of cold code (see the
previous commit) is.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The selinux AVC paths remain some of the hottest (and deepest) codepaths
at filename lookup time, and we make it worse by having the slow path
cases take up I$ and stack space even when they don't trigger. Gcc
tends to always want to inline functions that are just called once -
never mind that this might make for slower and worse code in the caller.
So this tries to improve on it a bit by making the slow-path cases
explicitly separate functions that are marked noinline, causing gcc to
at least no longer allocate stack space for them unless they are
actually called. It also seems to help register allocation a tiny bit,
since gcc now doesn't take the slow case code into account.
Uninlining the slow path may also allow us to inline the remaining hot
path into the one caller that actually matters: avc_has_perm_flags().
I'll have to look at that separately, but both avc_audit() and
avc_has_perm_noaudit() are now small and lean enough that inlining them
may make sense.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
avc_audit() did a lot of jumping around and had a big stack frame, all
for the uncommon case.
Split up the uncommon case (which we really can't make go fast anyway)
into its own slow function, and mark the conditional branches
appropriately for the common likely case.
This causes avc_audit() to no longer show up as one of the hottest
functions on the branch profiles (the new "perf -b" thing), and makes
the cycle profiles look really nice and dense too.
The whole audit path is still annoyingly very much one of the biggest
costs of name lookup, so these things are worth optimizing for. I wish
we could just tell people to turn it off, but realistically we do need
it: we just need to make sure that the overhead of the necessary evil is
as low as possible.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Right now security_get_user_sids() will pass in a NULL avd pointer to
avc_has_perm_noaudit(), which then forces that function to have a dummy
entry for that case and just generally test it.
Don't do it. The normal callers all pass a real avd pointer, and this
helper function is incredibly hot. So don't make avc_has_perm_noaudit()
do conditional stuff that isn't needed for the common case.
This also avoids some duplicated stack space.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>