Pull vfs fixes from Al Viro:
"Assorted fixes all over the place"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
aio: fix io_destroy(2) vs. lookup_ioctx() race
ext2: fix a block leak
nfsd: vfs_mkdir() might succeed leaving dentry negative unhashed
cachefiles: vfs_mkdir() might succeed leaving dentry negative unhashed
unfuck sysfs_mount()
kernfs: deal with kernfs_fill_super() failures
cramfs: Fix IS_ENABLED typo
befs_lookup(): use d_splice_alias()
affs_lookup: switch to d_splice_alias()
affs_lookup(): close a race with affs_remove_link()
fix breakage caused by d_find_alias() semantics change
fs: don't scan the inode cache before SB_BORN is set
do d_instantiate/unlock_new_inode combinations safely
iov_iter: fix memory leak in pipe_get_pages_alloc()
iov_iter: fix return type of __pipe_get_pages()
Sites may wish to provide additional metadata alongside files in order
to make more fine-grained security decisions[1]. The security of this is
enhanced if this metadata is protected, something that EVM makes
possible. However, the kernel cannot know about the set of extended
attributes that local admins may wish to protect, and hardcoding this
policy in the kernel makes it difficult to change over time and less
convenient for distributions to enable.
This patch adds a new /sys/kernel/security/integrity/evm/evm_xattrs node,
which can be read to obtain the current set of EVM-protected extended
attributes or written to in order to add new entries. Extending this list
will not change the validity of any existing signatures provided that the
file in question does not have any of the additional extended attributes -
missing xattrs are skipped when calculating the EVM hash.
[1] For instance, a package manager could install information about the
package uploader in an additional extended attribute. Local LSM policy
could then be associated with that extended attribute in order to
restrict the privileges available to packages from less trusted
uploaders.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: James Morris <james.morris@microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Use a list of xattrs rather than an array - this makes it easier to
extend the list at runtime.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: James Morris <james.morris@microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEcQCq365ubpQNLgrWVeRaWujKfIoFAlr8kO8UHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQVeRaWujKfIrEtg/5AWIHjkXWgUnwtG+zswaZmzXRCIHi
Ixz/R7gDLBstLDORr0mZ19sllo9iQfiFfeKQL+8ewn5CM7vGViASBDrbscsU9QDI
imy5PLcJ4iVRcLhpgKCQWrz2kE3lIkK1UlpMTnsHR7wXeLrTKF4bSI/Rdyu6jApB
VnyOaeTp3BUKpY5mKURVP+N8jG/MF/kCx94lNlsBnVmPkbI8A8wALyZPZt9D7YRu
3FGRQQ9FM0HTGTplnfvDLoEH97Dk4MRTGaKpHj/kKuqviQDpf/JH6/fk1nQDgHkW
Mzj6YbMZddee7TDbhmmyvymaYNqcjbRiOiPBEodoDMHcN9Cba7gvtGA0J4/WSLaz
ZdVUdqG1E0P3qsda4/pf1FLDTXOtwmxk0J/fwOixnfnVIvb/mUGzJrxb2HqXQBjH
Mycd260b4LmZg1XSkAiBvF6XLanOx3VZHTMg5rsMgM2lZ8o7mH3nWwbEhy9qIuHp
gSq63NU/X43pB8dfGVxWvVKild2uA2wKO4Kl6hZ0DW4VdM5423qz67aYy38EIguk
cEvTGrFBqZy5ib1XzXSYjMsmHRZQAU2SDI4g6gjSTjK+WnzaUgliFN0EyS7IIK1c
us1gYIPa3LrQ7giUsCqyKAcp08tHSAHYw6z1vHS1tlu447EkTX6QzO99dMPtMzWd
69zSUhOtbYamaiA=
=SSWK
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20180516' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux fixes from Paul Moore:
"A small pull request to fix a few regressions in the SELinux/SCTP code
with applications that call bind() with AF_UNSPEC/INADDR_ANY.
The individual commit descriptions have more information, but the
commits themselves should be self explanatory"
* tag 'selinux-pr-20180516' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: correctly handle sa_family cases in selinux_sctp_bind_connect()
selinux: fix address family in bind() and connect() to match address/port
selinux: add AF_UNSPEC and INADDR_ANY checks to selinux_socket_bind()
We want to add additional evm control nodes, and it'd be preferable not
to clutter up the securityfs root directory any further. Create a new
integrity directory, move the ima directory into it, create an evm
directory for the evm attribute and add compatibility symlinks.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Commit a756024 ("ima: added ima_policy_flag variable") replaced
ima_initialized with ima_policy_flag, but didn't remove ima_initialized.
This patch removes it.
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Reviewed-by: James Morris <james.morris@microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Define pr_fmt everywhere.
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> (powerpc build error)
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Changelog:
Previous pr_fmt definition was too late and caused problems in powerpc
allyesconfg build.
Kernel configured as CONFIG_IMA_READ_POLICY=y && CONFIG_IMA_WRITE_POLICY=n
keeps 0600 mode after loading policy. Remove write permission to state
that policy file no longer be written.
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Variants of proc_create{,_data} that directly take a struct seq_operations
argument and drastically reduces the boilerplate code in the callers.
All trivial callers converted over.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Recognizing that the audit context is an internal audit value, use an
access function to retrieve the audit context pointer for the task
rather than reaching directly into the task struct to get it.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: merge fuzz in auditsc.c and selinuxfs.c, checkpatch.pl fixes]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Allow to pass the socket address structure with AF_UNSPEC family for
compatibility purposes. selinux_socket_bind() will further check it
for INADDR_ANY and selinux_socket_connect_helper() should return
EINVAL.
For a bad address family return EINVAL instead of AFNOSUPPORT error,
i.e. what is expected from SCTP protocol in such case.
Fixes: d452930fd3 ("selinux: Add SCTP support")
Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Since sctp_bindx() and sctp_connectx() can have multiple addresses,
sk_family can differ from sa_family. Therefore, selinux_socket_bind()
and selinux_socket_connect_helper(), which process sockaddr structure
(address and port), should use the address family from that structure
too, and not from the socket one.
The initialization of the data for the audit record is moved above,
in selinux_socket_bind(), so that there is no duplicate changes and
code.
Fixes: d452930fd3 ("selinux: Add SCTP support")
Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Commit d452930fd3 ("selinux: Add SCTP support") breaks compatibility
with the old programs that can pass sockaddr_in structure with AF_UNSPEC
and INADDR_ANY to bind(). As a result, bind() returns EAFNOSUPPORT error.
This was found with LTP/asapi_01 test.
Similar to commit 29c486df6a ("net: ipv4: relax AF_INET check in
bind()"), which relaxed AF_INET check for compatibility, add AF_UNSPEC
case to AF_INET and make sure that the address is INADDR_ANY.
Fixes: d452930fd3 ("selinux: Add SCTP support")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
"VFS: don't keep disconnected dentries on d_anon" had a non-trivial
side-effect - d_unhashed() now returns true for those dentries,
making d_find_alias() skip them altogether. For most of its callers
that's fine - we really want a connected alias there. However,
there is a codepath where we relied upon picking such aliases
if nothing else could be found - selinux delayed initialization
of contexts for inodes on already mounted filesystems used to
rely upon that.
Cc: stable@kernel.org # f1ee616214 "VFS: don't keep disconnected dentries on d_anon"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We're interested in getting rid of all of the stack allocated arrays in
the kernel: https://lkml.org/lkml/2018/3/7/621
This case is interesting, since we really just need an array of bytes that
are zero. The loop already ensures that if the array isn't exactly the
right size that enough zero bytes will be copied in. So, instead of
choosing this value to be the size of the hash, let's just choose it to be
32, since that is a common size, is not too big, and will not result in too
many extra iterations of the loop.
v2: split out from other patch, just hardcode array size instead of
dynamically allocating something the right size
v3: fix typo of 256 -> 32
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Reviewed-by: Kees Cook <keescook@chromium.org>
CC: David Howells <dhowells@redhat.com>
CC: James Morris <jmorris@namei.org>
CC: "Serge E. Hallyn" <serge@hallyn.com>
CC: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
We're interested in getting rid of all of the stack allocated arrays in the
kernel: https://lkml.org/lkml/2018/3/7/621
This particular vla is used as a temporary output buffer in case there is
too much hash output for the destination buffer. Instead, let's just
allocate a buffer that's big enough initially, but only copy back to
userspace the amount that was originally asked for.
v2: allocate enough in the original output buffer vs creating a temporary
output buffer
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Reviewed-by: Kees Cook <keescook@chromium.org>
CC: David Howells <dhowells@redhat.com>
CC: James Morris <jmorris@namei.org>
CC: "Serge E. Hallyn" <serge@hallyn.com>
CC: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
We're interested in getting rid of all of the stack allocated arrays in the
kernel [1]. This patch simply hardcodes the iv length to match that of the
hardcoded cipher.
[1]: https://lkml.org/lkml/2018/3/7/621
v2: hardcode the length of the nonce to be the GCM AES IV length, and do a
sanity check in init(), Eric Biggers
v3: * remember to free big_key_aead when sanity check fails
* define a constant for big key IV size so it can be changed along side
the algorithm in the code
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Reviewed-by: Kees Cook <keescook@chromium.org>
CC: David Howells <dhowells@redhat.com>
CC: James Morris <jmorris@namei.org>
CC: "Serge E. Hallyn" <serge@hallyn.com>
CC: Jason A. Donenfeld <Jason@zx2c4.com>
CC: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Make sure to implement the new socketpair callback so the SO_PEERSEC
call on socketpair(2)s will return correct information.
Signed-off-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Make sure to implement the new socketpair callback so the SO_PEERSEC
call on socketpair(2)s will return correct information.
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Right now the LSM labels for socketpairs are always uninitialized,
since there is no security hook for the socketpair() syscall. This
patch adds the required hooks so LSMs can properly label socketpairs.
This allows SO_PEERSEC to return useful information on those sockets.
Note that the behavior of socketpair() can be emulated by creating a
listener socket, connecting to it, and then discarding the initial
listener socket. With this workaround, SO_PEERSEC would return the
caller's security context. However, with socketpair(), the uninitialized
context is returned unconditionally. This is unexpected and makes
socketpair() less useful in situations where the security context is
crucial to the application.
With the new socketpair-hook this disparity can be solved by making
socketpair() return the expected security context.
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Buildable skeleton of AF_XDP without any functionality. Just what it
takes to register a new address family.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The kernel should not calculate new hmacs for mounts done by
non-root users. Update evm_calc_hmac_or_hash() to refuse to
calculate new hmacs for mounts for non-init user namespaces.
Cc: linux-integrity@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: James Morris <james.l.morris@oracle.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Dongsu Park <dongsu@kinvolk.io>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Acked-by: Christian Boltz <apparmor@cboltz.de>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Acked-by: Christian Boltz <apparmor@cboltz.de>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Acked-by: Christian Boltz <apparmor@cboltz.de>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Acked-by: Christian Boltz <apparmor@cboltz.de>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Acked-by: Christian Boltz <apparmor@cboltz.de>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Acked-by: Christian Boltz <apparmor@cboltz.de>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Acked-by: Christian Boltz <apparmor@cboltz.de>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Use a radix tree to provide a map between the secid and the label,
and along with it a basic ability to provide secctx conversion.
Shared/cached secctx will be added later.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Pull userns bug fix from Eric Biederman:
"Just a small fix to properly set the return code on error"
* 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
commoncap: Handle memory allocation failure.
The audit MAC_POLICY_LOAD record had redundant dangling keywords and was
missing information about which LSM was responsible and its completion
status. While this record is only issued on success, the parser expects
the res= field to be present.
Old record:
type=MAC_POLICY_LOAD msg=audit(1479299795.404:43): policy loaded auid=0 ses=1
Delete the redundant dangling keywords, add the lsm= field and the res=
field.
New record:
type=MAC_POLICY_LOAD msg=audit(1523293846.204:894): auid=0 ses=1 lsm=selinux res=1
See: https://github.com/linux-audit/audit-kernel/issues/47
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
There were two formats of the audit MAC_STATUS record, one of which was more
standard than the other. One listed enforcing status changes and the
other listed enabled status changes with a non-standard label. In
addition, the record was missing information about which LSM was
responsible and the operation's completion status. While this record is
only issued on success, the parser expects the res= field to be present.
old enforcing/permissive:
type=MAC_STATUS msg=audit(1523312831.378:24514): enforcing=0 old_enforcing=1 auid=0 ses=1
old enable/disable:
type=MAC_STATUS msg=audit(1523312831.378:24514): selinux=0 auid=0 ses=1
List both sets of status and old values and add the lsm= field and the
res= field.
Here is the new format:
type=MAC_STATUS msg=audit(1523293828.657:891): enforcing=0 old_enforcing=1 auid=0 ses=1 enabled=1 old-enabled=1 lsm=selinux res=1
This record already accompanied a SYSCALL record.
See: https://github.com/linux-audit/audit-kernel/issues/46
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: 80-char fixes, merge fuzz, use new SELinux state functions]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Use new return type vm_fault_t for fault handler
in struct vm_operations_struct.
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
- add base infrastructure for socket mediation. ABI bump and
additional checks to ensure only v8 compliant policy uses
socket af mediation.
- improve and cleanup dfa verification
- improve profile attachment logic
- improve overlapping expression handling
- add the xattr matching to the attachment logic
- improve signal mediation handling with stacked labels
- improve handling of no_new_privs in a label stack
+ Cleanups and changes
- use dfa to parse string split
- bounded version of label_parse
- proper line wrap nulldfa.in
- split context out into task and cred naming to better match usage
- simplify code in aafs
+ Bug fixes
- fix display of .ns_name for containers
- fix resource audit messages when auditing peer
- fix logging of the existence test for signals
- fix resource audit messages when auditing peer
- fix display of .ns_name for containers
- fix an error code in verify_table_headers()
- fix memory leak on buffer on error exit path
- fix error returns checks by making size a ssize_t
-----BEGIN PGP SIGNATURE-----
iQIcBAABCgAGBQJazWpMAAoJEAUvNnAY1cPY2wwP/2ZmzyITY7xW3Cpz8ynKOTyZ
hD2ahIjLWxcQwMZUoHXIa4TTK5EThlhKcTa0+sdMJGsIsRyXLoyBcd/VST0F9ZrA
OWn1uL2ASeNroNw+88P6qU03+cT2eEohM3vvlNy2ud98EBiTyxB6L4VLpy3xDKAd
zblojxqegRO7WRfEFCR2kHmnrL0Z3oxPBahnuVitfrwO76WFUSM9EYm67Xtf4yjJ
qQ7ocGdhxiULNdceoIke11e8iNwiQyY4O+E24qVAJw66arxIByMKo+cLjeTxMbZR
z4/pVd664wiK7mW0In7bJWOfXLJHxHALpuCc82wFgiLPdfSpJzT1nx+Xjaw8DhdZ
FBoHLpHjJT3dTpYoQTjqtNdvHgXryL/OOllm+I8DPMu/nfcp8qsOru5bEXg+j/90
CRo1OqrWZhUkKHnQs12QIJS+Gt7qByQB6tDMDbjkIC71vKUWA4wnp7zLZHYd9T0L
6kZ2aWKiOXM6VRZ5V5HVLhrTajiubyBg3y3Eur4HwuGzquBmxAp1RhS8oiOpgzgW
jVI92/P2XjhnU9E2J5m+mzjh11i+D51homtz1y4vB53Ye/WLy1S0o4StDAiLfgw3
q/581V342vl6X46GlgcS5G7QeIkzFiCUe5H3t2/unCRnI+PxabwRmbaTqWq47xzQ
umwlYfok3ALSzdgnv2sT
=XhxG
-----END PGP SIGNATURE-----
Merge tag 'apparmor-pr-2018-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor updates from John Johansen:
"Features:
- add base infrastructure for socket mediation. ABI bump and
additional checks to ensure only v8 compliant policy uses socket af
mediation.
- improve and cleanup dfa verification
- improve profile attachment logic
- improve overlapping expression handling
- add the xattr matching to the attachment logic
- improve signal mediation handling with stacked labels
- improve handling of no_new_privs in a label stack
Cleanups and changes:
- use dfa to parse string split
- bounded version of label_parse
- proper line wrap nulldfa.in
- split context out into task and cred naming to better match usage
- simplify code in aafs
Bug fixes:
- fix display of .ns_name for containers
- fix resource audit messages when auditing peer
- fix logging of the existence test for signals
- fix resource audit messages when auditing peer
- fix display of .ns_name for containers
- fix an error code in verify_table_headers()
- fix memory leak on buffer on error exit path
- fix error returns checks by making size a ssize_t"
* tag 'apparmor-pr-2018-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (36 commits)
apparmor: fix memory leak on buffer on error exit path
apparmor: fix dangling symlinks to policy rawdata after replacement
apparmor: Fix an error code in verify_table_headers()
apparmor: fix error returns checks by making size a ssize_t
apparmor: update MAINTAINERS file git and wiki locations
apparmor: remove POLICY_MEDIATES_SAFE
apparmor: add base infastructure for socket mediation
apparmor: improve overlapping domain attachment resolution
apparmor: convert attaching profiles via xattrs to use dfa matching
apparmor: Add support for attaching profiles via xattr, presence and value
apparmor: cleanup: simplify code to get ns symlink name
apparmor: cleanup create_aafs() error path
apparmor: dfa split verification of table headers
apparmor: dfa add support for state differential encoding
apparmor: dfa move character match into a macro
apparmor: update domain transitions that are subsets of confinement at nnp
apparmor: move context.h to cred.h
apparmor: move task related defines and fns to task.X files
apparmor: cleanup, drop unused fn __aa_task_is_confined()
apparmor: cleanup fixup description of aa_replace_profiles
...
There is a permission discrepancy when consulting msq ipc object
metadata between /proc/sysvipc/msg (0444) and the MSG_STAT shmctl
command. The later does permission checks for the object vs S_IRUGO.
As such there can be cases where EACCESS is returned via syscall but the
info is displayed anyways in the procfs files.
While this might have security implications via info leaking (albeit no
writing to the msq metadata), this behavior goes way back and showing
all the objects regardless of the permissions was most likely an
overlook - so we are stuck with it. Furthermore, modifying either the
syscall or the procfs file can cause userspace programs to break (ie
ipcs). Some applications require getting the procfs info (without root
privileges) and can be rather slow in comparison with a syscall -- up to
500x in some reported cases for shm.
This patch introduces a new MSG_STAT_ANY command such that the msq ipc
object permissions are ignored, and only audited instead. In addition,
I've left the lsm security hook checks in place, as if some policy can
block the call, then the user has no other choice than just parsing the
procfs file.
Link: http://lkml.kernel.org/r/20180215162458.10059-4-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reported-by: Robert Kettler <robert.kettler@outlook.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is a permission discrepancy when consulting shm ipc object
metadata between /proc/sysvipc/sem (0444) and the SEM_STAT semctl
command. The later does permission checks for the object vs S_IRUGO.
As such there can be cases where EACCESS is returned via syscall but the
info is displayed anyways in the procfs files.
While this might have security implications via info leaking (albeit no
writing to the sma metadata), this behavior goes way back and showing
all the objects regardless of the permissions was most likely an
overlook - so we are stuck with it. Furthermore, modifying either the
syscall or the procfs file can cause userspace programs to break (ie
ipcs). Some applications require getting the procfs info (without root
privileges) and can be rather slow in comparison with a syscall -- up to
500x in some reported cases for shm.
This patch introduces a new SEM_STAT_ANY command such that the sem ipc
object permissions are ignored, and only audited instead. In addition,
I've left the lsm security hook checks in place, as if some policy can
block the call, then the user has no other choice than just parsing the
procfs file.
Link: http://lkml.kernel.org/r/20180215162458.10059-3-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reported-by: Robert Kettler <robert.kettler@outlook.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "sysvipc: introduce STAT_ANY commands", v2.
The following patches adds the discussed (see [1]) new command for shm
as well as for sems and msq as they are subject to the same
discrepancies for ipc object permission checks between the syscall and
via procfs. These new commands are justified in that (1) we are stuck
with this semantics as changing syscall and procfs can break userland;
and (2) some users can benefit from performance (for large amounts of
shm segments, for example) from not having to parse the procfs
interface.
Once merged, I will submit the necesary manpage updates. But I'm thinking
something like:
: diff --git a/man2/shmctl.2 b/man2/shmctl.2
: index 7bb503999941..bb00bbe21a57 100644
: --- a/man2/shmctl.2
: +++ b/man2/shmctl.2
: @@ -41,6 +41,7 @@
: .\" 2005-04-25, mtk -- noted aberrant Linux behavior w.r.t. new
: .\" attaches to a segment that has already been marked for deletion.
: .\" 2005-08-02, mtk: Added IPC_INFO, SHM_INFO, SHM_STAT descriptions.
: +.\" 2018-02-13, dbueso: Added SHM_STAT_ANY description.
: .\"
: .TH SHMCTL 2 2017-09-15 "Linux" "Linux Programmer's Manual"
: .SH NAME
: @@ -242,6 +243,18 @@ However, the
: argument is not a segment identifier, but instead an index into
: the kernel's internal array that maintains information about
: all shared memory segments on the system.
: +.TP
: +.BR SHM_STAT_ANY " (Linux-specific)"
: +Return a
: +.I shmid_ds
: +structure as for
: +.BR SHM_STAT .
: +However, the
: +.I shm_perm.mode
: +is not checked for read access for
: +.IR shmid ,
: +resembing the behaviour of
: +/proc/sysvipc/shm.
: .PP
: The caller can prevent or allow swapping of a shared
: memory segment with the following \fIcmd\fP values:
: @@ -287,7 +300,7 @@ operation returns the index of the highest used entry in the
: kernel's internal array recording information about all
: shared memory segments.
: (This information can be used with repeated
: -.B SHM_STAT
: +.B SHM_STAT/SHM_STAT_ANY
: operations to obtain information about all shared memory segments
: on the system.)
: A successful
: @@ -328,7 +341,7 @@ isn't accessible.
: \fIshmid\fP is not a valid identifier, or \fIcmd\fP
: is not a valid command.
: Or: for a
: -.B SHM_STAT
: +.B SHM_STAT/SHM_STAT_ANY
: operation, the index value specified in
: .I shmid
: referred to an array slot that is currently unused.
This patch (of 3):
There is a permission discrepancy when consulting shm ipc object metadata
between /proc/sysvipc/shm (0444) and the SHM_STAT shmctl command. The
later does permission checks for the object vs S_IRUGO. As such there can
be cases where EACCESS is returned via syscall but the info is displayed
anyways in the procfs files.
While this might have security implications via info leaking (albeit no
writing to the shm metadata), this behavior goes way back and showing all
the objects regardless of the permissions was most likely an overlook - so
we are stuck with it. Furthermore, modifying either the syscall or the
procfs file can cause userspace programs to break (ie ipcs). Some
applications require getting the procfs info (without root privileges) and
can be rather slow in comparison with a syscall -- up to 500x in some
reported cases.
This patch introduces a new SHM_STAT_ANY command such that the shm ipc
object permissions are ignored, and only audited instead. In addition,
I've left the lsm security hook checks in place, as if some policy can
block the call, then the user has no other choice than just parsing the
procfs file.
[1] https://lkml.org/lkml/2017/12/19/220
Link: http://lkml.kernel.org/r/20180215162458.10059-2-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Robert Kettler <robert.kettler@outlook.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Tom Zanussi's extended histogram work
This adds the synthetic events to have histograms from multiple event data
Adds triggers "onmatch" and "onmax" to call the synthetic events
Several updates to the histogram code from this
- Allow way to nest ring buffer calls in the same context
- Allow absolute time stamps in ring buffer
- Rewrite of filter code parsing based on Al Viro's suggestions
- Setting of trace_clock to global if TSC is unstable (on boot)
- Better OOM handling when allocating large ring buffers
- Added initcall tracepoints (consolidated initcall_debug code with them)
And other various fixes and clean ups
-----BEGIN PGP SIGNATURE-----
iQHIBAABCgAyFiEEPm6V/WuN2kyArTUe1a05Y9njSUkFAlrLoCAUHHJvc3RlZHRA
Z29vZG1pcy5vcmcACgkQ1a05Y9njSUks/QwAn/ky8WgfjcRdjKmBYuEwDedvm9iI
V9G5kpv5JMw5dLz4l1pS3tA3M9Lyuc5z3Shw92FTy36vdU1wxEjQgHa7viB1xk9x
KsiTyNjTsgrRd7GVHMy/8Be2RRiTRLaXKAsLCoj/c7QWzagV1P8XWlWK5mojYkh/
DrSXyg9Avkp30+sU1bvcLWnmmZUFqMxs+bWipD9uFc98USMMyeP25nrnhrj0gDTg
Q93cjXUuyVRC4lJ2YTW0GCSKhMKEw5f/ltEOT1hwScqYkCJj1EubKqS53R/9h21z
IPUrYcqLnMRu0j2ejR+UAy5Vsy3gJUrPMQb0F6hlu1DwbMd0d/9SGh1c+Sm+zorh
yftWTdCZsYrXkaOuB6V5M30X+KBwbWO0Xc9VCvgJ/IU5vMlgLSt5itTWbT/Fmfhb
ll5/RXP7zhSXRv5sdl/BP3/4dd6F8jpyKyaR2Rk2+XjBOGIq5mvqNGr4Vj9AzxW8
E0nvq7l7e0dbxZNM42gEm3cht1VUg7Zz0Y0+
=91oN
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"New features:
- Tom Zanussi's extended histogram work.
This adds the synthetic events to have histograms from multiple
event data Adds triggers "onmatch" and "onmax" to call the
synthetic events Several updates to the histogram code from this
- Allow way to nest ring buffer calls in the same context
- Allow absolute time stamps in ring buffer
- Rewrite of filter code parsing based on Al Viro's suggestions
- Setting of trace_clock to global if TSC is unstable (on boot)
- Better OOM handling when allocating large ring buffers
- Added initcall tracepoints (consolidated initcall_debug code with
them)
And other various fixes and clean ups"
* tag 'trace-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (68 commits)
init: Have initcall_debug still work without CONFIG_TRACEPOINTS
init, tracing: Have printk come through the trace events for initcall_debug
init, tracing: instrument security and console initcall trace events
init, tracing: Add initcall trace events
tracing: Add rcu dereference annotation for test func that touches filter->prog
tracing: Add rcu dereference annotation for filter->prog
tracing: Fixup logic inversion on setting trace_global_clock defaults
tracing: Hide global trace clock from lockdep
ring-buffer: Add set/clear_current_oom_origin() during allocations
ring-buffer: Check if memory is available before allocation
lockdep: Add print_irqtrace_events() to __warn
vsprintf: Do not preprocess non-dereferenced pointers for bprintf (%px and %pK)
tracing: Uninitialized variable in create_tracing_map_fields()
tracing: Make sure variable string fields are NULL-terminated
tracing: Add action comparisons when testing matching hist triggers
tracing: Don't add flag strings when displaying variable references
tracing: Fix display of hist trigger expressions containing timestamps
ftrace: Drop a VLA in module_exists()
tracing: Mention trace_clock=global when warning about unstable clocks
tracing: Default to using trace_global_clock if sched_clock is unstable
...
Commit 0619f0f5e3 ("selinux: wrap selinuxfs state") triggers a BUG
when SELinux is runtime-disabled (i.e. systemd or equivalent disables
SELinux before initial policy load via /sys/fs/selinux/disable based on
/etc/selinux/config SELINUX=disabled).
This does not manifest if SELinux is disabled via kernel command line
argument or if SELinux is enabled (permissive or enforcing).
Before:
SELinux: Disabled at runtime.
BUG: Dentry 000000006d77e5c7{i=17,n=null} still in use (1) [unmount of selinuxfs selinuxfs]
After:
SELinux: Disabled at runtime.
Fixes: 0619f0f5e3 ("selinux: wrap selinuxfs state")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull integrity updates from James Morris:
"A mixture of bug fixes, code cleanup, and continues to close
IMA-measurement, IMA-appraisal, and IMA-audit gaps.
Also note the addition of a new cred_getsecid LSM hook by Matthew
Garrett:
For IMA purposes, we want to be able to obtain the prepared secid
in the bprm structure before the credentials are committed. Add a
cred_getsecid hook that makes this possible.
which is used by a new CREDS_CHECK target in IMA:
In ima_bprm_check(), check with both the existing process
credentials and the credentials that will be committed when the new
process is started. This will not change behaviour unless the
system policy is extended to include CREDS_CHECK targets -
BPRM_CHECK will continue to check the same credentials that it did
previously"
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
ima: Fallback to the builtin hash algorithm
ima: Add smackfs to the default appraise/measure list
evm: check for remount ro in progress before writing
ima: Improvements in ima_appraise_measurement()
ima: Simplify ima_eventsig_init()
integrity: Remove unused macro IMA_ACTION_RULE_FLAGS
ima: drop vla in ima_audit_measurement()
ima: Fix Kconfig to select TPM 2.0 CRB interface
evm: Constify *integrity_status_msg[]
evm: Move evm_hmac and evm_hash from evm_main.c to evm_crypto.c
fuse: define the filesystem as untrusted
ima: fail signature verification based on policy
ima: clear IMA_HASH
ima: re-evaluate files on privileged mounted filesystems
ima: fail file signature verification on non-init mounted filesystems
IMA: Support using new creds in appraisal policy
security: Add a cred_getsecid hook
Pull smack update from James Morris:
"One small change for Automotive Grade Linux"
* 'next-smack' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
Smack: Handle CGROUP2 in the same way that CGROUP
Pull general security layer updates from James Morris:
- Convert security hooks from list to hlist, a nice cleanup, saving
about 50% of space, from Sargun Dhillon.
- Only pass the cred, not the secid, to kill_pid_info_as_cred and
security_task_kill (as the secid can be determined from the cred),
from Stephen Smalley.
- Close a potential race in kernel_read_file(), by making the file
unwritable before calling the LSM check (vs after), from Kees Cook.
* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security: convert security hooks to use hlist
exec: Set file unwritable before LSM check
usb, signal, security: only pass the cred, not the secid, to kill_pid_info_as_cred and security_task_kill
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEcQCq365ubpQNLgrWVeRaWujKfIoFAlrD6XoUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQVeRaWujKfIpy9RAAjwhkNBNJhw1UlGggVvst8lzJBdMp
XxL7cg+1TcZkB12yrghILg+gY4j5PzY4GJo1gvllWIHsT8Ud6cQTI/AzeYR2OfZ3
mHv3gtyzmHsPGBdqhmgC7R10tpyXFXwDc3VLMtuuDiUl/seFEaJWOMYP7zj+tRil
XoOCyoV9bb1wb7vNAzQikK8yhz3fu72Y5QOODLfaYeYojMKs8Q8pMZgi68oVQUXk
SmS2mj0k2P3UqeOSk+8phJQhilm32m0tE0YnLvzAhblJLqeS2DUNnWORP1j4oQ/Q
aOOu4ZQ9PA1N7VAIGceuf2HZHhnrFzWdvggp2bxegcRSIfUZ84FuZbrj60RUz2ja
V6GmKYACnyd28TAWdnzjKEd4dc36LSPxnaj8hcrvyO2V34ozVEsvIEIJREoXRUJS
heJ9HT+VIvmguzRCIPPeC1ZYopIt8M1kTRrszigU80TuZjIP0VJHLGQn/rgRQzuO
cV5gmJ6TSGn1l54H13koBzgUCo0cAub8Nl+288qek+jLWoHnKwzLB+1HCWuyeCHt
2q6wdFfenYH0lXdIzCeC7NNHRKCrPNwkZ/32d4ZQf4cu5tAn8bOk8dSHchoAfZG8
p7N6jPPoxmi2F/GRKrTiUNZvQpyvgX3hjtJS6ljOTSYgRhjeNYeCP8U+BlOpLVQy
U4KzB9wOAngTEpo=
=p2Sh
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux updates from Paul Moore:
"A bigger than usual pull request for SELinux, 13 patches (lucky!)
along with a scary looking diffstat.
Although if you look a bit closer, excluding the usual minor
tweaks/fixes, there are really only two significant changes in this
pull request: the addition of proper SELinux access controls for SCTP
and the encapsulation of a lot of internal SELinux state.
The SCTP changes are the result of a multi-month effort (maybe even a
year or longer?) between the SELinux folks and the SCTP folks to add
proper SELinux controls. A special thanks go to Richard for seeing
this through and keeping the effort moving forward.
The state encapsulation work is a bit of janitorial work that came out
of some early work on SELinux namespacing. The question of namespacing
is still an open one, but I believe there is some real value in the
encapsulation work so we've split that out and are now sending that up
to you"
* tag 'selinux-pr-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: wrap AVC state
selinux: wrap selinuxfs state
selinux: fix handling of uninitialized selinux state in get_bools/classes
selinux: Update SELinux SCTP documentation
selinux: Fix ltp test connect-syscall failure
selinux: rename the {is,set}_enforcing() functions
selinux: wrap global selinux state
selinux: fix typo in selinux_netlbl_sctp_sk_clone declaration
selinux: Add SCTP support
sctp: Add LSM hooks
sctp: Add ip option support
security: Add support for SCTP security hooks
netlabel: If PF_INET6, check sk_buff ip header version
Merge updates from Andrew Morton:
- a few misc things
- ocfs2 updates
- the v9fs maintainers have been missing for a long time. I've taken
over v9fs patch slinging.
- most of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (116 commits)
mm,oom_reaper: check for MMF_OOM_SKIP before complaining
mm/ksm: fix interaction with THP
mm/memblock.c: cast constant ULLONG_MAX to phys_addr_t
headers: untangle kmemleak.h from mm.h
include/linux/mmdebug.h: make VM_WARN* non-rvals
mm/page_isolation.c: make start_isolate_page_range() fail if already isolated
mm: change return type to vm_fault_t
mm, oom: remove 3% bonus for CAP_SYS_ADMIN processes
mm, page_alloc: wakeup kcompactd even if kswapd cannot free more memory
kernel/fork.c: detect early free of a live mm
mm: make counting of list_lru_one::nr_items lockless
mm/swap_state.c: make bool enable_vma_readahead and swap_vma_readahead() static
block_invalidatepage(): only release page if the full page was invalidated
mm: kernel-doc: add missing parameter descriptions
mm/swap.c: remove @cold parameter description for release_pages()
mm/nommu: remove description of alloc_vm_area
zram: drop max_zpage_size and use zs_huge_class_size()
zsmalloc: introduce zs_huge_class_size()
mm: fix races between swapoff and flush dcache
fs/direct-io.c: minor cleanups in do_blockdev_direct_IO
...
Trace events have been added around the initcall functions defined in
init/main.c. But console and security have their own initcalls. This adds
the trace events associated for those initcall functions.
Link: http://lkml.kernel.org/r/1521765208.19745.2.camel@polymtl.ca
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Abderrahmane Benbachir <abderrahmane.benbachir@polymtl.ca>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Currently <linux/slab.h> #includes <linux/kmemleak.h> for no obvious
reason. It looks like it's only a convenience, so remove kmemleak.h
from slab.h and add <linux/kmemleak.h> to any users of kmemleak_* that
don't already #include it. Also remove <linux/kmemleak.h> from source
files that do not use it.
This is tested on i386 allmodconfig and x86_64 allmodconfig. It would
be good to run it through the 0day bot for other $ARCHes. I have
neither the horsepower nor the storage space for the other $ARCHes.
Update: This patch has been extensively build-tested by both the 0day
bot & kisskb/ozlabs build farms. Both of them reported 2 build failures
for which patches are included here (in v2).
[ slab.h is the second most used header file after module.h; kernel.h is
right there with slab.h. There could be some minor error in the
counting due to some #includes having comments after them and I didn't
combine all of those. ]
[akpm@linux-foundation.org: security/keys/big_key.c needs vmalloc.h, per sfr]
Link: http://lkml.kernel.org/r/e4309f98-3749-93e1-4bb7-d9501a39d015@infradead.org
Link: http://kisskb.ellerman.id.au/kisskb/head/13396/
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Michael Ellerman <mpe@ellerman.id.au> [2 build failures]
Reported-by: Fengguang Wu <fengguang.wu@intel.com> [2 build failures]
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull namespace updates from Eric Biederman:
"There was a lot of work this cycle fixing bugs that were discovered
after the merge window and getting everything ready where we can
reasonably support fully unprivileged fuse. The bug fixes you already
have and much of the unprivileged fuse work is coming in via other
trees.
Still left for fully unprivileged fuse is figuring out how to cleanly
handle .set_acl and .get_acl in the legacy case, and properly handling
of evm xattrs on unprivileged mounts.
Included in the tree is a cleanup from Alexely that replaced a linked
list with a statically allocated fix sized array for the pid caches,
which simplifies and speeds things up.
Then there is are some cleanups and fixes for the ipc namespace. The
motivation was that in reviewing other code it was discovered that
access ipc objects from different pid namespaces recorded pids in such
a way that when asked the wrong pids were returned. In the worst case
there has been a measured 30% performance impact for sysvipc
semaphores. Other test cases showed no measurable performance impact.
Manfred Spraul and Davidlohr Bueso who tend to work on sysvipc
performance both gave the nod that this is good enough.
Casey Schaufler and James Morris have given their approval to the LSM
side of the changes.
I simplified the types and the code dealing with sysvipc to pass just
kern_ipc_perm for all three types of ipc. Which reduced the header
dependencies throughout the kernel and simplified the lsm code.
Which let me work on the pid fixes without having to worry about
trivial changes causing complete kernel recompiles"
* 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ipc/shm: Fix pid freeing.
ipc/shm: fix up for struct file no longer being available in shm.h
ipc/smack: Tidy up from the change in type of the ipc security hooks
ipc: Directly call the security hook in ipc_ops.associate
ipc/sem: Fix semctl(..., GETPID, ...) between pid namespaces
ipc/msg: Fix msgctl(..., IPC_STAT, ...) between pid namespaces
ipc/shm: Fix shmctl(..., IPC_STAT, ...) between pid namespaces.
ipc/util: Helpers for making the sysvipc operations pid namespace aware
ipc: Move IPCMNI from include/ipc.h into ipc/util.h
msg: Move struct msg_queue into ipc/msg.c
shm: Move struct shmid_kernel into ipc/shm.c
sem: Move struct sem and struct sem_array into ipc/sem.c
msg/security: Pass kern_ipc_perm not msg_queue into the msg_queue security hooks
shm/security: Pass kern_ipc_perm not shmid_kernel into the shm security hooks
sem/security: Pass kern_ipc_perm not sem_array into the sem security hooks
pidns: simpler allocation of pid_* caches
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-03-31
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add raw BPF tracepoint API in order to have a BPF program type that
can access kernel internal arguments of the tracepoints in their
raw form similar to kprobes based BPF programs. This infrastructure
also adds a new BPF_RAW_TRACEPOINT_OPEN command to BPF syscall which
returns an anon-inode backed fd for the tracepoint object that allows
for automatic detach of the BPF program resp. unregistering of the
tracepoint probe on fd release, from Alexei.
2) Add new BPF cgroup hooks at bind() and connect() entry in order to
allow BPF programs to reject, inspect or modify user space passed
struct sockaddr, and as well a hook at post bind time once the port
has been allocated. They are used in FB's container management engine
for implementing policy, replacing fragile LD_PRELOAD wrapper
intercepting bind() and connect() calls that only works in limited
scenarios like glibc based apps but not for other runtimes in
containerized applications, from Andrey.
3) BPF_F_INGRESS flag support has been added to sockmap programs for
their redirect helper call bringing it in line with cls_bpf based
programs. Support is added for both variants of sockmap programs,
meaning for tx ULP hooks as well as recv skb hooks, from John.
4) Various improvements on BPF side for the nfp driver, besides others
this work adds BPF map update and delete helper call support from
the datapath, JITing of 32 and 64 bit XADD instructions as well as
offload support of bpf_get_prandom_u32() call. Initial implementation
of nfp packet cache has been tackled that optimizes memory access
(see merge commit for further details), from Jakub and Jiong.
5) Removal of struct bpf_verifier_env argument from the print_bpf_insn()
API has been done in order to prepare to use print_bpf_insn() soon
out of perf tool directly. This makes the print_bpf_insn() API more
generic and pushes the env into private data. bpftool is adjusted
as well with the print_bpf_insn() argument removal, from Jiri.
6) Couple of cleanups and prep work for the upcoming BTF (BPF Type
Format). The latter will reuse the current BPF verifier log as
well, thus bpf_verifier_log() is further generalized, from Martin.
7) For bpf_getsockopt() and bpf_setsockopt() helpers, IPv4 IP_TOS read
and write support has been added in similar fashion to existing
IPv6 IPV6_TCLASS socket option we already have, from Nikita.
8) Fixes in recent sockmap scatterlist API usage, which did not use
sg_init_table() for initialization thus triggering a BUG_ON() in
scatterlist API when CONFIG_DEBUG_SG was enabled. This adds and
uses a small helper sg_init_marker() to properly handle the affected
cases, from Prashant.
9) Let the BPF core follow IDR code convention and therefore use the
idr_preload() and idr_preload_end() helpers, which would also help
idr_alloc_cyclic() under GFP_ATOMIC to better succeed under memory
pressure, from Shaohua.
10) Last but not least, a spelling fix in an error message for the
BPF cookie UID helper under BPF sample code, from Colin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently on the error exit path the allocated buffer is not free'd
causing a memory leak. Fix this by kfree'ing it.
Detected by CoverityScan, CID#1466876 ("Resource leaks")
Fixes: 1180b4c757 ("apparmor: fix dangling symlinks to policy rawdata after replacement")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
This changes security_hook_heads to use hlist_heads instead of
the circular doubly-linked list heads. This should cut down
the size of the struct by about half.
In addition, it allows mutation of the hooks at the tail of the
callback list without having to modify the head. The longer-term
purpose of this is to enable making the heads read only.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Reviewed-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
rt_genid_bump_all() consists of ipv4 and ipv6 part.
ipv4 part is incrementing of net::ipv4::rt_genid,
and I see many places, where it's read without rtnl_lock().
ipv6 part calls __fib6_clean_all(), and it's also
called without rtnl_lock() in other places.
So, rtnl_lock() here was used to iterate net_namespace_list only,
and we can remove it.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rtnl_lock() is used everywhere, and contention is very high.
When someone wants to iterate over alive net namespaces,
he/she has no a possibility to do that without exclusive lock.
But the exclusive rtnl_lock() in such places is overkill,
and it just increases the contention. Yes, there is already
for_each_net_rcu() in kernel, but it requires rcu_read_lock(),
and this can't be sleepable. Also, sometimes it may be need
really prevent net_namespace_list growth, so for_each_net_rcu()
is not fit there.
This patch introduces new rw_semaphore, which will be used
instead of rtnl_mutex to protect net_namespace_list. It is
sleepable and allows not-exclusive iterations over net
namespaces list. It allows to stop using rtnl_lock()
in several places (what is made in next patches) and makes
less the time, we keep rtnl_mutex. Here we just add new lock,
while the explanation of we can remove rtnl_lock() there are
in next patches.
Fine grained locks generally are better, then one big lock,
so let's do that with net_namespace_list, while the situation
allows that.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
move COUNT_ARGS() macro from apparmor to generic header and extend it
to count till twelve.
COUNT() was an alternative name for this logic, but it's used for
different purpose in many other places.
Similarly for CONCATENATE() macro.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Rename the variables shp, sma, msq to isp. As that is how the code already
refers to those variables.
Collapse smack_of_shm, smack_of_sem, and smack_of_msq into smack_of_ipc,
as the three functions had become completely identical.
Collapse smack_shm_alloc_security, smack_sem_alloc_security and
smack_msg_queue_alloc_security into smack_ipc_alloc_security as the three
functions had become identical.
Collapse smack_shm_free_security, smack_sem_free_security and
smack_msg_queue_free_security into smack_ipc_free_security as the
three functions had become identical.
Requested-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is required to use SMACK and IMA/EVM together. Add it to the
default nomeasure/noappraise list like other pseudo filesystems.
Signed-off-by: Martin Townsend <mtownsend1973@gmail.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
EVM might update the evm xattr while the VFS performs a remount to
readonly mode. This is not properly checked for, additionally check
the s_readonly_remount superblock flag before writing.
The bug can for example be observed with UBIFS. UBIFS checks the free
space on the device before and after a remount. With EVM enabled the
free space sometimes differs between both checks.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Replace nested ifs in the EVM xattr verification logic with a switch
statement, making the code easier to understand.
Also, add comments to the if statements in the out section and constify the
cause variable.
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
The "goto out" statement doesn't have any purpose since there's no cleanup
to be done when returning early, so remove it. This also makes the rc
variable unnecessary so remove it as well.
Also, the xattr_len and fmt variables are redundant so remove them as well.
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
This macro isn't used anymore since commit 0d73a55208 ("ima: re-introduce
own integrity cache lock"), so remove it.
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
In keeping with the directive to get rid of VLAs [1], let's drop the VLA
from ima_audit_measurement(). We need to adjust the return type of
ima_audit_measurement, because now this function can fail if an allocation
fails.
[1]: https://lkml.org/lkml/2018/3/7/621
v2: just use audit_log_format instead of doing a second allocation
v3: ignore failures in ima_audit_measurement()
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
TPM_CRB driver provides TPM CRB 2.0 support. If it is built as a
module, the TPM chip is registered after IMA init. tpm_pcr_read() in
IMA fails and displays the following message even though eventually
there is a TPM chip on the system.
ima: No TPM chip found, activating TPM-bypass! (rc=-19)
Fix IMA Kconfig to select TPM_CRB so TPM_CRB driver is built in the kernel
and initializes before IMA.
Signed-off-by: Jiandi An <anjiandi@codeaurora.org>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
When policy replacement occurs the symlinks in the profile directory
need to be updated to point to the new rawdata, otherwise once the
old rawdata is removed the symlink becomes broken.
Fix this by dynamically generating the symlink everytime it is read.
These links are used enough that their value needs to be cached and
this way we can avoid needing locking to read and update the link
value.
Fixes: a481f4d917 ("apparmor: add custom apparmorfs that will be used by policy namespace files")
BugLink: http://bugs.launchpad.net/bugs/1755563
Signed-off-by: John Johansen <john.johansen@canonical.com>
We accidentally return a positive EPROTO instead of a negative -EPROTO.
Since 71 is not an error pointer, that means it eventually results in an
Oops in the caller.
Fixes: d901d6a298 ("apparmor: dfa split verification of table headers")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Currently variable size is a unsigned size_t, hence comparisons to
see if it is less than zero (for error checking) will always be
false. Fix this by making size a ssize_t
Detected by CoverityScan, CID#1466080 ("Unsigned compared against 0")
Fixes: 8e51f9087f ("apparmor: Add support for attaching profiles via xattr, presence and value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
There is no gain from doing this except for some self-documenting.
Signed-off-by: Hernán Gonzalez <hernan@vanguardiasur.com.ar>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
These variables are not used where they are was defined. There is no
point in declaring them there as extern. Move and constify them, saving
2 bytes.
Function old new delta
init_desc 273 271 -2
Total: Before=2112094, After=2112092, chg -0.00%
Signed-off-by: Hernán Gonzalez <hernan@vanguardiasur.com.ar>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
This patch addresses the fuse privileged mounted filesystems in
environments which are unwilling to accept the risk of trusting the
signature verification and want to always fail safe, but are for example
using a pre-built kernel.
This patch defines a new builtin policy named "fail_securely", which can
be specified on the boot command line as an argument to "ima_policy=".
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Dongsu Park <dongsu@kinvolk.io>
Cc: Alban Crequy <alban@kinvolk.io>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
The IMA_APPRAISE and IMA_HASH policies overlap. Clear IMA_HASH properly.
Fixes: da1b0029f5 ("ima: support new "hash" and "dont_hash" policy actions")
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
This patch addresses the fuse privileged mounted filesystems in a "secure"
environment, with a correctly enforced security policy, which is willing
to assume the inherent risk of specific fuse filesystems that are well
defined and properly implemented.
As there is no way for the kernel to detect file changes, the kernel
ignores the cached file integrity results and re-measures, re-appraises,
and re-audits the file.
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Dongsu Park <dongsu@kinvolk.io>
Cc: Alban Crequy <alban@kinvolk.io>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
FUSE can be mounted by unprivileged users either today with fusermount
installed with setuid, or soon with the upcoming patches to allow FUSE
mounts in a non-init user namespace.
This patch addresses the new unprivileged non-init mounted filesystems,
which are untrusted, by failing the signature verification.
This patch defines two new flags SB_I_IMA_UNVERIFIABLE_SIGNATURE and
SB_I_UNTRUSTED_MOUNTER.
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Dongsu Park <dongsu@kinvolk.io>
Cc: Alban Crequy <alban@kinvolk.io>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
The existing BPRM_CHECK functionality in IMA validates against the
credentials of the existing process, not any new credentials that the
child process may transition to. Add an additional CREDS_CHECK target
and refactor IMA to pass the appropriate creds structure. In
ima_bprm_check(), check with both the existing process credentials and
the credentials that will be committed when the new process is started.
This will not change behaviour unless the system policy is extended to
include CREDS_CHECK targets - BPRM_CHECK will continue to check the same
credentials that it did previously.
After this patch, an IMA policy rule along the lines of:
measure func=CREDS_CHECK subj_type=unconfined_t
will trigger if a process is executed and runs as unconfined_t, ignoring
the context of the parent process. This is in contrast to:
measure func=BPRM_CHECK subj_type=unconfined_t
which will trigger if the process that calls exec() is already executing
in unconfined_t, ignoring the context that the child process executes
into.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Changelog:
- initialize ima_creds_status
For IMA purposes, we want to be able to obtain the prepared secid in the
bprm structure before the credentials are committed. Add a cred_getsecid
hook that makes this possible.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
All of the implementations of security hooks that take msg_queue only
access q_perm the struct kern_ipc_perm member. This means the
dependencies of the msg_queue security hooks can be simplified by
passing the kern_ipc_perm member of msg_queue.
Making this change will allow struct msg_queue to become private to
ipc/msg.c.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
All of the implementations of security hooks that take shmid_kernel only
access shm_perm the struct kern_ipc_perm member. This means the
dependencies of the shm security hooks can be simplified by passing
the kern_ipc_perm member of shmid_kernel..
Making this change will allow struct shmid_kernel to become private to ipc/shm.c.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
All of the implementations of security hooks that take sem_array only
access sem_perm the struct kern_ipc_perm member. This means the
dependencies of the sem security hooks can be simplified by passing
the kern_ipc_perm member of sem_array.
Making this change will allow struct sem and struct sem_array
to become private to ipc/sem.c.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlqvCPYeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGOaAH/171cgZGFEXSONxK
3O1AAv61wN5K/ISMt6mnelWR6fZg195FarOx0Rnq7Ot8OWuVa8CGcyT4vX4Z7nb9
SVMQKNMPCVQE4WCDOv6S0njChmRC0BxBoVJtTN9fhywdYgX1KcaTS/drMRHACF5n
rB9eouMQScfMzKGAW08gp5NvEGJ6W1SLX7La3/u0751dYisdJSP7+vFZNxUrGXEA
yIPOQjFu0Tfo8GXz/BwC678RZVzVLN0sE6+/vM7zNnoDlsRVkdDIVMo3UiVqm/NK
B37/TlZz8CYoapoKnRRB5giXnSPDSXtsikbGy3mcy0u5imGe+ZgdjrdYSaLk31cR
NVZY08k=
=pu3X
-----END PGP SIGNATURE-----
Merge tag 'v4.16-rc6' into next-general
Merge to Linux 4.16-rc6 at the request of Jarkko, for his TPM updates.
Wrap the AVC state within the selinux_state structure and
pass it explicitly to all AVC functions. The AVC private state
is encapsulated in a selinux_avc structure that is referenced
from the selinux_state.
This change should have no effect on SELinux behavior or
APIs (userspace or LSM).
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <james.morris@microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Move global selinuxfs state to a per-instance structure (selinux_fs_info),
and include a pointer to the selinux_state in this structure.
Pass this selinux_state to all security server operations, thereby
ensuring that each selinuxfs instance presents a view of and acts
as an interface to a particular selinux_state instance.
This change should have no effect on SELinux behavior or APIs
(userspace or LSM). It merely wraps the selinuxfs global state,
links it to a particular selinux_state (currently always the single
global selinux_state) and uses that state for all operations.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
If security_get_bools/classes are called before the selinux state is
initialized (i.e. before first policy load), then they should just
return immediately with no booleans/classes.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The unpack code now makes sure every profile has a dfa so the safe
version of POLICY_MEDIATES is no longer needed.
Signed-off-by: John Johansen <john.johansen@canonical.com>
version 2 - Force an abi break. Network mediation will only be
available in v8 abi complaint policy.
Provide a basic mediation of sockets. This is not a full net mediation
but just whether a spcific family of socket can be used by an
application, along with setting up some basic infrastructure for
network mediation to follow.
the user space rule hav the basic form of
NETWORK RULE = [ QUALIFIERS ] 'network' [ DOMAIN ]
[ TYPE | PROTOCOL ]
DOMAIN = ( 'inet' | 'ax25' | 'ipx' | 'appletalk' | 'netrom' |
'bridge' | 'atmpvc' | 'x25' | 'inet6' | 'rose' |
'netbeui' | 'security' | 'key' | 'packet' | 'ash' |
'econet' | 'atmsvc' | 'sna' | 'irda' | 'pppox' |
'wanpipe' | 'bluetooth' | 'netlink' | 'unix' | 'rds' |
'llc' | 'can' | 'tipc' | 'iucv' | 'rxrpc' | 'isdn' |
'phonet' | 'ieee802154' | 'caif' | 'alg' | 'nfc' |
'vsock' | 'mpls' | 'ib' | 'kcm' ) ','
TYPE = ( 'stream' | 'dgram' | 'seqpacket' | 'rdm' | 'raw' |
'packet' )
PROTOCOL = ( 'tcp' | 'udp' | 'icmp' )
eg.
network,
network inet,
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
commit d178bc3a70 ("user namespace: usb:
make usb urbs user namespace aware (v2)") changed kill_pid_info_as_uid
to kill_pid_info_as_cred, saving and passing a cred structure instead of
uids. Since the secid can be obtained from the cred, drop the secid fields
from the usb_dev_state and async structures, and drop the secid argument to
kill_pid_info_as_cred. Replace the secid argument to security_task_kill
with the cred. Update SELinux, Smack, and AppArmor to use the cred, which
avoids the need for Smack and AppArmor to use a secid at all in this hook.
Further changes to Smack might still be required to take full advantage of
this change, since it should now be possible to perform capability
checking based on the supplied cred. The changes to Smack and AppArmor
have only been compile-tested.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Define a selinux state structure (struct selinux_state) for
global SELinux state and pass it explicitly to all security server
functions. The public portion of the structure contains state
that is used throughout the SELinux code, such as the enforcing mode.
The structure also contains a pointer to a selinux_ss structure whose
definition is private to the security server and contains security
server specific state such as the policy database and SID table.
This change should have no effect on SELinux behavior or APIs
(userspace or LSM). It merely wraps SELinux state and passes it
explicitly as needed.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: minor fixups needed due to collisions with the SCTP patches]
Signed-off-by: Paul Moore <paul@paul-moore.com>
The new file system CGROUP2 isn't actually handled
by smack. This changes makes Smack treat equally
CGROUP and CGROUP2 items.
Signed-off-by: José Bollo <jose.bollo@iot.bzh>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
These pernet_operations only register and unregister nf hooks.
So, they are able to be marked as async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These pernet_operations only register and unregister nf hooks.
So, they are able to be marked as async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A missing 'struct' keyword caused a build error when CONFIG_NETLABEL
is disabled:
In file included from security/selinux/hooks.c:99:
security/selinux/include/netlabel.h:135:66: error: unknown type name 'sock'
static inline void selinux_netlbl_sctp_sk_clone(struct sock *sk, sock *newsk)
^~~~
security/selinux/hooks.c: In function 'selinux_sctp_sk_clone':
security/selinux/hooks.c:5188:2: error: implicit declaration of function 'selinux_netlbl_sctp_sk_clone'; did you mean 'selinux_netlbl_inet_csk_clone'? [-Werror=implicit-function-declaration]
Fixes: db97c9f9d312 ("selinux: Add SCTP support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The SELinux SCTP implementation is explained in:
Documentation/security/SELinux-sctp.rst
Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>