policy load failure always return EINVAL even if the failure was for some
other reason (usually ENOMEM). This patch passes error codes back up the
stack where they will make their way to userspace. This might help in
debugging future problems with policy load.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Reduce MAX_AVTAB_HASH_BITS so that the avtab allocation is an order 2
allocation rather than an order 4 allocation on x86_64. This
addresses reports of page allocation failures:
http://marc.info/?l=selinux&m=126757230625867&w=2https://bugzilla.redhat.com/show_bug.cgi?id=570433
Reported-by: Russell Coker <russell@coker.com.au>
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
trying to grep everything that messes with a sk_security_struct isn't easy
since we don't always call it sksec. Just rename everything sksec.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Reduce MAX_AVTAB_HASH_BITS so that the avtab allocation is an order 2
allocation rather than an order 4 allocation on x86_64. This
addresses reports of page allocation failures:
http://marc.info/?l=selinux&m=126757230625867&w=2https://bugzilla.redhat.com/show_bug.cgi?id=570433
Reported-by: Russell Coker <russell@coker.com.au>
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Several places strings tables are used that should be declared
const.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: James Morris <jmorris@namei.org>
skbuff.h is already included by netlink.h, so remove it.
Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
Signed-off-by: James Morris <jmorris@namei.org>
slab.h is unused in symtab.c, so remove it.
Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
Signed-off-by: James Morris <jmorris@namei.org>
list.h is unused in netlink.c, so remove it.
Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
Signed-off-by: James Morris <jmorris@namei.org>
Make selinux_kernel_create_files_as() return an error when it gets one, rather
than unconditionally returning 0.
Without this, cachefiles doesn't return an error if the SELinux policy doesn't
let it create files with the label of the directory at the base of the cache.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
This fixes corrupted CIPSO packets when SELinux categories greater than 127
are used. The bug occured on the second (and later) loops through the
while; the inner for loop through the ebitmap->maps array used the same
index as the NetLabel catmap->bitmap array, even though the NetLabel bitmap
is twice as long as the SELinux bitmap.
Signed-off-by: Joshua Roys <joshua.roys@gtri.gatech.edu>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Enhance the security framework to support resetting the active security
module. This eliminates the need for direct use of the security_ops and
default_security_ops variables outside of security.c, so make security_ops
and default_security_ops static. Also remove the secondary_ops variable as
a cleanup since there is no use for that. secondary_ops was originally used by
SELinux to call the "secondary" security module (capability or dummy),
but that was replaced by direct calls to capability and the only
remaining use is to save and restore the original security ops pointer
value if SELinux is disabled by early userspace based on /etc/selinux/config.
Further, if we support this directly in the security framework, then we can
just use &default_security_ops for this purpose since that is now available.
Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
This patch revert the commit of 7d52a155e3
which removed a part of type_attribute_bounds_av as a dead code.
However, at that time, we didn't find out the target side boundary allows
to handle some of pseudo /proc/<pid>/* entries with its process's security
context well.
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
--
security/selinux/ss/services.c | 43 ++++++++++++++++++++++++++++++++++++---
1 files changed, 39 insertions(+), 4 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
Fix a couple of sparse warnings for callers of
context_struct_to_string, which takes a *u32, not an *int.
These cases are harmless as the values are not used.
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
In sel_make_bools, kernel allocates memory for bool_pending_names[i]
with security_get_bools. So if we just free bool_pending_names, those
memories for bool_pending_names[i] will be leaked.
This patch resolves dozens of following kmemleak report after resuming
from suspend:
unreferenced object 0xffff88022e4c7380 (size 32):
comm "init", pid 1, jiffies 4294677173
backtrace:
[<ffffffff810f76b5>] create_object+0x1a2/0x2a9
[<ffffffff810f78bb>] kmemleak_alloc+0x26/0x4b
[<ffffffff810ef3eb>] __kmalloc+0x18f/0x1b8
[<ffffffff811cd511>] security_get_bools+0xd7/0x16f
[<ffffffff811c48c0>] sel_write_load+0x12e/0x62b
[<ffffffff810f9a39>] vfs_write+0xae/0x10b
[<ffffffff810f9b56>] sys_write+0x4a/0x6e
[<ffffffff81011b82>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Right now the syslog "type" action are just raw numbers which makes
the source difficult to follow. This patch replaces the raw numbers
with defined constants for some level of sanity.
Signed-off-by: Kees Cook <kees.cook@canonical.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
This allows the LSM to distinguish between syslog functions originating
from /proc/kmsg access and direct syscalls. By default, the commoncaps
will now no longer require CAP_SYS_ADMIN to read an opened /proc/kmsg
file descriptor. For example the kernel syslog reader can now drop
privileges after opening /proc/kmsg, instead of staying privileged with
CAP_SYS_ADMIN. MAC systems that implement security_syslog have unchanged
behavior.
Signed-off-by: Kees Cook <kees.cook@canonical.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
Allow runtime switching between different policy types (e.g. from a MLS/MCS
policy to a non-MLS/non-MCS policy or viceversa).
Signed-off-by: Guido Trentalancia <guido@trentalancia.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Always load the initial SIDs, even in the case of a policy
reload and not just at the initial policy load. This comes
particularly handy after the introduction of a recent
patch for enabling runtime switching between different
policy types, although this patch is in theory independent
from that feature.
Signed-off-by: Guido Trentalancia <guido@trentalancia.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
This patch removes dead code in type_attribute_bounds_av().
Due to the historical reason, the type boundary feature is delivered
from hierarchical types in libsepol, it has supported boundary features
both of subject type (domain; in most cases) and target type.
However, we don't have any actual use cases in bounded target types,
and it tended to make conceptual confusion.
So, this patch removes the dead code to apply boundary checks on the
target types. I makes clear the TYPEBOUNDS restricts privileges of
a certain domain bounded to any other domain.
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
--
security/selinux/ss/services.c | 43 +++------------------------------------
1 files changed, 4 insertions(+), 39 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
Per https://bugzilla.redhat.com/show_bug.cgi?id=548145
there are sufficient range transition rules in modern (Fedora) policy to
make mls_compute_sid a significant factor on the shmem file setup path
due to the length of the range_tr list. Replace the simple range_tr
list with a hashtab inside the security server to help mitigate this
problem.
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
If allow_unknown==deny, SELinux treats an undefined kernel security
class as an error condition rather than as a typical permission denial
and thus does not allow permissions on undefined classes even when in
permissive mode. Change the SELinux logic so that this case is handled
as a typical permission denial, subject to the usual permissive mode and
permissive domain handling.
Also drop the 'requested' argument from security_compute_av() and
helpers as it is a legacy of the original security server interface and
is unused.
Changes:
- Handle permissive domains consistently by moving up the test for a
permissive domain.
- Make security_compute_av_user() consistent with security_compute_av();
the only difference now is that security_compute_av() performs mapping
between the kernel-private class and permission indices and the policy
values. In the userspace case, this mapping is handled by libselinux.
- Moved avd_init inside the policy lock.
Based in part on a patch by Paul Moore <paul.moore@hp.com>.
Reported-by: Andrew Worsley <amworsley@gmail.com>
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Don't pass current RLIMIT_RTTIME to update_rlimit_cpu() in
selinux_bprm_committing_creds, since update_rlimit_cpu expects
RLIMIT_CPU limit.
Use proper rlim[RLIMIT_CPU].rlim_cur instead to fix that.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: David Howells <dhowells@redhat.com>
The last return is unreachable, remove the 'return'
in default, let it fall through.
Signed-off-by: WANG Cong <amwang@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
The size argument to kcalloc should be the size of desired structure,
not the pointer to it.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@expression@
expression *x;
@@
x =
<+...
-sizeof(x)
+sizeof(*x)
...+>// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Historically we've seen cases where permissions are requested for classes
where they do not exist. In particular we have seen CIFS forget to set
i_mode to indicate it is a directory so when we later check something like
remove_name we have problems since it wasn't defined in tclass file. This
used to result in a avc which included the permission 0x2000 or something.
Currently the kernel will deny the operations (good thing) but will not
print ANY information (bad thing). First the auditdeny field is no
extended to include unknown permissions. After that is fixed the logic in
avc_dump_query to output this information isn't right since it will remove
the permission from the av and print the phrase "<NULL>". This takes us
back to the behavior before the classmap rewrite.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
For SELinux to do better filtering in userspace we send the name of the
module along with the AVC denial when a program is denied module_request.
Example output:
type=SYSCALL msg=audit(11/03/2009 10:59:43.510:9) : arch=x86_64 syscall=write success=yes exit=2 a0=3 a1=7fc28c0d56c0 a2=2 a3=7fffca0d7440 items=0 ppid=1727 pid=1729 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=rpc.nfsd exe=/usr/sbin/rpc.nfsd subj=system_u:system_r:nfsd_t:s0 key=(null)
type=AVC msg=audit(11/03/2009 10:59:43.510:9) : avc: denied { module_request } for pid=1729 comm=rpc.nfsd kmod="net-pf-10" scontext=system_u:system_r:nfsd_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=system
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
The SELinux dynamic class work in c6d3aaa4e3
creates a number of dynamic header files and scripts. Add .gitignore files
so git doesn't complain about these.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Ensure that we release the policy read lock on all exit paths from
security_compute_av.
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Drop remapping of netlink classes and bypass of permission checking
based on netlink message type for policy version < 18. This removes
compatibility code introduced when the original single netlink
security class used for all netlink sockets was split into
finer-grained netlink classes based on netlink protocol and when
permission checking was added based on netlink message type in Linux
2.6.8. The only known distribution that shipped with SELinux and
policy < 18 was Fedora Core 2, which was EOL'd on 2005-04-11.
Given that the remapping code was never updated to address the
addition of newer netlink classes, that the corresponding userland
support was dropped in 2005, and that the assumptions made by the
remapping code about the fixed ordering among netlink classes in the
policy may be violated in the future due to the dynamic class/perm
discovery support, we should drop this compatibility code now.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Add a simple utility (scripts/selinux/genheaders) and invoke it to
generate the kernel-private class and permission indices in flask.h
and av_permissions.h automatically during the kernel build from the
security class mapping definitions in classmap.h. Adding new kernel
classes and permissions can then be done just by adding them to classmap.h.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Modify SELinux to dynamically discover class and permission values
upon policy load, based on the dynamic object class/perm discovery
logic from libselinux. A mapping is created between kernel-private
class and permission indices used outside the security server and the
policy values used within the security server.
The mappings are only applied upon kernel-internal computations;
similar mappings for the private indices of userspace object managers
is handled on a per-object manager basis by the userspace AVC. The
interfaces for compute_av and transition_sid are split for kernel
vs. userspace; the userspace functions are distinguished by a _user
suffix.
The kernel-private class indices are no longer tied to the policy
values and thus do not need to skip indices for userspace classes;
thus the kernel class index values are compressed. The flask.h
definitions were regenerated by deleting the userspace classes from
refpolicy's definitions and then regenerating the headers. Going
forward, we can just maintain the flask.h, av_permissions.h, and
classmap.h definitions separately from policy as they are no longer
tied to the policy values. The next patch introduces a utility to
automate generation of flask.h and av_permissions.h from the
classmap.h definitions.
The older kernel class and permission string tables are removed and
replaced by a single security class mapping table that is walked at
policy load to generate the mapping. The old kernel class validation
logic is completely replaced by the mapping logic.
The handle unknown logic is reworked. reject_unknown=1 is handled
when the mappings are computed at policy load time, similar to the old
handling by the class validation logic. allow_unknown=1 is handled
when computing and mapping decisions - if the permission was not able
to be mapped (i.e. undefined, mapped to zero), then it is
automatically added to the allowed vector. If the class was not able
to be mapped (i.e. undefined, mapped to zero), then all permissions
are allowed for it if allow_unknown=1.
avc_audit leverages the new security class mapping table to lookup the
class and permission names from the kernel-private indices.
The mdp program is updated to use the new table when generating the
class definitions and allow rules for a minimal boot policy for the
kernel. It should be noted that this policy will not include any
userspace classes, nor will its policy index values for the kernel
classes correspond with the ones in refpolicy (they will instead match
the kernel-private indices).
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
This patch resets the security_ops to the secondary_ops before it flushes
the avc. It's still possible that a task on another processor could have
already passed the security_ops dereference and be executing an selinux hook
function which would add a new avc entry. That entry would still not be
freed. This should however help to reduce the number of needless avcs the
kernel has when selinux is disabled at run time. There is no wasted
memory if selinux is disabled on the command line or not compiled.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Ratan Nalumasu reported that in a process with many threads doing
unnecessary wakeups. Every waiting thread in the process wakes up to loop
through the children and see that the only ones it cares about are still
not ready.
Now that we have struct wait_opts we can change do_wait/__wake_up_parent
to use filtered wakeups.
We can make child_wait_callback() more clever later, right now it only
checks eligible_child().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ratan Nalumasu <rnalumasu@gmail.com>
Cc: Vitaly Mayatskikh <vmayatsk@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Tested-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Before SELinux is disabled at boot it can create AVC entries. This patch
will flush those entries before disabling SELinux.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Move the avc_cache flushing into it's own function so it can be reused when
disabling SELinux.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
__validate_process_creds should check if selinux is actually enabled before
running tests on the selinux portion of the credentials struct.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
This patch adds a setxattr handler to the file, directory, and symlink
inode_operations structures for sysfs. The patch uses hooks introduced in the
previous patch to handle the getting and setting of security information for
the sysfs inodes. As was suggested by Eric Biederman the struct iattr in the
sysfs_dirent structure has been replaced by a structure which contains the
iattr, secdata and secdata length to allow the changes to persist in the event
that the inode representing the sysfs_dirent is evicted. Because sysfs only
stores this information when a change is made all the optional data is moved
into one dynamically allocated field.
This patch addresses an issue where SELinux was denying virtd access to the PCI
configuration entries in sysfs. The lack of setxattr handlers for sysfs
required that a single label be assigned to all entries in sysfs. Granting virtd
access to every entry in sysfs is not an acceptable solution so fine grained
labeling of sysfs is required such that individual entries can be labeled
appropriately.
[sds: Fixed compile-time warnings, coding style, and setting of inode security init flags.]
Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
This patch introduces three new hooks. The inode_getsecctx hook is used to get
all relevant information from an LSM about an inode. The inode_setsecctx is
used to set both the in-core and on-disk state for the inode based on a context
derived from inode_getsecctx.The final hook inode_notifysecctx will notify the
LSM of a change for the in-core state of the inode in question. These hooks are
for use in the labeled NFS code and addresses concerns of how to set security
on an inode in a multi-xattr LSM. For historical reasons Stephen Smalley's
explanation of the reason for these hooks is pasted below.
Quote Stephen Smalley
inode_setsecctx: Change the security context of an inode. Updates the
in core security context managed by the security module and invokes the
fs code as needed (via __vfs_setxattr_noperm) to update any backing
xattrs that represent the context. Example usage: NFS server invokes
this hook to change the security context in its incore inode and on the
backing file system to a value provided by the client on a SETATTR
operation.
inode_notifysecctx: Notify the security module of what the security
context of an inode should be. Initializes the incore security context
managed by the security module for this inode. Example usage: NFS
client invokes this hook to initialize the security context in its
incore inode to the value provided by the server for the file when the
server returned the file's attributes to the client.
Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add a keyctl to install a process's session keyring onto its parent. This
replaces the parent's session keyring. Because the COW credential code does
not permit one process to change another process's credentials directly, the
change is deferred until userspace next starts executing again. Normally this
will be after a wait*() syscall.
To support this, three new security hooks have been provided:
cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
the blank security creds and key_session_to_parent() - which asks the LSM if
the process may replace its parent's session keyring.
The replacement may only happen if the process has the same ownership details
as its parent, and the process has LINK permission on the session keyring, and
the session keyring is owned by the process, and the LSM permits it.
Note that this requires alteration to each architecture's notify_resume path.
This has been done for all arches barring blackfin, m68k* and xtensa, all of
which need assembly alteration to support TIF_NOTIFY_RESUME. This allows the
replacement to be performed at the point the parent process resumes userspace
execution.
This allows the userspace AFS pioctl emulation to fully emulate newpag() and
the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
alter the parent process's PAG membership. However, since kAFS doesn't use
PAGs per se, but rather dumps the keys into the session keyring, the session
keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
the newpag flag.
This can be tested with the following program:
#include <stdio.h>
#include <stdlib.h>
#include <keyutils.h>
#define KEYCTL_SESSION_TO_PARENT 18
#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)
int main(int argc, char **argv)
{
key_serial_t keyring, key;
long ret;
keyring = keyctl_join_session_keyring(argv[1]);
OSERROR(keyring, "keyctl_join_session_keyring");
key = add_key("user", "a", "b", 1, keyring);
OSERROR(key, "add_key");
ret = keyctl(KEYCTL_SESSION_TO_PARENT);
OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");
return 0;
}
Compiled and linked with -lkeyutils, you should see something like:
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
355907932 --alswrv 4043 -1 \_ keyring: _uid.4043
[dhowells@andromeda ~]$ /tmp/newpag
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
1055658746 --alswrv 4043 4043 \_ user: a
[dhowells@andromeda ~]$ /tmp/newpag hello
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: hello
340417692 --alswrv 4043 4043 \_ user: a
Where the test program creates a new session keyring, sticks a user key named
'a' into it and then installs it on its parent.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking
for credential management. The additional code keeps track of the number of
pointers from task_structs to any given cred struct, and checks to see that
this number never exceeds the usage count of the cred struct (which includes
all references, not just those from task_structs).
Furthermore, if SELinux is enabled, the code also checks that the security
pointer in the cred struct is never seen to be invalid.
This attempts to catch the bug whereby inode_has_perm() faults in an nfsd
kernel thread on seeing cred->security be a NULL pointer (it appears that the
credential struct has been previously released):
http://www.kerneloops.org/oops.php?number=252883
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add support for the new TUN LSM hooks: security_tun_dev_create(),
security_tun_dev_post_create() and security_tun_dev_attach(). This includes
the addition of a new object class, tun_socket, which represents the socks
associated with TUN devices. The _tun_dev_create() and _tun_dev_post_create()
hooks are fairly similar to the standard socket functions but _tun_dev_attach()
is a bit special. The _tun_dev_attach() is unique because it involves a
domain attaching to an existing TUN device and its associated tun_socket
object, an operation which does not exist with standard sockets and most
closely resembles a relabel operation.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@parisplace.org>
Signed-off-by: James Morris <jmorris@namei.org>
As suggested by OGAWA Hirofumi in thread:
http://lkml.org/lkml/2009/8/7/132, we should let selinux_inode_setattr()
to match our ATTR_* rules. ATTR_FORCE should not force things like
ATTR_SIZE.
[hirofumi@mail.parknet.co.jp: tweaks]
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: Eugene Teo <eteo@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>
Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable. This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.
The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.
This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Currently SELinux does not check CAP_SYS_RAWIO in the file_mmap hook. This
means there is no DAC check on the ability to mmap low addresses in the
memory space. This function adds the DAC check for CAP_SYS_RAWIO while
maintaining the selinux check on mmap_zero. This means that processes
which need to mmap low memory will need CAP_SYS_RAWIO and mmap_zero but will
NOT need the SELinux sys_rawio capability.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Convert avc_audit in security/selinux/avc.c to use lsm_audit.h,
for better maintainability.
- changed selinux to use common_audit_data instead of
avc_audit_data
- eliminated code in avc.c and used code from lsm_audit.h instead.
Had to add a LSM_AUDIT_NO_AUDIT to lsm_audit.h so that avc_audit
can call common_lsm_audit and do the pre and post callbacks without
doing the actual dump. This makes it so that the patched version
behaves the same way as the unpatched version.
Also added a denied field to the selinux_audit_data private space,
once again to make it so that the patched version behaves like the
unpatched.
I've tested and confirmed that AVCs look the same before and after
this patch.
Signed-off-by: Thomas Liu <tliu@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
This patch adds a new selinux hook so SELinux can arbitrate if a given
process should be allowed to trigger a request for the kernel to try to
load a module. This is a different operation than a process trying to load
a module itself, which is already protected by CAP_SYS_MODULE.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Fix memory leakage in /security/selinux/hooks.c
The buffer always needs to be freed here; we either error
out or allocate more memory.
Reported-by: iceberg <strakh@ispras.ru>
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable. This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.
The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.
This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Currently SELinux does not check CAP_SYS_RAWIO in the file_mmap hook. This
means there is no DAC check on the ability to mmap low addresses in the
memory space. This function adds the DAC check for CAP_SYS_RAWIO while
maintaining the selinux check on mmap_zero. This means that processes
which need to mmap low memory will need CAP_SYS_RAWIO and mmap_zero but will
NOT need the SELinux sys_rawio capability.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
- is_single_threaded(task) is not safe unless task == current,
we can't use task->signal or task->mm.
- it doesn't make sense unless task == current, the task can
fork right after the check.
Rename it to current_is_single_threaded() and kill the argument.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Convert avc_audit in security/selinux/avc.c to use lsm_audit.h,
for better maintainability and for less code duplication.
- changed selinux to use common_audit_data instead of
avc_audit_data
- eliminated code in avc.c and used code from lsm_audit.h instead.
I have tested to make sure that the avcs look the same before and
after this patch.
Signed-off-by: Thomas Liu <tliu@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Added a call to free the avc_node_cache when inside selinux_disable because
it should not waste resources allocated during avc_init if SELinux is disabled
and the cache will never be used.
Signed-off-by: Thomas Liu <tliu@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
The ->ptrace_may_access() methods are named confusingly - the real
ptrace_may_access() returns a bool, while these security checks have
a retval convention.
Rename it to ptrace_access_check, to reduce the confusion factor.
[ Impact: cleanup, no code changed ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: James Morris <jmorris@namei.org>
Restore the optimization to skip revalidation in selinux_file_permission
if nothing has changed since the dentry_open checks, accidentally removed by
389fb800. Also remove redundant test from selinux_revalidate_file_permission.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
The attached patch adds support to generate audit messages on two cases.
The first one is a case when a multi-thread process tries to switch its
performing security context using setcon(3), but new security context is
not bounded by the old one.
type=SELINUX_ERR msg=audit(1245311998.599:17): \
op=security_bounded_transition result=denied \
oldcontext=system_u:system_r:httpd_t:s0 \
newcontext=system_u:system_r:guest_webapp_t:s0
The other one is a case when security_compute_av() masked any permissions
due to the type boundary violation.
type=SELINUX_ERR msg=audit(1245312836.035:32): \
op=security_compute_av reason=bounds \
scontext=system_u:object_r:user_webapp_t:s0 \
tcontext=system_u:object_r:shadow_t:s0:c0 \
tclass=file perms=getattr,open
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
It is a cleanup patch to cut down a line within 80 columns.
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
--
security/selinux/ss/services.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
Define three accessors to get/set dst attached to a skb
struct dst_entry *skb_dst(const struct sk_buff *skb)
void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)
void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;
Delete skb->dst field
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Audit trees defined 2 new netlink messages but the netlink mapping tables for
selinux permissions were not set up. This patch maps these 2 new operations
to AUDIT_WRITE.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
On Tue, 2009-05-19 at 00:05 -0400, Eamon Walsh wrote:
> Recent versions of coreutils have bumped the read buffer size from 4K to
> 32K in several of the utilities.
>
> This means that "cat /selinux/booleans/xserver_object_manager" no longer
> works, it returns "Invalid argument" on F11. getsebool works fine.
>
> sel_read_bool has a check for "count > PAGE_SIZE" that doesn't seem to
> be present in the other read functions. Maybe it could be removed?
Yes, that check is obsoleted by the conversion of those functions to
using simple_read_from_buffer(), which will reduce count if necessary to
what is available in the buffer.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
The selinuxfs superblock magic is used inside the IMA code, but is being
defined in two places and could someday get out of sync. This patch moves the
declaration into magic.h so it is only done once.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
The CRED patch incorrectly converted the SELinux send_sigiotask hook to
use the current task SID rather than the target task SID in its
permission check, yielding the wrong permission check. This fixes the
hook function. Detected by the ltp selinux testsuite and confirmed to
correct the test failure.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
We shouldn't worry about the tracer if current is ptraced, exec() must not
succeed if the tracer has no rights to trace this task after cred changing.
But we should notify ->real_parent which is, well, real parent.
Also, we don't need _irq to take tasklist, and we don't need parent's
->siglock to wake_up_interruptible(real_parent->signal->wait_chldexit).
Since we hold tasklist, real_parent->signal must be stable. Otherwise
spin_lock(siglock) is not safe too and can't help anyway.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Don't flush inherited SIGKILL during execve() in SELinux's post cred commit
hook. This isn't really a security problem: if the SIGKILL came before the
credentials were changed, then we were right to receive it at the time, and
should honour it; if it came after the creds were changed, then we definitely
should honour it; and in any case, all that will happen is that the process
will be scrapped before it ever returns to userspace.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
We are still calling secondary_ops->sysctl even though the capabilities
module does not define a sysctl operation.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
This patch enables applications to handle permissive domain correctly.
Since the v2.6.26 kernel, SELinux has supported an idea of permissive
domain which allows certain processes to work as if permissive mode,
even if the global setting is enforcing mode.
However, we don't have an application program interface to inform
what domains are permissive one, and what domains are not.
It means applications focuses on SELinux (XACE/SELinux, SE-PostgreSQL
and so on) cannot handle permissive domain correctly.
This patch add the sixth field (flags) on the reply of the /selinux/access
interface which is used to make an access control decision from userspace.
If the first bit of the flags field is positive, it means the required
access control decision is on permissive domain, so application should
allow any required actions, as the kernel doing.
This patch also has a side benefit. The av_decision.flags is set at
context_struct_compute_av(). It enables to check required permissions
without read_lock(&policy_rwlock).
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
--
security/selinux/avc.c | 2 +-
security/selinux/include/security.h | 4 +++-
security/selinux/selinuxfs.c | 4 ++--
security/selinux/ss/services.c | 30 +++++-------------------------
4 files changed, 11 insertions(+), 29 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
The SELinux "compat_net" is marked as deprecated, the time has come to
finally remove it from the kernel. Further code simplifications are
likely in the future, but this patch was intended to be a simple,
straight-up removal of the compat_net code.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The current NetLabel/SELinux behavior for incoming TCP connections works but
only through a series of happy coincidences that rely on the limited nature of
standard CIPSO (only able to convey MLS attributes) and the write equality
imposed by the SELinux MLS constraints. The problem is that network sockets
created as the result of an incoming TCP connection were not on-the-wire
labeled based on the security attributes of the parent socket but rather based
on the wire label of the remote peer. The issue had to do with how IP options
were managed as part of the network stack and where the LSM hooks were in
relation to the code which set the IP options on these newly created child
sockets. While NetLabel/SELinux did correctly set the socket's on-the-wire
label it was promptly cleared by the network stack and reset based on the IP
options of the remote peer.
This patch, in conjunction with a prior patch that adjusted the LSM hook
locations, works to set the correct on-the-wire label format for new incoming
connections through the security_inet_conn_request() hook. Besides the
correct behavior there are many advantages to this change, the most significant
is that all of the NetLabel socket labeling code in SELinux now lives in hooks
which can return error codes to the core stack which allows us to finally get
ride of the selinux_netlbl_inode_permission() logic which greatly simplfies
the NetLabel/SELinux glue code. In the process of developing this patch I
also ran into a small handful of AF_INET6 cleanliness issues that have been
fixed which should make the code safer and easier to extend in the future.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
Drop the printk message when an inode is found without an associated
dentry. This should only happen when userspace can't be accessing those
inodes and those labels will get set correctly on the next d_instantiate.
Thus there is no reason to send this message.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
New selinux permission to separate the ability to turn on tty auditing from
the ability to set audit rules.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
When I did open permissions I didn't think any sockets would have an open.
Turns out AF_UNIX sockets can have an open when they are bound to the
filesystem namespace. This patch adds a new SOCK_FILE__OPEN permission.
It's safe to add this as the open perms are already predicated on
capabilities and capabilities means we have unknown perm handling so
systems should be as backwards compatible as the policy wants them to
be.
https://bugzilla.redhat.com/show_bug.cgi?id=475224
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Rick McNeal from LSI identified a panic in selinux_netlbl_inode_permission()
caused by a certain sequence of SUNRPC operations. The problem appears to be
due to the lack of NULL pointer checking in the function; this patch adds the
pointer checks so the function will exit safely in the cases where the socket
is not completely initialized.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
At some point we (okay, I) managed to break the ability for users to use the
setsockopt() syscall to set IPv4 options when NetLabel was not active on the
socket in question. The problem was noticed by someone trying to use the
"-R" (record route) option of ping:
# ping -R 10.0.0.1
ping: record route: No message of desired type
The solution is relatively simple, we catch the unlabeled socket case and
clear the error code, allowing the operation to succeed. Please note that we
still deny users the ability to override IPv4 options on socket's which have
NetLabel labeling active; this is done to ensure the labeling remains intact.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
We do not need O(1) access to the tail of the avc cache lists and so we are
wasting lots of space using struct list_head instead of struct hlist_head.
This patch converts the avc cache to use hlists in which there is a single
pointer from the head which saves us about 4k of global memory.
Resulted in about a 1.5% decrease in time spent in avc_has_perm_noaudit based
on oprofile sampling of tbench. Although likely within the noise....
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The code making use of struct avc_cache was not easy to read thanks to liberal
use of &avc_cache.{slots_lock,slots}[hvalue] throughout. This patch simply
creates local pointers and uses those instead of the long global names.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
It appears there was an intention to have the security server only decide
certain permissions and leave other for later as some sort of a portential
performance win. We are currently always deciding all 32 bits of
permissions and this is a useless couple of branches and wasted space.
This patch completely drops the av.decided concept.
This in a 17% reduction in the time spent in avc_has_perm_noaudit
based on oprofile sampling of a tbench benchmark.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
we are often needlessly jumping through hoops when it comes to avd
entries in avc_has_perm_noaudit and we have extra initialization and memcpy
which are just wasting performance. Try to clean the function up a bit.
This patch resulted in a 13% drop in time spent in avc_has_perm_noaudit in my
oprofile sampling of a tbench benchmark.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Currently SELinux code has an atomic which was intended to track how many
times an avc entry was used and to evict entries when they haven't been
used recently. Instead we never let this atomic get above 1 and evict when
it is first checked for eviction since it hits zero. This is a total waste
of time so I'm completely dropping ae.used.
This change resulted in about a 3% faster avc_has_perm_noaudit when running
oprofile against a tbench benchmark.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
The avc update node callbacks do not check the seqno of the caller with the
seqno of the node found. It is possible that a policy change could happen
(although almost impossibly unlikely) in which a permissive or
permissive_domain decision is not valid for the entry found. Simply pass
and check that the seqno of the caller and the seqno of the node found
match.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>