Commit Graph

1060 Commits

Author SHA1 Message Date
Linus Torvalds 0342cbcfce Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rcu: Fix wrong check in list_splice_init_rcu()
  net,rcu: Convert call_rcu(xt_rateest_free_rcu) to kfree_rcu()
  sysctl,rcu: Convert call_rcu(free_head) to kfree
  vmalloc,rcu: Convert call_rcu(rcu_free_vb) to kfree_rcu()
  vmalloc,rcu: Convert call_rcu(rcu_free_va) to kfree_rcu()
  ipc,rcu: Convert call_rcu(ipc_immediate_free) to kfree_rcu()
  ipc,rcu: Convert call_rcu(free_un) to kfree_rcu()
  security,rcu: Convert call_rcu(sel_netport_free) to kfree_rcu()
  security,rcu: Convert call_rcu(sel_netnode_free) to kfree_rcu()
  ia64,rcu: Convert call_rcu(sn_irq_info_free) to kfree_rcu()
  block,rcu: Convert call_rcu(disk_free_ptbl_rcu_cb) to kfree_rcu()
  scsi,rcu: Convert call_rcu(fc_rport_free_rcu) to kfree_rcu()
  audit_tree,rcu: Convert call_rcu(__put_tree) to kfree_rcu()
  security,rcu: Convert call_rcu(whitelist_item_free) to kfree_rcu()
  md,rcu: Convert call_rcu(free_conf) to kfree_rcu()
2011-07-22 16:44:08 -07:00
Linus Torvalds 8209f53d79 Merge branch 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc
* 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc: (39 commits)
  ptrace: do_wait(traced_leader_killed_by_mt_exec) can block forever
  ptrace: fix ptrace_signal() && STOP_DEQUEUED interaction
  connector: add an event for monitoring process tracers
  ptrace: dont send SIGSTOP on auto-attach if PT_SEIZED
  ptrace: mv send-SIGSTOP from do_fork() to ptrace_init_task()
  ptrace_init_task: initialize child->jobctl explicitly
  has_stopped_jobs: s/task_is_stopped/SIGNAL_STOP_STOPPED/
  ptrace: make former thread ID available via PTRACE_GETEVENTMSG after PTRACE_EVENT_EXEC stop
  ptrace: wait_consider_task: s/same_thread_group/ptrace_reparented/
  ptrace: kill real_parent_is_ptracer() in in favor of ptrace_reparented()
  ptrace: ptrace_reparented() should check same_thread_group()
  redefine thread_group_leader() as exit_signal >= 0
  do not change dead_task->exit_signal
  kill task_detached()
  reparent_leader: check EXIT_DEAD instead of task_detached()
  make do_notify_parent() __must_check, update the callers
  __ptrace_detach: avoid task_detached(), check do_notify_parent()
  kill tracehook_notify_death()
  make do_notify_parent() return bool
  ptrace: s/tracehook_tracer_task()/ptrace_parent()/
  ...
2011-07-22 15:06:50 -07:00
Lai Jiangshan 449a68cc65 security,rcu: Convert call_rcu(sel_netport_free) to kfree_rcu()
The rcu callback sel_netport_free() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(sel_netport_free).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Paris <eparis@parisplace.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-07-20 14:10:15 -07:00
Lai Jiangshan 9801c60e99 security,rcu: Convert call_rcu(sel_netnode_free) to kfree_rcu()
The rcu callback sel_netnode_free() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(sel_netnode_free).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Paris <eparis@parisplace.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-07-20 14:10:14 -07:00
Al Viro cf1dd1dae8 selinux: don't transliterate MAY_NOT_BLOCK to IPERM_FLAG_RCU
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:27 -04:00
Al Viro e74f71eb78 ->permission() sanitizing: don't pass flags to ->inode_permission()
pass that via mask instead.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:26 -04:00
Tejun Heo 06d984737b ptrace: s/tracehook_tracer_task()/ptrace_parent()/
tracehook.h is on the way out.  Rename tracehook_tracer_task() to
ptrace_parent() and move it from tracehook.h to ptrace.h.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:29 +02:00
James Morris 82b88bb24e Merge branch 'for-linus' of git://git.infradead.org/users/eparis/selinux into for-linus 2011-06-15 09:41:48 +10:00
Roy.Li ded509880f SELinux: skip file_name_trans_write() when policy downgraded.
When policy version is less than POLICYDB_VERSION_FILENAME_TRANS,
skip file_name_trans_write().

Signed-off-by: Roy.Li <rongqing.li@windriver.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-06-14 12:58:51 -04:00
Linus Torvalds 95f4efb2d7 selinux: simplify and clean up inode_has_perm()
This is a rather hot function that is called with a potentially NULL
"struct common_audit_data" pointer argument.  And in that case it has to
provide and initialize its own dummy common_audit_data structure.

However, all the _common_ cases already pass it a real audit-data
structure, so that uncommon NULL case not only creates a silly run-time
test, more importantly it causes that function to have a big stack frame
for the dummy variable that isn't even used in the common case!

So get rid of that stupid run-time behavior, and make the (few)
functions that currently call with a NULL pointer just call a new helper
function instead (naturally called inode_has_perm_noapd(), since it has
no adp argument).

This makes the run-time test be a static code generation issue instead,
and allows for a much denser stack since none of the common callers need
the dummy structure.  And a denser stack not only means less stack space
usage, it means better cache behavior.  So we have a win-win-win from
this simplification: less code executed, smaller stack footprint, and
better cache behavior.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-08 15:11:56 -07:00
Linus Torvalds f01e1af445 selinux: don't pass in NULL avd to avc_has_perm_noaudit
Right now security_get_user_sids() will pass in a NULL avd pointer to
avc_has_perm_noaudit(), which then forces that function to have a dummy
entry for that case and just generally test it.

Don't do it.  The normal callers all pass a real avd pointer, and this
helper function is incredibly hot.  So don't make avc_has_perm_noaudit()
do conditional stuff that isn't needed for the common case.

This also avoids some duplicated stack space.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 18:13:57 -07:00
Kohei Kaigai 0f7e4c33eb selinux: fix case of names with whitespace/multibytes on /selinux/create
I submit the patch again, according to patch submission convension.

This patch enables to accept percent-encoded object names as forth
argument of /selinux/create interface to avoid possible bugs when we
give an object name including whitespace or multibutes.

E.g) if and when a userspace object manager tries to create a new object
 named as "resolve.conf but fake", it shall give this name as the forth
 argument of the /selinux/create. But sscanf() logic in kernel space
 fetches only the part earlier than the first whitespace.
 In this case, selinux may unexpectedly answer a default security context
 configured to "resolve.conf", but it is bug.

Although I could not test this patch on named TYPE_TRANSITION rules
actually, But debug printk() message seems to me the logic works
correctly.
I assume the libselinux provides an interface to apply this logic
transparently, so nothing shall not be changed from the viewpoint of
application.

Signed-off-by: KaiGai Kohei <kohei.kaigai@emea.nec.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-05-26 17:20:53 -04:00
Eric Paris ea77f7a2e8 Merge commit 'v2.6.39' into 20110526
Conflicts:
	lib/flex_array.c
	security/selinux/avc.c
	security/selinux/hooks.c
	security/selinux/ss/policydb.c
	security/smack/smack_lsm.c
2011-05-26 17:20:14 -04:00
James Morris b7b57551bb Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into for-linus
Conflicts:
	lib/flex_array.c
	security/selinux/avc.c
	security/selinux/hooks.c
	security/selinux/ss/policydb.c
	security/smack/smack_lsm.c

Manually resolve conflicts.

Signed-off-by: James Morris <jmorris@namei.org>
2011-05-24 23:20:19 +10:00
Linus Torvalds 57d19e80f4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  b43: fix comment typo reqest -> request
  Haavard Skinnemoen has left Atmel
  cris: typo in mach-fs Makefile
  Kconfig: fix copy/paste-ism for dell-wmi-aio driver
  doc: timers-howto: fix a typo ("unsgined")
  perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
  md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
  treewide: fix a few typos in comments
  regulator: change debug statement be consistent with the style of the rest
  Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
  audit: acquire creds selectively to reduce atomic op overhead
  rtlwifi: don't touch with treewide double semicolon removal
  treewide: cleanup continuations and remove logging message whitespace
  ath9k_hw: don't touch with treewide double semicolon removal
  include/linux/leds-regulator.h: fix syntax in example code
  tty: fix typo in descripton of tty_termios_encode_baud_rate
  xtensa: remove obsolete BKL kernel option from defconfig
  m68k: fix comment typo 'occcured'
  arch:Kconfig.locks Remove unused config option.
  treewide: remove extra semicolons
  ...
2011-05-23 09:12:26 -07:00
Linus Torvalds 257313b2a8 selinux: avoid unnecessary avc cache stat hit count
There is no point in counting hits - we can calculate it from the number
of lookups and misses.

This makes the avc statistics a bit smaller, and makes the code
generation better too.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-19 21:22:53 -07:00
Linus Torvalds 044aea9b83 selinux: de-crapify avc cache stat code generation
You can turn off the avc cache stats, but distributions seem to not do
that (perhaps because several performance tuning how-to's talk about the
avc cache statistics).

Which is sad, because the code it generates is truly horrendous, with
the statistics update being sandwitched between get_cpu/put_cpu which in
turn causes preemption disables etc.  We're talking ten+ instructions
just to increment a per-cpu variable in some pretty hot code.

Fix the craziness by just using 'this_cpu_inc()' instead.  Suddenly we
only need a single 'inc' instruction to increment the statistics.  This
is quite noticeable in the incredibly hot avc_has_perm_noaudit()
function (which triggers all the statistics by virtue of doing an
avc_lookup() call).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-19 18:59:47 -07:00
Linus Torvalds eb04f2f04e Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits)
  Revert "rcu: Decrease memory-barrier usage based on semi-formal proof"
  net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree
  batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu
  batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree()
  batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu
  net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu()
  net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu()
  net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu()
  perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu()
  perf,rcu: convert call_rcu(free_ctx) to kfree_rcu()
  net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(net_generic_release) to kfree_rcu()
  net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu()
  net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu()
  security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu()
  net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu()
  net,rcu: convert call_rcu(xps_map_release) to kfree_rcu()
  net,rcu: convert call_rcu(rps_map_release) to kfree_rcu()
  ...
2011-05-19 18:14:34 -07:00
James Morris ca7d120008 Merge branch 'for-linus' of git://git.infradead.org/users/eparis/selinux into for-linus 2011-05-13 09:52:16 +10:00
Eric Paris 93826c092c SELinux: delete debugging printks from filename_trans rule processing
The filename_trans rule processing has some printk(KERN_ERR ) messages
which were intended as debug aids in creating the code but weren't removed
before it was submitted.  Remove them.

Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-05-12 16:02:42 -04:00
Greg Kroah-Hartman 7a627e3b9a SELINUX: add /sys/fs/selinux mount point to put selinuxfs
In the interest of keeping userspace from having to create new root
filesystems all the time, let's follow the lead of the other in-kernel
filesystems and provide a proper mount point for it in sysfs.

For selinuxfs, this mount point should be in /sys/fs/selinux/

Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Lennart Poettering <mzerqung@0pointer.de>
Cc: Daniel J Walsh <dwalsh@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[include kobject.h - Eric Paris]
[use selinuxfs_obj throughout - Eric Paris]
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-05-11 12:58:09 -04:00
Lai Jiangshan 690273fc70 security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu()
The rcu callback sel_netif_free() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(sel_netif_free).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-07 22:51:05 -07:00
James Morris 6f23928454 Merge branch 'for-linus' of git://git.infradead.org/users/eparis/selinux into for-linus 2011-05-04 11:59:34 +10:00
Eric Paris 5d30b10bd6 flex_array: flex_array_prealloc takes a number of elements, not an end
Change flex_array_prealloc to take the number of elements for which space
should be allocated instead of the last (inclusive) element. Users
and documentation are updated accordingly.  flex_arrays got introduced before
they had users.  When folks started using it, they ended up needing a
different API than was coded up originally.  This swaps over to the API that
folks apparently need.

Based-on-patch-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Tested-by: Chris Richards <gizmo@giz-works.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: stable@kernel.org [2.6.38+]
2011-04-28 16:12:47 -04:00
Eric Paris cb1e922fa1 SELinux: pass last path component in may_create
New inodes are created in a two stage process.  We first will compute the
label on a new inode in security_inode_create() and check if the
operation is allowed.  We will then actually re-compute that same label and
apply it in security_inode_init_security().  The change to do new label
calculations based in part on the last component of the path name only
passed the path component information all the way down the
security_inode_init_security hook.  Down the security_inode_create hook the
path information did not make it past may_create.  Thus the two calculations
came up differently and the permissions check might not actually be against
the label that is created.  Pass and use the same information in both places
to harmonize the calculations and checks.

Reported-by: Dominick Grift <domg472@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-04-28 16:12:41 -04:00
Eric Paris 2875fa0083 SELinux: introduce path_has_perm
We currently have inode_has_perm and dentry_has_perm.  dentry_has_perm just
calls inode_has_perm with additional audit data.  But dentry_has_perm can
take either a dentry or a path.  Split those to make the code obvious and
to fix the previous problem where I thought dentry_has_perm always had a
valid dentry and mnt.

Signed-off-by: Eric Paris <eparis@redhat.com>
2011-04-28 16:09:59 -04:00
Eric Paris 5a3ea8782c flex_array: flex_array_prealloc takes a number of elements, not an end
Change flex_array_prealloc to take the number of elements for which space
should be allocated instead of the last (inclusive) element. Users
and documentation are updated accordingly.  flex_arrays got introduced before
they had users.  When folks started using it, they ended up needing a
different API than was coded up originally.  This swaps over to the API that
folks apparently need.

Based-on-patch-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Tested-by: Chris Richards <gizmo@giz-works.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: stable@kernel.org [2.6.38+]
2011-04-28 15:56:06 -04:00
Eric Paris 562abf6241 SELinux: pass last path component in may_create
New inodes are created in a two stage process.  We first will compute the
label on a new inode in security_inode_create() and check if the
operation is allowed.  We will then actually re-compute that same label and
apply it in security_inode_init_security().  The change to do new label
calculations based in part on the last component of the path name only
passed the path component information all the way down the
security_inode_init_security hook.  Down the security_inode_create hook the
path information did not make it past may_create.  Thus the two calculations
came up differently and the permissions check might not actually be against
the label that is created.  Pass and use the same information in both places
to harmonize the calculations and checks.

Reported-by: Dominick Grift <domg472@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-04-28 15:15:54 -04:00
Eric Paris 2463c26d50 SELinux: put name based create rules in a hashtable
To shorten the list we need to run if filename trans rules exist for the type
of the given parent directory I put them in a hashtable.  Given the policy we
are expecting to use in Fedora this takes the worst case list run from about
5,000 entries to 17.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
2011-04-28 15:15:53 -04:00
Eric Paris 3f058ef778 SELinux: generic hashtab entry counter
Instead of a hashtab entry counter function only useful for range
transition rules make a function generic for any hashtable to use.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
2011-04-28 15:15:52 -04:00
Eric Paris be30b16d43 SELinux: calculate and print hashtab stats with a generic function
We have custom debug functions like rangetr_hash_eval and symtab_hash_eval
which do the same thing.  Just create a generic function that takes the name
of the hash table as an argument instead of having custom functions.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
2011-04-28 15:15:52 -04:00
Eric Paris 03a4c0182a SELinux: skip filename trans rules if ttype does not match parent dir
Right now we walk to filename trans rule list for every inode that is
created.  First passes at policy using this facility creates around 5000
filename trans rules.  Running a list of 5000 entries every time is a bad
idea.  This patch adds a new ebitmap to policy which has a bit set for each
ttype that has at least 1 filename trans rule.  Thus when an inode is
created we can quickly determine if any rules exist for this parent
directory type and can skip the list if we know there is definitely no
relevant entry.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
2011-04-28 15:15:52 -04:00
Eric Paris 2667991f60 SELinux: rename filename_compute_type argument to *type instead of *con
filename_compute_type() takes as arguments the numeric value of the type of
the subject and target.  It does not take a context.  Thus the names are
misleading.  Fix the argument names.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
2011-04-28 15:15:51 -04:00
Eric Paris 4742600cf5 SELinux: fix comment to state filename_compute_type takes an objname not a qstr
filename_compute_type used to take a qstr, but it now takes just a name.
Fix the comments to indicate it is an objname, not a qstr.

Signed-off-by: Eric Paris <eparis@redhat.com>
2011-04-28 15:15:51 -04:00
Jiri Kosina 07f9479a40 Merge branch 'master' into for-next
Fast-forwarded to current state of Linus' tree as there are patches to be
applied for files that didn't exist on the old branch.
2011-04-26 10:22:59 +02:00
Eric Paris 9ade0cf440 SELINUX: Make selinux cache VFS RCU walks safe
Now that the security modules can decide whether they support the
dcache RCU walk or not it's possible to make selinux a bit more
RCU friendly.  The SELinux AVC and security server access decision
code is RCU safe.  A specific piece of the LSM audit code may not
be RCU safe.

This patch makes the VFS RCU walk retry if it would hit the non RCU
safe chunk of code.  It will normally just work under RCU.  This is
done simply by passing the VFS RCU state as a flag down into the
avc_audit() code and returning ECHILD there if it would have an issue.

Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-25 18:16:32 -07:00
Eric Paris a269434d2f LSM: separate LSM_AUDIT_DATA_DENTRY from LSM_AUDIT_DATA_PATH
This patch separates and audit message that only contains a dentry from
one that contains a full path.  This allows us to make it harder to
misuse the interfaces or for the interfaces to be implemented wrong.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2011-04-25 18:14:07 -04:00
Eric Paris f48b739984 LSM: split LSM_AUDIT_DATA_FS into _PATH and _INODE
The lsm common audit code has wacky contortions making sure which pieces
of information are set based on if it was given a path, dentry, or
inode.  Split this into path and inode to get rid of some of the code
complexity.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2011-04-25 18:13:15 -04:00
Eric Paris 0dc1ba24f7 SELINUX: Make selinux cache VFS RCU walks safe
Now that the security modules can decide whether they support the
dcache RCU walk or not it's possible to make selinux a bit more
RCU friendly.  The SELinux AVC and security server access decision
code is RCU safe.  A specific piece of the LSM audit code may not
be RCU safe.

This patch makes the VFS RCU walk retry if it would hit the non RCU
safe chunk of code.  It will normally just work under RCU.  This is
done simply by passing the VFS RCU state as a flag down into the
avc_audit() code and returning ECHILD there if it would have an issue.

Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-04-25 16:24:41 -04:00
Andi Kleen 1c99042974 SECURITY: Move exec_permission RCU checks into security modules
Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
is enabled, even though just the standard capability module is active.
This is because security_inode_exec_permission unconditionally fails
RCU walks.

Move this decision to the low level security module. This requires
passing the RCU flags down the security hook. This way at least
the capability module and a few easy cases in selinux/smack work
with RCU walks with CONFIG_SECURITY=y

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-04-25 10:20:32 -04:00
Eric Paris 6b697323a7 SELinux: security_read_policy should take a size_t not ssize_t
The len should be an size_t but is a ssize_t.  Easy enough fix to silence
build warnings.  We have no need for signed-ness.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
2011-04-25 10:19:02 -04:00
Eric Paris a35c6c8368 SELinux: silence build warning when !CONFIG_BUG
If one builds a kernel without CONFIG_BUG there are a number of 'may be
used uninitialized' warnings.  Silence these by returning after the BUG().

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
2011-04-25 10:18:27 -04:00
Andi Kleen 8c9e80ed27 SECURITY: Move exec_permission RCU checks into security modules
Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
is enabled, even though just the standard capability module is active.
This is because security_inode_exec_permission unconditionally fails
RCU walks.

Move this decision to the low level security module. This requires
passing the RCU flags down the security hook. This way at least
the capability module and a few easy cases in selinux/smack work
with RCU walks with CONFIG_SECURITY=y

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-22 16:17:29 -07:00
Eric Paris 425b473de5 SELinux: delete debugging printks from filename_trans rule processing
The filename_trans rule processing has some printk(KERN_ERR ) messages
which were intended as debug aids in creating the code but weren't removed
before it was submitted.  Remove them.

Signed-off-by: Eric Paris <eparis@redhat.com>
2011-04-20 11:45:14 -04:00
Justin P. Mattock 6eab04a876 treewide: remove extra semicolons
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-04-10 17:01:05 +02:00
Harry Ciao 1214eac73f Initialize policydb.process_class eariler.
Initialize policydb.process_class once all symtabs read from policy image,
so that it could be used to setup the role_trans.tclass field when a lower
version policy.X is loaded.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-04-07 12:00:26 -04:00
Stephen Smalley eba71de2cb selinux: Fix regression for Xorg
Commit 6f5317e730 introduced a bug in the
handling of userspace object classes that is causing breakage for Xorg
when XSELinux is enabled.  Fix the bug by changing map_class() to return
SECCLASS_NULL when the class cannot be mapped to a kernel object class.

Reported-by:  "Justin P. Mattock" <justinmattock@gmail.com>
Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2011-04-07 12:00:12 -04:00
Kohei Kaigai f50a3ec961 selinux: add type_transition with name extension support for selinuxfs
The attached patch allows /selinux/create takes optional 4th argument
to support TYPE_TRANSITION with name extension for userspace object
managers.
If 4th argument is not supplied, it shall perform as existing kernel.
In fact, the regression test of SE-PostgreSQL works well on the patched
kernel.

Thanks,

Signed-off-by: KaiGai Kohei <kohei.kaigai@eu.nec.com>
[manually verify fuzz was not an issue, and it wasn't: eparis]
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-04-01 17:13:23 -04:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Stephen Smalley 85cd6da53a selinux: Fix regression for Xorg
Commit 6f5317e730 introduced a bug in the
handling of userspace object classes that is causing breakage for Xorg
when XSELinux is enabled.  Fix the bug by changing map_class() to return
SECCLASS_NULL when the class cannot be mapped to a kernel object class.

Reported-by:  "Justin P. Mattock" <justinmattock@gmail.com>
Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2011-03-29 10:26:30 +11:00
Harry Ciao c900ff323d SELinux: Write class field in role_trans_write.
If kernel policy version is >= 26, then write the class field of the
role_trans structure into the binary reprensentation.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-03-28 14:21:05 -04:00
Harry Ciao 63a312ca55 SELinux: Compute role in newcontext for all classes
Apply role_transition rules for all kinds of classes.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-03-28 14:21:01 -04:00
Harry Ciao 8023976cf4 SELinux: Add class support to the role_trans structure
If kernel policy version is >= 26, then the binary representation of
the role_trans structure supports specifying the class for the current
subject or the newly created object.

If kernel policy version is < 26, then the class field would be default
to the process class.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-03-28 14:20:58 -04:00
Serge E. Hallyn 2e14967075 userns: rename is_owner_or_cap to inode_owner_or_capable
And give it a kernel-doc comment.

[akpm@linux-foundation.org: btrfs changed in linux-next]
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:47:13 -07:00
Serge E. Hallyn 3486740a4f userns: security: make capabilities relative to the user namespace
- Introduce ns_capable to test for a capability in a non-default
  user namespace.
- Teach cap_capable to handle capabilities in a non-default
  user namespace.

The motivation is to get to the unprivileged creation of new
namespaces.  It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.

I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.

Changelog:
	11/05/2010: [serge] add apparmor
	12/14/2010: [serge] fix capabilities to created user namespaces
	Without this, if user serge creates a user_ns, he won't have
	capabilities to the user_ns he created.  THis is because we
	were first checking whether his effective caps had the caps
	he needed and returning -EPERM if not, and THEN checking whether
	he was the creator.  Reverse those checks.
	12/16/2010: [serge] security_real_capable needs ns argument in !security case
	01/11/2011: [serge] add task_ns_capable helper
	01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
	02/16/2011: [serge] fix a logic bug: the root user is always creator of
		    init_user_ns, but should not always have capabilities to
		    it!  Fix the check in cap_capable().
	02/21/2011: Add the required user_ns parameter to security_capable,
		    fixing a compile failure.
	02/23/2011: Convert some macros to functions as per akpm comments.  Some
		    couldn't be converted because we can't easily forward-declare
		    them (they are inline if !SECURITY, extern if SECURITY).  Add
		    a current_user_ns function so we can use it in capability.h
		    without #including cred.h.  Move all forward declarations
		    together to the top of the #ifdef __KERNEL__ section, and use
		    kernel-doc format.
	02/23/2011: Per dhowells, clean up comment in cap_capable().
	02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.

(Original written and signed off by Eric;  latest, modified version
acked by him)

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:47:02 -07:00
Linus Torvalds 7a6362800c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits)
  bonding: enable netpoll without checking link status
  xfrm: Refcount destination entry on xfrm_lookup
  net: introduce rx_handler results and logic around that
  bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
  bonding: wrap slave state work
  net: get rid of multiple bond-related netdevice->priv_flags
  bonding: register slave pointer for rx_handler
  be2net: Bump up the version number
  be2net: Copyright notice change. Update to Emulex instead of ServerEngines
  e1000e: fix kconfig for crc32 dependency
  netfilter ebtables: fix xt_AUDIT to work with ebtables
  xen network backend driver
  bonding: Improve syslog message at device creation time
  bonding: Call netif_carrier_off after register_netdevice
  bonding: Incorrect TX queue offset
  net_sched: fix ip_tos2prio
  xfrm: fix __xfrm_route_forward()
  be2net: Fix UDP packet detected status in RX compl
  Phonet: fix aligned-mode pipe socket buffer header reserve
  netxen: support for GbE port settings
  ...

Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c
with the staging updates.
2011-03-16 16:29:25 -07:00
David S. Miller 1d28f42c1b net: Put flowi_* prefix on AF independent members of struct flowi
I intend to turn struct flowi into a union of AF specific flowi
structs.  There will be a common structure that each variant includes
first, much like struct sock_common.

This is the first step to move in that direction.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:44 -08:00
James Morris fe3fa43039 Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into next 2011-03-08 11:38:10 +11:00
James Morris 1cc26bada9 Merge branch 'master'; commit 'v2.6.38-rc7' into next 2011-03-08 10:55:06 +11:00
Eric Paris 026eb167ae SELinux: implement the new sb_remount LSM hook
For SELinux we do not allow security information to change during a remount
operation.  Thus this hook simply strips the security module options from
the data and verifies that those are the same options as exist on the
current superblock.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
2011-03-03 16:12:28 -05:00
Harry Ciao 2ad18bdf3b SELinux: Compute SID for the newly created socket
The security context for the newly created socket shares the same
user, role and MLS attribute as its creator but may have a different
type, which could be specified by a type_transition rule in the relevant
policy package.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
[fix call to security_transition_sid to include qstr, Eric Paris]
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2011-03-03 15:19:44 -05:00
Harry Ciao 6f5317e730 SELinux: Socket retains creator role and MLS attribute
The socket SID would be computed on creation and no longer inherit
its creator's SID by default. Socket may have a different type but
needs to retain the creator's role and MLS attribute in order not
to break labeled networking and network access control.

The kernel value for a class would be used to determine if the class
if one of socket classes. If security_compute_sid is called from
userspace the policy value for a class would be mapped to the relevant
kernel value first.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2011-03-03 15:19:43 -05:00
Harry Ciao 4bc6c2d5d8 SELinux: Auto-generate security_is_socket_class
The security_is_socket_class() is auto-generated by genheaders based
on classmap.h to reduce maintenance effort when a new class is defined
in SELinux kernel. The name for any socket class should be suffixed by
"socket" and doesn't contain more than one substr of "socket".

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2011-03-03 15:19:43 -05:00
Patrick McHardy c53fa1ed92 netlink: kill loginuid/sessionid/sid members from struct netlink_skb_parms
Netlink message processing in the kernel is synchronous these days, the
session information can be collected when needed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-03 10:55:40 -08:00
Eric Paris 0b24dcb7f2 Revert "selinux: simplify ioctl checking"
This reverts commit 242631c49d.

Conflicts:

	security/selinux/hooks.c

SELinux used to recognize certain individual ioctls and check
permissions based on the knowledge of the individual ioctl.  In commit
242631c49d the SELinux code stopped trying to understand
individual ioctls and to instead looked at the ioctl access bits to
determine in we should check read or write for that operation.  This
same suggestion was made to SMACK (and I believe copied into TOMOYO).
But this suggestion is total rubbish.  The ioctl access bits are
actually the access requirements for the structure being passed into the
ioctl, and are completely unrelated to the operation of the ioctl or the
object the ioctl is being performed upon.

Take FS_IOC_FIEMAP as an example.  FS_IOC_FIEMAP is defined as:

FS_IOC_FIEMAP _IOWR('f', 11, struct fiemap)

So it has access bits R and W.  What this really means is that the
kernel is going to both read and write to the struct fiemap.  It has
nothing at all to do with the operations that this ioctl might perform
on the file itself!

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2011-02-25 15:40:00 -05:00
Eric Paris 47ac19ea42 selinux: drop unused packet flow permissions
These permissions are not used and can be dropped in the kernel
definitions.

Suggested-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2011-02-25 15:40:00 -05:00
Steffen Klassert 4a7ab3dcad selinux: Fix packet forwarding checks on postrouting
The IPSKB_FORWARDED and IP6SKB_FORWARDED flags are used only in the
multicast forwarding case to indicate that a packet looped back after
forward. So these flags are not a good indicator for packet forwarding.
A better indicator is the incoming interface. If we have no socket context,
but an incoming interface and we see the packet in the ip postroute hook,
the packet is going to be forwarded.

With this patch we use the incoming interface as an indicator on packet
forwarding.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-02-25 15:00:51 -05:00
Steffen Klassert b9679a7618 selinux: Fix wrong checks for selinux_policycap_netpeer
selinux_sock_rcv_skb_compat and selinux_ip_postroute_compat are just
called if selinux_policycap_netpeer is not set. However in these
functions we check if selinux_policycap_netpeer is set. This leads
to some dead code and to the fact that selinux_xfrm_postroute_last
is never executed. This patch removes the dead code and the checks
for selinux_policycap_netpeer in the compatibility functions.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-02-25 15:00:47 -05:00
Steffen Klassert 8f82a6880d selinux: Fix check for xfrm selinux context algorithm
selinux_xfrm_sec_ctx_alloc accidentally checks the xfrm domain of
interpretation against the selinux context algorithm. This patch
fixes this by checking ctx_alg against the selinux context algorithm.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-02-25 15:00:44 -05:00
David S. Miller e33f770426 xfrm: Mark flowi arg to security_xfrm_state_pol_flow_match() const.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 18:13:15 -08:00
Tetsuo Handa 2edeaa34a6 CRED: Fix BUG() upon security_cred_alloc_blank() failure
In cred_alloc_blank() since 2.6.32, abort_creds(new) is called with
new->security == NULL and new->magic == 0 when security_cred_alloc_blank()
returns an error.  As a result, BUG() will be triggered if SELinux is enabled
or CONFIG_DEBUG_CREDENTIALS=y.

If CONFIG_DEBUG_CREDENTIALS=y, BUG() is called from __invalid_creds() because
cred->magic == 0.  Failing that, BUG() is called from selinux_cred_free()
because selinux_cred_free() is not expecting cred->security == NULL.  This does
not affect smack_cred_free(), tomoyo_cred_free() or apparmor_cred_free().

Fix these bugs by

(1) Set new->magic before calling security_cred_alloc_blank().

(2) Handle null cred->security in creds_are_invalid() and selinux_cred_free().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-07 14:04:00 -08:00
Lucian Adrian Grijincu 8e6c96935f security/selinux: fix /proc/sys/ labeling
This fixes an old (2007) selinux regression: filesystem labeling for
/proc/sys returned
     -r--r--r-- unknown                          /proc/sys/fs/file-nr
instead of
     -r--r--r-- system_u:object_r:sysctl_fs_t:s0 /proc/sys/fs/file-nr

Events that lead to breaking of /proc/sys/ selinux labeling:

1) sysctl was reimplemented to route all calls through /proc/sys/

    commit 77b14db502
    [PATCH] sysctl: reimplement the sysctl proc support

2) proc_dir_entry was removed from ctl_table:

    commit 3fbfa98112
    [PATCH] sysctl: remove the proc_dir_entry member for the sysctl tables

3) selinux still walked the proc_dir_entry tree to apply
   labeling. Because ctl_tables don't have a proc_dir_entry, we did
   not label /proc/sys/ inodes any more. To achieve this the /proc/sys/
   inodes were marked private and private inodes were ignored by
   selinux.

    commit bbaca6c2e7
    [PATCH] selinux: enhance selinux to always ignore private inodes

    commit 86a71dbd3e
    [PATCH] sysctl: hide the sysctl proc inodes from selinux

Access control checks have been done by means of a special sysctl hook
that was called for read/write accesses to any /proc/sys/ entry.

We don't have to do this because, instead of walking the
proc_dir_entry tree we can walk the dentry tree (as done in this
patch). With this patch:
* we don't mark /proc/sys/ inodes as private
* we don't need the sysclt security hook
* we walk the dentry tree to find the path to the inode.

We have to strip the PID in /proc/PID/ entries that have a
proc_dir_entry because selinux does not know how to label paths like
'/1/net/rpc/nfsd.fh' (and defaults to 'proc_t' labeling). Selinux does
know of '/net/rpc/nfsd.fh' (and applies the 'sysctl_rpc_t' label).

PID stripping from the path was done implicitly in the previous code
because the proc_dir_entry tree had the root in '/net' in the example
from above. The dentry tree has the root in '/1'.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-02-01 11:53:54 -05:00
Eric Paris 652bb9b0d6 SELinux: Use dentry name in new object labeling
Currently SELinux has rules which label new objects according to 3 criteria.
The label of the process creating the object, the label of the parent
directory, and the type of object (reg, dir, char, block, etc.)  This patch
adds a 4th criteria, the dentry name, thus we can distinguish between
creating a file in an etc_t directory called shadow and one called motd.

There is no file globbing, regex parsing, or anything mystical.  Either the
policy exactly (strcmp) matches the dentry name of the object or it doesn't.
This patch has no changes from today if policy does not implement the new
rules.

Signed-off-by: Eric Paris <eparis@redhat.com>
2011-02-01 11:12:30 -05:00
Eric Paris 2a7dba391e fs/vfs/security: pass last path component to LSM on inode creation
SELinux would like to implement a new labeling behavior of newly created
inodes.  We currently label new inodes based on the parent and the creating
process.  This new behavior would also take into account the name of the
new object when deciding the new label.  This is not the (supposed) full path,
just the last component of the path.

This is very useful because creating /etc/shadow is different than creating
/etc/passwd but the kernel hooks are unable to differentiate these
operations.  We currently require that userspace realize it is doing some
difficult operation like that and than userspace jumps through SELinux hoops
to get things set up correctly.  This patch does not implement new
behavior, that is obviously contained in a seperate SELinux patch, but it
does pass the needed name down to the correct LSM hook.  If no such name
exists it is fine to pass NULL.

Signed-off-by: Eric Paris <eparis@redhat.com>
2011-02-01 11:12:29 -05:00
Davidlohr Bueso 3ac285ff23 selinux: return -ENOMEM when memory allocation fails
Return -ENOMEM when memory allocation fails in cond_init_bool_indexes,
correctly propagating error code to caller.

Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Signed-off-by: James Morris <jmorris@namei.org>
2011-01-24 11:35:47 +11:00
Shan Wei ced3b93018 security:selinux: kill unused MAX_AVTAB_HASH_MASK and ebitmap_startbit
Kill unused MAX_AVTAB_HASH_MASK and ebitmap_startbit.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-01-24 10:36:11 +11:00
Linus Torvalds e0e736fc0d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (30 commits)
  MAINTAINERS: Add tomoyo-dev-en ML.
  SELinux: define permissions for DCB netlink messages
  encrypted-keys: style and other cleanup
  encrypted-keys: verify datablob size before converting to binary
  trusted-keys: kzalloc and other cleanup
  trusted-keys: additional TSS return code and other error handling
  syslog: check cap_syslog when dmesg_restrict
  Smack: Transmute labels on specified directories
  selinux: cache sidtab_context_to_sid results
  SELinux: do not compute transition labels on mountpoint labeled filesystems
  This patch adds a new security attribute to Smack called SMACK64EXEC. It defines label that is used while task is running.
  SELinux: merge policydb_index_classes and policydb_index_others
  selinux: convert part of the sym_val_to_name array to use flex_array
  selinux: convert type_val_to_struct to flex_array
  flex_array: fix flex_array_put_ptr macro to be valid C
  SELinux: do not set automatic i_ino in selinuxfs
  selinux: rework security_netlbl_secattr_to_sid
  SELinux: standardize return code handling in selinuxfs.c
  SELinux: standardize return code handling in selinuxfs.c
  SELinux: standardize return code handling in policydb.c
  ...
2011-01-10 11:18:59 -08:00
Alexey Dobriyan 37721e1b0c headers: path.h redux
Remove path.h from sched.h and other files.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-10 08:51:44 -08:00
James Morris aeda4ac3ef Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into next 2011-01-10 10:40:42 +11:00
James Morris d2e7ad1922 Merge branch 'master' into next
Conflicts:
	security/smack/smack_lsm.c

Verified and added fix by Stephen Rothwell <sfr@canb.auug.org.au>
Ok'd by Casey Schaufler <casey@schaufler-ca.com>

Signed-off-by: James Morris <jmorris@namei.org>
2011-01-10 09:46:24 +11:00
Linus Torvalds b4a45f5fe8 Merge branch 'vfs-scale-working' of git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin
* 'vfs-scale-working' of git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin: (57 commits)
  fs: scale mntget/mntput
  fs: rename vfsmount counter helpers
  fs: implement faster dentry memcmp
  fs: prefetch inode data in dcache lookup
  fs: improve scalability of pseudo filesystems
  fs: dcache per-inode inode alias locking
  fs: dcache per-bucket dcache hash locking
  bit_spinlock: add required includes
  kernel: add bl_list
  xfs: provide simple rcu-walk ACL implementation
  btrfs: provide simple rcu-walk ACL implementation
  ext2,3,4: provide simple rcu-walk ACL implementation
  fs: provide simple rcu-walk generic_check_acl implementation
  fs: provide rcu-walk aware permission i_ops
  fs: rcu-walk aware d_revalidate method
  fs: cache optimise dentry and inode for rcu-walk
  fs: dcache reduce branches in lookup path
  fs: dcache remove d_mounted
  fs: fs_struct use seqlock
  fs: rcu-walk for path lookup
  ...
2011-01-07 08:56:33 -08:00
Nick Piggin dc0474be3e fs: dcache rationalise dget variants
dget_locked was a shortcut to avoid the lazy lru manipulation when we already
held dcache_lock (lru manipulation was relatively cheap at that point).
However, how that the lru lock is an innermost one, we never hold it at any
caller, so the lock cost can now be avoided. We already have well working lazy
dcache LRU, so it should be fine to defer LRU manipulations to scan time.

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07 17:50:24 +11:00
Nick Piggin b5c84bf6f6 fs: dcache remove dcache_lock
dcache_lock no longer protects anything. remove it.

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07 17:50:23 +11:00
Nick Piggin 2fd6b7f507 fs: dcache scale subdirs
Protect d_subdirs and d_child with d_lock, except in filesystems that aren't
using dcache_lock for these anyway (eg. using i_mutex).

Note: if we change the locking rule in future so that ->d_child protection is
provided only with ->d_parent->d_lock, it may allow us to reduce some locking.
But it would be an exception to an otherwise regular locking scheme, so we'd
have to see some good results. Probably not worthwhile.

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07 17:50:21 +11:00
David S. Miller 3610cda53f af_unix: Avoid socket->sk NULL OOPS in stream connect security hooks.
unix_release() can asynchornously set socket->sk to NULL, and
it does so without holding the unix_state_lock() on "other"
during stream connects.

However, the reverse mapping, sk->sk_socket, is only transitioned
to NULL under the unix_state_lock().

Therefore make the security hooks follow the reverse mapping instead
of the forward mapping.

Reported-by: Jeremy Fitzhardinge <jeremy@goop.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-05 15:38:53 -08:00
David S. Miller 17f7f4d9fc Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/ipv4/fib_frontend.c
2010-12-26 22:37:05 -08:00
Eric Paris 350e4f31e0 SELinux: define permissions for DCB netlink messages
Commit 2f90b865 added two new netlink message types to the netlink route
socket.  SELinux has hooks to define if netlink messages are allowed to
be sent or received, but it did not know about these two new message
types.  By default we allow such actions so noone likely noticed.  This
patch adds the proper definitions and thus proper permissions
enforcement.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-12-16 12:50:17 -05:00
Eric Paris 73ff5fc0a8 selinux: cache sidtab_context_to_sid results
sidtab_context_to_sid takes up a large share of time when creating large
numbers of new inodes (~30-40% in oprofile runs).  This patch implements a
cache of 3 entries which is checked before we do a full context_to_sid lookup.
On one system this showed over a x3 improvement in the number of inodes that
could be created per second and around a 20% improvement on another system.

Any time we look up the same context string sucessivly (imagine ls -lZ) we
should hit this cache hot.  A cache miss should have a relatively minor affect
on performance next to doing the full table search.

All operations on the cache are done COMPLETELY lockless.  We know that all
struct sidtab_node objects created will never be deleted until a new policy is
loaded thus we never have to worry about a pointer being dereferenced.  Since
we also know that pointer assignment is atomic we know that the cache will
always have valid pointers.  Given this information we implement a FIFO cache
in an array of 3 pointers.  Every result (whether a cache hit or table lookup)
will be places in the 0 spot of the cache and the rest of the entries moved
down one spot.  The 3rd entry will be lost.

Races are possible and are even likely to happen.  Lets assume that 4 tasks
are hitting sidtab_context_to_sid.  The first task checks against the first
entry in the cache and it is a miss.  Now lets assume a second task updates
the cache with a new entry.  This will push the first entry back to the second
spot.  Now the first task might check against the second entry (which it
already checked) and will miss again.  Now say some third task updates the
cache and push the second entry to the third spot.  The first task my check
the third entry (for the third time!) and again have a miss.  At which point
it will just do a full table lookup.  No big deal!

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-12-07 16:44:01 -05:00
Eric Paris 415103f993 SELinux: do not compute transition labels on mountpoint labeled filesystems
selinux_inode_init_security computes transitions sids even for filesystems
that use mount point labeling.  It shouldn't do that.  It should just use
the mount point label always and no matter what.

This causes 2 problems.  1) it makes file creation slower than it needs to be
since we calculate the transition sid and 2) it allows files to be created
with a different label than the mount point!

# id -Z
staff_u:sysadm_r:sysadm_t:s0-s0:c0.c1023
# sesearch --type --class file --source sysadm_t --target tmp_t
Found 1 semantic te rules:
   type_transition sysadm_t tmp_t : file user_tmp_t;

# mount -o loop,context="system_u:object_r:tmp_t:s0"  /tmp/fs /mnt/tmp

# ls -lZ /mnt/tmp
drwx------. root root system_u:object_r:tmp_t:s0       lost+found
# touch /mnt/tmp/file1
# ls -lZ /mnt/tmp
-rw-r--r--. root root staff_u:object_r:user_tmp_t:s0   file1
drwx------. root root system_u:object_r:tmp_t:s0       lost+found

Whoops, we have a mount point labeled filesystem tmp_t with a user_tmp_t
labeled file!

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Reviewed-by: James Morris <jmorris@namei.org>
2010-12-02 16:14:51 -05:00
Eric Paris 1d9bc6dc5b SELinux: merge policydb_index_classes and policydb_index_others
We duplicate functionality in policydb_index_classes() and
policydb_index_others().  This patch merges those functions just to make it
clear there is nothing special happening here.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30 17:28:58 -05:00
Eric Paris ac76c05bec selinux: convert part of the sym_val_to_name array to use flex_array
The sym_val_to_name type array can be quite large as it grows linearly with
the number of types.  With known policies having over 5k types these
allocations are growing large enough that they are likely to fail.  Convert
those to flex_array so no allocation is larger than PAGE_SIZE

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30 17:28:58 -05:00
Eric Paris 23bdecb000 selinux: convert type_val_to_struct to flex_array
In rawhide type_val_to_struct will allocate 26848 bytes, an order 3
allocations.  While this hasn't been seen to fail it isn't outside the
realm of possibiliy on systems with severe memory fragmentation.  Convert
to flex_array so no allocation will ever be bigger than PAGE_SIZE.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30 17:28:57 -05:00
Eric Paris c9e86a9b95 SELinux: do not set automatic i_ino in selinuxfs
selinuxfs carefully uses i_ino to figure out what the inode refers to.  The
VFS used to generically set this value and we would reset it to something
useable.  After 85fe4025c6 each filesystem sets this value to a default
if needed.  Since selinuxfs doesn't use the default value and it can only
lead to problems (I'd rather have 2 inodes with i_ino == 0 than one
pointing to the wrong data) lets just stop setting a default.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
2010-11-30 17:28:57 -05:00
Eric Paris 7ae9f23cbd selinux: rework security_netlbl_secattr_to_sid
security_netlbl_secattr_to_sid is difficult to follow, especially the
return codes.  Try to make the function obvious.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30 17:28:57 -05:00
Eric Paris 4b02b52448 SELinux: standardize return code handling in selinuxfs.c
selinuxfs.c has lots of different standards on how to handle return paths on
error.  For the most part transition to

	rc=errno
	if (failure)
		goto out;
[...]
out:
	cleanup()
	return rc;

Instead of doing cleanup mid function, or having multiple returns or other
options.  This doesn't do that for every function, but most of the complex
functions which have cleanup routines on error.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30 17:28:57 -05:00
Eric Paris b77a493b1d SELinux: standardize return code handling in selinuxfs.c
selinuxfs.c has lots of different standards on how to handle return paths on
error.  For the most part transition to

	rc=errno
	if (failure)
		goto out;
[...]
out:
	cleanup()
	return rc;

Instead of doing cleanup mid function, or having multiple returns or other
options.  This doesn't do that for every function, but most of the complex
functions which have cleanup routines on error.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30 17:28:57 -05:00
Eric Paris 9398c7f794 SELinux: standardize return code handling in policydb.c
policydb.c has lots of different standards on how to handle return paths on
error.  For the most part transition to

	rc=errno
	if (failure)
		goto out;
[...]
out:
	cleanup()
	return rc;

Instead of doing cleanup mid function, or having multiple returns or other
options.  This doesn't do that for every function, but most of the complex
functions which have cleanup routines on error.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30 17:28:56 -05:00
Serge E. Hallyn ce6ada35bd security: Define CAP_SYSLOG
Privileged syslog operations currently require CAP_SYS_ADMIN.  Split
this off into a new CAP_SYSLOG privilege which we can sanely take away
from a container through the capability bounding set.

With this patch, an lxc container can be prevented from messing with
the host's syslog (i.e. dmesg -c).

Changelog: mar 12 2010: add selinux capability2:cap_syslog perm
Changelog: nov 22 2010:
	. port to new kernel
	. add a WARN_ONCE if userspace isn't using CAP_SYSLOG

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-By: Kees Cook <kees.cook@canonical.com>
Cc: James Morris <jmorris@namei.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: "Christopher J. PeBenito" <cpebenito@tresys.com>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: James Morris <jmorris@namei.org>
2010-11-29 08:35:12 +11:00
Eric Paris 2fe66ec242 SELinux: indicate fatal error in compat netfilter code
The SELinux ip postroute code indicates when policy rejected a packet and
passes the error back up the stack.  The compat code does not.  This patch
sends the same kind of error back up the stack in the compat code.

Based-on-patch-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-23 10:50:17 -08:00
Eric Paris 04f6d70f6e SELinux: Only return netlink error when we know the return is fatal
Some of the SELinux netlink code returns a fatal error when the error might
actually be transient.  This patch just silently drops packets on
potentially transient errors but continues to return a permanant error
indicator when the denial was because of policy.

Based-on-comments-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-23 10:50:17 -08:00
Eric Paris 1f1aaf8282 SELinux: return -ECONNREFUSED from ip_postroute to signal fatal error
The SELinux netfilter hooks just return NF_DROP if they drop a packet.  We
want to signal that a drop in this hook is a permanant fatal error and is not
transient.  If we do this the error will be passed back up the stack in some
places and applications will get a faster interaction that something went
wrong.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17 10:54:35 -08:00
Eric Paris 12b3052c3e capabilities/syslog: open code cap_syslog logic to fix build failure
The addition of CONFIG_SECURITY_DMESG_RESTRICT resulted in a build
failure when CONFIG_PRINTK=n.  This is because the capabilities code
which used the new option was built even though the variable in question
didn't exist.

The patch here fixes this by moving the capabilities checks out of the
LSM and into the caller.  All (known) LSMs should have been calling the
capabilities hook already so it actually makes the code organization
better to eliminate the hook altogether.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-15 15:40:01 -08:00
Al Viro fc14f2fef6 convert get_sb_single() users
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-29 04:16:28 -04:00
Christoph Hellwig 85fe4025c6 fs: do not assign default i_ino in new_inode
Instead of always assigning an increasing inode number in new_inode
move the call to assign it into those callers that actually need it.
For now callers that need it is estimated conservatively, that is
the call is added to all filesystems that do not assign an i_ino
by themselves.  For a few more filesystems we can avoid assigning
any inode number given that they aren't user visible, and for others
it could be done lazily when an inode number is actually needed,
but that's left for later patches.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-25 21:26:11 -04:00
Stephen Rothwell f0d3d9894e selinux: include vmalloc.h for vmalloc_user
Include vmalloc.h for vmalloc_user (fixes ppc build warning).
Acked-by: Eric Paris <eparis@redhat.com>

Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:13:01 +11:00
Eric Paris 845ca30fe9 selinux: implement mmap on /selinux/policy
/selinux/policy allows a user to copy the policy back out of the kernel.
This patch allows userspace to actually mmap that file and use it directly.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:59 +11:00
Eric Paris cee74f47a6 SELinux: allow userspace to read policy back out of the kernel
There is interest in being able to see what the actual policy is that was
loaded into the kernel.  The patch creates a new selinuxfs file
/selinux/policy which can be read by userspace.  The actual policy that is
loaded into the kernel will be written back out to userspace.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:58 +11:00
Eric Paris 00d85c83ac SELinux: drop useless (and incorrect) AVTAB_MAX_SIZE
AVTAB_MAX_SIZE was a define which was supposed to be used in userspace to
define a maximally sized avtab when userspace wasn't sure how big of a table
it needed.  It doesn't make sense in the kernel since we always know our table
sizes.  The only place it is used we have a more appropiately named define
called AVTAB_MAX_HASH_BUCKETS, use that instead.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:57 +11:00
Eric Paris 4419aae1f4 SELinux: deterministic ordering of range transition rules
Range transition rules are placed in the hash table in an (almost)
arbitrary order.  This patch inserts them in a fixed order to make policy
retrival more predictable.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:56 +11:00
Eric Paris d5630b9d27 security: secid_to_secctx returns len when data is NULL
With the (long ago) interface change to have the secid_to_secctx functions
do the string allocation instead of having the caller do the allocation we
lost the ability to query the security server for the length of the
upcoming string.  The SECMARK code would like to allocate a netlink skb
with enough length to hold the string but it is just too unclean to do the
string allocation twice or to do the allocation the first time and hold
onto the string and slen.  This patch adds the ability to call
security_secid_to_secctx() with a NULL data pointer and it will just set
the slen pointer.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:50 +11:00
Eric Paris 2606fd1fa5 secmark: make secmark object handling generic
Right now secmark has lots of direct selinux calls.  Use all LSM calls and
remove all SELinux specific knowledge.  The only SELinux specific knowledge
we leave is the mode.  The only point is to make sure that other LSMs at
least test this generic code before they assume it works.  (They may also
have to make changes if they do not represent labels as strings)

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:48 +11:00
KOSAKI Motohiro b0ae198113 security: remove unused parameter from security_task_setscheduler()
All security modules shouldn't change sched_param parameter of
security_task_setscheduler().  This is not only meaningless, but also
make a harmful result if caller pass a static variable.

This patch remove policy and sched_param parameter from
security_task_setscheduler() becuase none of security module is
using it.

Cc: James Morris <jmorris@namei.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:44 +11:00
KaiGai Kohei 36f7f28416 selinux: fix up style problem on /selinux/status
This patch fixes up coding-style problem at this commit:

 4f27a7d49789b04404eca26ccde5f527231d01d5
 selinux: fast status update interface (/selinux/status)

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:41 +11:00
matt mooney 8b0c543e5c selinux: change to new flag variable
Replace EXTRA_CFLAGS with ccflags-y.

Signed-off-by: matt mooney <mfm@muteddisk.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:40 +11:00
Paul Gortmaker 60272da034 selinux: really fix dependency causing parallel compile failure.
While the previous change to the selinux Makefile reduced the window
significantly for this failure, it is still possible to see a compile
failure where cpp starts processing selinux files before the auto
generated flask.h file is completed.  This is easily reproduced by
adding the following temporary change to expose the issue everytime:

-      cmd_flask = scripts/selinux/genheaders/genheaders ...
+      cmd_flask = sleep 30 ; scripts/selinux/genheaders/genheaders ...

This failure happens because the creation of the object files in the ss
subdir also depends on flask.h.  So simply incorporate them into the
parent Makefile, as the ss/Makefile really doesn't do anything unique.

With this change, compiling of all selinux files is dependent on
completion of the header file generation, and this test case with
the "sleep 30" now confirms it is functioning as expected.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:39 +11:00
Paul Gortmaker ceba72a68d selinux: fix parallel compile error
Selinux has an autogenerated file, "flask.h" which is included by
two other selinux files.  The current makefile has a single dependency
on the first object file in the selinux-y list, assuming that will get
flask.h generated before anyone looks for it, but that assumption breaks
down in a "make -jN" situation and you get:

   selinux/selinuxfs.c:35: fatal error: flask.h: No such file or directory
   compilation terminated.
   remake[9]: *** [security/selinux/selinuxfs.o] Error 1

Since flask.h is included by security.h which in turn is included
nearly everywhere, make the dependency apply to all of the selinux-y
list of objs.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:38 +11:00
KaiGai Kohei 1190416725 selinux: fast status update interface (/selinux/status)
This patch provides a new /selinux/status entry which allows applications
read-only mmap(2).
This region reflects selinux_kernel_status structure in kernel space.
  struct selinux_kernel_status
  {
          u32     length;         /* length of this structure */
          u32     sequence;       /* sequence number of seqlock logic */
          u32     enforcing;      /* current setting of enforcing mode */
          u32     policyload;     /* times of policy reloaded */
          u32     deny_unknown;   /* current setting of deny_unknown */
  };

When userspace object manager caches access control decisions provided
by SELinux, it needs to invalidate the cache on policy reload and setenforce
to keep consistency.
However, the applications need to check the kernel state for each accesses
on userspace avc, or launch a background worker process.
In heuristic, frequency of invalidation is much less than frequency of
making access control decision, so it is annoying to invoke a system call
to check we don't need to invalidate the userspace cache.
If we can use a background worker thread, it allows to receive invalidation
messages from the kernel. But it requires us an invasive coding toward the
base application in some cases; E.g, when we provide a feature performing
with SELinux as a plugin module, it is unwelcome manner to launch its own
worker thread from the module.

If we could map /selinux/status to process memory space, application can
know updates of selinux status; policy reload or setenforce.

A typical application checks selinux_kernel_status::sequence when it tries
to reference userspace avc. If it was changed from the last time when it
checked userspace avc, it means something was updated in the kernel space.
Then, the application can reset userspace avc or update current enforcing
mode, without any system call invocations.
This sequence number is updated according to the seqlock logic, so we need
to wait for a while if it is odd number.

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by: Eric Paris <eparis@redhat.com>
--
 security/selinux/include/security.h |   21 ++++++
 security/selinux/selinuxfs.c        |   56 +++++++++++++++
 security/selinux/ss/Makefile        |    2 +-
 security/selinux/ss/services.c      |    3 +
 security/selinux/ss/status.c        |  129 +++++++++++++++++++++++++++++++++++
 5 files changed, 210 insertions(+), 1 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:36 +11:00
Eric Paris daa6d83a28 selinux: type_bounds_sanity_check has a meaningless variable declaration
type is not used at all, stop declaring and assigning it.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:33 +11:00
Nick Piggin d996b62a8d tty: fix fu_list abuse
tty: fix fu_list abuse

tty code abuses fu_list, which causes a bug in remount,ro handling.

If a tty device node is opened on a filesystem, then the last link to the inode
removed, the filesystem will be allowed to be remounted readonly. This is
because fs_may_remount_ro does not find the 0 link tty inode on the file sb
list (because the tty code incorrectly removed it to use for its own purpose).
This can result in a filesystem with errors after it is marked "clean".

Taking idea from Christoph's initial patch, allocate a tty private struct
at file->private_data and put our required list fields in there, linking
file and tty. This makes tty nodes behave the same way as other device nodes
and avoid meddling with the vfs, and avoids this bug.

The error handling is not trivial in the tty code, so for this bugfix, I take
the simple approach of using __GFP_NOFAIL and don't worry about memory errors.
This is not a problem because our allocator doesn't fail small allocs as a rule
anyway. So proper error handling is left as an exercise for tty hackers.

[ Arguably filesystem's device inode would ideally be divorced from the
driver's pseudo inode when it is opened, but in practice it's not clear whether
that will ever be worth implementing. ]

Cc: linux-kernel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-18 08:35:47 -04:00
Nick Piggin ee2ffa0dfd fs: cleanup files_lock locking
fs: cleanup files_lock locking

Lock tty_files with a new spinlock, tty_files_lock; provide helpers to
manipulate the per-sb files list; unexport the files_lock spinlock.

Cc: linux-kernel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-18 08:35:47 -04:00
Linus Torvalds b34d8915c4 Merge branch 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linux
* 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linux:
  unistd: add __NR_prlimit64 syscall numbers
  rlimits: implement prlimit64 syscall
  rlimits: switch more rlimit syscalls to do_prlimit
  rlimits: redo do_setrlimit to more generic do_prlimit
  rlimits: add rlimit64 structure
  rlimits: do security check under task_lock
  rlimits: allow setrlimit to non-current tasks
  rlimits: split sys_setrlimit
  rlimits: selinux, do rlimits changes under task_lock
  rlimits: make sure ->rlim_max never grows in sys_setrlimit
  rlimits: add task_struct to update_rlimit_cpu
  rlimits: security, add task_struct to setrlimit

Fix up various system call number conflicts.  We not only added fanotify
system calls in the meantime, but asm-generic/unistd.h added a wait4
along with a range of reserved per-architecture system calls.
2010-08-10 12:07:51 -07:00
Ralf Baechle a7a387cc59 SELINUX: Fix build error.
Fix build error caused by a stale security/selinux/av_permissions.h in the $(src)
directory which will override a more recent version in $(obj) that is it
appears to strike only when building with a separate object directory.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-06 18:11:39 -04:00
Eric Paris 6371dcd36f selinux: convert the policy type_attr_map to flex_array
Current selinux policy can have over 3000 types.  The type_attr_map in
policy is an array sized by the number of types times sizeof(struct ebitmap)
(12 on x86_64).  Basic math tells us the array is going to be of length
3000 x 12 = 36,000 bytes.  The largest 'safe' allocation on a long running
system is 16k.  Most of the time a 32k allocation will work.  But on long
running systems a 64k allocation (what we need) can fail quite regularly.
In order to deal with this I am converting the type_attr_map to use
flex_arrays.  Let the library code deal with breaking this into PAGE_SIZE
pieces.

-v2
rework some of the if(!obj) BUG() to be BUG_ON(!obj)
drop flex_array_put() calls and just use a _get() object directly

-v3
make apply to James' tree (drop the policydb_write changes)

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:38:39 +10:00
Eric Paris b424485abe SELinux: Move execmod to the common perms
execmod "could" show up on non regular files and non chr files.  The current
implementation would actually make these checks against non-existant bits
since the code assumes the execmod permission is same for all file types.
To make this line up for chr files we had to define execute_no_trans and
entrypoint permissions.  These permissions are unreachable and only existed
to to make FILE__EXECMOD and CHR_FILE__EXECMOD the same.  This patch drops
those needless perms as well.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:09 +10:00
Eric Paris 49b7b8de46 selinux: place open in the common file perms
kernel can dynamically remap perms.  Drop the open lookup table and put open
in the common file perms.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:08 +10:00
Eric Paris b782e0a68d SELinux: special dontaudit for access checks
Currently there are a number of applications (nautilus being the main one) which
calls access() on files in order to determine how they should be displayed.  It
is normal and expected that nautilus will want to see if files are executable
or if they are really read/write-able.  access() should return the real
permission.  SELinux policy checks are done in access() and can result in lots
of AVC denials as policy denies RWX on files which DAC allows.  Currently
SELinux must dontaudit actual attempts to read/write/execute a file in
order to silence these messages (and not flood the logs.)  But dontaudit rules
like that can hide real attacks.  This patch addes a new common file
permission audit_access.  This permission is special in that it is meaningless
and should never show up in an allow rule.  Instead the only place this
permission has meaning is in a dontaudit rule like so:

dontaudit nautilus_t sbin_t:file audit_access

With such a rule if nautilus just checks access() we will still get denied and
thus userspace will still get the correct answer but we will not log the denial.
If nautilus attempted to actually perform one of the forbidden actions
(rather than just querying access(2) about it) we would still log a denial.
This type of dontaudit rule should be used sparingly, as it could be a
method for an attacker to probe the system permissions without detection.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:07 +10:00
Eric Paris d09ca73979 security: make LSMs explicitly mask off permissions
SELinux needs to pass the MAY_ACCESS flag so it can handle auditting
correctly.  Presently the masking of MAY_* flags is done in the VFS.  In
order to allow LSMs to decide what flags they care about and what flags
they don't just pass them all and the each LSM mask off what they don't
need.  This patch should contain no functional changes to either the VFS or
any LSM.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:07 +10:00
Eric Paris 692a8a231b SELinux: break ocontext reading into a separate function
Move the reading of ocontext type data out of policydb_read() in a separate
function ocontext_read()

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:06 +10:00
Eric Paris d1b43547e5 SELinux: move genfs read to a separate function
move genfs read functionality out of policydb_read() and into a new
function called genfs_read()

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:05 +10:00
Dan Carpenter 9a7982793c selinux: fix error codes in symtab_init()
hashtab_create() only returns NULL on allocation failures to -ENOMEM is
appropriate here.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:04 +10:00
Dan Carpenter 338437f6a0 selinux: fix error codes in cond_read_bool()
The original code always returned -1 (-EPERM) on error.  The new code
returns either -ENOMEM, or -EINVAL or it propagates the error codes from
lower level functions next_entry() or hashtab_insert().

next_entry() returns -EINVAL.
hashtab_insert() returns -EINVAL, -EEXIST, or -ENOMEM.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:04 +10:00
Dan Carpenter 38184c5222 selinux: fix error codes in cond_policydb_init()
It's better to propagate the error code from avtab_init() instead of
returning -1 (-EPERM).  It turns out that avtab_init() never fails so
this patch doesn't change how the code runs but it's still a clean up.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:03 +10:00
Dan Carpenter fc5c126e47 selinux: fix error codes in cond_read_node()
Originally cond_read_node() returned -1 (-EPERM) on errors which was
incorrect.  Now it either propagates the error codes from lower level
functions next_entry() or cond_read_av_list() or it returns -ENOMEM or
-EINVAL.

next_entry() returns -EINVAL.
cond_read_av_list() returns -EINVAL or -ENOMEM.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:02 +10:00
Dan Carpenter 9d623b17a7 selinux: fix error codes in cond_read_av_list()
After this patch cond_read_av_list() no longer returns -1 for any
errors.  It just propagates error code back from lower levels.  Those can
either be -EINVAL or -ENOMEM.

I also modified cond_insertf() since cond_read_av_list() passes that as a
function pointer to avtab_read_item().  It isn't used anywhere else.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:02 +10:00
Dan Carpenter 5241c1074f selinux: propagate error codes in cond_read_list()
These are passed back when the security module gets loaded.

The original code always returned -1 (-EPERM) on error but after this
patch it can return -EINVAL, or -ENOMEM or propagate the error code from
cond_read_node().  cond_read_node() still returns -1 all the time, but I
fix that in a later patch.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:01 +10:00
Dan Carpenter 9e0bd4cba4 selinux: cleanup return codes in avtab_read_item()
The avtab_read_item() function tends to return -1 as a default error
code which is wrong (-1 means -EPERM).  I modified it to return
appropriate error codes which is -EINVAL or the error code from
next_entry() or insertf().

next_entry() returns -EINVAL.
insertf() is a function pointer to either avtab_insert() or
cond_insertf().
avtab_insert() returns -EINVAL, -ENOMEM, and -EEXIST.
cond_insertf() currently returns -1, but I will fix it in a later patch.

There is code in avtab_read() which translates the -1 returns from
avtab_read_item() to -EINVAL. The translation is no longer needed, so I
removed it.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:01 +10:00
Arnd Bergmann 57a62c2317 selinux: use generic_file_llseek
The default for llseek will change to no_llseek,
so selinuxfs needs to add explicit .llseek
assignments. Since we're dealing with regular
files from a VFS perspective, use generic_file_llseek.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:34:59 +10:00
Mimi Zohar af4f136056 security: move LSM xattrnames to xattr.h
Make the security extended attributes names global. Updated to move
the remaining Smack xattrs.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:34:57 +10:00
Paul Moore 5fb49870e6 selinux: Use current_security() when possible
There were a number of places using the following code pattern:

  struct cred *cred = current_cred();
  struct task_security_struct *tsec = cred->security;

... which were simplified to the following:

  struct task_security_struct *tsec = current_security();

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:34:39 +10:00
Paul Moore 253bfae6e0 selinux: Convert socket related access controls to use socket labels
At present, the socket related access controls use a mix of inode and
socket labels; while there should be no practical difference (they
_should_ always be the same), it makes the code more confusing.  This
patch attempts to convert all of the socket related access control
points (with the exception of some of the inode/fd based controls) to
use the socket's own label.  In the process, I also converted the
socket_has_perm() function to take a 'sock' argument instead of a
'socket' since that was adding a bit more overhead in some cases.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:34:39 +10:00
Paul Moore 84914b7ed1 selinux: Shuffle the sk_security_struct alloc and free routines
The sk_alloc_security() and sk_free_security() functions were only being
called by the selinux_sk_alloc_security() and selinux_sk_free_security()
functions so we just move the guts of the alloc/free routines to the
callers and eliminate a layer of indirection.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:34:38 +10:00
Paul Moore d4f2d97841 selinux: Consolidate sockcreate_sid logic
Consolidate the basic sockcreate_sid logic into a single helper function
which allows us to do some cleanups in the related code.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:34:37 +10:00
Paul Moore 4d1e24514d selinux: Set the peer label correctly on connected UNIX domain sockets
Correct a problem where we weren't setting the peer label correctly on
the client end of a pair of connected UNIX sockets.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:34:37 +10:00
Eric Paris 9ee0c823c1 SELinux: seperate range transition rules to a seperate function
Move the range transition rule to a separate function, range_read(), rather
than doing it all in policydb_read()

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:34:30 +10:00
Paul E. McKenney babcd37821 selinux: remove all rcu head initializations
Remove all rcu head inits. We don't care about the RCU head state before passing
it to call_rcu() anyway. Only leave the "on_stack" variants so debugobjects can
keep track of objects on stack.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <jmorris@namei.org>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:33:35 +10:00
Oleg Nesterov eb2d55a32b rlimits: selinux, do rlimits changes under task_lock
When doing an exec, selinux updates rlimits in its code of current
process depending on current max. Make sure max or cur doesn't change
in the meantime by grabbing task_lock which do_prlimit needs for
changing limits too.

While at it, use rlimit helper for accessing CPU rlimit a line below.
To have a volatile access too.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Oleg Nesterov <oleg@redhat.com>
2010-07-16 09:48:46 +02:00
Jiri Slaby 5ab46b345e rlimits: add task_struct to update_rlimit_cpu
Add task_struct as a parameter to update_rlimit_cpu to be able to set
rlimit_cpu of different task than current.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
2010-07-16 09:48:45 +02:00
Jiri Slaby 8fd00b4d70 rlimits: security, add task_struct to setrlimit
Add task_struct to task_setrlimit of security_operations to be able to set
rlimit of task other than current.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
2010-07-16 09:48:45 +02:00
Al Viro e8c2625599 switch selinux delayed superblock handling to iterate_supers()
... kill their private list, while we are at it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:17 -04:00
Julia Lawall b3139bbc52 security/selinux/ss: Use kstrdup
Use kstrdup when the goal of an allocation is copy a string into the
allocated region.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression from,to;
expression flag,E1,E2;
statement S;
@@

-  to = kmalloc(strlen(from) + 1,flag);
+  to = kstrdup(from, flag);
   ... when != \(from = E1 \| to = E1 \)
   if (to==NULL || ...) S
   ... when != \(from = E2 \| to = E2 \)
-  strcpy(to, from);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-05-17 09:00:27 +10:00
James Morris 0ffbe2699c Merge branch 'master' into next 2010-05-06 10:56:07 +10:00
Stephen Smalley fcaaade1db selinux: generalize disabling of execmem for plt-in-heap archs
On Tue, 2010-04-27 at 11:47 -0700, David Miller wrote:
> From: "Tom \"spot\" Callaway" <tcallawa@redhat.com>
> Date: Tue, 27 Apr 2010 14:20:21 -0400
>
> > [root@apollo ~]$ cat /proc/2174/maps
> > 00010000-00014000 r-xp 00000000 fd:00 15466577
> >  /sbin/mingetty
> > 00022000-00024000 rwxp 00002000 fd:00 15466577
> >  /sbin/mingetty
> > 00024000-00046000 rwxp 00000000 00:00 0
> >  [heap]
>
> SELINUX probably barfs on the executable heap, the PLT is in the HEAP
> just like powerpc32 and that's why VM_DATA_DEFAULT_FLAGS has to set
> both executable and writable.
>
> You also can't remove the CONFIG_PPC32 ifdefs in selinux, since
> because of the VM_DATA_DEFAULT_FLAGS setting used still in that arch,
> the heap will always have executable permission, just like sparc does.
> You have to support those binaries forever, whether you like it or not.
>
> Let's just replace the CONFIG_PPC32 ifdef in SELINUX with CONFIG_PPC32
> || CONFIG_SPARC as in Tom's original patch and let's be done with
> this.
>
> In fact I would go through all the arch/ header files and check the
> VM_DATA_DEFAULT_FLAGS settings and add the necessary new ifdefs to the
> SELINUX code so that other platforms don't have the pain of having to
> go through this process too.

To avoid maintaining per-arch ifdefs, it seems that we could just
directly use (VM_DATA_DEFAULT_FLAGS & VM_EXEC) as the basis for deciding
whether to enable or disable these checks.   VM_DATA_DEFAULT_FLAGS isn't
constant on some architectures but instead depends on
current->personality, but we want this applied uniformly.  So we'll just
use the initial task state to determine whether or not to enable these
checks.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: James Morris <jmorris@namei.org>
2010-04-29 08:58:45 +10:00
Eric Paris cb84aa9b42 LSM Audit: rename LSM_AUDIT_NO_AUDIT to LSM_AUDIT_DATA_NONE
Most of the LSM common audit work uses LSM_AUDIT_DATA_* for the naming.
This was not so for LSM_AUDIT_NO_AUDIT which means the generic initializer
cannot be used.  This patch just renames the flag so the generic
initializer can be used.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-04-28 08:51:12 +10:00
Eric Paris a200005038 SELinux: return error codes on policy load failure
policy load failure always return EINVAL even if the failure was for some
other reason (usually ENOMEM).  This patch passes error codes back up the
stack where they will make their way to userspace.  This might help in
debugging future problems with policy load.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-04-21 08:58:49 +10:00
Stephen Smalley 6c9ff1013b SELinux: Reduce max avtab size to avoid page allocation failures
Reduce MAX_AVTAB_HASH_BITS so that the avtab allocation is an order 2
allocation rather than an order 4 allocation on x86_64.  This
addresses reports of page allocation failures:
http://marc.info/?l=selinux&m=126757230625867&w=2
https://bugzilla.redhat.com/show_bug.cgi?id=570433

Reported-by:  Russell Coker <russell@coker.com.au>
Signed-off-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-04-15 09:26:01 +10:00
wzt.wzt@gmail.com c1a7368a6f Security: Fix coding style in security/
Fix coding style in security/

Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-04-09 15:13:48 +10:00
Eric Paris dd3e7836bf selinux: always call sk_security_struct sksec
trying to grep everything that messes with a sk_security_struct isn't easy
since we don't always call it sksec.  Just rename everything sksec.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-04-08 09:17:02 +10:00
James Morris d25d6fa1a9 Merge branch 'master' into next 2010-03-31 08:39:27 +11:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Stephen Smalley 77c160e779 SELinux: Reduce max avtab size to avoid page allocation failures
Reduce MAX_AVTAB_HASH_BITS so that the avtab allocation is an order 2
allocation rather than an order 4 allocation on x86_64.  This
addresses reports of page allocation failures:
http://marc.info/?l=selinux&m=126757230625867&w=2
https://bugzilla.redhat.com/show_bug.cgi?id=570433

Reported-by:  Russell Coker <russell@coker.com.au>
Signed-off-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-03-16 08:31:02 +11:00
James Morris c43a752347 Merge branch 'next-queue' into next 2010-03-09 12:46:47 +11:00
Jiri Kosina 318ae2edc3 Merge branch 'for-next' into for-linus
Conflicts:
	Documentation/filesystems/proc.txt
	arch/arm/mach-u300/include/mach/debug-macro.S
	drivers/net/qlge/qlge_ethtool.c
	drivers/net/qlge/qlge_main.c
	drivers/net/typhoon.c
2010-03-08 16:55:37 +01:00
Stephen Hemminger 634a539e16 selinux: const strings in tables
Several places strings tables are used that should be declared
const.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-03-08 09:33:53 +11:00
wzt.wzt@gmail.com 06b9b72df4 Selinux: Remove unused headers skbuff.h in selinux/nlmsgtab.c
skbuff.h is already included by netlink.h, so remove it.

Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-03-04 08:51:06 +11:00
wzt.wzt@gmail.com dbba541f9d Selinux: Remove unused headers slab.h in selinux/ss/symtab.c
slab.h is unused in symtab.c, so remove it.

Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-03-03 09:22:16 +11:00
wzt.wzt@gmail.com 31637b55b0 Selinux: Remove unused headers list.h in selinux/netlink.c
list.h is unused in netlink.c, so remove it.

Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-03-03 09:20:57 +11:00
James Morris b4ccebdd37 Merge branch 'next' into for-linus 2010-03-01 09:36:31 +11:00
David Howells ef57471a73 SELinux: Make selinux_kernel_create_files_as() shouldn't just always return 0
Make selinux_kernel_create_files_as() return an error when it gets one, rather
than unconditionally returning 0.

Without this, cachefiles doesn't return an error if the SELinux policy doesn't
let it create files with the label of the directory at the base of the cache.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-02-26 14:54:23 +11:00
Joshua Roys c36f74e67f netlabel: fix export of SELinux categories > 127
This fixes corrupted CIPSO packets when SELinux categories greater than 127
are used.  The bug occured on the second (and later) loops through the
while; the inner for loop through the ebitmap->maps array used the same
index as the NetLabel catmap->bitmap array, even though the NetLabel bitmap
is twice as long as the SELinux bitmap.

Signed-off-by: Joshua Roys <joshua.roys@gtri.gatech.edu>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-02-25 17:49:20 +11:00
wzt.wzt@gmail.com 189b3b1c89 Security: add static to security_ops and default_security_ops variable
Enhance the security framework to support resetting the active security
module. This eliminates the need for direct use of the security_ops and
default_security_ops variables outside of security.c, so make security_ops
and default_security_ops static. Also remove the secondary_ops variable as
a cleanup since there is no use for that. secondary_ops was originally used by
SELinux to call the "secondary" security module (capability or dummy),
but that was replaced by direct calls to capability and the only
remaining use is to save and restore the original security ops pointer
value if SELinux is disabled by early userspace based on /etc/selinux/config.
Further, if we support this directly in the security framework, then we can
just use &default_security_ops for this purpose since that is now available.

Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-02-24 08:11:02 +11:00
KaiGai Kohei 2ae3ba3938 selinux: libsepol: remove dead code in check_avtab_hierarchy_callback()
This patch revert the commit of 7d52a155e3
which removed a part of type_attribute_bounds_av as a dead code.
However, at that time, we didn't find out the target side boundary allows
to handle some of pseudo /proc/<pid>/* entries with its process's security
context well.

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>

--
 security/selinux/ss/services.c |   43 ++++++++++++++++++++++++++++++++++++---
 1 files changed, 39 insertions(+), 4 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
2010-02-22 08:27:41 +11:00
James Morris 2da5d31bc7 security: fix a couple of sparse warnings
Fix a couple of sparse warnings for callers of
context_struct_to_string, which takes a *u32, not an *int.

These cases are harmless as the values are not used.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2010-02-16 17:29:06 +11:00
Xiaotian Feng 8007f10259 selinux: fix memory leak in sel_make_bools
In sel_make_bools, kernel allocates memory for bool_pending_names[i]
with security_get_bools. So if we just free bool_pending_names, those
memories for bool_pending_names[i] will be leaked.

This patch resolves dozens of following kmemleak report after resuming
from suspend:
unreferenced object 0xffff88022e4c7380 (size 32):
  comm "init", pid 1, jiffies 4294677173
  backtrace:
    [<ffffffff810f76b5>] create_object+0x1a2/0x2a9
    [<ffffffff810f78bb>] kmemleak_alloc+0x26/0x4b
    [<ffffffff810ef3eb>] __kmalloc+0x18f/0x1b8
    [<ffffffff811cd511>] security_get_bools+0xd7/0x16f
    [<ffffffff811c48c0>] sel_write_load+0x12e/0x62b
    [<ffffffff810f9a39>] vfs_write+0xae/0x10b
    [<ffffffff810f9b56>] sys_write+0x4a/0x6e
    [<ffffffff81011b82>] system_call_fastpath+0x16/0x1b
    [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-02-09 08:22:24 +11:00
Justin P. Mattock 6382dc3340 fix comment typos in avc.c
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-02-05 12:22:35 +01:00
Kees Cook d78ca3cd73 syslog: use defined constants instead of raw numbers
Right now the syslog "type" action are just raw numbers which makes
the source difficult to follow.  This patch replaces the raw numbers
with defined constants for some level of sanity.

Signed-off-by: Kees Cook <kees.cook@canonical.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-02-04 14:20:41 +11:00
Kees Cook 002345925e syslog: distinguish between /proc/kmsg and syscalls
This allows the LSM to distinguish between syslog functions originating
from /proc/kmsg access and direct syscalls.  By default, the commoncaps
will now no longer require CAP_SYS_ADMIN to read an opened /proc/kmsg
file descriptor.  For example the kernel syslog reader can now drop
privileges after opening /proc/kmsg, instead of staying privileged with
CAP_SYS_ADMIN.  MAC systems that implement security_syslog have unchanged
behavior.

Signed-off-by: Kees Cook <kees.cook@canonical.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-02-04 14:20:12 +11:00
Guido Trentalancia 0719aaf5ea selinux: allow MLS->non-MLS and vice versa upon policy reload
Allow runtime switching between different policy types (e.g. from a MLS/MCS
policy to a non-MLS/non-MCS policy or viceversa).

Signed-off-by: Guido Trentalancia <guido@trentalancia.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-02-04 09:06:36 +11:00
Guido Trentalancia 42596eafdd selinux: load the initial SIDs upon every policy load
Always load the initial SIDs, even in the case of a policy
reload and not just at the initial policy load. This comes
particularly handy after the introduction of a recent
patch for enabling runtime switching between different
policy types, although this patch is in theory independent
from that feature.

Signed-off-by: Guido Trentalancia <guido@trentalancia.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-02-04 08:48:17 +11:00
Stephen Smalley b6cac5a30b selinux: Only audit permissions specified in policy
Only audit the permissions specified by the policy rules.

Before:
type=AVC msg=audit(01/28/2010 14:30:46.690:3250) : avc:  denied  { read
append } for  pid=14092 comm=foo name=test_file dev=dm-1 ino=132932
scontext=unconfined_u:unconfined_r:load_policy_t:s0-s0:c0.c1023
tcontext=unconfined_u:object_r:rpm_tmp_t:s0 tclass=file

After:
type=AVC msg=audit(01/28/2010 14:52:37.448:26) : avc:  denied
{ append } for  pid=1917 comm=foo name=test_file dev=dm-1 ino=132932
scontext=unconfined_u:unconfined_r:load_policy_t:s0-s0:c0.c1023
tcontext=unconfined_u:object_r:rpm_tmp_t:s0 tclass=file

Reference:
https://bugzilla.redhat.com/show_bug.cgi?id=558499

Reported-by: Tom London <selinux@gmail.com>
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-02-03 08:49:10 +11:00
KaiGai Kohei 7d52a155e3 selinux: remove dead code in type_attribute_bounds_av()
This patch removes dead code in type_attribute_bounds_av().

Due to the historical reason, the type boundary feature is delivered
from hierarchical types in libsepol, it has supported boundary features
both of subject type (domain; in most cases) and target type.

However, we don't have any actual use cases in bounded target types,
and it tended to make conceptual confusion.
So, this patch removes the dead code to apply boundary checks on the
target types. I makes clear the TYPEBOUNDS restricts privileges of
a certain domain bounded to any other domain.

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>

--
 security/selinux/ss/services.c |   43 +++------------------------------------
 1 files changed, 4 insertions(+), 39 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
2010-01-25 08:31:38 +11:00
Stephen Smalley 2f3e82d694 selinux: convert range transition list to a hashtab
Per https://bugzilla.redhat.com/show_bug.cgi?id=548145
there are sufficient range transition rules in modern (Fedora) policy to
make mls_compute_sid a significant factor on the shmem file setup path
due to the length of the range_tr list.  Replace the simple range_tr
list with a hashtab inside the security server to help mitigate this
problem.

Signed-off-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2010-01-25 08:29:05 +11:00
James Morris 2457552d1e Merge branch 'master' into next 2010-01-18 09:56:22 +11:00
Stephen Smalley 19439d05b8 selinux: change the handling of unknown classes
If allow_unknown==deny, SELinux treats an undefined kernel security
class as an error condition rather than as a typical permission denial
and thus does not allow permissions on undefined classes even when in
permissive mode.  Change the SELinux logic so that this case is handled
as a typical permission denial, subject to the usual permissive mode and
permissive domain handling.

Also drop the 'requested' argument from security_compute_av() and
helpers as it is a legacy of the original security server interface and
is unused.

Changes:
- Handle permissive domains consistently by moving up the test for a
permissive domain.
- Make security_compute_av_user() consistent with security_compute_av();
the only difference now is that security_compute_av() performs mapping
between the kernel-private class and permission indices and the policy
values.  In the userspace case, this mapping is handled by libselinux.
- Moved avd_init inside the policy lock.

Based in part on a patch by Paul Moore <paul.moore@hp.com>.

Reported-by: Andrew Worsley <amworsley@gmail.com>
Signed-off-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-01-18 09:54:26 +11:00
Jiri Slaby 17740d8978 SECURITY: selinux, fix update_rlimit_cpu parameter
Don't pass current RLIMIT_RTTIME to update_rlimit_cpu() in
selinux_bprm_committing_creds, since update_rlimit_cpu expects
RLIMIT_CPU limit.

Use proper rlim[RLIMIT_CPU].rlim_cur instead to fix that.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: David Howells <dhowells@redhat.com>
2010-01-04 11:27:18 +01:00
Linus Torvalds 4ef58d4e2a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
  tree-wide: fix misspelling of "definition" in comments
  reiserfs: fix misspelling of "journaled"
  doc: Fix a typo in slub.txt.
  inotify: remove superfluous return code check
  hdlc: spelling fix in find_pvc() comment
  doc: fix regulator docs cut-and-pasteism
  mtd: Fix comment in Kconfig
  doc: Fix IRQ chip docs
  tree-wide: fix assorted typos all over the place
  drivers/ata/libata-sff.c: comment spelling fixes
  fix typos/grammos in Documentation/edac.txt
  sysctl: add missing comments
  fs/debugfs/inode.c: fix comment typos
  sgivwfb: Make use of ARRAY_SIZE.
  sky2: fix sky2_link_down copy/paste comment error
  tree-wide: fix typos "couter" -> "counter"
  tree-wide: fix typos "offest" -> "offset"
  fix kerneldoc for set_irq_msi()
  spidev: fix double "of of" in comment
  comment typo fix: sybsystem -> subsystem
  ...
2009-12-09 19:43:33 -08:00
James Morris 1ad1f10cd9 Merge branch 'master' into next 2009-12-09 19:01:03 +11:00
Amerigo Wang 08e3daff21 selinux: remove a useless return
The last return is unreachable, remove the 'return'
in default, let it fall through.

Signed-off-by: WANG Cong <amwang@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-12-08 14:58:11 +11:00
Julia Lawall 9f59f90bf5 security/selinux/ss: correct size computation
The size argument to kcalloc should be the size of desired structure,
not the pointer to it.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@expression@
expression *x;
@@

x =
 <+...
-sizeof(x)
+sizeof(*x)
...+>// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-12-08 14:57:54 +11:00
Jiri Kosina d014d04386 Merge branch 'for-next' into for-linus
Conflicts:

	kernel/irq/chip.c
2009-12-07 18:36:35 +01:00
David S. Miller 28b4d5cc17 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	drivers/net/pcmcia/fmvj18x_cs.c
	drivers/net/pcmcia/nmclan_cs.c
	drivers/net/pcmcia/xirc2ps_cs.c
	drivers/net/wireless/ray_cs.c
2009-12-05 15:22:26 -08:00
André Goddard Rosa af901ca181 tree-wide: fix assorted typos all over the place
That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 15:39:55 +01:00
Eric Paris 0bce952799 SELinux: print denials for buggy kernel with unknown perms
Historically we've seen cases where permissions are requested for classes
where they do not exist.  In particular we have seen CIFS forget to set
i_mode to indicate it is a directory so when we later check something like
remove_name we have problems since it wasn't defined in tclass file.  This
used to result in a avc which included the permission 0x2000 or something.
Currently the kernel will deny the operations (good thing) but will not
print ANY information (bad thing).  First the auditdeny field is no
extended to include unknown permissions.  After that is fixed the logic in
avc_dump_query to output this information isn't right since it will remove
the permission from the av and print the phrase "<NULL>".  This takes us
back to the behavior before the classmap rewrite.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-11-24 14:30:49 +11:00
Eric Dumazet 8964be4a9a net: rename skb->iif to skb->skb_iif
To help grep games, rename iif to skb_iif

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-20 15:35:04 -08:00
Eric Paris dd8dbf2e68 security: report the module name to security_module_request
For SELinux to do better filtering in userspace we send the name of the
module along with the AVC denial when a program is denied module_request.

Example output:

type=SYSCALL msg=audit(11/03/2009 10:59:43.510:9) : arch=x86_64 syscall=write success=yes exit=2 a0=3 a1=7fc28c0d56c0 a2=2 a3=7fffca0d7440 items=0 ppid=1727 pid=1729 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=rpc.nfsd exe=/usr/sbin/rpc.nfsd subj=system_u:system_r:nfsd_t:s0 key=(null)
type=AVC msg=audit(11/03/2009 10:59:43.510:9) : avc:  denied  { module_request } for  pid=1729 comm=rpc.nfsd kmod="net-pf-10" scontext=system_u:system_r:nfsd_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=system

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-11-10 09:33:46 +11:00
Eric Paris 6e8e16c7bc SELinux: add .gitignore files for dynamic classes
The SELinux dynamic class work in c6d3aaa4e3
creates a number of dynamic header files and scripts.  Add .gitignore files
so git doesn't complain about these.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-10-24 09:42:27 +08:00
Stephen Smalley b7f3008ad1 SELinux: fix locking issue introduced with c6d3aaa4e3
Ensure that we release the policy read lock on all exit paths from
security_compute_av.

Signed-off-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-10-20 09:22:07 +09:00
Stephen Smalley 941fc5b2bf selinux: drop remapping of netlink classes
Drop remapping of netlink classes and bypass of permission checking
based on netlink message type for policy version < 18.  This removes
compatibility code introduced when the original single netlink
security class used for all netlink sockets was split into
finer-grained netlink classes based on netlink protocol and when
permission checking was added based on netlink message type in Linux
2.6.8.  The only known distribution that shipped with SELinux and
policy < 18 was Fedora Core 2, which was EOL'd on 2005-04-11.

Given that the remapping code was never updated to address the
addition of newer netlink classes, that the corresponding userland
support was dropped in 2005, and that the assumptions made by the
remapping code about the fixed ordering among netlink classes in the
policy may be violated in the future due to the dynamic class/perm
discovery support, we should drop this compatibility code now.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-10-07 21:56:46 +11:00
Stephen Smalley 8753f6bec3 selinux: generate flask headers during kernel build
Add a simple utility (scripts/selinux/genheaders) and invoke it to
generate the kernel-private class and permission indices in flask.h
and av_permissions.h automatically during the kernel build from the
security class mapping definitions in classmap.h.  Adding new kernel
classes and permissions can then be done just by adding them to classmap.h.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-10-07 21:56:44 +11:00
Stephen Smalley c6d3aaa4e3 selinux: dynamic class/perm discovery
Modify SELinux to dynamically discover class and permission values
upon policy load, based on the dynamic object class/perm discovery
logic from libselinux.  A mapping is created between kernel-private
class and permission indices used outside the security server and the
policy values used within the security server.

The mappings are only applied upon kernel-internal computations;
similar mappings for the private indices of userspace object managers
is handled on a per-object manager basis by the userspace AVC.  The
interfaces for compute_av and transition_sid are split for kernel
vs. userspace; the userspace functions are distinguished by a _user
suffix.

The kernel-private class indices are no longer tied to the policy
values and thus do not need to skip indices for userspace classes;
thus the kernel class index values are compressed.  The flask.h
definitions were regenerated by deleting the userspace classes from
refpolicy's definitions and then regenerating the headers.  Going
forward, we can just maintain the flask.h, av_permissions.h, and
classmap.h definitions separately from policy as they are no longer
tied to the policy values.  The next patch introduces a utility to
automate generation of flask.h and av_permissions.h from the
classmap.h definitions.

The older kernel class and permission string tables are removed and
replaced by a single security class mapping table that is walked at
policy load to generate the mapping.  The old kernel class validation
logic is completely replaced by the mapping logic.

The handle unknown logic is reworked.  reject_unknown=1 is handled
when the mappings are computed at policy load time, similar to the old
handling by the class validation logic.  allow_unknown=1 is handled
when computing and mapping decisions - if the permission was not able
to be mapped (i.e. undefined, mapped to zero), then it is
automatically added to the allowed vector.  If the class was not able
to be mapped (i.e. undefined, mapped to zero), then all permissions
are allowed for it if allow_unknown=1.

avc_audit leverages the new security class mapping table to lookup the
class and permission names from the kernel-private indices.

The mdp program is updated to use the new table when generating the
class definitions and allow rules for a minimal boot policy for the
kernel.  It should be noted that this policy will not include any
userspace classes, nor will its policy index values for the kernel
classes correspond with the ones in refpolicy (they will instead match
the kernel-private indices).

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-10-07 21:56:42 +11:00
Eric Paris af8ff04917 SELinux: reset the security_ops before flushing the avc cache
This patch resets the security_ops to the secondary_ops before it flushes
the avc.  It's still possible that a task on another processor could have
already passed the security_ops dereference and be executing an selinux hook
function which would add a new avc entry.  That entry would still not be
freed.  This should however help to reduce the number of needless avcs the
kernel has when selinux is disabled at run time.  There is no wasted
memory if selinux is disabled on the command line or not compiled.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-30 19:17:06 +10:00
Oleg Nesterov 0b7570e77f do_wait() wakeup optimization: change __wake_up_parent() to use filtered wakeup
Ratan Nalumasu reported that in a process with many threads doing
unnecessary wakeups.  Every waiting thread in the process wakes up to loop
through the children and see that the only ones it cares about are still
not ready.

Now that we have struct wait_opts we can change do_wait/__wake_up_parent
to use filtered wakeups.

We can make child_wait_callback() more clever later, right now it only
checks eligible_child().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ratan Nalumasu <rnalumasu@gmail.com>
Cc: Vitaly Mayatskikh <vmayatsk@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Tested-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24 07:20:59 -07:00
Eric Paris 5224ee0863 SELinux: do not destroy the avc_cache_nodep
The security_ops reset done when SELinux is disabled at run time is done
after the avc cache is freed and after the kmem_cache for the avc is also
freed.  This means that between the time the selinux disable code destroys
the avc_node_cachep another process could make a security request and could
try to allocate from the cache.  We are just going to leave the cachep around,
like we always have.

SELinux:  Disabled at runtime.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81122537>] kmem_cache_alloc+0x9a/0x185
PGD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file:
CPU 1
Modules linked in:
Pid: 12, comm: khelper Not tainted 2.6.31-tip-05525-g0eeacc6-dirty #14819
System Product Name
RIP: 0010:[<ffffffff81122537>]  [<ffffffff81122537>]
kmem_cache_alloc+0x9a/0x185
RSP: 0018:ffff88003f9258b0  EFLAGS: 00010086
RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000078c0129e
RDX: 0000000000000000 RSI: ffffffff8130b626 RDI: ffffffff81122528
RBP: ffff88003f925900 R08: 0000000078c0129e R09: 0000000000000001
R10: 0000000000000000 R11: 0000000078c0129e R12: 0000000000000246
R13: 0000000000008020 R14: ffff88003f8586d8 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff880002b00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001001000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: ffffffff827bd420 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process khelper (pid: 12, threadinfo ffff88003f924000, task
ffff88003f928000)
Stack:
 0000000000000246 0000802000000246 ffffffff8130b626 0000000000000001
<0> 0000000078c0129e 0000000000000000 ffff88003f925a70 0000000000000002
<0> 0000000000000001 0000000000000001 ffff88003f925960 ffffffff8130b626
Call Trace:
 [<ffffffff8130b626>] ? avc_alloc_node+0x36/0x273
 [<ffffffff8130b626>] avc_alloc_node+0x36/0x273
 [<ffffffff8130b545>] ? avc_latest_notif_update+0x7d/0x9e
 [<ffffffff8130b8b4>] avc_insert+0x51/0x18d
 [<ffffffff8130bcce>] avc_has_perm_noaudit+0x9d/0x128
 [<ffffffff8130bf20>] avc_has_perm+0x45/0x88
 [<ffffffff8130f99d>] current_has_perm+0x52/0x6d
 [<ffffffff8130fbb2>] selinux_task_create+0x2f/0x45
 [<ffffffff81303bf7>] security_task_create+0x29/0x3f
 [<ffffffff8105c6ba>] copy_process+0x82/0xdf0
 [<ffffffff81091578>] ? register_lock_class+0x2f/0x36c
 [<ffffffff81091a13>] ? mark_lock+0x2e/0x1e1
 [<ffffffff8105d596>] do_fork+0x16e/0x382
 [<ffffffff81091578>] ? register_lock_class+0x2f/0x36c
 [<ffffffff810d9166>] ? probe_workqueue_execution+0x57/0xf9
 [<ffffffff81091a13>] ? mark_lock+0x2e/0x1e1
 [<ffffffff810d9166>] ? probe_workqueue_execution+0x57/0xf9
 [<ffffffff8100cdb2>] kernel_thread+0x82/0xe0
 [<ffffffff81078b1f>] ? ____call_usermodehelper+0x0/0x139
 [<ffffffff8100ce10>] ? child_rip+0x0/0x20
 [<ffffffff81078aea>] ? __call_usermodehelper+0x65/0x9a
 [<ffffffff8107a5c7>] run_workqueue+0x171/0x27e
 [<ffffffff8107a573>] ? run_workqueue+0x11d/0x27e
 [<ffffffff81078a85>] ? __call_usermodehelper+0x0/0x9a
 [<ffffffff8107a7bc>] worker_thread+0xe8/0x10f
 [<ffffffff810808e2>] ? autoremove_wake_function+0x0/0x63
 [<ffffffff8107a6d4>] ? worker_thread+0x0/0x10f
 [<ffffffff8108042e>] kthread+0x91/0x99
 [<ffffffff8100ce1a>] child_rip+0xa/0x20
 [<ffffffff8100c754>] ? restore_args+0x0/0x30
 [<ffffffff8108039d>] ? kthread+0x0/0x99
 [<ffffffff8100ce10>] ? child_rip+0x0/0x20
Code: 0f 85 99 00 00 00 9c 58 66 66 90 66 90 49 89 c4 fa 66 66 90 66 66 90
e8 83 34 fb ff e8 d7 e9 26 00 48 98 49 8b 94 c6 10 01 00 00 <48> 8b 1a 44
8b 7a 18 48 85 db 74 0f 8b 42 14 48 8b 04 c3 ff 42
RIP  [<ffffffff81122537>] kmem_cache_alloc+0x9a/0x185
 RSP <ffff88003f9258b0>
CR2: 0000000000000000
---[ end trace 42f41a982344e606 ]---

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-23 11:16:20 -07:00
Eric Paris 4e6d0bffd3 SELinux: flush the avc before disabling SELinux
Before SELinux is disabled at boot it can create AVC entries.  This patch
will flush those entries before disabling SELinux.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-14 12:34:11 +10:00
Eric Paris 008574b111 SELinux: seperate avc_cache flushing
Move the avc_cache flushing into it's own function so it can be reused when
disabling SELinux.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-14 12:34:09 +10:00
Eric Paris ed868a5698 Creds: creds->security can be NULL is selinux is disabled
__validate_process_creds should check if selinux is actually enabled before
running tests on the selinux portion of the credentials struct.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-14 12:34:07 +10:00
David P. Quigley ddd29ec659 sysfs: Add labeling support for sysfs
This patch adds a setxattr handler to the file, directory, and symlink
inode_operations structures for sysfs. The patch uses hooks introduced in the
previous patch to handle the getting and setting of security information for
the sysfs inodes. As was suggested by Eric Biederman the struct iattr in the
sysfs_dirent structure has been replaced by a structure which contains the
iattr, secdata and secdata length to allow the changes to persist in the event
that the inode representing the sysfs_dirent is evicted. Because sysfs only
stores this information when a change is made all the optional data is moved
into one dynamically allocated field.

This patch addresses an issue where SELinux was denying virtd access to the PCI
configuration entries in sysfs. The lack of setxattr handlers for sysfs
required that a single label be assigned to all entries in sysfs. Granting virtd
access to every entry in sysfs is not an acceptable solution so fine grained
labeling of sysfs is required such that individual entries can be labeled
appropriately.

[sds:  Fixed compile-time warnings, coding style, and setting of inode security init flags.]

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-10 10:11:29 +10:00
David P. Quigley 1ee65e37e9 LSM/SELinux: inode_{get,set,notify}secctx hooks to access LSM security context information.
This patch introduces three new hooks. The inode_getsecctx hook is used to get
all relevant information from an LSM about an inode. The inode_setsecctx is
used to set both the in-core and on-disk state for the inode based on a context
derived from inode_getsecctx.The final hook inode_notifysecctx will notify the
LSM of a change for the in-core state of the inode in question. These hooks are
for use in the labeled NFS code and addresses concerns of how to set security
on an inode in a multi-xattr LSM. For historical reasons Stephen Smalley's
explanation of the reason for these hooks is pasted below.

Quote Stephen Smalley

inode_setsecctx:  Change the security context of an inode.  Updates the
in core security context managed by the security module and invokes the
fs code as needed (via __vfs_setxattr_noperm) to update any backing
xattrs that represent the context.  Example usage:  NFS server invokes
this hook to change the security context in its incore inode and on the
backing file system to a value provided by the client on a SETATTR
operation.

inode_notifysecctx:  Notify the security module of what the security
context of an inode should be.  Initializes the incore security context
managed by the security module for this inode.  Example usage:  NFS
client invokes this hook to initialize the security context in its
incore inode to the value provided by the server for the file when the
server returned the file's attributes to the client.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-10 10:11:24 +10:00
David Howells ee18d64c1f KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]
Add a keyctl to install a process's session keyring onto its parent.  This
replaces the parent's session keyring.  Because the COW credential code does
not permit one process to change another process's credentials directly, the
change is deferred until userspace next starts executing again.  Normally this
will be after a wait*() syscall.

To support this, three new security hooks have been provided:
cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
the blank security creds and key_session_to_parent() - which asks the LSM if
the process may replace its parent's session keyring.

The replacement may only happen if the process has the same ownership details
as its parent, and the process has LINK permission on the session keyring, and
the session keyring is owned by the process, and the LSM permits it.

Note that this requires alteration to each architecture's notify_resume path.
This has been done for all arches barring blackfin, m68k* and xtensa, all of
which need assembly alteration to support TIF_NOTIFY_RESUME.  This allows the
replacement to be performed at the point the parent process resumes userspace
execution.

This allows the userspace AFS pioctl emulation to fully emulate newpag() and
the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
alter the parent process's PAG membership.  However, since kAFS doesn't use
PAGs per se, but rather dumps the keys into the session keyring, the session
keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
the newpag flag.

This can be tested with the following program:

	#include <stdio.h>
	#include <stdlib.h>
	#include <keyutils.h>

	#define KEYCTL_SESSION_TO_PARENT	18

	#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)

	int main(int argc, char **argv)
	{
		key_serial_t keyring, key;
		long ret;

		keyring = keyctl_join_session_keyring(argv[1]);
		OSERROR(keyring, "keyctl_join_session_keyring");

		key = add_key("user", "a", "b", 1, keyring);
		OSERROR(key, "add_key");

		ret = keyctl(KEYCTL_SESSION_TO_PARENT);
		OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");

		return 0;
	}

Compiled and linked with -lkeyutils, you should see something like:

	[dhowells@andromeda ~]$ keyctl show
	Session Keyring
	       -3 --alswrv   4043  4043  keyring: _ses
	355907932 --alswrv   4043    -1   \_ keyring: _uid.4043
	[dhowells@andromeda ~]$ /tmp/newpag
	[dhowells@andromeda ~]$ keyctl show
	Session Keyring
	       -3 --alswrv   4043  4043  keyring: _ses
	1055658746 --alswrv   4043  4043   \_ user: a
	[dhowells@andromeda ~]$ /tmp/newpag hello
	[dhowells@andromeda ~]$ keyctl show
	Session Keyring
	       -3 --alswrv   4043  4043  keyring: hello
	340417692 --alswrv   4043  4043   \_ user: a

Where the test program creates a new session keyring, sticks a user key named
'a' into it and then installs it on its parent.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 21:29:22 +10:00
David Howells e0e817392b CRED: Add some configurable debugging [try #6]
Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking
for credential management.  The additional code keeps track of the number of
pointers from task_structs to any given cred struct, and checks to see that
this number never exceeds the usage count of the cred struct (which includes
all references, not just those from task_structs).

Furthermore, if SELinux is enabled, the code also checks that the security
pointer in the cred struct is never seen to be invalid.

This attempts to catch the bug whereby inode_has_perm() faults in an nfsd
kernel thread on seeing cred->security be a NULL pointer (it appears that the
credential struct has been previously released):

	http://www.kerneloops.org/oops.php?number=252883

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 21:29:01 +10:00
Paul Moore ed6d76e4c3 selinux: Support for the new TUN LSM hooks
Add support for the new TUN LSM hooks: security_tun_dev_create(),
security_tun_dev_post_create() and security_tun_dev_attach().  This includes
the addition of a new object class, tun_socket, which represents the socks
associated with TUN devices.  The _tun_dev_create() and _tun_dev_post_create()
hooks are fairly similar to the standard socket functions but _tun_dev_attach()
is a bit special.  The _tun_dev_attach() is unique because it involves a
domain attaching to an existing TUN device and its associated tun_socket
object, an operation which does not exist with standard sockets and most
closely resembles a relabel operation.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@parisplace.org>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-01 08:29:52 +10:00
Amerigo Wang bc6a6008e5 selinux: adjust rules for ATTR_FORCE
As suggested by OGAWA Hirofumi in thread:
http://lkml.org/lkml/2009/8/7/132, we should let selinux_inode_setattr()
to match our ATTR_* rules.  ATTR_FORCE should not force things like
ATTR_SIZE.

[hirofumi@mail.parknet.co.jp: tweaks]
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: Eugene Teo <eteo@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-21 14:25:30 +10:00
James Morris ece13879e7 Merge branch 'master' into next
Conflicts:
	security/Kconfig

Manual fix.

Signed-off-by: James Morris <jmorris@namei.org>
2009-08-20 09:18:42 +10:00
Eric Paris 788084aba2 Security/SELinux: seperate lsm specific mmap_min_addr
Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable.  This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.

The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.

This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-17 15:09:11 +10:00
Eric Paris 8cf948e744 SELinux: call cap_file_mmap in selinux_file_mmap
Currently SELinux does not check CAP_SYS_RAWIO in the file_mmap hook.  This
means there is no DAC check on the ability to mmap low addresses in the
memory space.  This function adds the DAC check for CAP_SYS_RAWIO while
maintaining the selinux check on mmap_zero.  This means that processes
which need to mmap low memory will need CAP_SYS_RAWIO and mmap_zero but will
NOT need the SELinux sys_rawio capability.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-17 15:08:48 +10:00
Thomas Liu 2bf4969032 SELinux: Convert avc_audit to use lsm_audit.h
Convert avc_audit in security/selinux/avc.c to use lsm_audit.h,
for better maintainability.

 - changed selinux to use common_audit_data instead of
    avc_audit_data
 - eliminated code in avc.c and used code from lsm_audit.h instead.

Had to add a LSM_AUDIT_NO_AUDIT to lsm_audit.h so that avc_audit
can call common_lsm_audit and do the pre and post callbacks without
doing the actual dump.  This makes it so that the patched version
behaves the same way as the unpatched version.

Also added a denied field to the selinux_audit_data private space,
once again to make it so that the patched version behaves like the
unpatched.

I've tested and confirmed that AVCs look the same before and after
this patch.

Signed-off-by: Thomas Liu <tliu@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-17 08:37:18 +10:00
Eric Paris 25354c4fee SELinux: add selinux_kernel_module_request
This patch adds a new selinux hook so SELinux can arbitrate if a given
process should be allowed to trigger a request for the kernel to try to
load a module.  This is a different operation than a process trying to load
a module itself, which is already protected by CAP_SYS_MODULE.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-14 11:18:40 +10:00
James Morris 314dabb83a SELinux: fix memory leakage in /security/selinux/hooks.c
Fix memory leakage in /security/selinux/hooks.c

The buffer always needs to be freed here; we either error
out or allocate more memory.

Reported-by: iceberg <strakh@ispras.ru>
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
2009-08-11 08:37:13 +10:00
Eric Paris a2551df7ec Security/SELinux: seperate lsm specific mmap_min_addr
Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable.  This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.

The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.

This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-06 09:02:23 +10:00
Eric Paris 84336d1a77 SELinux: call cap_file_mmap in selinux_file_mmap
Currently SELinux does not check CAP_SYS_RAWIO in the file_mmap hook.  This
means there is no DAC check on the ability to mmap low addresses in the
memory space.  This function adds the DAC check for CAP_SYS_RAWIO while
maintaining the selinux check on mmap_zero.  This means that processes
which need to mmap low memory will need CAP_SYS_RAWIO and mmap_zero but will
NOT need the SELinux sys_rawio capability.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-06 09:02:21 +10:00
Oleg Nesterov 5bb459bb45 kernel: rename is_single_threaded(task) to current_is_single_threaded(void)
- is_single_threaded(task) is not safe unless task == current,
  we can't use task->signal or task->mm.

- it doesn't make sense unless task == current, the task can
  fork right after the check.

Rename it to current_is_single_threaded() and kill the argument.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-07-17 09:10:42 +10:00
James Morris be940d6279 Revert "SELinux: Convert avc_audit to use lsm_audit.h"
This reverts commit 8113a8d80f.

The patch causes a stack overflow on my system during boot.

Signed-off-by: James Morris <jmorris@namei.org>
2009-07-13 10:39:36 +10:00
Thomas Liu 8113a8d80f SELinux: Convert avc_audit to use lsm_audit.h
Convert avc_audit in security/selinux/avc.c to use lsm_audit.h,
for better maintainability and for less code duplication.

 - changed selinux to use common_audit_data instead of
   avc_audit_data
 - eliminated code in avc.c and used code from lsm_audit.h instead.

I have tested to make sure that the avcs look the same before and
after this patch.

Signed-off-by: Thomas Liu <tliu@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-07-13 07:54:48 +10:00
Thomas Liu 89c86576ec selinux: clean up avc node cache when disabling selinux
Added a call to free the avc_node_cache when inside selinux_disable because
it should not waste resources allocated during avc_init if SELinux is disabled
and the cache will never be used.

Signed-off-by: Thomas Liu <tliu@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-06-25 08:29:16 +10:00
Ingo Molnar 9e48858f7d security: rename ptrace_may_access => ptrace_access_check
The ->ptrace_may_access() methods are named confusingly - the real
ptrace_may_access() returns a bool, while these security checks have
a retval convention.

Rename it to ptrace_access_check, to reduce the confusion factor.

[ Impact: cleanup, no code changed ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: James Morris <jmorris@namei.org>
2009-06-25 00:18:05 +10:00
Stephen Smalley 20dda18be9 selinux: restore optimization to selinux_file_permission
Restore the optimization to skip revalidation in selinux_file_permission
if nothing has changed since the dentry_open checks, accidentally removed by
389fb800.  Also remove redundant test from selinux_revalidate_file_permission.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-06-23 08:19:58 +10:00
James Morris d905163c5b Merge branch 'master' into next 2009-06-19 08:20:55 +10:00
KaiGai Kohei 44c2d9bdd7 Add audit messages on type boundary violations
The attached patch adds support to generate audit messages on two cases.

The first one is a case when a multi-thread process tries to switch its
performing security context using setcon(3), but new security context is
not bounded by the old one.

  type=SELINUX_ERR msg=audit(1245311998.599:17):        \
      op=security_bounded_transition result=denied      \
      oldcontext=system_u:system_r:httpd_t:s0           \
      newcontext=system_u:system_r:guest_webapp_t:s0

The other one is a case when security_compute_av() masked any permissions
due to the type boundary violation.

  type=SELINUX_ERR msg=audit(1245312836.035:32):	\
      op=security_compute_av reason=bounds              \
      scontext=system_u:object_r:user_webapp_t:s0       \
      tcontext=system_u:object_r:shadow_t:s0:c0         \
      tclass=file perms=getattr,open

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-06-19 00:12:28 +10:00
KaiGai Kohei caabbdc07d cleanup in ss/services.c
It is a cleanup patch to cut down a line within 80 columns.

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
--
 security/selinux/ss/services.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
2009-06-18 21:53:44 +10:00
David S. Miller 9cbc1cb8cd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:
	Documentation/feature-removal-schedule.txt
	drivers/scsi/fcoe/fcoe.c
	net/core/drop_monitor.c
	net/core/net-traces.c
2009-06-15 03:02:23 -07:00
Eric Dumazet adf30907d6 net: skb->dst accessors
Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-03 02:51:04 -07:00
Eric Paris 850b0cee16 SELinux: define audit permissions for audit tree netlink messages
Audit trees defined 2 new netlink messages but the netlink mapping tables for
selinux permissions were not set up.  This patch maps these 2 new operations
to AUDIT_WRITE.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-06-03 07:44:53 +10:00
Stephen Smalley c5642f4bba selinux: remove obsolete read buffer limit from sel_read_bool
On Tue, 2009-05-19 at 00:05 -0400, Eamon Walsh wrote:
> Recent versions of coreutils have bumped the read buffer size from 4K to
> 32K in several of the utilities.
>
> This means that "cat /selinux/booleans/xserver_object_manager" no longer
> works, it returns "Invalid argument" on F11.  getsebool works fine.
>
> sel_read_bool has a check for "count > PAGE_SIZE" that doesn't seem to
> be present in the other read functions.  Maybe it could be removed?

Yes, that check is obsoleted by the conversion of those functions to
using simple_read_from_buffer(), which will reduce count if necessary to
what is available in the buffer.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-05-19 23:56:11 +10:00
Eric Paris 75834fc3b6 SELinux: move SELINUX_MAGIC into magic.h
The selinuxfs superblock magic is used inside the IMA code, but is being
defined in two places and could someday get out of sync.  This patch moves the
declaration into magic.h so it is only done once.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-05-19 08:19:00 +10:00
James Morris d254117099 Merge branch 'master' into next 2009-05-08 17:56:47 +10:00
Stephen Smalley 65c90bca0d selinux: Fix send_sigiotask hook
The CRED patch incorrectly converted the SELinux send_sigiotask hook to
use the current task SID rather than the target task SID in its
permission check, yielding the wrong permission check.  This fixes the
hook function.  Detected by the ltp selinux testsuite and confirmed to
correct the test failure.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-05-05 08:31:03 +10:00
Oleg Nesterov ecd6de3c88 selinux: selinux_bprm_committed_creds() should wake up ->real_parent, not ->parent.
We shouldn't worry about the tracer if current is ptraced, exec() must not
succeed if the tracer has no rights to trace this task after cred changing.
But we should notify ->real_parent which is, well, real parent.

Also, we don't need _irq to take tasklist, and we don't need parent's
->siglock to wake_up_interruptible(real_parent->signal->wait_chldexit).
Since we hold tasklist, real_parent->signal must be stable. Otherwise
spin_lock(siglock) is not safe too and can't help anyway.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-04-30 09:08:48 +10:00
David Howells 3bcac0263f SELinux: Don't flush inherited SIGKILL during execve()
Don't flush inherited SIGKILL during execve() in SELinux's post cred commit
hook.  This isn't really a security problem: if the SIGKILL came before the
credentials were changed, then we were right to receive it at the time, and
should honour it; if it came after the creds were changed, then we definitely
should honour it; and in any case, all that will happen is that the process
will be scrapped before it ever returns to userspace.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-04-30 09:07:13 +10:00
Eric Paris 88c48db978 SELinux: drop secondary_ops->sysctl
We are still calling secondary_ops->sysctl even though the capabilities
module does not define a sysctl operation.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-04-30 08:45:56 +10:00
KaiGai Kohei 8a6f83afd0 Permissive domain in userspace object manager
This patch enables applications to handle permissive domain correctly.

Since the v2.6.26 kernel, SELinux has supported an idea of permissive
domain which allows certain processes to work as if permissive mode,
even if the global setting is enforcing mode.
However, we don't have an application program interface to inform
what domains are permissive one, and what domains are not.
It means applications focuses on SELinux (XACE/SELinux, SE-PostgreSQL
and so on) cannot handle permissive domain correctly.

This patch add the sixth field (flags) on the reply of the /selinux/access
interface which is used to make an access control decision from userspace.
If the first bit of the flags field is positive, it means the required
access control decision is on permissive domain, so application should
allow any required actions, as the kernel doing.

This patch also has a side benefit. The av_decision.flags is set at
context_struct_compute_av(). It enables to check required permissions
without read_lock(&policy_rwlock).

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
--
 security/selinux/avc.c              |    2 +-
 security/selinux/include/security.h |    4 +++-
 security/selinux/selinuxfs.c        |    4 ++--
 security/selinux/ss/services.c      |   30 +++++-------------------------
 4 files changed, 11 insertions(+), 29 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
2009-04-02 09:23:45 +11:00
Paul Moore 58bfbb51ff selinux: Remove the "compat_net" compatibility code
The SELinux "compat_net" is marked as deprecated, the time has come to
finally remove it from the kernel.  Further code simplifications are
likely in the future, but this patch was intended to be a simple,
straight-up removal of the compat_net code.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-03-28 15:01:37 +11:00
Paul Moore 389fb800ac netlabel: Label incoming TCP connections correctly in SELinux
The current NetLabel/SELinux behavior for incoming TCP connections works but
only through a series of happy coincidences that rely on the limited nature of
standard CIPSO (only able to convey MLS attributes) and the write equality
imposed by the SELinux MLS constraints.  The problem is that network sockets
created as the result of an incoming TCP connection were not on-the-wire
labeled based on the security attributes of the parent socket but rather based
on the wire label of the remote peer.  The issue had to do with how IP options
were managed as part of the network stack and where the LSM hooks were in
relation to the code which set the IP options on these newly created child
sockets.  While NetLabel/SELinux did correctly set the socket's on-the-wire
label it was promptly cleared by the network stack and reset based on the IP
options of the remote peer.

This patch, in conjunction with a prior patch that adjusted the LSM hook
locations, works to set the correct on-the-wire label format for new incoming
connections through the security_inet_conn_request() hook.  Besides the
correct behavior there are many advantages to this change, the most significant
is that all of the NetLabel socket labeling code in SELinux now lives in hooks
which can return error codes to the core stack which allows us to finally get
ride of the selinux_netlbl_inode_permission() logic which greatly simplfies
the NetLabel/SELinux glue code.  In the process of developing this patch I
also ran into a small handful of AF_INET6 cleanliness issues that have been
fixed which should make the code safer and easier to extend in the future.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-03-28 15:01:36 +11:00
James Morris 703a3cd728 Merge branch 'master' into next 2009-03-24 10:52:46 +11:00
Eric Paris df7f54c012 SELinux: inode_doinit_with_dentry drop no dentry printk
Drop the printk message when an inode is found without an associated
dentry.  This should only happen when userspace can't be accessing those
inodes and those labels will get set correctly on the next d_instantiate.
Thus there is no reason to send this message.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-03-10 08:40:02 +11:00
Eric Paris dd34b5d75a SELinux: new permission between tty audit and audit socket
New selinux permission to separate the ability to turn on tty auditing from
the ability to set audit rules.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-03-06 08:50:21 +11:00
Eric Paris 6a25b27d60 SELinux: open perm for sock files
When I did open permissions I didn't think any sockets would have an open.
Turns out AF_UNIX sockets can have an open when they are bound to the
filesystem namespace.  This patch adds a new SOCK_FILE__OPEN permission.
It's safe to add this as the open perms are already predicated on
capabilities and capabilities means we have unknown perm handling so
systems should be as backwards compatible as the policy wants them to
be.

https://bugzilla.redhat.com/show_bug.cgi?id=475224

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-03-06 08:50:18 +11:00
Paul Moore d7f59dc464 selinux: Fix a panic in selinux_netlbl_inode_permission()
Rick McNeal from LSI identified a panic in selinux_netlbl_inode_permission()
caused by a certain sequence of SUNRPC operations.  The problem appears to be
due to the lack of NULL pointer checking in the function; this patch adds the
pointer checks so the function will exit safely in the cases where the socket
is not completely initialized.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-03-02 09:30:04 +11:00
Paul Moore 09c50b4a52 selinux: Fix the NetLabel glue code for setsockopt()
At some point we (okay, I) managed to break the ability for users to use the
setsockopt() syscall to set IPv4 options when NetLabel was not active on the
socket in question.  The problem was noticed by someone trying to use the
"-R" (record route) option of ping:

 # ping -R 10.0.0.1
 ping: record route: No message of desired type

The solution is relatively simple, we catch the unlabeled socket case and
clear the error code, allowing the operation to succeed.  Please note that we
still deny users the ability to override IPv4 options on socket's which have
NetLabel labeling active; this is done to ensure the labeling remains intact.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-23 10:05:55 +11:00
Eric Paris 26036651c5 SELinux: convert the avc cache hash list to an hlist
We do not need O(1) access to the tail of the avc cache lists and so we are
wasting lots of space using struct list_head instead of struct hlist_head.
This patch converts the avc cache to use hlists in which there is a single
pointer from the head which saves us about 4k of global memory.

Resulted in about a 1.5% decrease in time spent in avc_has_perm_noaudit based
on oprofile sampling of tbench.  Although likely within the noise....

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:23:48 +11:00
Eric Paris edf3d1aecd SELinux: code readability with avc_cache
The code making use of struct avc_cache was not easy to read thanks to liberal
use of &avc_cache.{slots_lock,slots}[hvalue] throughout.  This patch simply
creates local pointers and uses those instead of the long global names.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:23:45 +11:00
Eric Paris f1c6381a6e SELinux: remove unused av.decided field
It appears there was an intention to have the security server only decide
certain permissions and leave other for later as some sort of a portential
performance win.  We are currently always deciding all 32 bits of
permissions and this is a useless couple of branches and wasted space.
This patch completely drops the av.decided concept.

This in a 17% reduction in the time spent in avc_has_perm_noaudit
based on oprofile sampling of a tbench benchmark.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:23:08 +11:00
Eric Paris 21193dcd1f SELinux: more careful use of avd in avc_has_perm_noaudit
we are often needlessly jumping through hoops when it comes to avd
entries in avc_has_perm_noaudit and we have extra initialization and memcpy
which are just wasting performance.  Try to clean the function up a bit.

This patch resulted in a 13% drop in time spent in avc_has_perm_noaudit in my
oprofile sampling of a tbench benchmark.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:23:04 +11:00
Eric Paris 906d27d9d2 SELinux: remove the unused ae.used
Currently SELinux code has an atomic which was intended to track how many
times an avc entry was used and to evict entries when they haven't been
used recently.  Instead we never let this atomic get above 1 and evict when
it is first checked for eviction since it hits zero.  This is a total waste
of time so I'm completely dropping ae.used.

This change resulted in about a 3% faster avc_has_perm_noaudit when running
oprofile against a tbench benchmark.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:22:37 +11:00
Eric Paris a5dda68332 SELinux: check seqno when updating an avc_node
The avc update node callbacks do not check the seqno of the caller with the
seqno of the node found.  It is possible that a policy change could happen
(although almost impossibly unlikely) in which a permissive or
permissive_domain decision is not valid for the entry found.  Simply pass
and check that the seqno of the caller and the seqno of the node found
match.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:22:34 +11:00
Eric Paris 4cb912f1d1 SELinux: NULL terminate al contexts from disk
When a context is pulled in from disk we don't know that it is null
terminated.  This patch forecebly null terminates contexts when we pull
them from disk.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:22:30 +11:00
Eric Paris 4ba0a8ad63 SELinux: better printk when file with invalid label found
Currently when an inode is read into the kernel with an invalid label
string (can often happen with removable media) we output a string like:

SELinux: inode_doinit_with_dentry:  context_to_sid([SOME INVALID LABEL])
returned -22 dor dev=[blah] ino=[blah]

Which is all but incomprehensible to all but a couple of us.  Instead, on
EINVAL only, I plan to output a much more user friendly string and I plan to
ratelimit the printk since many of these could be generated very rapidly.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:22:27 +11:00
Eric Paris 200ac532a4 SELinux: call capabilities code directory
For cleanliness and efficiency remove all calls to secondary-> and instead
call capabilities code directly.  capabilities are the only module that
selinux stacks with and so the code should not indicate that other stacking
might be possible.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:22:24 +11:00
James Morris cb5629b10d Merge branch 'master' into next
Conflicts:
	fs/namei.c

Manually merged per:

diff --cc fs/namei.c
index 734f2b5,bbc15c2..0000000
--- a/fs/namei.c
+++ b/fs/namei.c
@@@ -860,9 -848,8 +849,10 @@@ static int __link_path_walk(const char
  		nd->flags |= LOOKUP_CONTINUE;
  		err = exec_permission_lite(inode);
  		if (err == -EAGAIN)
- 			err = vfs_permission(nd, MAY_EXEC);
+ 			err = inode_permission(nd->path.dentry->d_inode,
+ 					       MAY_EXEC);
 +		if (!err)
 +			err = ima_path_check(&nd->path, MAY_EXEC);
   		if (err)
  			break;

@@@ -1525,14 -1506,9 +1509,14 @@@ int may_open(struct path *path, int acc
  		flag &= ~O_TRUNC;
  	}

- 	error = vfs_permission(nd, acc_mode);
+ 	error = inode_permission(inode, acc_mode);
  	if (error)
  		return error;
 +
- 	error = ima_path_check(&nd->path,
++	error = ima_path_check(path,
 +			       acc_mode & (MAY_READ | MAY_WRITE | MAY_EXEC));
 +	if (error)
 +		return error;
  	/*
  	 * An append-only file must be opened in append mode for writing.
  	 */

Signed-off-by: James Morris <jmorris@namei.org>
2009-02-06 11:01:45 +11:00
James Morris 5626d3e861 selinux: remove hooks which simply defer to capabilities
Remove SELinux hooks which do nothing except defer to the capabilites
hooks (or in one case, replicates the function).

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
2009-02-02 09:20:34 +11:00
James Morris 95c14904b6 selinux: remove secondary ops call to shm_shmat
Remove secondary ops call to shm_shmat, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:16 +11:00
James Morris 5c4054ccfa selinux: remove secondary ops call to unix_stream_connect
Remove secondary ops call to unix_stream_connect, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:15 +11:00
James Morris 2cbbd19812 selinux: remove secondary ops call to task_kill
Remove secondary ops call to task_kill, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:14 +11:00
James Morris ef76e748fa selinux: remove secondary ops call to task_setrlimit
Remove secondary ops call to task_setrlimit, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:13 +11:00
James Morris ca5143d3ff selinux: remove unused cred_commit hook
Remove unused cred_commit hook from SELinux.   This
currently calls into the capabilities hook, which is a noop.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:12 +11:00
James Morris af294e41d0 selinux: remove secondary ops call to task_create
Remove secondary ops call to task_create, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:11 +11:00
James Morris d541bbee69 selinux: remove secondary ops call to file_mprotect
Remove secondary ops call to file_mprotect, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:11 +11:00
James Morris 438add6b32 selinux: remove secondary ops call to inode_setattr
Remove secondary ops call to inode_setattr, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:10 +11:00
James Morris 188fbcca9d selinux: remove secondary ops call to inode_permission
Remove secondary ops call to inode_permission, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:09 +11:00
James Morris f51115b9ab selinux: remove secondary ops call to inode_follow_link
Remove secondary ops call to inode_follow_link, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:08 +11:00
James Morris dd4907a6d4 selinux: remove secondary ops call to inode_mknod
Remove secondary ops call to inode_mknod, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:07 +11:00
James Morris e4737250b7 selinux: remove secondary ops call to inode_unlink
Remove secondary ops call to inode_unlink, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:06 +11:00
James Morris efdfac4376 selinux: remove secondary ops call to inode_link
Remove secondary ops call to inode_link, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:06 +11:00
James Morris 97422ab9ef selinux: remove secondary ops call to sb_umount
Remove secondary ops call to sb_umount, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:05 +11:00
James Morris ef935b9136 selinux: remove secondary ops call to sb_mount
Remove secondary ops call to sb_mount, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:04 +11:00
James Morris 5565b0b865 selinux: remove secondary ops call to bprm_committed_creds
Remove secondary ops call to bprm_committed_creds, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:03 +11:00
James Morris 2ec5dbe23d selinux: remove secondary ops call to bprm_committing_creds
Remove secondary ops call to bprm_committing_creds, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:02 +11:00
James Morris bc05595845 selinux: remove unused bprm_check_security hook
Remove unused bprm_check_security hook from SELinux.   This
currently calls into the capabilities hook, which is a noop.

Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:01 +11:00
David P. Quigley cd89596f0c SELinux: Unify context mount and genfs behavior
Context mounts and genfs labeled file systems behave differently with respect to
setting file system labels. This patch brings genfs labeled file systems in line
with context mounts in that setxattr calls to them should return EOPNOTSUPP and
fscreate calls will be ignored.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@macbook.localdomain>
2009-01-19 09:47:14 +11:00
David P. Quigley 11689d47f0 SELinux: Add new security mount option to indicate security label support.
There is no easy way to tell if a file system supports SELinux security labeling.
Because of this a new flag is being added to the super block security structure
to indicate that the particular super block supports labeling. This flag is set
for file systems using the xattr, task, and transition labeling methods unless
that behavior is overridden by context mounts.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@macbook.localdomain>
2009-01-19 09:47:06 +11:00
David P. Quigley 0d90a7ec48 SELinux: Condense super block security structure flags and cleanup necessary code.
The super block security structure currently has three fields for what are
essentially flags.  The flags field is used for mount options while two other
char fields are used for initialization and proc flags. These latter two fields are
essentially bit fields since the only used values are 0 and 1.  These fields
have been collapsed into the flags field and new bit masks have been added for
them. The code is also fixed to work with these new flags.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@macbook.localdomain>
2009-01-19 09:46:40 +11:00
James Morris ac8cc0fa53 Merge branch 'next' into for-linus 2009-01-07 09:58:22 +11:00
David Howells 3699c53c48 CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #3]
Fix a regression in cap_capable() due to:

	commit 3b11a1dece
	Author: David Howells <dhowells@redhat.com>
	Date:   Fri Nov 14 10:39:26 2008 +1100

	    CRED: Differentiate objective and effective subjective credentials on a task

The problem is that the above patch allows a process to have two sets of
credentials, and for the most part uses the subjective credentials when
accessing current's creds.

There is, however, one exception: cap_capable(), and thus capable(), uses the
real/objective credentials of the target task, whether or not it is the current
task.

Ordinarily this doesn't matter, since usually the two cred pointers in current
point to the same set of creds.  However, sys_faccessat() makes use of this
facility to override the credentials of the calling process to make its test,
without affecting the creds as seen from other processes.

One of the things sys_faccessat() does is to make an adjustment to the
effective capabilities mask, which cap_capable(), as it stands, then ignores.

The affected capability check is in generic_permission():

	if (!(mask & MAY_EXEC) || execute_ok(inode))
		if (capable(CAP_DAC_OVERRIDE))
			return 0;

This change passes the set of credentials to be tested down into the commoncap
and SELinux code.  The security functions called by capable() and
has_capability() select the appropriate set of credentials from the process
being checked.

This can be tested by compiling the following program from the XFS testsuite:

/*
 *  t_access_root.c - trivial test program to show permission bug.
 *
 *  Written by Michael Kerrisk - copyright ownership not pursued.
 *  Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html
 */
#include <limits.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/stat.h>

#define UID 500
#define GID 100
#define PERM 0
#define TESTPATH "/tmp/t_access"

static void
errExit(char *msg)
{
    perror(msg);
    exit(EXIT_FAILURE);
} /* errExit */

static void
accessTest(char *file, int mask, char *mstr)
{
    printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask));
} /* accessTest */

int
main(int argc, char *argv[])
{
    int fd, perm, uid, gid;
    char *testpath;
    char cmd[PATH_MAX + 20];

    testpath = (argc > 1) ? argv[1] : TESTPATH;
    perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM;
    uid = (argc > 3) ? atoi(argv[3]) : UID;
    gid = (argc > 4) ? atoi(argv[4]) : GID;

    unlink(testpath);

    fd = open(testpath, O_RDWR | O_CREAT, 0);
    if (fd == -1) errExit("open");

    if (fchown(fd, uid, gid) == -1) errExit("fchown");
    if (fchmod(fd, perm) == -1) errExit("fchmod");
    close(fd);

    snprintf(cmd, sizeof(cmd), "ls -l %s", testpath);
    system(cmd);

    if (seteuid(uid) == -1) errExit("seteuid");

    accessTest(testpath, 0, "0");
    accessTest(testpath, R_OK, "R_OK");
    accessTest(testpath, W_OK, "W_OK");
    accessTest(testpath, X_OK, "X_OK");
    accessTest(testpath, R_OK | W_OK, "R_OK | W_OK");
    accessTest(testpath, R_OK | X_OK, "R_OK | X_OK");
    accessTest(testpath, W_OK | X_OK, "W_OK | X_OK");
    accessTest(testpath, R_OK | W_OK | X_OK, "R_OK | W_OK | X_OK");

    exit(EXIT_SUCCESS);
} /* main */

This can be run against an Ext3 filesystem as well as against an XFS
filesystem.  If successful, it will show:

	[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
	---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx
	access(/tmp/xxx, 0) returns 0
	access(/tmp/xxx, R_OK) returns 0
	access(/tmp/xxx, W_OK) returns 0
	access(/tmp/xxx, X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK) returns 0
	access(/tmp/xxx, R_OK | X_OK) returns -1
	access(/tmp/xxx, W_OK | X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

If unsuccessful, it will show:

	[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
	---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx
	access(/tmp/xxx, 0) returns 0
	access(/tmp/xxx, R_OK) returns -1
	access(/tmp/xxx, W_OK) returns -1
	access(/tmp/xxx, X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK) returns -1
	access(/tmp/xxx, R_OK | X_OK) returns -1
	access(/tmp/xxx, W_OK | X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

I've also tested the fix with the SELinux and syscalls LTP testsuites.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-07 09:38:48 +11:00
James Morris 29881c4502 Revert "CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2]"
This reverts commit 14eaddc967.

David has a better version to come.
2009-01-07 09:21:54 +11:00
Linus Torvalds 520c853466 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  inotify: fix type errors in interfaces
  fix breakage in reiserfs_new_inode()
  fix the treatment of jfs special inodes
  vfs: remove duplicate code in get_fs_type()
  add a vfs_fsync helper
  sys_execve and sys_uselib do not call into fsnotify
  zero i_uid/i_gid on inode allocation
  inode->i_op is never NULL
  ntfs: don't NULL i_op
  isofs check for NULL ->i_op in root directory is dead code
  affs: do not zero ->i_op
  kill suid bit only for regular files
  vfs: lseek(fd, 0, SEEK_CUR) race condition
2009-01-05 18:32:06 -08:00
Al Viro 56ff5efad9 zero i_uid/i_gid on inode allocation
... and don't bother in callers.  Don't bother with zeroing i_blocks,
while we are at it - it's already been zeroed.

i_mode is not worth the effort; it has no common default value.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-05 11:54:28 -05:00
Eric Paris 76f7ba35d4 SELinux: shrink sizeof av_inhert selinux_class_perm and context
I started playing with pahole today and decided to put it against the
selinux structures.  Found we could save a little bit of space on x86_64
(and no harm on i686) just reorganizing some structs.

Object size changes:
av_inherit: 24 -> 16
selinux_class_perm: 48 -> 40
context: 80 -> 72

Admittedly there aren't many of av_inherit or selinux_class_perm's in
the kernel (33 and 1 respectively) But the change to the size of struct
context reverberate out a bit.  I can get some hard number if they are
needed, but I don't see why they would be.  We do change which cacheline
context->len and context->str would be on, but I don't see that as a
problem since we are clearly going to have to load both if the context
is to be of any value.  I've run with the patch and don't seem to be
having any problems.

An example of what's going on using struct av_inherit would be:

form: to:
struct av_inherit {			struct av_inherit {
	u16 tclass;				const char **common_pts;
	const char **common_pts;		u32 common_base;
	u32 common_base;			u16 tclass;
};

(notice all I did was move u16 tclass to the end of the struct instead
of the beginning)

Memory layout before the change:
struct av_inherit {
	u16 tclass; /* 2 */
	/* 6 bytes hole */
	const char** common_pts; /* 8 */
	u32 common_base; /* 4 */
	/* 4 byes padding */

	/* size: 24, cachelines: 1 */
	/* sum members: 14, holes: 1, sum holes: 6 */
	/* padding: 4 */
};

Memory layout after the change:
struct av_inherit {
	const char ** common_pts; /* 8 */
	u32 common_base; /* 4 */
	u16 tclass; /* 2 */
	/* 2 bytes padding */

	/* size: 16, cachelines: 1 */
	/* sum members: 14, holes: 0, sum holes: 0 */
	/* padding: 2 */
};

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-05 19:19:55 +11:00
David Howells 14eaddc967 CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2]
Fix a regression in cap_capable() due to:

	commit 5ff7711e635b32f0a1e558227d030c7e45b4a465
	Author: David Howells <dhowells@redhat.com>
	Date:   Wed Dec 31 02:52:28 2008 +0000

	    CRED: Differentiate objective and effective subjective credentials on a task

The problem is that the above patch allows a process to have two sets of
credentials, and for the most part uses the subjective credentials when
accessing current's creds.

There is, however, one exception: cap_capable(), and thus capable(), uses the
real/objective credentials of the target task, whether or not it is the current
task.

Ordinarily this doesn't matter, since usually the two cred pointers in current
point to the same set of creds.  However, sys_faccessat() makes use of this
facility to override the credentials of the calling process to make its test,
without affecting the creds as seen from other processes.

One of the things sys_faccessat() does is to make an adjustment to the
effective capabilities mask, which cap_capable(), as it stands, then ignores.

The affected capability check is in generic_permission():

	if (!(mask & MAY_EXEC) || execute_ok(inode))
		if (capable(CAP_DAC_OVERRIDE))
			return 0;

This change splits capable() from has_capability() down into the commoncap and
SELinux code.  The capable() security op now only deals with the current
process, and uses the current process's subjective creds.  A new security op -
task_capable() - is introduced that can check any task's objective creds.

strictly the capable() security op is superfluous with the presence of the
task_capable() op, however it should be faster to call the capable() op since
two fewer arguments need be passed down through the various layers.

This can be tested by compiling the following program from the XFS testsuite:

/*
 *  t_access_root.c - trivial test program to show permission bug.
 *
 *  Written by Michael Kerrisk - copyright ownership not pursued.
 *  Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html
 */
#include <limits.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/stat.h>

#define UID 500
#define GID 100
#define PERM 0
#define TESTPATH "/tmp/t_access"

static void
errExit(char *msg)
{
    perror(msg);
    exit(EXIT_FAILURE);
} /* errExit */

static void
accessTest(char *file, int mask, char *mstr)
{
    printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask));
} /* accessTest */

int
main(int argc, char *argv[])
{
    int fd, perm, uid, gid;
    char *testpath;
    char cmd[PATH_MAX + 20];

    testpath = (argc > 1) ? argv[1] : TESTPATH;
    perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM;
    uid = (argc > 3) ? atoi(argv[3]) : UID;
    gid = (argc > 4) ? atoi(argv[4]) : GID;

    unlink(testpath);

    fd = open(testpath, O_RDWR | O_CREAT, 0);
    if (fd == -1) errExit("open");

    if (fchown(fd, uid, gid) == -1) errExit("fchown");
    if (fchmod(fd, perm) == -1) errExit("fchmod");
    close(fd);

    snprintf(cmd, sizeof(cmd), "ls -l %s", testpath);
    system(cmd);

    if (seteuid(uid) == -1) errExit("seteuid");

    accessTest(testpath, 0, "0");
    accessTest(testpath, R_OK, "R_OK");
    accessTest(testpath, W_OK, "W_OK");
    accessTest(testpath, X_OK, "X_OK");
    accessTest(testpath, R_OK | W_OK, "R_OK | W_OK");
    accessTest(testpath, R_OK | X_OK, "R_OK | X_OK");
    accessTest(testpath, W_OK | X_OK, "W_OK | X_OK");
    accessTest(testpath, R_OK | W_OK | X_OK, "R_OK | W_OK | X_OK");

    exit(EXIT_SUCCESS);
} /* main */

This can be run against an Ext3 filesystem as well as against an XFS
filesystem.  If successful, it will show:

	[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
	---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx
	access(/tmp/xxx, 0) returns 0
	access(/tmp/xxx, R_OK) returns 0
	access(/tmp/xxx, W_OK) returns 0
	access(/tmp/xxx, X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK) returns 0
	access(/tmp/xxx, R_OK | X_OK) returns -1
	access(/tmp/xxx, W_OK | X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

If unsuccessful, it will show:

	[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
	---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx
	access(/tmp/xxx, 0) returns 0
	access(/tmp/xxx, R_OK) returns -1
	access(/tmp/xxx, W_OK) returns -1
	access(/tmp/xxx, X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK) returns -1
	access(/tmp/xxx, R_OK | X_OK) returns -1
	access(/tmp/xxx, W_OK | X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

I've also tested the fix with the SELinux and syscalls LTP testsuites.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-05 11:17:04 +11:00
Al Viro 5af75d8d58 audit: validate comparison operations, store them in sane form
Don't store the field->op in the messy (and very inconvenient for e.g.
audit_comparator()) form; translate to dense set of values and do full
validation of userland-submitted value while we are at it.

->audit_init_rule() and ->audit_match_rule() get new values now; in-tree
instances updated.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-04 15:14:42 -05:00
Rusty Russell 4f4b6c1a94 cpumask: prepare for iterators to only go to nr_cpu_ids/nr_cpumask_bits.: core
Impact: cleanup

In future, all cpumask ops will only be valid (in general) for bit
numbers < nr_cpu_ids.  So use that instead of NR_CPUS in iterators
and other comparisons.

This is always safe: no cpu number can be >= nr_cpu_ids, and
nr_cpu_ids is initialized to NR_CPUS at boot.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: James Morris <jmorris@namei.org>
Cc: Eric Biederman <ebiederm@xmission.com>
2009-01-01 10:12:15 +10:30
Paul Moore 277d342fc4 selinux: Deprecate and schedule the removal of the the compat_net functionality
This patch is the first step towards removing the old "compat_net" code from
the kernel.  Secmark, the "compat_net" replacement was first introduced in
2.6.18 (September 2006) and the major Linux distributions with SELinux support
have transitioned to Secmark so it is time to start deprecating the "compat_net"
mechanism.  Testing a patched version of 2.6.28-rc6 with the initial release of
Fedora Core 5 did not show any problems when running in enforcing mode.

This patch adds an entry to the feature-removal-schedule.txt file and removes
the SECURITY_SELINUX_ENABLE_SECMARK_DEFAULT configuration option, forcing
Secmark on by default although it can still be disabled at runtime.  The patch
also makes the Secmark permission checks "dynamic" in the sense that they are
only executed when Secmark is configured; this should help prevent problems
with older distributions that have not yet migrated to Secmark.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-12-31 12:54:11 -05:00
Linus Torvalds 0191b625ca Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits)
  net: Allow dependancies of FDDI & Tokenring to be modular.
  igb: Fix build warning when DCA is disabled.
  net: Fix warning fallout from recent NAPI interface changes.
  gro: Fix potential use after free
  sfc: If AN is enabled, always read speed/duplex from the AN advertising bits
  sfc: When disabling the NIC, close the device rather than unregistering it
  sfc: SFT9001: Add cable diagnostics
  sfc: Add support for multiple PHY self-tests
  sfc: Merge top-level functions for self-tests
  sfc: Clean up PHY mode management in loopback self-test
  sfc: Fix unreliable link detection in some loopback modes
  sfc: Generate unique names for per-NIC workqueues
  802.3ad: use standard ethhdr instead of ad_header
  802.3ad: generalize out mac address initializer
  802.3ad: initialize ports LACPDU from const initializer
  802.3ad: remove typedef around ad_system
  802.3ad: turn ports is_individual into a bool
  802.3ad: turn ports is_enabled into a bool
  802.3ad: make ntt bool
  ixgbe: Fix set_ringparam in ixgbe to use the same memory pools.
  ...

Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due
to the conversion to %pI (in this networking merge) and the addition of
doing IPv6 addresses (from the earlier merge of CIFS).
2008-12-28 12:49:40 -08:00
James Morris 7419224691 SELinux: don't check permissions for kernel mounts
Don't bother checking permissions when the kernel performs an
internal mount, as this should always be allowed.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2008-12-20 09:03:39 +11:00
James Morris 12204e24b1 security: pass mount flags to security_sb_kern_mount()
Pass mount flags to security_sb_kern_mount(), so security modules
can determine if a mount operation is being performed by the kernel.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2008-12-20 09:02:39 +11:00
Stephen Smalley 459c19f524 SELinux: correctly detect proc filesystems of the form "proc/foo"
Map all of these proc/ filesystem types to "proc" for the policy lookup at
filesystem mount time.

Signed-off-by: James Morris <jmorris@namei.org>
2008-12-20 09:01:03 +11:00
David Howells 3a3b7ce933 CRED: Allow kernel services to override LSM settings for task actions
Allow kernel services to override LSM settings appropriate to the actions
performed by a task by duplicating a set of credentials, modifying it and then
using task_struct::cred to point to it when performing operations on behalf of
a task.

This is used, for example, by CacheFiles which has to transparently access the
cache on behalf of a process that thinks it is doing, say, NFS accesses with a
potentially inappropriate (with respect to accessing the cache) set of
credentials.

This patch provides two LSM hooks for modifying a task security record:

 (*) security_kernel_act_as() which allows modification of the security datum
     with which a task acts on other objects (most notably files).

 (*) security_kernel_create_files_as() which allows modification of the
     security datum that is used to initialise the security data on a file that
     a task creates.

The patch also provides four new credentials handling functions, which wrap the
LSM functions:

 (1) prepare_kernel_cred()

     Prepare a set of credentials for a kernel service to use, based either on
     a daemon's credentials or on init_cred.  All the keyrings are cleared.

 (2) set_security_override()

     Set the LSM security ID in a set of credentials to a specific security
     context, assuming permission from the LSM policy.

 (3) set_security_override_from_ctx()

     As (2), but takes the security context as a string.

 (4) set_create_files_as()

     Set the file creation LSM security ID in a set of credentials to be the
     same as that on a particular inode.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> [Smack changes]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:28 +11:00
David Howells 1bfdc75ae0 CRED: Add a kernel_service object class to SELinux
Add a 'kernel_service' object class to SELinux and give this object class two
access vectors: 'use_as_override' and 'create_files_as'.

The first vector is used to grant a process the right to nominate an alternate
process security ID for the kernel to use as an override for the SELinux
subjective security when accessing stuff on behalf of another process.

For example, CacheFiles when accessing the cache on behalf on a process
accessing an NFS file needs to use a subjective security ID appropriate to the
cache rather then the one the calling process is using.  The cachefilesd
daemon will nominate the security ID to be used.

The second vector is used to grant a process the right to nominate a file
creation label for a kernel service to use.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:27 +11:00
David Howells 3b11a1dece CRED: Differentiate objective and effective subjective credentials on a task
Differentiate the objective and real subjective credentials from the effective
subjective credentials on a task by introducing a second credentials pointer
into the task_struct.

task_struct::real_cred then refers to the objective and apparent real
subjective credentials of a task, as perceived by the other tasks in the
system.

task_struct::cred then refers to the effective subjective credentials of a
task, as used by that task when it's actually running.  These are not visible
to the other tasks in the system.

__task_cred(task) then refers to the objective/real credentials of the task in
question.

current_cred() refers to the effective subjective credentials of the current
task.

prepare_creds() uses the objective creds as a base and commit_creds() changes
both pointers in the task_struct (indeed commit_creds() requires them to be the
same).

override_creds() and revert_creds() change the subjective creds pointer only,
and the former returns the old subjective creds.  These are used by NFSD,
faccessat() and do_coredump(), and will by used by CacheFiles.

In SELinux, current_has_perm() is provided as an alternative to
task_has_perm().  This uses the effective subjective context of current,
whereas task_has_perm() uses the objective/real context of the subject.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:26 +11:00
David Howells a6f76f23d2 CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.

This patch and the preceding patches have been tested with the LTP SELinux
testsuite.

This patch makes several logical sets of alteration:

 (1) execve().

     The credential bits from struct linux_binprm are, for the most part,
     replaced with a single credentials pointer (bprm->cred).  This means that
     all the creds can be calculated in advance and then applied at the point
     of no return with no possibility of failure.

     I would like to replace bprm->cap_effective with:

	cap_isclear(bprm->cap_effective)

     but this seems impossible due to special behaviour for processes of pid 1
     (they always retain their parent's capability masks where normally they'd
     be changed - see cap_bprm_set_creds()).

     The following sequence of events now happens:

     (a) At the start of do_execve, the current task's cred_exec_mutex is
     	 locked to prevent PTRACE_ATTACH from obsoleting the calculation of
     	 creds that we make.

     (a) prepare_exec_creds() is then called to make a copy of the current
     	 task's credentials and prepare it.  This copy is then assigned to
     	 bprm->cred.

  	 This renders security_bprm_alloc() and security_bprm_free()
     	 unnecessary, and so they've been removed.

     (b) The determination of unsafe execution is now performed immediately
     	 after (a) rather than later on in the code.  The result is stored in
     	 bprm->unsafe for future reference.

     (c) prepare_binprm() is called, possibly multiple times.

     	 (i) This applies the result of set[ug]id binaries to the new creds
     	     attached to bprm->cred.  Personality bit clearance is recorded,
     	     but now deferred on the basis that the exec procedure may yet
     	     fail.

         (ii) This then calls the new security_bprm_set_creds().  This should
	     calculate the new LSM and capability credentials into *bprm->cred.

	     This folds together security_bprm_set() and parts of
	     security_bprm_apply_creds() (these two have been removed).
	     Anything that might fail must be done at this point.

         (iii) bprm->cred_prepared is set to 1.

	     bprm->cred_prepared is 0 on the first pass of the security
	     calculations, and 1 on all subsequent passes.  This allows SELinux
	     in (ii) to base its calculations only on the initial script and
	     not on the interpreter.

     (d) flush_old_exec() is called to commit the task to execution.  This
     	 performs the following steps with regard to credentials:

	 (i) Clear pdeath_signal and set dumpable on certain circumstances that
	     may not be covered by commit_creds().

         (ii) Clear any bits in current->personality that were deferred from
             (c.i).

     (e) install_exec_creds() [compute_creds() as was] is called to install the
     	 new credentials.  This performs the following steps with regard to
     	 credentials:

         (i) Calls security_bprm_committing_creds() to apply any security
             requirements, such as flushing unauthorised files in SELinux, that
             must be done before the credentials are changed.

	     This is made up of bits of security_bprm_apply_creds() and
	     security_bprm_post_apply_creds(), both of which have been removed.
	     This function is not allowed to fail; anything that might fail
	     must have been done in (c.ii).

         (ii) Calls commit_creds() to apply the new credentials in a single
             assignment (more or less).  Possibly pdeath_signal and dumpable
             should be part of struct creds.

	 (iii) Unlocks the task's cred_replace_mutex, thus allowing
	     PTRACE_ATTACH to take place.

         (iv) Clears The bprm->cred pointer as the credentials it was holding
             are now immutable.

         (v) Calls security_bprm_committed_creds() to apply any security
             alterations that must be done after the creds have been changed.
             SELinux uses this to flush signals and signal handlers.

     (f) If an error occurs before (d.i), bprm_free() will call abort_creds()
     	 to destroy the proposed new credentials and will then unlock
     	 cred_replace_mutex.  No changes to the credentials will have been
     	 made.

 (2) LSM interface.

     A number of functions have been changed, added or removed:

     (*) security_bprm_alloc(), ->bprm_alloc_security()
     (*) security_bprm_free(), ->bprm_free_security()

     	 Removed in favour of preparing new credentials and modifying those.

     (*) security_bprm_apply_creds(), ->bprm_apply_creds()
     (*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()

     	 Removed; split between security_bprm_set_creds(),
     	 security_bprm_committing_creds() and security_bprm_committed_creds().

     (*) security_bprm_set(), ->bprm_set_security()

     	 Removed; folded into security_bprm_set_creds().

     (*) security_bprm_set_creds(), ->bprm_set_creds()

     	 New.  The new credentials in bprm->creds should be checked and set up
     	 as appropriate.  bprm->cred_prepared is 0 on the first call, 1 on the
     	 second and subsequent calls.

     (*) security_bprm_committing_creds(), ->bprm_committing_creds()
     (*) security_bprm_committed_creds(), ->bprm_committed_creds()

     	 New.  Apply the security effects of the new credentials.  This
     	 includes closing unauthorised files in SELinux.  This function may not
     	 fail.  When the former is called, the creds haven't yet been applied
     	 to the process; when the latter is called, they have.

 	 The former may access bprm->cred, the latter may not.

 (3) SELinux.

     SELinux has a number of changes, in addition to those to support the LSM
     interface changes mentioned above:

     (a) The bprm_security_struct struct has been removed in favour of using
     	 the credentials-under-construction approach.

     (c) flush_unauthorized_files() now takes a cred pointer and passes it on
     	 to inode_has_perm(), file_has_perm() and dentry_open().

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:24 +11:00
David Howells d84f4f992c CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management.  This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.

A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().

With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:

	struct cred *new = prepare_creds();
	int ret = blah(new);
	if (ret < 0) {
		abort_creds(new);
		return ret;
	}
	return commit_creds(new);

There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.

To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const.  The purpose of this is compile-time
discouragement of altering credentials through those pointers.  Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:

  (1) Its reference count may incremented and decremented.

  (2) The keyrings to which it points may be modified, but not replaced.

The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).

This patch and the preceding patches have been tested with the LTP SELinux
testsuite.

This patch makes several logical sets of alteration:

 (1) execve().

     This now prepares and commits credentials in various places in the
     security code rather than altering the current creds directly.

 (2) Temporary credential overrides.

     do_coredump() and sys_faccessat() now prepare their own credentials and
     temporarily override the ones currently on the acting thread, whilst
     preventing interference from other threads by holding cred_replace_mutex
     on the thread being dumped.

     This will be replaced in a future patch by something that hands down the
     credentials directly to the functions being called, rather than altering
     the task's objective credentials.

 (3) LSM interface.

     A number of functions have been changed, added or removed:

     (*) security_capset_check(), ->capset_check()
     (*) security_capset_set(), ->capset_set()

     	 Removed in favour of security_capset().

     (*) security_capset(), ->capset()

     	 New.  This is passed a pointer to the new creds, a pointer to the old
     	 creds and the proposed capability sets.  It should fill in the new
     	 creds or return an error.  All pointers, barring the pointer to the
     	 new creds, are now const.

     (*) security_bprm_apply_creds(), ->bprm_apply_creds()

     	 Changed; now returns a value, which will cause the process to be
     	 killed if it's an error.

     (*) security_task_alloc(), ->task_alloc_security()

     	 Removed in favour of security_prepare_creds().

     (*) security_cred_free(), ->cred_free()

     	 New.  Free security data attached to cred->security.

     (*) security_prepare_creds(), ->cred_prepare()

     	 New. Duplicate any security data attached to cred->security.

     (*) security_commit_creds(), ->cred_commit()

     	 New. Apply any security effects for the upcoming installation of new
     	 security by commit_creds().

     (*) security_task_post_setuid(), ->task_post_setuid()

     	 Removed in favour of security_task_fix_setuid().

     (*) security_task_fix_setuid(), ->task_fix_setuid()

     	 Fix up the proposed new credentials for setuid().  This is used by
     	 cap_set_fix_setuid() to implicitly adjust capabilities in line with
     	 setuid() changes.  Changes are made to the new credentials, rather
     	 than the task itself as in security_task_post_setuid().

     (*) security_task_reparent_to_init(), ->task_reparent_to_init()

     	 Removed.  Instead the task being reparented to init is referred
     	 directly to init's credentials.

	 NOTE!  This results in the loss of some state: SELinux's osid no
	 longer records the sid of the thread that forked it.

     (*) security_key_alloc(), ->key_alloc()
     (*) security_key_permission(), ->key_permission()

     	 Changed.  These now take cred pointers rather than task pointers to
     	 refer to the security context.

 (4) sys_capset().

     This has been simplified and uses less locking.  The LSM functions it
     calls have been merged.

 (5) reparent_to_kthreadd().

     This gives the current thread the same credentials as init by simply using
     commit_thread() to point that way.

 (6) __sigqueue_alloc() and switch_uid()

     __sigqueue_alloc() can't stop the target task from changing its creds
     beneath it, so this function gets a reference to the currently applicable
     user_struct which it then passes into the sigqueue struct it returns if
     successful.

     switch_uid() is now called from commit_creds(), and possibly should be
     folded into that.  commit_creds() should take care of protecting
     __sigqueue_alloc().

 (7) [sg]et[ug]id() and co and [sg]et_current_groups.

     The set functions now all use prepare_creds(), commit_creds() and
     abort_creds() to build and check a new set of credentials before applying
     it.

     security_task_set[ug]id() is called inside the prepared section.  This
     guarantees that nothing else will affect the creds until we've finished.

     The calling of set_dumpable() has been moved into commit_creds().

     Much of the functionality of set_user() has been moved into
     commit_creds().

     The get functions all simply access the data directly.

 (8) security_task_prctl() and cap_task_prctl().

     security_task_prctl() has been modified to return -ENOSYS if it doesn't
     want to handle a function, or otherwise return the return value directly
     rather than through an argument.

     Additionally, cap_task_prctl() now prepares a new set of credentials, even
     if it doesn't end up using it.

 (9) Keyrings.

     A number of changes have been made to the keyrings code:

     (a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
     	 all been dropped and built in to the credentials functions directly.
     	 They may want separating out again later.

     (b) key_alloc() and search_process_keyrings() now take a cred pointer
     	 rather than a task pointer to specify the security context.

     (c) copy_creds() gives a new thread within the same thread group a new
     	 thread keyring if its parent had one, otherwise it discards the thread
     	 keyring.

     (d) The authorisation key now points directly to the credentials to extend
     	 the search into rather pointing to the task that carries them.

     (e) Installing thread, process or session keyrings causes a new set of
     	 credentials to be created, even though it's not strictly necessary for
     	 process or session keyrings (they're shared).

(10) Usermode helper.

     The usermode helper code now carries a cred struct pointer in its
     subprocess_info struct instead of a new session keyring pointer.  This set
     of credentials is derived from init_cred and installed on the new process
     after it has been cloned.

     call_usermodehelper_setup() allocates the new credentials and
     call_usermodehelper_freeinfo() discards them if they haven't been used.  A
     special cred function (prepare_usermodeinfo_creds()) is provided
     specifically for call_usermodehelper_setup() to call.

     call_usermodehelper_setkeys() adjusts the credentials to sport the
     supplied keyring as the new session keyring.

(11) SELinux.

     SELinux has a number of changes, in addition to those to support the LSM
     interface changes mentioned above:

     (a) selinux_setprocattr() no longer does its check for whether the
     	 current ptracer can access processes with the new SID inside the lock
     	 that covers getting the ptracer's SID.  Whilst this lock ensures that
     	 the check is done with the ptracer pinned, the result is only valid
     	 until the lock is released, so there's no point doing it inside the
     	 lock.

(12) is_single_threaded().

     This function has been extracted from selinux_setprocattr() and put into
     a file of its own in the lib/ directory as join_session_keyring() now
     wants to use it too.

     The code in SELinux just checked to see whether a task shared mm_structs
     with other tasks (CLONE_VM), but that isn't good enough.  We really want
     to know if they're part of the same thread group (CLONE_THREAD).

(13) nfsd.

     The NFS server daemon now has to use the COW credentials to set the
     credentials it is going to use.  It really needs to pass the credentials
     down to the functions it calls, but it can't do that until other patches
     in this series have been applied.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:23 +11:00
David Howells 745ca2475a CRED: Pass credentials through dentry_open()
Pass credentials through dentry_open() so that the COW creds patch can have
SELinux's flush_unauthorized_files() pass the appropriate creds back to itself
when it opens its null chardev.

The security_dentry_open() call also now takes a creds pointer, as does the
dentry_open hook in struct security_operations.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:22 +11:00
David Howells 88e67f3b88 CRED: Make inode_has_perm() and file_has_perm() take a cred pointer
Make inode_has_perm() and file_has_perm() take a cred pointer rather than a
task pointer.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:21 +11:00
David Howells 275bb41e9d CRED: Wrap access to SELinux's task SID
Wrap access to SELinux's task SID, using task_sid() and current_sid() as
appropriate.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:19 +11:00
David Howells c69e8d9c01 CRED: Use RCU to access another task's creds and to release a task's own creds
Use RCU to access another task's creds and to release a task's own creds.
This means that it will be possible for the credentials of a task to be
replaced without another task (a) requiring a full lock to read them, and (b)
seeing deallocated memory.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:19 +11:00
David Howells 86a264abe5 CRED: Wrap current->cred and a few other accessors
Wrap current->cred and a few other accessors to hide their actual
implementation.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:18 +11:00
David Howells f1752eec61 CRED: Detach the credentials from task_struct
Detach the credentials from task_struct, duplicating them in copy_process()
and releasing them in __put_task_struct().

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:17 +11:00
David Howells b6dff3ec5e CRED: Separate task security context from task_struct
Separate the task security context from task_struct.  At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.

Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.

With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:16 +11:00
David Howells 15a2460ed0 CRED: Constify the kernel_cap_t arguments to the capset LSM hooks
Constify the kernel_cap_t arguments to the capset LSM hooks.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:15 +11:00
David Howells 1cdcbec1a3 CRED: Neuter sys_capset()
Take away the ability for sys_capset() to affect processes other than current.

This means that current will not need to lock its own credentials when reading
them against interference by other processes.

This has effectively been the case for a while anyway, since:

 (1) Without LSM enabled, sys_capset() is disallowed.

 (2) With file-based capabilities, sys_capset() is neutered.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:14 +11:00
Eric Paris 066746796b Currently SELinux jumps through some ugly hoops to not audit a capbility
check when determining if a process has additional powers to override
memory limits or when trying to read/write illegal file labels.  Use
the new noaudit call instead.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-11 22:02:57 +11:00
Eric Paris 06112163f5 Add a new capable interface that will be used by systems that use audit to
make an A or B type decision instead of a security decision.  Currently
this is the case at least for filesystems when deciding if a process can use
the reserved 'root' blocks and for the case of things like the oom
algorithm determining if processes are root processes and should be less
likely to be killed.  These types of security system requests should not be
audited or logged since they are not really security decisions.  It would be
possible to solve this problem like the vm_enough_memory security check did
by creating a new LSM interface and moving all of the policy into that
interface but proves the needlessly bloat the LSM and provide complex
indirection.

This merely allows those decisions to be made where they belong and to not
flood logs or printk with denials for thing that are not security decisions.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-11 22:02:50 +11:00
Eric Paris 39c9aede2b SELinux: Use unknown perm handling to handle unknown netlink msg types
Currently when SELinux has not been updated to handle a netlink message
type the operation is denied with EINVAL.  This patch will leave the
audit/warning message so things get fixed but if policy chose to allow
unknowns this will allow the netlink operation.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-09 07:33:18 +08:00
David S. Miller 9eeda9abd1 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/ath5k/base.c
	net/8021q/vlan_core.c
2008-11-06 22:43:03 -08:00
James Morris e21e696edb Merge branch 'master' into next 2008-11-06 07:12:34 +08:00
Michal Schmidt 2f99db28af selinux: recognize netlink messages for 'ip addrlabel'
In enforcing mode '/sbin/ip addrlabel' results in a SELinux error:
type=SELINUX_ERR msg=audit(1225698822.073:42): SELinux:  unrecognized
netlink message type=74 for sclass=43

The problem is missing RTM_*ADDRLABEL entries in SELinux's netlink
message types table.

Reported in https://bugzilla.redhat.com/show_bug.cgi?id=469423

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-06 07:08:36 +08:00
Eric Paris 41d9f9c524 SELinux: hold tasklist_lock and siglock while waking wait_chldexit
SELinux has long been calling wake_up_interruptible() on
current->parent->signal->wait_chldexit without holding any locks.  It
appears that this operation should hold the tasklist_lock to dereference
current->parent and we should hold the siglock when waking up the
signal->wait_chldexit.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-05 08:44:11 +11:00
Eric Paris 37dd0bd04a SELinux: properly handle empty tty_files list
SELinux has wrongly (since 2004) had an incorrect test for an empty
tty->tty_files list.  With an empty list selinux would be pointing to part
of the tty struct itself and would then proceed to dereference that value
and again dereference that result.  An F10 change to plymouth on a ppc64
system is actually currently triggering this bug.  This patch uses
list_empty() to handle empty lists rather than looking at a meaningless
location.

[note, this fixes the oops reported in
https://bugzilla.redhat.com/show_bug.cgi?id=469079]

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-01 09:38:48 +11:00
Harvey Harrison 3685f25de1 misc: replace NIPQUAD()
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:56:49 -07:00
Eric Paris 8b6a5a37f8 SELinux: check open perms in dentry_open not inode_permission
Some operations, like searching a directory path or connecting a unix domain
socket, make explicit calls into inode_permission.  Our choices are to
either try to come up with a signature for all of the explicit calls to
inode_permission and do not check open on those, or to move the open checks to
dentry_open where we know this is always an open operation.  This patch moves
the checks to dentry_open.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-10-31 02:00:52 +11:00
Harvey Harrison 5b095d9892 net: replace %p6 with %pI6
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-29 12:52:50 -07:00
Harvey Harrison 1afa67f5e7 misc: replace NIP6_FMT with %p6 format specifier
The iscsi_ibft.c changes are almost certainly a bugfix as the
pointer 'ip' is a u8 *, so they never print the last 8 bytes
of the IPv6 address, and the eight bytes they do print have
a zero byte with them in each 16-bit word.

Other than that, this should cause no difference in functionality.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 16:06:44 -07:00
Alexey Dobriyan def8b4faff net: reduce structures when XFRM=n
ifdef out
* struct sk_buff::sp		(pointer)
* struct dst_entry::xfrm	(pointer)
* struct sock::sk_policy	(2 pointers)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 13:24:06 -07:00
Thomas Gleixner c465a76af6 Merge branches 'timers/clocksource', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/debug' into v28-timers-for-linus 2008-10-20 13:14:06 +02:00
Steven Whitehouse a447c09324 vfs: Use const for kernel parser table
This is a much better version of a previous patch to make the parser
tables constant. Rather than changing the typedef, we put the "const" in
all the various places where its required, allowing the __initconst
exception for nfsroot which was the cause of the previous trouble.

This was posted for review some time ago and I believe its been in -mm
since then.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-13 10:10:37 -07:00
Linus Torvalds 8d71ff0bef Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (24 commits)
  integrity: special fs magic
  As pointed out by Jonathan Corbet, the timer must be deleted before
  ERROR: code indent should use tabs where possible
  The tpm_dev_release function is only called for platform devices, not pnp
  Protect tpm_chip_list when transversing it.
  Renames num_open to is_open, as only one process can open the file at a time.
  Remove the BKL calls from the TPM driver, which were added in the overall
  netlabel: Add configuration support for local labeling
  cipso: Add support for native local labeling and fixup mapping names
  netlabel: Changes to the NetLabel security attributes to allow LSMs to pass full contexts
  selinux: Cache NetLabel secattrs in the socket's security struct
  selinux: Set socket NetLabel based on connection endpoint
  netlabel: Add functionality to set the security attributes of a packet
  netlabel: Add network address selectors to the NetLabel/LSM domain mapping
  netlabel: Add a generic way to create ordered linked lists of network addrs
  netlabel: Replace protocol/NetLabel linking with refrerence counts
  smack: Fix missing calls to netlbl_skbuff_err()
  selinux: Fix missing calls to netlbl_skbuff_err()
  selinux: Fix a problem in security_netlbl_sid_to_secattr()
  selinux: Better local/forward check in selinux_ip_postroute()
  ...
2008-10-13 10:00:44 -07:00
Alan Cox 934e6ebf96 tty: Redo current tty locking
Currently it is sometimes locked by the tty mutex and sometimes by the
sighand lock. The latter is in fact correct and now we can hand back referenced
objects we can fix this up without problems around sleeping functions.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-13 09:51:41 -07:00
Alan Cox 452a00d2ee tty: Make get_current_tty use a kref
We now return a kref covered tty reference. That ensures the tty structure
doesn't go away when you have a return from get_current_tty. This is not
enough to protect you from most of the resources being freed behind your
back - yet.

[Updated to include fixes for SELinux problems found by Andrew Morton and
 an s390 leak found while debugging the former]

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-13 09:51:41 -07:00
James Morris 0da939b005 Merge branch 'master' of git://git.infradead.org/users/pcmoore/lblnet-2.6_next into next 2008-10-11 09:26:14 +11:00
Paul Moore 8d75899d03 netlabel: Changes to the NetLabel security attributes to allow LSMs to pass full contexts
This patch provides support for including the LSM's secid in addition to
the LSM's MLS information in the NetLabel security attributes structure.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:33 -04:00
Paul Moore 6c5b3fc014 selinux: Cache NetLabel secattrs in the socket's security struct
Previous work enabled the use of address based NetLabel selectors, which
while highly useful, brought the potential for additional per-packet overhead
when used.  This patch attempts to mitigate some of that overhead by caching
the NetLabel security attribute struct within the SELinux socket security
structure.  This should help eliminate the need to recreate the NetLabel
secattr structure for each packet resulting in less overhead.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:33 -04:00
Paul Moore 014ab19a69 selinux: Set socket NetLabel based on connection endpoint
Previous work enabled the use of address based NetLabel selectors, which while
highly useful, brought the potential for additional per-packet overhead when
used.  This patch attempts to solve that by applying NetLabel socket labels
when sockets are connect()'d.  This should alleviate the per-packet NetLabel
labeling for all connected sockets (yes, it even works for connected DGRAM
sockets).

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:33 -04:00
Paul Moore 948bf85c1b netlabel: Add functionality to set the security attributes of a packet
This patch builds upon the new NetLabel address selector functionality by
providing the NetLabel KAPI and CIPSO engine support needed to enable the
new packet-based labeling.  The only new addition to the NetLabel KAPI at
this point is shown below:

 * int netlbl_skbuff_setattr(skb, family, secattr)

... and is designed to be called from a Netfilter hook after the packet's
IP header has been populated such as in the FORWARD or LOCAL_OUT hooks.

This patch also provides the necessary SELinux hooks to support this new
functionality.  Smack support is not currently included due to uncertainty
regarding the permissions needed to expand the Smack network access controls.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:32 -04:00
Paul Moore dfaebe9825 selinux: Fix missing calls to netlbl_skbuff_err()
At some point I think I messed up and dropped the calls to netlbl_skbuff_err()
which are necessary for CIPSO to send error notifications to remote systems.
This patch re-introduces the error handling calls into the SELinux code.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:31 -04:00
Paul Moore 99d854d231 selinux: Fix a problem in security_netlbl_sid_to_secattr()
Currently when SELinux fails to allocate memory in
security_netlbl_sid_to_secattr() the NetLabel LSM domain field is set to
NULL which triggers the default NetLabel LSM domain mapping which may not
always be the desired mapping.  This patch fixes this by returning an error
when the kernel is unable to allocate memory.  This could result in more
failures on a system with heavy memory pressure but it is the "correct"
thing to do.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:30 -04:00
Paul Moore d8395c876b selinux: Better local/forward check in selinux_ip_postroute()
It turns out that checking to see if skb->sk is NULL is not a very good
indicator of a forwarded packet as some locally generated packets also have
skb->sk set to NULL.  Fix this by not only checking the skb->sk field but also
the IP[6]CB(skb)->flags field for the IP[6]SKB_FORWARDED flag.  While we are
at it, we are calling selinux_parse_skb() much earlier than we really should
resulting in potentially wasted cycles parsing packets for information we
might no use; so shuffle the code around a bit to fix this.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:30 -04:00
Paul Moore aa86290089 selinux: Correctly handle IPv4 packets on IPv6 sockets in all cases
We did the right thing in a few cases but there were several areas where we
determined a packet's address family based on the socket's address family which
is not the right thing to do since we can get IPv4 packets on IPv6 sockets.
This patch fixes these problems by either taking the address family directly
from the packet.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:29 -04:00
Paul Moore accc609322 selinux: Cleanup the NetLabel glue code
We were doing a lot of extra work in selinux_netlbl_sock_graft() what wasn't
necessary so this patch removes that code.  It also removes the redundant
second argument to selinux_netlbl_sock_setsid() which allows us to simplify a
few other functions.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:29 -04:00
Paul Moore 3040a6d5a2 selinux: Fix an uninitialized variable BUG/panic in selinux_secattr_to_sid()
At some point during the 2.6.27 development cycle two new fields were added
to the SELinux context structure, a string pointer and a length field.  The
code in selinux_secattr_to_sid() was not modified and as a result these two
fields were left uninitialized which could result in erratic behavior,
including kernel panics, when NetLabel is used.  This patch fixes the
problem by fully initializing the context in selinux_secattr_to_sid() before
use and reducing the level of direct context manipulation done to help
prevent future problems.

Please apply this to the 2.6.27-rcX release stream.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-10-04 08:25:18 +10:00
Paul Moore 81990fbdd1 selinux: Fix an uninitialized variable BUG/panic in selinux_secattr_to_sid()
At some point during the 2.6.27 development cycle two new fields were added
to the SELinux context structure, a string pointer and a length field.  The
code in selinux_secattr_to_sid() was not modified and as a result these two
fields were left uninitialized which could result in erratic behavior,
including kernel panics, when NetLabel is used.  This patch fixes the
problem by fully initializing the context in selinux_secattr_to_sid() before
use and reducing the level of direct context manipulation done to help
prevent future problems.

Please apply this to the 2.6.27-rcX release stream.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-10-04 08:18:18 +10:00
Stephen Smalley ea6b184f7d selinux: use default proc sid on symlinks
As we are not concerned with fine-grained control over reading of
symlinks in proc, always use the default proc SID for all proc symlinks.
This should help avoid permission issues upon changes to the proc tree
as in the /proc/net -> /proc/self/net example.
This does not alter labeling of symlinks within /proc/pid directories.
ls -Zd /proc/net output before and after the patch should show the difference.

Signed-off-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-09-30 00:26:53 +10:00
James Morris ab2b49518e Merge branch 'master' into next
Conflicts:

	MAINTAINERS

Thanks for breaking my tree :-)

Signed-off-by: James Morris <jmorris@namei.org>
2008-09-21 17:41:56 -07:00
Frank Mayhar f06febc96b timers: fix itimer/many thread hang
Overview

This patch reworks the handling of POSIX CPU timers, including the
ITIMER_PROF, ITIMER_VIRT timers and rlimit handling.  It was put together
with the help of Roland McGrath, the owner and original writer of this code.

The problem we ran into, and the reason for this rework, has to do with using
a profiling timer in a process with a large number of threads.  It appears
that the performance of the old implementation of run_posix_cpu_timers() was
at least O(n*3) (where "n" is the number of threads in a process) or worse.
Everything is fine with an increasing number of threads until the time taken
for that routine to run becomes the same as or greater than the tick time, at
which point things degrade rather quickly.

This patch fixes bug 9906, "Weird hang with NPTL and SIGPROF."

Code Changes

This rework corrects the implementation of run_posix_cpu_timers() to make it
run in constant time for a particular machine.  (Performance may vary between
one machine and another depending upon whether the kernel is built as single-
or multiprocessor and, in the latter case, depending upon the number of
running processors.)  To do this, at each tick we now update fields in
signal_struct as well as task_struct.  The run_posix_cpu_timers() function
uses those fields to make its decisions.

We define a new structure, "task_cputime," to contain user, system and
scheduler times and use these in appropriate places:

struct task_cputime {
	cputime_t utime;
	cputime_t stime;
	unsigned long long sum_exec_runtime;
};

This is included in the structure "thread_group_cputime," which is a new
substructure of signal_struct and which varies for uniprocessor versus
multiprocessor kernels.  For uniprocessor kernels, it uses "task_cputime" as
a simple substructure, while for multiprocessor kernels it is a pointer:

struct thread_group_cputime {
	struct task_cputime totals;
};

struct thread_group_cputime {
	struct task_cputime *totals;
};

We also add a new task_cputime substructure directly to signal_struct, to
cache the earliest expiration of process-wide timers, and task_cputime also
replaces the it_*_expires fields of task_struct (used for earliest expiration
of thread timers).  The "thread_group_cputime" structure contains process-wide
timers that are updated via account_user_time() and friends.  In the non-SMP
case the structure is a simple aggregator; unfortunately in the SMP case that
simplicity was not achievable due to cache-line contention between CPUs (in
one measured case performance was actually _worse_ on a 16-cpu system than
the same test on a 4-cpu system, due to this contention).  For SMP, the
thread_group_cputime counters are maintained as a per-cpu structure allocated
using alloc_percpu().  The timer functions update only the timer field in
the structure corresponding to the running CPU, obtained using per_cpu_ptr().

We define a set of inline functions in sched.h that we use to maintain the
thread_group_cputime structure and hide the differences between UP and SMP
implementations from the rest of the kernel.  The thread_group_cputime_init()
function initializes the thread_group_cputime structure for the given task.
The thread_group_cputime_alloc() is a no-op for UP; for SMP it calls the
out-of-line function thread_group_cputime_alloc_smp() to allocate and fill
in the per-cpu structures and fields.  The thread_group_cputime_free()
function, also a no-op for UP, in SMP frees the per-cpu structures.  The
thread_group_cputime_clone_thread() function (also a UP no-op) for SMP calls
thread_group_cputime_alloc() if the per-cpu structures haven't yet been
allocated.  The thread_group_cputime() function fills the task_cputime
structure it is passed with the contents of the thread_group_cputime fields;
in UP it's that simple but in SMP it must also safely check that tsk->signal
is non-NULL (if it is it just uses the appropriate fields of task_struct) and,
if so, sums the per-cpu values for each online CPU.  Finally, the three
functions account_group_user_time(), account_group_system_time() and
account_group_exec_runtime() are used by timer functions to update the
respective fields of the thread_group_cputime structure.

Non-SMP operation is trivial and will not be mentioned further.

The per-cpu structure is always allocated when a task creates its first new
thread, via a call to thread_group_cputime_clone_thread() from copy_signal().
It is freed at process exit via a call to thread_group_cputime_free() from
cleanup_signal().

All functions that formerly summed utime/stime/sum_sched_runtime values from
from all threads in the thread group now use thread_group_cputime() to
snapshot the values in the thread_group_cputime structure or the values in
the task structure itself if the per-cpu structure hasn't been allocated.

Finally, the code in kernel/posix-cpu-timers.c has changed quite a bit.
The run_posix_cpu_timers() function has been split into a fast path and a
slow path; the former safely checks whether there are any expired thread
timers and, if not, just returns, while the slow path does the heavy lifting.
With the dedicated thread group fields, timers are no longer "rebalanced" and
the process_timer_rebalance() function and related code has gone away.  All
summing loops are gone and all code that used them now uses the
thread_group_cputime() inline.  When process-wide timers are set, the new
task_cputime structure in signal_struct is used to cache the earliest
expiration; this is checked in the fast path.

Performance

The fix appears not to add significant overhead to existing operations.  It
generally performs the same as the current code except in two cases, one in
which it performs slightly worse (Case 5 below) and one in which it performs
very significantly better (Case 2 below).  Overall it's a wash except in those
two cases.

I've since done somewhat more involved testing on a dual-core Opteron system.

Case 1: With no itimer running, for a test with 100,000 threads, the fixed
	kernel took 1428.5 seconds, 513 seconds more than the unfixed system,
	all of which was spent in the system.  There were twice as many
	voluntary context switches with the fix as without it.

Case 2: With an itimer running at .01 second ticks and 4000 threads (the most
	an unmodified kernel can handle), the fixed kernel ran the test in
	eight percent of the time (5.8 seconds as opposed to 70 seconds) and
	had better tick accuracy (.012 seconds per tick as opposed to .023
	seconds per tick).

Case 3: A 4000-thread test with an initial timer tick of .01 second and an
	interval of 10,000 seconds (i.e. a timer that ticks only once) had
	very nearly the same performance in both cases:  6.3 seconds elapsed
	for the fixed kernel versus 5.5 seconds for the unfixed kernel.

With fewer threads (eight in these tests), the Case 1 test ran in essentially
the same time on both the modified and unmodified kernels (5.2 seconds versus
5.8 seconds).  The Case 2 test ran in about the same time as well, 5.9 seconds
versus 5.4 seconds but again with much better tick accuracy, .013 seconds per
tick versus .025 seconds per tick for the unmodified kernel.

Since the fix affected the rlimit code, I also tested soft and hard CPU limits.

Case 4: With a hard CPU limit of 20 seconds and eight threads (and an itimer
	running), the modified kernel was very slightly favored in that while
	it killed the process in 19.997 seconds of CPU time (5.002 seconds of
	wall time), only .003 seconds of that was system time, the rest was
	user time.  The unmodified kernel killed the process in 20.001 seconds
	of CPU (5.014 seconds of wall time) of which .016 seconds was system
	time.  Really, though, the results were too close to call.  The results
	were essentially the same with no itimer running.

Case 5: With a soft limit of 20 seconds and a hard limit of 2000 seconds
	(where the hard limit would never be reached) and an itimer running,
	the modified kernel exhibited worse tick accuracy than the unmodified
	kernel: .050 seconds/tick versus .028 seconds/tick.  Otherwise,
	performance was almost indistinguishable.  With no itimer running this
	test exhibited virtually identical behavior and times in both cases.

In times past I did some limited performance testing.  those results are below.

On a four-cpu Opteron system without this fix, a sixteen-thread test executed
in 3569.991 seconds, of which user was 3568.435s and system was 1.556s.  On
the same system with the fix, user and elapsed time were about the same, but
system time dropped to 0.007 seconds.  Performance with eight, four and one
thread were comparable.  Interestingly, the timer ticks with the fix seemed
more accurate:  The sixteen-thread test with the fix received 149543 ticks
for 0.024 seconds per tick, while the same test without the fix received 58720
for 0.061 seconds per tick.  Both cases were configured for an interval of
0.01 seconds.  Again, the other tests were comparable.  Each thread in this
test computed the primes up to 25,000,000.

I also did a test with a large number of threads, 100,000 threads, which is
impossible without the fix.  In this case each thread computed the primes only
up to 10,000 (to make the runtime manageable).  System time dominated, at
1546.968 seconds out of a total 2176.906 seconds (giving a user time of
629.938s).  It received 147651 ticks for 0.015 seconds per tick, still quite
accurate.  There is obviously no comparable test without the fix.

Signed-off-by: Frank Mayhar <fmayhar@google.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 16:25:35 +02:00
Stephen Smalley f058925b20 Update selinux info in MAINTAINERS and Kconfig help text
Update the SELinux entry in MAINTAINERS and drop the obsolete information
from the selinux Kconfig help text.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-09-12 00:44:08 +10:00
Eric Paris 8e531af90f SELinux: memory leak in security_context_to_sid_core
Fix a bug and a philosophical decision about who handles errors.

security_context_to_sid_core() was leaking a context in the common case.
This was causing problems on fedora systems which recently have started
making extensive use of this function.

In discussion it was decided that if string_to_context_struct() had an
error it was its own responsibility to clean up any mess it created
along the way.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-09-04 08:35:13 +10:00
KaiGai Kohei d9250dea3f SELinux: add boundary support and thread context assignment
The purpose of this patch is to assign per-thread security context
under a constraint. It enables multi-threaded server application
to kick a request handler with its fair security context, and
helps some of userspace object managers to handle user's request.

When we assign a per-thread security context, it must not have wider
permissions than the original one. Because a multi-threaded process
shares a single local memory, an arbitary per-thread security context
also means another thread can easily refer violated information.

The constraint on a per-thread security context requires a new domain
has to be equal or weaker than its original one, when it tries to assign
a per-thread security context.

Bounds relationship between two types is a way to ensure a domain can
never have wider permission than its bounds. We can define it in two
explicit or implicit ways.

The first way is using new TYPEBOUNDS statement. It enables to define
a boundary of types explicitly. The other one expand the concept of
existing named based hierarchy. If we defines a type with "." separated
name like "httpd_t.php", toolchain implicitly set its bounds on "httpd_t".

This feature requires a new policy version.
The 24th version (POLICYDB_VERSION_BOUNDARY) enables to ship them into
kernel space, and the following patch enables to handle it.

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-29 00:33:33 +10:00
James Morris 86d688984d Merge branch 'master' into next 2008-08-28 10:47:34 +10:00
Vesa-Matti Kari dbc74c65b3 selinux: Unify for- and while-loop style
Replace "thing != NULL" comparisons with just "thing" to make
the code look more uniform (mixed styles were used even in the
same source file).

Signed-off-by: Vesa-Matti Kari <vmkari@cc.helsinki.fi>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-15 08:40:47 +10:00
David Howells 5cd9c58fbe security: Fix setting of PF_SUPERPRIV by __capable()
Fix the setting of PF_SUPERPRIV by __capable() as it could corrupt the flags
the target process if that is not the current process and it is trying to
change its own flags in a different way at the same time.

__capable() is using neither atomic ops nor locking to protect t->flags.  This
patch removes __capable() and introduces has_capability() that doesn't set
PF_SUPERPRIV on the process being queried.

This patch further splits security_ptrace() in two:

 (1) security_ptrace_may_access().  This passes judgement on whether one
     process may access another only (PTRACE_MODE_ATTACH for ptrace() and
     PTRACE_MODE_READ for /proc), and takes a pointer to the child process.
     current is the parent.

 (2) security_ptrace_traceme().  This passes judgement on PTRACE_TRACEME only,
     and takes only a pointer to the parent process.  current is the child.

     In Smack and commoncap, this uses has_capability() to determine whether
     the parent will be permitted to use PTRACE_ATTACH if normal checks fail.
     This does not set PF_SUPERPRIV.

Two of the instances of __capable() actually only act on current, and so have
been changed to calls to capable().

Of the places that were using __capable():

 (1) The OOM killer calls __capable() thrice when weighing the killability of a
     process.  All of these now use has_capability().

 (2) cap_ptrace() and smack_ptrace() were using __capable() to check to see
     whether the parent was allowed to trace any process.  As mentioned above,
     these have been split.  For PTRACE_ATTACH and /proc, capable() is now
     used, and for PTRACE_TRACEME, has_capability() is used.

 (3) cap_safe_nice() only ever saw current, so now uses capable().

 (4) smack_setprocattr() rejected accesses to tasks other than current just
     after calling __capable(), so the order of these two tests have been
     switched and capable() is used instead.

 (5) In smack_file_send_sigiotask(), we need to allow privileged processes to
     receive SIGIO on files they're manipulating.

 (6) In smack_task_wait(), we let a process wait for a privileged process,
     whether or not the process doing the waiting is privileged.

I've tested this with the LTP SELinux and syscalls testscripts.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-14 22:59:43 +10:00
Vesa-Matti Kari 421fae06be selinux: conditional expression type validation was off-by-one
expr_isvalid() in conditional.c was off-by-one and allowed
invalid expression type COND_LAST. However, it is this header file
that needs to be fixed. That way the if-statement's disjunction's
second component reads more naturally, "if expr type is greater than
the last allowed value" ( rather than using ">=" in conditional.c):

  if (expr->expr_type <= 0 || expr->expr_type > COND_LAST)

Signed-off-by: Vesa-Matti Kari <vmkari@cc.helsinki.fi>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-07 08:56:16 +10:00
David Howells cf9481e289 SELinux: Fix a potentially uninitialised variable in SELinux hooks
Fix a potentially uninitialised variable in SELinux hooks that's given a
pointer to the network address by selinux_parse_skb() passing a pointer back
through its argument list.  By restructuring selinux_parse_skb(), the compiler
can see that the error case need not set it as the caller will return
immediately.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-05 10:55:47 +10:00
Vesa-Matti J Kari 0c0e186f81 SELinux: trivial, remove unneeded local variable
Hello,

Remove unneeded local variable:

    struct avtab_node *newnode

Signed-off-by: Vesa-Matti Kari <vmkari@cc.helsinki.fi>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-05 10:55:38 +10:00
Vesa-Matti J Kari df4ea865f0 SELinux: Trivial minor fixes that change C null character style
Trivial minor fixes that change C null character style.

Signed-off-by: Vesa-Matti Kari <vmkari@cc.helsinki.fi>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-05 10:55:30 +10:00